C.3. Tables #

C.3.1. What kind of tables are there in Shardman?
C.3.2. What are global tables?
C.3.3. What are global tables suitable for?
C.3.4. What are sharded tables?
C.3.5. Which partitioning parameters are optimal when creating a sharded table?
C.3.6. What are colocated tables?
C.3.7. How to create a colocated table?
C.3.8. What are local tables?
C.3.9. Are foreign keys supported in Shardman?

C.3.1. What kind of tables are there in Shardman? #

In addition to local table types Shardman supports distributed tables: global and sharded.

C.3.2. What are global tables? #

A global table in Shardman is a table that has the same schema and contents on all shards in the cluster. Global tables are created as follows:

CREATE TABLE g(id bigint PRIMARY KEY, t text) WITH(global);
    

A copy of such a table is created on each shard. Data replication of global tables is based on triggers. When data is inserted into such a table on any node of the cluster, data replication to other nodes occurs. When creating a global table, it is necessary to specify non deferrable primary key.

C.3.3. What are global tables suitable for? #

Global tables are suitable for directories and other relatively small and infrequently modified tables. Global tables are NOT suitable for storing large amounts of data and for intensive INSERT/UPDATE/DELETE workload, especially with highly competitive access (storefronts, queues, etc.)

C.3.4. What are sharded tables? #

Sharded tables are tables whose parts are hosted on different shards. Each shard stores its own piece of data from such a table. A sharded table can be created as follows:

CREATE TABLE ... WITH(distributed_by = 'column_name', num_parts = number_of_partitions);
    

Where:

distributed_by — table field being the sharding key,

num_parts — (default = 24) number of partitions into which the table is initially divided.

These parts are then distributed to shards.

C.3.5.  Which partitioning parameters are optimal when creating a sharded table? #

The number of partitions should be not less than the number of shards including the shards that can be added later. In general it may be a number with quite a few divisors like 12 or 24, so you can evenly divide the table into 2, 3, 4 or 6 shards. Large amount of partitions adds overhead on planning and execution, so it is preferable to keep it reasonable.

C.3.6.  What are colocated tables? #

Colocated tables are used when a table is often joined with another sharded table (usually by foreign keys) and therefore it is better to physically place their parts on the same shards.

C.3.7.  How to create a colocated table? #

CREATE TABLE ... WITH(distributed_by = 'column_name', num_parts = number_of_partitions, colocate_with = 'distributed_table');
    

Here:

distributed_by = 'column_name' — the name of the sharding key as it is called in the colocated table (not the colocating table) being created,

colocate_with = 'distributed_table' — the name of the table with which you want to colocate parts of the colocated table.

C.3.8.  What are local tables? #

A local table is a table only hosted on the shard where it was created.

C.3.9.  Are foreign keys supported in Shardman? #

Foreign keys are allowed in Shardman but with some limitations:

  • On global tables, both from sharded tables and from other global tables

  • Between sharded colocated tables.

Foreign keys are NOT allowed:

  • From global to sharded tables

  • Between sharded tables if they are not colocated.