Skip to main content

Create and Manage Tables in Cloudberry Database

Cloudberry Database tables are similar to tables in any relational database, except that table rows are distributed across the different segments in the system. When you create a table, you specify the table's distribution policy.

Create a table

The CREATE TABLE command creates a table and defines its structure. When you create a table, you define:

  • The columns of the table and their associated data types. See Choose column data types.
  • Any table or column constraints to limit the data that a column or table can contain. See Setting table and column constraints.
  • The distribution policy of the table, which determines how Cloudberry Database divides data across the segments.
  • The way the table is stored on disk.
  • The table partitioning strategy for large tables.

Choose column data types

The data type of a column determines the types of data values the column can contain. Choose the data type that uses the least possible space but can still accommodate your data and that best constrains the data. For example, use character data types for strings, date or timestamp data types for dates, and numeric data types for numbers.

For table columns that contain textual data, specify the data type VARCHAR or TEXT. Specifying the data type CHAR is not recommended. In Cloudberry Database, the data types VARCHAR or TEXT handle padding added to the data (space characters added after the last non-space character) as significant characters, the data type CHAR does not.

Use the smallest numeric data type that will accommodate your numeric data and allow for future expansion. For example, using BIGINT for data that fits in INT or SMALLINT wastes storage space. If you expect that your data values will expand over time, consider that changing from a smaller datatype to a larger datatype after loading large amounts of data is costly. For example, if your current data values fit in a SMALLINT but it is likely that the values will expand, INT is the better long-term choice.

Use the same data types for columns that you plan to use in cross-table joins. Cross-table joins usually use the primary key in one table and a foreign key in the other table. When the data types are different, the database must convert one of them so that the data values can be compared correctly, which adds unnecessary overhead.

Set table and column constraints

You can define constraints on columns and tables to restrict the data in your tables. Cloudberry Database support for constraints is the same as PostgreSQL with some limitations, including:

  • CHECK constraints can refer only to the table on which they are defined.

  • UNIQUE and PRIMARY KEY constraints must be compatible with their table's distribution key and partitioning key, if any.

    info

    UNIQUE and PRIMARY KEY constraints are not allowed on append-optimized tables because the UNIQUE indexes that are created by the constraints are not allowed on append-optimized tables.

  • FOREIGN KEY constraints are allowed, but not enforced.

  • Constraints that you define on partitioned tables apply to the partitioned table as a whole. You cannot define constraints on the individual parts of the table.

Check constraints

Check constraints allow you to specify that the value in a certain column must satisfy a Boolean (truth-value) expression. For example, to require positive product prices:

=> CREATE TABLE products 
( product_no integer,
name text,
price numeric CHECK (price > 0) );

Not-null constraints

Not-null constraints specify that a column must not assume the null value. A not-null constraint is always written as a column constraint. For example:

=> CREATE TABLE products 
( product_no integer NOT NULL,
name text NOT NULL,
price numeric );

Unique constraints

Unique constraints ensure that the data contained in a column or a group of columns is unique with respect to all the rows in the table. The table must be hash-distributed or replicated (not DISTRIBUTED RANDOMLY). If the table is hash-distributed, the constraint columns must be the same as (or a superset of) the table's distribution key columns. For example:

=> CREATE TABLE products 
( product_no integer `UNIQUE`,
name text,
price numeric)
DISTRIBUTED BY (product_no);

Primary keys

A primary key constraint is a combination of a UNIQUE constraint and a NOT NULL constraint. The table must be hash-distributed (not DISTRIBUTED RANDOMLY), and the primary key columns must be the same as (or a superset of) the table's distribution key columns. If a table has a primary key, this column (or group of columns) is chosen as the distribution key for the table by default. For example:

=> CREATE TABLE products 
( `product_no` integer `PRIMARY KEY`,
name text,
price numeric)
` DISTRIBUTED BY (``product_no``)`;

Foreign keys

Foreign keys are not supported. You can declare them, but referential integrity is not enforced.

Foreign key constraints specify that the values in a column or a group of columns must match the values appearing in some row of another table to maintain referential integrity between two related tables. Referential integrity checks cannot be enforced between the distributed table segments of a Cloudberry database.

Exclusion constraints

Exclusion constraints ensure that if any two rows are compared on the specified columns or expressions using the specified operators, at least one of these operator comparisons will return false or null. The syntax is:

CREATE TABLE circles (
c circle,
EXCLUDE USING gist (c WITH &&)
) DISTRIBUTED REPLICATED;

Similar to unique constraints, an exclusion constraint is permitted only for replicated tables or when the distribution key columns are part of the constraint, with the same = operator as in the distribution key's hash operator class.

Exclusion constraints are not supported for partitioned tables.

See also CREATE TABLE ... CONSTRAINT ... EXCLUDE for details.

Adding an exclusion constraint automatically creates an index of the type specified in the constraint declaration.