CREATE TABLE
Contents
CREATE TABLE#
The CREATE TABLE
statement is used to create a table. The table name must be unique in one database.
Syntax#
CreateTableStmt ::=
'CREATE' 'TABLE' IfNotExists TableName (
TableElementList CreateTableSelectOpt | LikeTableWithOrWithoutParen ) OnCommitOpt
IfNotExists ::=
('IF' 'NOT' 'EXISTS')?
TableName ::=
Identifier ('.' Identifier)?
TableElementList ::=
TableElement ( ',' TableElement )*
TableElement ::=
ColumnDef | ColumnIndex
The TableElementList
needs to be defined in the CREATE TABLE
statement. TableElementList
consists of ColumnDef
(column definition) and ColumnIndex
. OpenMLDB requires at least one ColumnDef
in the TableElementList
.
Or use Hive tables and Parquet files to create new tables.
CreateTableStmt ::=
'CREATE' 'TABLE' TableName LIKE LikeType PATH
TableName ::=
Identifier ('.' Identifier)?
LikeType ::=
'HIVE' | 'PARQUET'
PATH ::=
string_literal
Here is the known issues of creating tables with Hive.
May get timeout for the default CLI config and need to show tables to check result.
The column constraints of Hive tables such as
NOT NULL
will not copy to new tables.
ColumnDef (required)#
ColumnDef ::=
ColumnName ( ColumnType ) [ColumnOptionList]
ColumnName ::=
Identifier ( '.' Identifier ( '.' Identifier )? )?
ColumnType ::=
'INT' | 'INT32'
|'SMALLINT' | 'INT16'
|'BIGINT' | 'INT64'
|'FLOAT'
|'DOUBLE'
|'TIMESTAMP'
|'DATE'
|'BOOL'
|'STRING' | 'VARCHAR'
| 'MAP' '<' ColumnType ',' ColumnType '>'
ColumnOptionList ::=
ColumnOption*
ColumnOption ::=
['DEFAULT' DefaultValueExpr ] ['NOT' 'NULL']
DefaultValueExpr ::=
int_literal | float_literal | double_literal | string_literal
A table contains one or more columns. The column description ColumnDef
for each column describes the column name, column type, and options.
ColumnName
: The name of the column in the table. Column names within the same table must be unique.ColumnType
: The data type of the column. To learn about the data types supported by OpenMLDB, please refer to Data Types.ColumnOptionList
:NOT NULL
: The column does not allow null values.DEFAULT
: The default value of this column. It is recommended to configure the default value ifNOT NULL
is configured. In this case, when inserting data, if the value of the column is not defined, the default value will be inserted. If theNOT NULL
attribute is configured but theDEFAULT
value is not configured, OpenMLDB will throw an error when the change column value is not defined in the INSERT statement.
Example#
Example 1: Create a Table
The following SQL commands set the current database to db1
and create a table t1
in the current database, including the column named col0
. The data type of col0
is STRING
.
CREATE DATABASE db1;
-- SUCCEED
USE db1;
-- SUCCEED: Database changed
CREATE TABLE t1(col0 STRING);
-- SUCCEED
The following SQL command shows how to create a table in a database which is not the database currently used.
CREATE TABLE db1.t2 (col0 STRING, col1 int);
-- SUCCEED
Switch to database db1
to see the details of the table just created.
USE db1;
-- SUCCEED: Database changed
desc t2;
--- ------- --------- ------ ---------
# Field Type Null Default
--- ------- --------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
--- ------- --------- ------ ---------
--- -------------------- ------ ---- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---- ------ ---------------
1 INDEX_0_1639524201 col0 - 0min kAbsoluteTime
--- -------------------- ------ ---- ------ ---------------
--------------
storage_mode
--------------
Memory
--------------
Example 2: Create a Duplicate Table
The following SQL command creates a table, whose name is the same as an existing table of this database.
CREATE TABLE t1 (col0 STRING NOT NULL, col1 int);
-- SUCCEED
CREATE TABLE t1 (col0 STRING NOT NULL, col1 int);
-- Error: table already exists
CREATE TABLE t1 (col0 STRING NOT NULL, col1 string);
-- Error: table already exists
Example 3: Create a Table with NOT NULL on Certain Columns
USE db1;
-- SUCCEED: Database changed
CREATE TABLE t3 (col0 STRING NOT NULL, col1 int);
-- SUCCEED
desc t3;
--- ------- --------- ------ ---------
# Field Type Null Default
--- ------- --------- ------ ---------
1 col0 Varchar NO
2 col1 Int YES
--- ------- --------- ------ ---------
--- -------------------- ------ ---- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---- ------ ---------------
1 INDEX_0_1657327434 col0 - 0min kAbsoluteTime
--- -------------------- ------ ---- ------ ---------------
--------------
storage_mode
--------------
Memory
--------------
Example 4: Create a Table with Default Value
USE db1;
-- SUCCEED: Database changed
CREATE TABLE t3 (col0 STRING NOT NULL, col1 int);
-- SUCCEED
desc t3;
--- ------- --------- ------ ---------
# Field Type Null Default
--- ------- --------- ------ ---------
1 col0 Varchar NO
2 col1 Int YES
--- ------- --------- ------ ---------
--- -------------------- ------ ---- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---- ------ ---------------
1 INDEX_0_1657327434 col0 - 0min kAbsoluteTime
--- -------------------- ------ ---- ------ ---------------
--------------
storage_mode
--------------
Memory
--------------
Example 5: Create a Table from the Hive table
At first configure OpenMLDB to support Hive, then create table with the following SQL.
CREATE TABLE t1 LIKE HIVE 'hive://hive_db.t1';
-- SUCCEED
Example 6: Create a Table from the Parquet files
CREATE TABLE t1 LIKE PARQUET 'file://t1.parquet';
-- SUCCEED
ColumnIndex (optional)#
ColumnIndex ::=
'INDEX' <OptionalIndexName> '(' IndexOptionList ')'
IndexOptionList ::=
IndexOption ( ',' IndexOption )*
IndexOption ::=
IndexOptionName '=' expr
Indexes can be used by database search engines to speed up data retrieval. Simply put, an index is a pointer to the data in a table. Configuring a column index generally requires configuring the index key (KEY
), index time column (TS
), TTL
and TTL_TYPE
.
The index key must be configured, and other configuration items are optional. The following table introduces these configuration items in detail.
Configuration Item |
Note |
Expression |
Example |
---|---|---|---|
|
It defines the index column (required). OpenMLDB supports single-column indexes as well as joint indexes. When |
Single-column index: |
Single-column index: |
|
It defines the index time column (optional). Data on the same index will be sorted by the index time column. When |
|
|
|
It defines the elimination rules (optional). Including four types. When |
Supported expr: |
For specific usage, please refer to Configuration Rules for TTL and TTL_TYP below. |
|
It defines the maximum survival time/number. Different TTL_TYPEs determines different |
Supported expr: |
For specific usage, please refer to “Configuration Rules for TTL and TTL_TYPE” below. |
Configuration details of TTL and TTL_TYPE:
TTL_TYPE |
TTL |
Note |
Example |
---|---|---|---|
|
The value of TTL represents the expiration time. The configuration value is a time period such as |
When a record expires, it is eliminated. |
|
|
The value of TTL represents the maximum number of surviving entries. That is, under the same index, the maximum number of data items allowed exists. Up to 1000 can be configured |
When the record exceeds the maximum number, it will be eliminated. |
|
|
It defines the expiration time and the maximum number of live records. The configuration value is a 2-tuple of the form |
Records will be eliminated if either the time expires or the number of records exceeds the maximum limit. |
|
|
It defines the expiration time and the maximum number of live records. The configuration value is a 2-tuple of the form |
Records will only be eliminated when both the time expires and the number of records exceeds the maximum limit. |
|
Index-Organized Table(IOT)#
When creating a Covering index using KEY in OpenMLDB, the index stores the complete data row, which results in higher memory usage. If lower memory consumption is desired at the cost of some performance degradation, an IOT table can be used. In an IOT table, three types of indexes can be created:
CKEY
:Clustered index, storing the complete data row. The configured CKEY and TS are used to uniquely identify a row of data. If a duplicate primary key is inserted, the data will be updated (this triggers the deletion of old data in all indexes, followed by the insertion of new data, which may incur performance overhead). It is also possible to use only CKEY without configuring TS, where CKEY uniquely identifies a row of data. Queries using this index suffer no performance loss.SKEY
:Secondary index, storing the primary key. If TS is not configured, data under the same SKEY is sorted by insertion time. During query execution, the corresponding primary key values are first located in the Secondary index, and then the data is retrieved based on the primary key, resulting in some performance loss.KEY
:Covering index, storing the complete data row. If TS is not configured, data under the same KEY is sorted by insertion time. Queries using this index suffer no performance loss.
When creating an IOT table, the first index must be the only Clustered index, while other indexes are optional. Changing the order of the Clustered index is currently not supported.
CREATE TABLE iot (c1 int64, c2 int64, c3 int64, INDEX(ckey=c1, ts=c2)); -- Clustered index
CREATE TABLE iot (c1 int64, c2 int64, c3 int64, INDEX(ckey=c1), INDEX(skey=c2)); -- Clustered index and Secondary index
CREATE TABLE iot (c1 int64, c2 int64, c3 int64, INDEX(ckey=c1), INDEX(skey=c2), INDEX(key=c3)); -- Clustered index、Secondary index and Covering index
The TTL behavior of indexes in an IOT table differs from that of a regular table. When data is evicted by TTL from the IOT Clustered index, it triggers deletion operations in all other indexes. In contrast, TTL eviction from a Secondary index or a Covering index only removes data within the respective index itself and does not trigger deletions in other indexes.
Generally, unless it is necessary to make Secondary and Covering indexes more memory-efficient, you can configure TTL only for the Clustered index and leave the TTL unset for Secondary and Covering indexes.
Notes#
An IOT table does not support concurrent writes of multiple rows with the same primary key, as this may lead to conflicts causing at least one write to fail. Pre-existing rows with the same primary key in the IOT table will be overwritten without requiring additional handling. To avoid the need for import repairs, perform data cleansing before import by deduplicating records with the same primary key in the source data. (Overwriting data triggers deletions in all indexes, resulting in very low efficiency even for single-threaded writes. Therefore, single-threaded import is not recommended.)
Example#
Example 1
The following sql example creates a table with a single-column index.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1));
-- SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---- ------ ---------------
1 INDEX_0_1639524520 col1 - 0min kAbsoluteTime
--- -------------------- ------ ---- ------ ---------------
Example 2
The following sql example creates a table with a joint index.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=(col0, col1)));
-- SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ----------- ---- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ----------- ---- ------ ---------------
1 INDEX_0_1639524576 col0|col1 - 0min kAbsoluteTime
--- -------------------- ----------- ---- ------ ---------------
Example 3
The following sql example creates a table with a single-column index configuring the time column.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1, TS=std_time));
-- SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---------- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---------- ------ ---------------
1 INDEX_0_1639524645 col0 std_time 0min kAbsoluteTime
--- -------------------- ------ ---------- ------ ---------------
Example 4
The following sql example creates a table with a single-column index configuring the time column, TTL_TYPE and TTL.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1, TS=std_time, TTL_TYPE=absolute, TTL=30d));
-- SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---------- ---------- ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---------- ---------- ---------------
1 INDEX_0_1639524729 col1 std_time 43200min kAbsoluteTime
--- -------------------- ------ ---------- ---------- ---------------
Example 5
The following sql commands create a table with a single-column index and set TTL_TYPE=LATEST.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1, TS=std_time, TTL_TYPE=latest, TTL=1));
-- SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---------- ----- -------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---------- ----- -------------
1 INDEX_0_1639524802 col1 std_time 1 kLatestTime
--- -------------------- ------ ---------- ----- -------------
Example 6
The following sql commands create a table with a single-column index, set TTL_TYPE=absandlat and configure the maximum number of retained records as 10.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1, TS=std_time, TTL_TYPE=absandlat, TTL=(30d,10)));
-- SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---------- -------------- ------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---------- -------------- ------------
1 INDEX_0_1639525038 col1 std_time 43200min&&10 kAbsAndLat
--- -------------------- ------ ---------- -------------- ------------
Example 7
The following sql commands create a table with a single-column index, set TTL_TYPE=absorlat and configure the maximum number of retained records as 10.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1, TS=std_time, TTL_TYPE=absorlat, TTL=(30d,10)));
--SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---------- -------------- -----------
# name keys ts ttl ttl_type
--- -------------------- ------ ---------- -------------- -----------
1 INDEX_0_1639525079 col1 std_time 43200min||10 kAbsOrLat
--- -------------------- ------ ---------- -------------- -----------
Example 8
The following sql commands create a multi-index table.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col0, TS=std_time), INDEX(KEY=col1, TS=std_time));
--SUCCEED
desc t1;
--- ---------- ----------- ------ ---------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---------- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---------- ------ ---------------
1 INDEX_0_1648692457 col0 std_time 0min kAbsoluteTime
2 INDEX_1_1648692457 col1 std_time 0min kAbsoluteTime
--- -------------------- ------ ---------- ------ ---------------
Table Property TableOptions (optional)#
TableOptions
::= 'OPTIONS' '(' TableOptionItem (',' TableOptionItem)* ')'
TableOptionItem
::= PartitionNumOption
| ReplicaNumOption
| DistributeOption
| StorageModeOption
PartitionNumOption
::= 'PARTITIONNUM' '=' int_literal
ReplicaNumOption
::= 'REPLICANUM' '=' int_literal
DistributeOption
::= 'DISTRIBUTION' '=' DistributionList
DistributionList
::= DistributionItem (',' DistributionItem)*
DistributionItem
::= '(' LeaderEndpoint ',' FollowerEndpointList ')'
LeaderEndpoint
::= Endpoint
FollowerEndpointList
::= '[' Endpoint (',' Endpoint)* ']'
Endpoint
::= string_literals
StorageModeOption
::= 'STORAGE_MODE' '=' StorageMode
StorageMode
::= 'Memory'
| 'HDD'
| 'SSD'
CompressTypeOption
::= 'COMPRESS_TYPE' '=' CompressType
CompressType
::= 'NoCompress'
| 'Snappy
Configuration Item |
Note |
Example |
---|---|---|
|
It defines the number of partitions for the table. OpenMLDB divides the table into different partition blocks for storage. A partition is the basic unit of storage, replica, and fail-over related operations in OpenMLDB. When not explicitly configured, |
|
|
It defines the number of replicas for the table. Note that the number of replicas is only configurable in Cluster version. |
|
|
It defines the distributed node endpoint configuration. Generally, it contains a Leader node and several followers. |
|
|
It defines the storage mode of the table. The supported modes are |
|
|
It defines the compress types of the table. The supported compress type are |
|
The Difference between Disk Table and Memory Table#
If the value of
STORAGE_MODE
isHDD
orSSD
, the table is a disk table. IfSTORAGE_MODE
isMemory
, the table is a memory table.Currently, disk tables do not support GC operations
When inserting data into a disk table, if (
key
,ts
) are the same under the same index, the old data will be overwritten; a new piece of data will be inserted into the memory table.Disk tables do not support
addindex
ordeleteindex
operations, so you need to define all required indexes when creating a disk table. Thedeploy
command will automatically add the required indexes, so for a disk table, if the corresponding index is missing when it is created,deploy
will fail.
Example#
The following sql commands create a table and configure the number of partitions as 8, the number of replicas as 3, and the storage_mode as HDD.
USE db1;
--SUCCEED: Database changed
CREATE TABLE t1 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1, TS=std_time)) OPTIONS(partitionnum=8, replicanum=3, storage_mode='HDD');
--SUCCEED
DESC t1;
--- ---------- ----------- ------ ----------
# Field Type Null Default
--- ---------- ----------- ------ ---------
1 col0 Varchar YES
2 col1 Int YES
3 std_time Timestamp YES
--- ---------- ----------- ------ ---------
--- -------------------- ------ ---------- ------ ---------------
# name keys ts ttl ttl_type
--- -------------------- ------ ---------- ------ ---------------
1 INDEX_0_1651143735 col1 std_time 0min kAbsoluteTime
--- -------------------- ------ ---------- ------ ---------------
--------------- --------------
compress_type storage_mode
--------------- --------------
NoCompress HDD
--------------- --------------
The following sql command create a table with specified distribution.
create table t1 (col0 string, col1 int) options (DISTRIBUTION=[('127.0.0.1:30921', ['127.0.0.1:30922', '127.0.0.1:30923']), ('127.0.0.1:30922', ['127.0.0.1:30921', '127.0.0.1:30923'])]);
--SUCCEED