Code Structure#

Hybridse SQL Engine#

hybridse/
β”œβ”€β”€ examples          // demo db and hybrisde integration tests
β”œβ”€β”€ include           // the include directory of codes, whose structure is similar to src
β”œβ”€β”€ src
β”‚Β Β  β”œβ”€β”€ base          // basic libraries catalogue
β”‚Β Β  β”œβ”€β”€ benchmark     
β”‚Β Β  β”œβ”€β”€ case          // cases for testing
β”‚Β Β  β”œβ”€β”€ cmd           // packaged demo 
β”‚Β Β  β”œβ”€β”€ codec         // decode and encode 
β”‚Β Β  β”œβ”€β”€ codegen       // llvm codes generation
β”‚Β Β  β”œβ”€β”€ llvm_ext      // llvm characters parsing
β”‚Β Β  β”œβ”€β”€ node          // the definition of logic plans', physical plans' and expressions' nodes and type nodes.
β”‚Β Β  β”œβ”€β”€ passes        // sql optimizer
β”‚Β Β  β”œβ”€β”€ plan          // logic execution plan generation
β”‚Β Β  β”œβ”€β”€ planv2        // the transformation from zetasql syntax tree to nodes
β”‚Β Β  β”œβ”€β”€ proto         // the difinition of protobuf
β”‚Β Β  β”œβ”€β”€ sdk           
β”‚Β Β  β”œβ”€β”€ testing       
β”‚Β Β  β”œβ”€β”€ udf           // the registration and generation of udf and udaf
β”‚Β Β  └── vm            // the generation of sql physical plan and execution plan, the compilation and execution entries of sql 
└── tools     // codes related to benchmark 

Online Storage Engine and External Service Interface#

src/
β”œβ”€β”€ apiserver      
β”œβ”€β”€ base           // basic libraries catalogue
β”œβ”€β”€ catalog        
β”œβ”€β”€ client         // the difinition and implementation of ns/tablet/taskmanager client interfaces 
β”œβ”€β”€ cmd            // CLI and OpenMLDB binary generation
β”œβ”€β”€ codec          // decode and encode
β”œβ”€β”€ datacollector  // online -> offline sync tool 
β”œβ”€β”€ log            // the formats, reading and writing of binlog and snapshot
β”œβ”€β”€ nameserver     
β”œβ”€β”€ proto          // definition of protobuf
β”œβ”€β”€ replica        // the synchronization between leader and followers
β”œβ”€β”€ rpc            // brpc request package
β”œβ”€β”€ schema         // generate the resolution of schema and index 
β”œβ”€β”€ sdk            
β”œβ”€β”€ storage        // storage engine 
β”œβ”€β”€ tablet         // the implementation of tablet interface
β”œβ”€β”€ test           
β”œβ”€β”€ tools          // packages of some gadgets 
└── zk             // packages of zookeeper client

Java Modules#

java/
β”œβ”€β”€ hybridse-native          // codes generated automatically by SQL engine swig
β”œβ”€β”€ hybridse-proto           // proto of SQL engine 
β”œβ”€β”€ hybridse-sdk             // packaged sdk of SQL engine 
β”œβ”€β”€ openmldb-batch           // offline planner which translates the SQL logic to the spark execution plan
β”œβ”€β”€ openmldb-batchjob        // codes related to offline tasks execution 
β”œβ”€β”€ openmldb-common          // some public codes and basic libraries of java sdk
β”œβ”€β”€ openmldb-import          // data import tools
β”œβ”€β”€ openmldb-jdbc            // java sdk
β”œβ”€β”€ openmldb-jmh             // used for performance and stability testing
β”œβ”€β”€ openmldb-native          // codes generated automatically by swig
β”œβ”€β”€ openmldb-spark-connector // the implementation of spark connector used for reading from and writing into OpenMLDB
β”œβ”€β”€ openmldb-synctool        // online -> offline sync tool 
└── openmldb-taskmanager     // offline tasks management module 

Python SDK#

python
β”œβ”€β”€ openmldb
β”‚Β Β  β”œβ”€β”€ dbapi                // dbapi interface 
β”‚Β Β  β”œβ”€β”€ native               // codes generated automatically by swig
β”‚Β Β  β”œβ”€β”€ sdk                  // calling the underlying c++ interface
β”‚Β Β  β”œβ”€β”€ sqlalchemy_openmldb  // sqlalchemy interface
β”‚Β Β  β”œβ”€β”€ sql_magic            // notebook magic
β”‚Β Β  └── test                 

Offline Execution Engine#

https://github.com/4paradigm/spark