Data Loading Overview
Cloudberry Database loads data mainly by transforming external data into external tables (or foreign tables) via loading tools. Then it reads data from these external tables or writes data into them to achieve external data loading.
Loading process
The general process of loading external data into Cloudberry Database is as follows:
- Assess the data loading scenario (such as data source location, data type, and data volume) and select an appropriate loading tool.
- Set up and enable the loading tool.
- Create an external table, specifying information such as the protocol of the loading tool, data source address, data format in the
CREATE EXTERNAL TABLE
statement. - Once the external table is created, data from the external table can be queried directly using the
SELECT
statement, or data can be imported from the external table usingINSERT INTO SELECT
.
Loading methods and scenarios
Cloudberry Database offers multiple data loading solutions, and you can select different data loading methods according to different data sources.
Loading method | Data source | Data format | Parallel or not |
---|---|---|---|
copy | Local file system • Coordinator node host (for a single file) • Segment node host (for multiple files) | • TXT • CSV • Binary | No |
file:// protocol | Local file system (local segment host, accessible only by superuser) | • TXT • CSV | Yes |
gpfdist | Local host files or files accessible via internal network | • TXT • CSV • Any delimited text format supported by the FORMAT clause• XML and JSON (requires conversion to text format via YAML configuration file) | Yes |
Batch loading using gpload (with gpfdist as the underlying worker) | Local host files or files accessible via internal network | • TXT • CSV • Any delimited text format supported by the FORMAT clause• XML and JSON (require conversion to text format via YAML configuration file) | Yes |
Creating external web tables | Data pulled from network services or from any source accessible by command lines | • TXT • CSV | Yes |
Learn more
🗃️ 从本地文件加载数据
4 个项目
📄️ 从 Web 服务加载数据
在 Cloudberry Database 中,你可以通过创建外部 Web 表的方式,从 Web 服务或任何支持命令行访问的数据源加载数据。这里支持的数据格式有文本(TEXT)和 CSV 格式。