PostgreSQL 柱狀存儲擴展:cstore_fdw
cstore_fdw 實現了 PostgreSQL 數據庫的柱狀存儲,用于對批量加載的數據進行分析的場景。
該擴展使用了 Optimized Row Columnar (ORC) 格式的數據存儲布局。ORC 提升 非死book 開發的 RCFile 格式,帶來如下好處:
-
壓縮: Reduces in-memory and on-disk data size by 2-4x. Can be extended to support different codecs.
</li> -
列預測: Only reads column data relevant to the query. Improves performance for I/O bound queries.
</li> -
跳過索引: Stores min/max statistics for row groups, and uses them to skip over unrelated rows.
</li> </ul>此外,我們使用了 PostgreS 外部數據封裝 API 和類型呈現,帶來:
-
Support for 40+ Postgres data types. The user can also create new types and use them.
</li> -
Statistics collection. PostgreSQL's query optimizer uses these stats to evaluate different query plans and pick the best one.
</li> -
Simple setup. Create foreign table and copy data. Run SQL.
</li> </ul>Highlights
Key areas improved by this extension:
- Faster Analytics — Reduce analytics query disk and memory use by 10x
- Lower Storage — Compress data by 3x
- Easy Setup — Deploy as standard PostgreSQL extension
- Flexibility — Mix row- and column-based tables in the same DB
- Community — Benefit from PostgreSQL compatibility and open development </ul>
- Column projections: only read columns relevant to the query
- Compressed data: higher data density reduces disk I/O
- Skip indexes: row group stats permit skipping irrelevant rows
- Stats collections: integrates with PostgreSQL’s own query optimizer
- PostgreSQL-native formats: no deserialization overhead at query time </ul>
- Uses PostgreSQL’s own LZ family compression technique
- Only decompresses columns needed by the query
- Extensible to support different codecs </ul>
- Deploy as standard PostgreSQL extension
- Simply specify table type at creation time using FDW commands
- Copy data into your tables using standard PostgreSQL
COPY
command
</ul>
Learn more on our blog post.
Faster Analytics
cstore_fdw
brings substantial performance benefits to analytics-heavy workloads:Lower Storage
Cleanly implements full-table compression:
Easy Setup
If you know how to use PostgreSQL extensions, you know how to use
cstore_fdw
:Flexibility
Have the best of all worlds… mix row- and column-based tables in the same DB:
CREATE FOREIGN TABLE cstore_table (num integer, name text) SERVER cstore_server OPTIONS (filename '/var/tmp/testing.cstore');
CREATE TABLE plain_table (num integer, name text);
COPY cstore_table FROM STDIN (FORMAT csv); -- 1, foo -- 2, bar -- 3, baz -- .
COPY plain_table FROM STDIN (FORMAT csv); -- 4, foo -- 5, bar -- 6, baz -- .
SELECT * FROM cstore_table c, plain_table p WHERE c.name=p.name; -- num | name | num | name -------+------+-----+------ -- 1 | foo | 4 | foo -- 2 | bar | 5 | bar -- 3 | baz | 6 | baz</pre>
</code>
-