PostgreSQL 柱狀存儲擴展:cstore_fdw

jopen 10年前發布 | 35K 次閱讀 PostgreSQL 數據庫服務器

cstore_fdw 實現了 PostgreSQL 數據庫的柱狀存儲,用于對批量加載的數據進行分析的場景。

該擴展使用了 Optimized Row Columnar (ORC) 格式的數據存儲布局。ORC 提升 非死book 開發的 RCFile 格式,帶來如下好處:

  • 壓縮: Reduces in-memory and on-disk data size by 2-4x. Can be extended to support different codecs.

    </li>

  • 列預測: Only reads column data relevant to the query. Improves performance for I/O bound queries.

    </li>

  • 跳過索引: Stores min/max statistics for row groups, and uses them to skip over unrelated rows.

    </li> </ul>

    此外,我們使用了 PostgreS 外部數據封裝 API 和類型呈現,帶來:

    • Support for 40+ Postgres data types. The user can also create new types and use them.

      </li>

    • Statistics collection. PostgreSQL's query optimizer uses these stats to evaluate different query plans and pick the best one.

      </li>

    • Simple setup. Create foreign table and copy data. Run SQL.

      </li> </ul>

      Highlights

      Key areas improved by this extension:

      • Faster Analytics — Reduce analytics query disk and memory use by 10x
      • Lower Storage — Compress data by 3x
      • Easy Setup — Deploy as standard PostgreSQL extension
      • Flexibility — Mix row- and column-based tables in the same DB
      • Community — Benefit from PostgreSQL compatibility and open development
      • </ul>

        Learn more on our blog post.

        Faster Analytics

        cstore_fdw brings substantial performance benefits to analytics-heavy workloads:

        • Column projections: only read columns relevant to the query
        • Compressed data: higher data density reduces disk I/O
        • Skip indexes: row group stats permit skipping irrelevant rows
        • Stats collections: integrates with PostgreSQL’s own query optimizer
        • PostgreSQL-native formats: no deserialization overhead at query time
        • </ul>

          qq截圖20140404083637.png

            PostgreSQL 柱狀存儲擴展:cstore_fdwDisk I/O (MiB)I/O Utilization4GB data using PostgreSQL 9.3 on m1.xlargePostgreSQLcstorecstore (LZ)TPC-H 3TPC-H 5TPC-H 6TPC-H 100k1k2k3k4k5kHighcharts.com

            Lower Storage

            Cleanly implements full-table compression:

            • Uses PostgreSQL’s own LZ family compression technique
            • Only decompresses columns needed by the query
            • Extensible to support different codecs
            • </ul>

              Easy Setup

              If you know how to use PostgreSQL extensions, you know how to use cstore_fdw:

              • Deploy as standard PostgreSQL extension
              • Simply specify table type at creation time using FDW commands
              • Copy data into your tables using standard PostgreSQL COPY command
              • </ul>

                Flexibility

                Have the best of all worlds… mix row- and column-based tables in the same DB:

                CREATE FOREIGN TABLE cstore_table
                  (num integer, name text)
                SERVER cstore_server
                OPTIONS (filename '/var/tmp/testing.cstore');

                CREATE TABLE plain_table (num integer, name text);

                COPY cstore_table FROM STDIN (FORMAT csv); -- 1, foo -- 2, bar -- 3, baz -- .

                COPY plain_table FROM STDIN (FORMAT csv); -- 4, foo -- 5, bar -- 6, baz -- .

                SELECT * FROM cstore_table c, plain_table p WHERE c.name=p.name; -- num | name | num | name -------+------+-----+------ -- 1 | foo | 4 | foo -- 2 | bar | 5 | bar -- 3 | baz | 6 | baz</pre>

                項目主頁:http://www.baiduhome.net/lib/view/home/1396571870215

                </code>

     本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
     轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
     本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!