Druid Analytics 0.8.0 發布,OLAP 數據查詢引擎
Druid Analytics 0.8.0 發布,更新內容如下:
新特性
-
Redo Druid metrics to use an understandable metrics schema
-
Support compression for multi-value columns
-
Added longMax/longMin aggregators in addition to previous min/max [double] aggregators which have been renamed to appropriate doubleMax/doubleMin
-
Added a hadoop_convert_segment task for the indexer to allow large scale batch re-compression of old data as an indexer task.
改進
-
Index task now ignores invalid rows (#1264)
-
Improved segment filtering for dataSourceMetadataQuery (#1299)
-
Numerous additional unit tests
Bug 修復
-
Fixed deprecated warnings in Hadoop batch indexing (#1275). Thanks @infynyxx!
-
Fix groupBys applying limitSpecs to historicals on post aggregations (#1292). Thanks @guobingkun!
-
Fix incorrectly typed values in metadata sql queries (#1295). THanks @anubhgup!
-
Fix timeBoundary cache serde problems (#1303)
-
Fix serde issue with pulling timestamps from cache (#1304)
-
Fixed concatenated gzip files with static s3 firehose (#1311)
-
Fix audit table config serde problems (#1322)
-
Fix IRC firehose serde (#1331)
-
Fix Arithmetic exceptions on the broker (#1336)
-
Fix an error where the Convert Segment Task would leave zombie tasks if the task failed (#1363)
-
Fixed #1365 to return actual complex metric name in segment metadata query response
-
Fix groupBy caching to work with renamed aggregators (#1499)
文檔
-
Numerous typo fixes. Thanks to @textractor, @rasahner, & @bobrik.
下載:https://github.com/druid-io/druid/archive/druid-0.8.0.zip。
Druid 是為大型數據集上實時探索查詢的引擎,提供專為 OLAP 設計的開源分析數據存儲系統,它的設計意圖是在面對代碼部署、機器故障以及其他產品系統遇到不測時能保持100%正常運行。它也可以用于后臺用例,但設計決策明確定位線上服務。
數據流:
集群架構:
主要特性:
-
為分析而設計——Druid是為OLAP工作流的探索性分析而構建。它支持各種filter、aggregator和查詢類型,并為添加新功能提供了一個框架。用戶已經利用Druid的基礎設施開發了高級K查詢和直方圖功能。
-
交互式查詢——Druid的低延遲數據攝取架構允許事件在它們創建后毫秒內查詢,因為Druid的查詢延時通過只讀取和掃描優必要的元素被優化。Aggregate和 filter沒有坐等結果。
-
高可用性——Druid是用來支持需要一直在線的SaaS的實現。你的數據在系統更新時依然可用、可查詢。規模的擴大和縮小不會造成數據丟失。
-
可伸縮——現有的Druid部署每天處理數十億事件和TB級數據。Druid被設計成PB級別。