Apache Drill 1.0發布
雖然大數據往往將關系型數據庫當作靶子,但事實上真正生產環境的Hadoop和Spark等大數據平臺,每天大部分工作仍然是為SQL查詢提供服務,所以,SQL on Hadoop就成了競爭最激烈的技術領域。
5月19日,Apache基金會宣布針對Hadoop、NoSQL(MongoDB和HBase)和云存儲(Amazon S3, Google Cloud Storage, Azure Blog Storage, Swift)的無模式SQL查詢引擎Drill 1.0發布。
項目的PMC成員Tomer Shiran說:
這是許多公司數十名工程師將近三年開發的成果。Apache Drill的靈活性和易用性已經吸引了數千維護,而1.0版的企業級可靠性、安全與性能將進一步加速采用。
發布聲明中列出的相對于0.9的重要改進包括:
- Substantial improvements in stability, memory handling and performance
- Improvements in Drill CLI experience with addition of convenience shortcuts and improved colors/alignment
- Substantial additions to documentation including coverage of troubleshooting, performance tuning and many additions to the SQL reference
- Enhancements in join planning to facilitate high speed planning of large and complicated joins
- Add support for new context functions including CURRENTUSER and CURRENTSCHEMA
- Ability to treat all numbers as approximate decimals when reading JSON
- Enhancements in Drill's text and CSV handling to support first row skipping, configurable field/line delimiters and configurable quoting
- Improved JDBC compatibility (and tracing proxy for easy debugging).
- Ability to do JDBC connections with direct urls (avoiding ZooKeeper)
- Automatic selection of spooling or back-pressure exchange semantics to avoid distributed deadlocks in complex sort-heavy queries
- Improvements in query profile reporting
- Addition of ILIKE(VARCHAR, PATTERN) and SUBSTR(VARCHAR, REGEX) functions
更多詳情可以參考官方網站:http://drill.apache.org
Drill實際上是MapR在主導的,項目負責人和核心開發者大多來自MapR。它實際上是眾多SQL on Hadoop中的一個,此外還包括:
- Hadoop上原生的Hive
- Hortonworks主導的Hive演進項目Stinger
- Cloudera主導的Impala
- MapR主導的Apache Drill
- 非死book的Presto
- Pivotal的Greenplum
- Salesforce最初開發的Apache Phoenix
- 出自韓國的Apache Tajo(Google Tenzing的模仿)
- Spark社區的Spark SQL
- Splice Machine
從邏輯上來說,除了Spark SQL會借助Spark的火勢取得一定優勢外,其余最值得關注的還是Hadoop三巨頭分別支持的Impala、Stinger和Drill。 Impala和Drill都是在Google Dremel啟發下產生的,之前好像Impala勢頭較猛,但現在Drill有迎頭趕上的意思。
Drill這次正式發布,在FAQ里專門做了比較:
CSDN上的更多Drill資料:http://www.csdn.net/tag/drill
Hacker News上的討論:https://news.ycombinator.com/item?id=9571780