開源BFS - 百度文件系統
The Baidu File System
The Baidu File System (BFS) is a distributed file system designed to support real-time applications. Like many other distributed file systems, BFS is highly fault-tolerant. But different from others, BFS provides low read/write latency while maintaining high throughput rates. Together with Galaxy and Tera , BFS supports many real-time products in Baidu, including Baidu webpage database, Baidu incremental indexing system, Baidu user behavior analysis system, etc.
Features
- Continuous availability
- Nameserver is implemented as a raft group , no single point failure.
- High throughput
- High performance data engine to maximize IO utils.
- Low latency
- Global load balance and slow node detection.
- Linear scalability
- Support multi data center deployment and up to 10,000 data nodes.
Architecture
Quick Start
Build
./build.sh
Standalone BFS
cd sandbox ./deploy.sh ./start_bfs.sh
How to Contribute
- Please read the RoadMap or source code.
- Find something you are interested in and start working on it.
- Test your code by simply running make test and make check .
- Make a pull request.
- Once your code has passed the code-review and merged, it will be run on thousands of servers :)
Contact us
百度文件系統
百度的核心業務和數據庫系統都依賴分布式文件系統作為底層存儲,文件系統的可用性和性能對上層搜索業務的穩定性與效果有著至關重要的影響。現有的分布式文件系統(如HDFS等)是為離線批處理設計的,無法在保證高吞吐的情況下做到低延遲和持續可用,所以我們從搜索的業務特點出發,設計了百度文件系統。
核心特點
- 持續可用
- 數據多機房、多地域冗余,元數據通過Raft維護一致性,單個機房宕機,不影響整體可用性。
- 高吞吐
- 通過高性能的單機引擎,最大化存儲介質IO吞吐;
- 低延時
- 全局負載均衡、慢節點自動規避
- 水平擴展
- 設計支持兩地三機房,1萬+臺機器管理。
架構
快速試用
構建
./build.sh
單機版BFS
cd sandbox ./deploy.sh ./start_bfs.sh
如何參與開發
- 閱讀 RoadMap 文件或者源代碼,了解我們當前的開發方向
- 找到自己感興趣開發的的功能或模塊
- 進行開發,開發完成后自測功能是否正確,并運行make test及make check檢查是否可以通過已有的測試case
- 發起pull request
- 在code-review通過后,你的代碼便有機會運行在百度的數萬臺服務器上~
本文由用戶 dkcn6629 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!