當我們討論分布式系統時，我們都討論些什么？

jopen 10年前發布 | 16K 次閱讀分布式

我一直在學習有關分布式系統的知識，學習時間不算短了。老實說，只要你開始鉆研分布式系統，知識點好像學不完似的，一個接一個。分布式系統領域的文獻太多了，包括許多大學發表的論文，還有很多書籍可選。像我這樣的絕對新手，很難決定應該閱讀哪些論文或者購買哪些書籍。

同時，我還發現了幾個博客作者，他們在博客中推薦這篇或者那篇論文，聲稱這是分布式系統工程師（先別糾結這個詞的含義）應該必知必會的。于是，我的閱讀列表越來越長： FLP, Zab, 時間, 分布式系統中的時鐘和事件順序， Viewstamped Replication, Paxos, Chubby ，等等。問題是，很多時候他們沒有說明為什么要閱讀這篇或者那篇論文。為了滿足好奇心，不分主次地學習所有知識，這種想法我挺喜歡。不過，我們還是應該確定不同內容的閱讀優先級，畢竟，每天只有 24 小時呀。

除了有大量的論文和研究資料，分布式系統領域的書也很多。我買了挺多本。我翻看其中的章節，發現有些書的書名起得很有吸引力，貌似是我感興趣的內容，實則不然，或者說，這些書的內容并沒有涉及我想解決的問題。

在寫這篇博客的同時，我仍然在持續學習分布式系統知識，所以請讀者有點耐心，明白本文難免會存在錯誤。已經寫好的內容，以后我會盡力做相應的擴充。

在這里，我想告訴大家，我已經在好幾個會議上講過這篇博客的內容，演講稿： What We Talk About When We Talk About Distributed Systems

我在斯德哥爾摩 Erlang 用戶大會上的演講視頻： What We Talk About When We Talk About Distributed Systems

主要概念

確定分布式系統算法的分類，主要依據是搞清楚算法的各種屬性。例如，定時模型、進程間通信類型和失效模型等等。

本文涉及的主要概念包括：

定時模型（Timing Model）

進程間通信（Interprocess Communication）

失效模式（Failure Modes）

失效檢測器（Failure Detectors）

領導人選舉（Leader Election）

共識（Consensus）

法定人數（Quorums）

分布式系統中的時間

快速瀏覽 FLP

結束語

參考文獻

定時模型

同步模型

異步模型

“只要有一個進程可能會失效就不可能達成共識”（Impossibility of Consensus with one Faulty Process）

部分同步模型

《分布式算法》

進程間通信

消息傳遞模型：進程間相互發送消息；

共享內存模型：不同的進程讀寫共享的變量，實現數據的共享。

線性化（linearizabilty）

失效模式

“分布式系統中的失效模式”

《可靠的分布式系統指南》

失效檢測器

p

q

N

q

q

p

q

q

q

q

N

q

q

q

《可靠且安全的分布式程序設計指南》

強完備性：到最后，每一個崩潰的進程都會被每一個正常運行的進程永久地懷疑；

最終強精準性：最終，沒有任何正常運行的進程會被其他正常的進程所懷疑。

能夠規避上述問題的失效檢測器

領導人選舉

共識

“在存在失效的情況下達成一致”

容錯系統通常要求提供一種手段，使得獨立的處理器或者進程能夠達成某種精確的相互一致。例如，一個冗余系統的多個處理器可能需要定期同步它們的內部時鐘。或者每個處理從某個時變的輸入傳感器讀取的數值都有稍微不同，它們需要確定一個統一的值。

《容錯的實時系統》

終止：每一個正確的進程最終都會決定一個值；

合法性：如果某個進程最終決定取值是 _v_ ，那么這個 _v_ 必然是由某個進程提議的；

誠實：沒有進程會決定兩次；

一致性：兩個正確的進程，它們的決定不會不同。

《可靠且安全的分布式程序設計導論》第 5 章

《同步消息傳遞系統中的容錯一致》

《容錯、異步分布式系統中的通信與一致抽象》

法定人數

N

N/2-1

S

p

S

q

p

q

“法定人數系統的負載、容量和可用性”

N

f

(N + f) / 2

《可靠且安全的分布式程序設計導論》

《法定人數系統及其在存儲和共識中的應用》

分布式系統中的時間

“在……之前發生”（happend before）

“時間，時鐘和分布式系統中的事件順序”

區間樹時鐘

“沒有現在”

我們必須摒棄同時性（simultaneity）的想法。

《發明敵人》

快速瀏覽 FLP

只要有一個進程有可能失效，就無法達成分布式共識

共識問題涉及的是由一組進程組成的異步系統，其中一些進程是不可靠的。

p

q

q

p

p

本文提出一個令人驚奇的結果：系統中哪怕是只有一個進程可能會失效，就完全不可能存在異步共識協議。我們假設系統中不存在拜占庭失效，消息系統也是可靠的——所有消息都會被正確地一次送達。

不假定進程能夠檢測出另外一個進程是否死亡。也就是說，一個進程無法區分另外一個進程的兩種狀態：死亡（完全停止運行）或者運行得很慢。

FLP 簡要梳理

“共識研究中的那些坑：誤解與問題”

結束語

《分布式算法》

《可靠且安全的分布式程序設計指南》

參考文獻

Marcos K. Aguilera. 2010. Stumbling over consensus research: misunderstandings and issues. In Replication, Bernadette Charron-Bost, Fernando Pedone, and André Schiper (Eds.). Springer-Verlag, Berlin, Heidelberg 59-72.

Paulo Sérgio Almeida, Carlos Baquero, and Victor Fonte. 2008. Interval Tree Clocks. In Proceedings of the 12th International Conference on Principles of Distributed Systems (OPODIS ‘08), Theodore P. Baker, Alain Bui, and Sébastien Tixeuil (Eds.). Springer-Verlag, Berlin, Heidelberg, 259-274.

Kenneth P. Birman. 2012. Guide to Reliable Distributed Systems: Building High-Assurance Applications and Cloud-Hosted Services. Springer Publishing Company, Incorporated.

Mike Burrows. 2006. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th symposium on Operating systems design and implementation (OSDI ‘06). USENIX Association, Berkeley, CA, USA, 335-350.

Christian Cachin, Rachid Guerraoui, and Luis Rodrigues. 2014. Introduction to Reliable and Secure Distributed Programming (2nd ed.). Springer Publishing Company, Incorporated.

Tushar Deepak Chandra and Sam Toueg. 1996. Unreliable failure detectors for reliable distributed systems. J. ACM 43, 2 (March 1996), 225-267.

Umberto Eco. 2013. Inventing the Enemy: Essays. Mariner Books.

Colin J. Fidge. 1988. Timestamps in message-passing systems that preserve the partial ordering. Proceedings of the 11th Australian Computer Science Conference 10 (1) , 56–66.

Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. 1983. Impossibility of distributed consensus with one faulty process. In Proceedings of the 2nd ACM SIGACT-SIGMOD symposium on Principles of database systems (PODS ‘83). ACM, New York, NY, USA, 1-7.

Maurice P. Herlihy and Jeannette M. Wing. 1990. Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12, 3 (July 1990), 463-492.

Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July 1978), 558-565.

Leslie Lamport. 1998. The part-time parliament. ACM Trans. Comput. Syst. 16, 2 (May 1998), 133-169.

Nancy A. Lynch. 1996. Distributed Algorithms. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

Moni Naor and Avishai Wool. 1998. The Load, Capacity, and Availability of Quorum Systems. SIAM J. Comput. 27, 2 (April 1998), 423-447.

Brian M. Oki and Barbara H. Liskov. 1988. Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems. In Proceedings of the seventh annual ACM Symposium on Principles of distributed computing (PODC ‘88). ACM, New York, NY, USA, 8-17.

Diego Ongaro and John Ousterhout. 2014. In search of an understandable consensus algorithm. In Proceedings of the 2014 USENIX conference on USENIX Annual Technical Conference (USENIX ATC’14), Garth Gibson and Nickolai Zeldovich (Eds.). USENIX Association, Berkeley, CA, USA, 305-320.

M. Pease, R. Shostak, and L. Lamport. 1980. Reaching Agreement in the Presence of Faults. J. ACM 27, 2 (April 1980), 228-234.

Stefan Poledna. 1996. Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism. Kluwer Academic Publishers, Norwell, MA, USA.

Michel Raynal. 2010. Communication and Agreement Abstractions for Fault-Tolerant Asynchronous Distributed Systems (1st ed.). Morgan and Claypool Publishers.

Michel Raynal. 2010. Fault-tolerant Agreement in Synchronous Message-passing Systems (1st ed.). Morgan and Claypool Publishers.

Benjamin Reed and Flavio P. Junqueira. 2008. A simple totally ordered broadcast protocol. In Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware (LADIS ‘08). ACM, New York, NY, USA, , Article 2 , 6 pages.

Justin Sheehy. 2015. There Is No Now. ACM Queue

Marko Vukolic. 2012. Quorum Systems: With Applications to Storage and Consensus. Morgan and Claypool Publishers.

原文鏈接：What We Talk About When We Talk About Distributed Systems（翻譯：柳泉波）

http://dockone.io/article/898

本文由用戶 jopen 自行上傳分享，僅供網友學習交流。所有權歸原作者，若您的權利被侵害，請聯系管理員。

轉載本站原創文章，請注明出處，并保留原始鏈接、圖片水印。

本站是一個以用戶分享為主的開源技術平臺，歡迎各類分享！

本文地址：http://www.baiduhome.net/news/view/25ba7c

相關資訊

相關經驗

相關文檔