C++11 多線程：數據保護

jopen 13年前發布 | 22K 次閱讀 C/C++開發 C/C++
在編寫多線程程序時，多個線程同時訪問某個共享資源，會導致同步的問題，這篇文章中我們將介紹 C++11 多線程編程中的數據保護。
數據丟失

讓我們從一個簡單的例子開始，請看如下代碼：
#include <iostream>
include <string>
include <thread>
include <vector>
using std::thread;
using std::vector;
using std::cout;
using std::endl;
class Incrementer
{
    private:
        int counter;

public:
    Incrementer() : counter{0} { };

    void operator()()
    {
        for(int i = 0; i < 100000; i++)
        {
            this->counter++;
        }
    }

    int getCounter() const
    {
        return this->counter;
    }       

};
int main()
{
    // Create the threads which will each do some counting
    vector<thread> threads;

Incrementer counter;

threads.push_back(thread(std::ref(counter)));
threads.push_back(thread(std::ref(counter)));
threads.push_back(thread(std::ref(counter)));

for(auto &t : threads)
{
    t.join();
}

cout << counter.getCounter() << endl;

return 0;

}</pre> 
這個程序的目的就是數數，數到30萬，某些傻叉程序員想要優化數數的過程，因此創建了三個線程，使用一個共享變量 counter，每個線程負責給這個變量增加10萬計數。

這段代碼創建了一個名為 Incrementer 的類，該類包含一個私有變量 counter，其構造器非常簡單，只是將 counter 設置為 0.

緊接著是一個操作符重載，這意味著這個類的每個實例都是被當作一個簡單函數來調用的。一般我們調用類的某個方法時會這樣 object.fooMethod()，但現在你實際上是直接調用了對象，如 object(). 因為我們是在操作符重載函數中將整個對象傳遞給了線程類。最后是一個 getCounter 方法，返回 counter 變量的值。

再下來是程序的入口函數 main()，我們創建了三個線程，不過只創建了一個 Incrementer 類的實例，然后將這個實例傳遞給三個線程，注意這里使用了 std::ref ，這相當于是傳遞了實例的引用對象，而不是對象的拷貝。

現在讓我們來看看程序執行的結果，如果這位傻叉程序員還夠聰明的話，他會使用 GCC 4.7 或者更新版本，或者是 Clang 3.1 來進行編譯，編譯方法：

g++ -std=c++11 -lpthread -o threading_example main.cpp
 運行結果：

[lucas@lucas-desktop src]$ ./threading_example 
218141
[lucas@lucas-desktop src]$ ./threading_example 
208079
[lucas@lucas-desktop src]$ ./threading_example 
100000
[lucas@lucas-desktop src]$ ./threading_example 
202426
[lucas@lucas-desktop src]$ ./threading_example 
172209
 但等等，不對啊，程序并沒有數數到30萬，有一次居然只數到10萬，為什么會這樣呢？好吧，加1操作對應實際的處理器指令其實包括：

movl    counter(%rip), %eax
addl    $1, %eax
movl    %eax, counter(%rip)
 首個指令將裝載 counter 的值到 %eax 寄存器，緊接著寄存器的值增1，然后將寄存器的值移給內存中 counter 所在的地址。

我聽到你在嘀咕：這不錯，可為什么會導致數數錯誤的問題呢？嗯，還記得我們以前說過線程會共享處理器，因為只有單核。因此在某些點上，一個線程會依照指令執行完成，但在很多情況下，操作系統會對線程說：時間結束了，到后面排隊再來，然后另外一個線程開始執行，當下一個線程開始執行時，它會從被暫停的那個位置開始執行。所以你猜會發生什么事，當前線程正準備執行寄存器加1操作時，系統把處理器交給另外一個線程？

我真的不知道會發生什么事，可能我們在準備加1時，另外一個線程進來了，重新將 counter 值加載到寄存器等多種情況的產生。誰也不知道到底發生了什么。

正確的做法

解決方案就是要求同一個時間內只允許一個線程訪問共享變量。這個可通過 std::mutex 類來解決。當線程進入時，加鎖、執行操作，然后釋放鎖。其他線程想要訪問這個共享資源必須等待鎖釋放。

互斥(mutex) 是操作系統確保鎖和解鎖操作是不可分割的。這意味著線程在對互斥量進行鎖和解鎖的操作是不會被中斷的。當線程對互斥量進行鎖或者解鎖時，該操作會在操作系統切換線程前完成。

而最好的事情是，當你試圖對互斥量進行加鎖操作時，其他的線程已經鎖住了該互斥量，那你就必須等待直到其釋放。操作系統會跟蹤哪個線程正在等待哪個互斥量，被堵塞的線程會進入 "blocked on m" 狀態，意味著操作系統不會給這個堵塞的線程任何處理器時間，直到互斥量解鎖，因此也不會浪費 CPU 的循環。如果有多個線程處于等待狀態，哪個線程最先獲得資源取決于操作系統本身，一般像 Windows 和 Linux 系統使用的是 FIFO 策略，在實時操作系統中則是基于優先級的。

現在讓我們對上面的代碼進行改進：

#include <iostream>
include <string>
include <thread>
include <vector>
include <mutex>
using std::thread;
using std::vector;
using std::cout;
using std::endl;
using std::mutex;
class Incrementer
{
    private:
        int counter;
        mutex m;

public:
    Incrementer() : counter{0} { };

    void operator()()
    {
        for(int i = 0; i < 100000; i++)
        {
            this->m.lock();
            this->counter++;
            this->m.unlock();
        }
    }

    int getCounter() const
    {
        return this->counter;
    }   

};
int main()
{
    // Create the threads which will each do some counting
    vector<thread> threads;

Incrementer counter;

threads.push_back(thread(std::ref(counter)));
threads.push_back(thread(std::ref(counter)));
threads.push_back(thread(std::ref(counter)));

for(auto &t : threads)
{
    t.join();
}

cout << counter.getCounter() << endl;

return 0;

}</pre> 
注意代碼上的變化：我們引入了 mutex 頭文件，增加了一個 m 的成員，類型是 mutex，在 operator()() 中我們鎖住互斥量 m 然后對 counter 進行加1操作，然后釋放互斥量。

再次執行上述程序，結果如下：

[lucas@lucas-desktop src]$ ./threading_example 
300000
[lucas@lucas-desktop src]$ ./threading_example 
300000
 這下數對了。不過在計算機科學中，沒有免費的午餐，使用互斥量會降低程序的性能，但這總比一個錯誤的程序要強吧。

防范異常

當對變量進行加1操作時，是可能會發生異常的，當然在我們這個例子中發生異常的機會微乎其微，但是在一些復雜系統中是極有可能的。上面的代碼并不是異常安全的，當異常發生時，程序已經結束了，可是互斥量還是處于鎖的狀態。

為了確保互斥量在異常發生的情況下也能被解鎖，我們需要使用如下代碼：

   for(int i = 0; i < 100000; i++)
    {
    this->m.lock();
    try
    {
        this->counter++;
        this->m.unlock();
    }
    catch(...)
    {
        this->m.unlock();
        throw;
    }
    }
 但是，這代碼太多了，而只是為了對互斥量進行加鎖和解鎖。沒關系，我知道你很懶，因此推薦個更簡單的單行代碼解決方法，就是使用 std::lock_guard 類。這個類在創建時就鎖定了 mutex 對象，然后在結束時釋放。

繼續修改代碼：

void operator()()
{
    for(int i = 0; i < 100000; i++)
    {
    lock_guard<mutex> lock(this->m);

// The lock has been created now, and immediatly locks the mutex
this->counter++;

// This is the end of the for-loop scope, and the lock will be
// destroyed, and in the destructor of the lock, it will
// unlock the mutex
}

}</pre> 
上面代碼已然是異常安全了，因為當異常發生時，將會調用 lock 對象的析構函數，然后自動進行互斥量的解鎖。

記住，請使用放下代碼模板來編寫：

void long_function()
{
    // some long code

// Just a pair of curly braces
{
// Temp scope, create lock
lock_guard<mutex> lock(this->m);

// do some stuff

// Close the scope, so the guard will unlock the mutex
}

}</pre>英文原文， OSCHINA原創翻譯
本文由用戶 jopen 自行上傳分享，僅供網友學習交流。所有權歸原作者，若您的權利被侵害，請聯系管理員。
轉載本站原創文章，請注明出處，并保留原始鏈接、圖片水印。
本站是一個以用戶分享為主的開源技術平臺，歡迎各類分享！
本文地址：http://www.baiduhome.net/lib/view/open1341387098312.html
C/C++開發 C/C++
C++11 多線程：數據保護

數據丟失

相關經驗

相關資訊

相關文檔

目錄