elasticsearch的bulk操作

goby1220 8年前發布 | 68K 次閱讀 ElasticSearch 搜索引擎

來自: https://segmentfault.com/a/1190000004426546


本文主要記錄如何用curl進行es的bulk操作。

bulk請求

準備數據

vim documents.json
{ "index": {"_index": "library", "_type": "book", "_id": "1"}}
{ "title": "All Quiet on the Western Front","otitle": "Im Westen nichts Neues","author": "Erich Maria Remarque","year": 1929,"characters": ["Paul B?umer", "Albert Kropp", "Haie Westhus", "Fredrich Müller", "Stanislaus Katczinsky", "Tjaden"],"tags": ["novel"],"copies": 1, "available": true, "section" : 3}
{ "index": {"_index": "library", "_type": "book", "_id": "2"}}
{ "title": "Catch-22","author": "Joseph Heller","year": 1961,"characters": ["John Yossarian", "Captain Aardvark", "Chaplain Tappman", "Colonel Cathcart", "Doctor Daneeka"],"tags": ["novel"],"copies": 6, "available" : false, "section" : 1}
{ "index": {"_index": "library", "_type": "book", "_id": "3"}}
{ "title": "The Complete Sherlock Holmes","author": "Arthur Conan Doyle","year": 1936,"characters": ["Sherlock Holmes","Dr. Watson", "G. Lestrade"],"tags": [],"copies": 0, "available" : false, "section" : 12}
{ "index": {"_index": "library", "_type": "book", "_id": "4"}}
{ "title": "Crime and Punishment","otitle": "Преступлéние и наказáние","author": "Fyodor Dostoevsky","year": 1886,"characters": ["Raskolnikov", "Sofia Semyonovna Marmeladova"],"tags": [],"copies": 0, "available" : true}

關閉refresh

curl -XPUT '192.168.99.100:9200/library -d '
{
    "settings":{
        "refresh_interval":"-1"
    }
}
'

發送請求

curl -s -XPOST '192.168.99.100:9200/_bulk' --data-binary @document.json
{"took":2603,"errors":false,"items":[{"index":{"_index":"library","_type":"book","_id":"1","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"status":201}},{"index":{"_index":"library","_type":"book","_id":"2","_version":2,"_shards":{"total":2,"successful":2,"failed":0},"status":200}},{"index":{"_index":"library","_type":"book","_id":"3","_version":2,"_shards":{"total":2,"successful":2,"failed":0},"status":200}},{"index":{"_index":"library","_type":"book","_id":"4","_version":2,"_shards":{"total":2,"successful":2,"failed":0},"status":200}}]}%

refresh

更改回每隔1s將內存的segment刷回文件系統緩存

curl -XPUT '192.168.99.100:9200/library -d '
{
    "settings":{
        "refresh_interval":"1"
    }
}
'

或者再手動刷新一次

curl -XPOST '192.168.99.100:9200/_refresh

head插件安裝

cd /usr/share/elasticsearch
./bin/plugin install mobz/elasticsearch-head

重啟es

cd /etc/init.d
./elasticsearch restart
{
    "query": {
        "query_string": {
            "query": "title:crime"
        }
    }
}

要返回版本信息的話:

{
    "version": true, 
    "query": {
        "query_string": {
            "query": "title:crime"
        }
    }
}

返回指定字段:

{
    "fields": ["title","year"], 
    "query": {
        "query_string": {
            "query": "title:crime"
        }
    }
}

關于flush

refresh只是將內存的segment刷回到文件系統緩存(刷到文件系統緩存中lucene就可以檢索這個segment),還沒有到磁盤。es在將數據寫入內存buffer同時,會寫一份translog日志,refresh的時候,translog保持原樣。
flush是真正把segment刷回到磁盤,更新commit文件(該文件用來記錄索引中的所有segment)時,translog清空的過程。這個flush的頻率默認是30分鐘主動flush一次,或者translog大小大于512M時主動flush一次。

參考

 本文由用戶 goby1220 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!