把Elasticsearch作為時間序列數據庫使用

bing200767 8年前發布 | 23K 次閱讀 ElasticSearch 搜索引擎

來自: http://blog.csdn.net//jiao_fuyou/article/details/49663687


這篇文章算是對另一篇《Elasticsearch as a Time Series Data Store》的簡單翻譯吧,自己的理解吧。

  • 首先_source被關閉了,這樣原始的json文檔不會被重復存儲一遍。
  • 其次_all也被關閉了。而且每個字段的store都是False,也就是不會單獨被存儲。
  • 這些都關掉了,那么數據存哪里了?存在doc_values里。doc_values用于在做聚合運算的時候,根據一批文檔id快速找到對應的列的值。doc_values在磁盤上一個按列壓縮存儲的文件,非常高效。

curl -XPOST http://172.16.18.116:9200/test -d '
{
    "settings": { "number_of_shards": 1, "number_of_replicas": 0, "index.query.default_field": "timestamp", "index.mapping.ignore_malformed": false, "index.mapping.coerce": false, "index.query.parse.allow_unmapped_fields": false },
    "mappings": { "test": { "_source": {"enabled": false}, "_all": {"enabled": false}, "properties": { "timestamp": { "type": "date", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "appid": { "type": "string", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "result": { "type": "string", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "cmdid": { "type": "string", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "optime": { "type": "integer", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "total_count": { "type": "integer", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } } } } } }'

增加一條數據:

curl -XPOST http://172.16.18.116:9200/test/test/1 -d '
{
    "timestamp": 53534543,
    "appid": 1,
    "result": "test",
    "cmdid": "test",
    "optime": 53534543,
    "total_count": 100 }
'

查詢一下:

curl -XGET http://172.16.18.116:9200/test/test/_search
{
    "took": 1,
    "timed_out": false,
    "_shards": { "total": 1, "successful": 1, "failed": 0 },
    "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "test", "_type": "test", "_id": "1", "_score": 1 } ] } }

能查到數據,但是看不到原始字段內容,因為沒存儲也沒索引,但是doc_values=true,實際上是保存到了磁盤上的

下面做一下聚合操作:

curl -XPOST http://172.16.18.116:9200/test/test/_search
{
    "aggs": { "timestamp": { "terms": { "field": "timestamp" }, "aggs": { "total_count": {"sum": {"field": "total_count"}} } } } }

結果:

{
    "took": 2,
    "timed_out": false,
    "_shards": { "total": 1, "successful": 1, "failed": 0 },
    "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "test", "_type": "test", "_id": "1", "_score": 1 } ] },
    "aggregations": { "timestamp": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": 53534543, "key_as_string": "1970-01-01T14:52:14.543Z", "doc_count": 1, "total_count": { "value": 100 } } ] } } }

可以看到聚合操作可以獲取到total_count值。

</div>

 本文由用戶 bing200767 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!