[ElasticSearch]ES操作之總和桶聚合(Sum Bucket Aggregation)


最近從同事那里學到了很多ES查詢的新姿勢,總結一波.

總和桶聚合(Sum Bucket Aggregation)


使用場景: 獲取某分組條件下所有桶的指定度量的和


比如: 根據某個條件分組,獲取前1000條數據出現的數量和.

可以用笨辦法定義變量,循環遍歷分組,拿到count再求和的方式,但不夠逼格,既然ES提供了方法,直接調用即可.

 

傳送門:https://xiaoxiami.gitbook.io/elasticsearch/ji-chu/36aggregationsju-he-fen-679029/363guan-dao-ju-540828-pipeline-aggregations/zong-he-tong-ju-540828-sum-bucket-aggregation

 

例1-DSL寫法:

"aggs": {
    "all": {
        "terms": {
          "field": "topics",
          "size": 5
      }
    },
    "sum":{
        "sum_bucket":{
          "buckets_path":"all>_count"
      }
    }
}    

結果:

      "aggregations": {
            "all": {
              "doc_count_error_upper_bound": 11656,
              "sum_other_doc_count": 2575137,
              "buckets": [
                {
                  "key": "xx",
                  "doc_count": 129636
                },
                {
                  "key": "xxx",
                  "doc_count": 41586
                },
                {
                  "key": "xxxx",
                  "doc_count": 39196
                },
                {
                  "key": "xxxxx",
                  "doc_count": 38775
                },
                {
                  "key": "xxxxxx",
                  "doc_count": 23163
                }
              ]
            },
            "sum": {
              "value": 272356
            }
         }

sum的value就是分組的doc_count的和

 

java操作rest-high-level-client寫法:

     SearchSourceBuilder sourceBuilder = new SearchSourceBuilder()
                .query(new MatchAllQueryBuilder())
                .size(0)
                .timeout(TimeValue.timeValueMillis(120000));
        TermsAggregationBuilder terms = AggregationBuilders.terms("all").field("topics").size(5);
        SumBucketPipelineAggregationBuilder sumBucket = new SumBucketPipelineAggregationBuilder("sum", "all>_count");
        sourceBuilder.aggregation(terms).aggregation(sumBucket);
        SearchRequest request = new SearchRequest(xxIndex)
                .types(xxType)
                .source(sourceBuilder);
        SearchResponse response = esClient.getClient().search(request);
        Map<String, Aggregation> map = response.getAggregations().getAsMap();
        double sum = ((ParsedSimpleValue)map.get("sum")).value();

 

除了count,其他度量條件(數字類型)也可以求和,比如對分組下的某個字段求和,然后獲取所有分組的和

例2-DSL寫法:

      "aggs": {
            "all": {
              "terms": {
                "field": "topics",
                "size": 5
              },
              "aggs": {
                "friends_cnt": {
                  "sum": {
                    "field": "friends_cnt"
                  }
                }
              }
            },
            "sum":{
              "sum_bucket":{
                "buckets_path":"all>friends_cnt"
              }
            }
          }

結果:

           "aggregations": {
            "all": {
              "doc_count_error_upper_bound": 11656,
              "sum_other_doc_count": 2575137,
              "buckets": [
                {
                  "key": "xx",
                  "doc_count": 129636,
                  "friends_cnt": {
                    "value": 55291503
                  }
                },
                {
                  "key": "xxx",
                  "doc_count": 41586,
                  "friends_cnt": {
                    "value": 21381248
                  }
                },
                {
                  "key": "xxxx",
                  "doc_count": 39196,
                  "friends_cnt": {
                    "value": 14668921
                  }
                },
                {
                  "key": "xxxxx",
                  "doc_count": 38775,
                  "friends_cnt": {
                    "value": 19805247
                  }
                },
                {
                  "key": "xxxxxx",
                  "doc_count": 23163,
                  "friends_cnt": {
                    "value": 10268415
                  }
                }
              ]
            },
            "sum": {
              "value": 121415334
            }
          }        

基於java:只需要修改第一個聚合條件,加一個子聚合,然后修改sumbucket的"_count"

     SearchSourceBuilder sourceBuilder = new SearchSourceBuilder()
                .query(new MatchAllQueryBuilder())
                .size(0)
                .timeout(TimeValue.timeValueMillis(120000));
        TermsAggregationBuilder terms = AggregationBuilders.terms("all").field("topics").size(5)
                .subAggregation(new SumAggregationBuilder("friends_cnt").field("friends_cnt"));
        SumBucketPipelineAggregationBuilder sumBucket = new SumBucketPipelineAggregationBuilder("sum", "all>friends_cnt");
        sourceBuilder.aggregation(terms).aggregation(sumBucket);
        SearchRequest request = new SearchRequest(xxIndex)
                .types(xxType)
                .source(sourceBuilder);
        SearchResponse response = esClient.getClient().search(request);
        Map<String, Aggregation> map = response.getAggregations().getAsMap();
        double sum = ((ParsedSimpleValue)map.get("sum")).value();
        return Double.toString(sum);

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM