Elasticsearch version: 7.8
需求是分頁去重獲取索引中的數據, 類似 MySQL 的 distinct. Elasticsearch 中的 collapse 可以實現該需求.
You can use the collapse parameter to collapse search results based on field values. The collapsing is done by selecting only the top sorted document per collapse key.
你可以使用 collapse 參數根據字段值折疊搜索結果, 折疊是通過每個折疊鍵僅選擇排序最靠前的文檔來完成的.
注意:
The total number of hits in the response indicates the number of matching documents without collapsing. The total number of distinct group is unknown.
響應中的總數表示沒有折疊的匹配文檔數, 去重后的總數是不知道的.
那么怎么獲取去重后的總數呢? 可以使用 Aggregation 中的 cardinality 來實現.
DSL example:
{ "from": 0, "size": 5, "sort": [ { "createTime": { "order": "desc" } } ], "collapse": { "field": "app_id" }, "aggs": { "total_size": { "cardinality": { "field": "app_id" } } } }
Java API example:
SortBuilder sortBuilder = SortBuilders.fieldSort(CREATE_TIME).order(SortOrder.DESC); CollapseBuilder collapseBuilder = new CollapseBuilder(APP_ID); AggregationBuilder aggregation = AggregationBuilders.cardinality(TOTAL_COUNT_KEY).field(APP_ID);