springboot使用RestHighLevelClient操作es


springboot使用RestHighLevelClient操作es

之前博客中已經寫過集成es6.3了,今天回過頭來看覺得之前的太麻煩了,現在使用的是es官方的
Java High Level REST Client
具體的api可以查看下官方文檔,老樣子本次博文還是采用docker進行搭建

進行環境配置

安裝es7.1

​ 拉取鏡像

docker pull elasticsearch:7.1.0

image-20210518103045423

運行鏡像

指定了最大日志文件,因為這個服務的日志文件實在太大了

docker run -d --log-opt max-size=10m -e "discovery.type=single-node" --name es -p 9200:9200 -p 9300:9300 elasticsearch:7.1.0

​ 進入容器,安裝中文ik分詞器

docker exec -it es /bin/bash

image-20210518103645359

ik分詞器地址https://github.com/medcl/elasticsearch-analysis-ik/,自己下載對應版本,下面的命令太長了,復制到一行執行

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.1.0/elasticsearch-analysis-ik-7.1.0.zip

添加密碼認證(可選)

還是在容器中進行操作,在/usr/share/elasticsearch/config中的elasticsearch.yml中添加下面的配置

image-20210518105559280

image-20210518105545191

重啟es

重啟后進入es,進行密碼配置

elasticsearch-setup-passwords interactive

依次輸入密碼就行

image-20210518110026732

安裝kibana7.1(可選)

拉取鏡像

docker pull kibana:7.1.0

image-20210518110211364

運行鏡像

docker run -d -it   --name=kibana -p 5601:5601  kibana:7.1.0

要是上一步配置了密碼,就需要修改kibana的配置文件

#修改es配置文件kibana.yml
vi /usr/share/kibana/config/kibana.yml
#添加以下內容
elasticsearch.username: "elastic"
elasticsearch.password: "*****"

重啟kibana

訪問URL驗證結果

進行springboot的集成
  1. 首先配置maven依賴,本次用的是highlevelclient

     <!--        es-->
            <dependency>
                <groupid>org.elasticsearch.client</groupid>
                <artifactid>elasticsearch-rest-high-level-client</artifactid>
                <version>7.1.0</version>
            </dependency>
    
            <dependency>
                <groupid>org.elasticsearch</groupid>
                <artifactid>elasticsearch</artifactid>
                <version>7.1.0</version>
            </dependency>
    
  2. 配置es

    1. 首先在yml中進行常見配置

      elasticsearch:
        host: localhost
        port: 9200
        connTimeout: 3000
        socketTimeout: 5000
        connectionRequestTimeout: 500
      #  索引名稱
        index-name: contentik
        username: elastic
        password: 123456
      
      
    2. 新建es配置類

      后續所有的操作都是通過這個client來進行的

      package com.pgy.esdemo.config;
      
      import org.apache.http.HttpHost;
      import org.apache.http.auth.AuthScope;
      import org.apache.http.auth.UsernamePasswordCredentials;
      import org.apache.http.client.CredentialsProvider;
      import org.apache.http.impl.client.BasicCredentialsProvider;
      import org.apache.http.impl.nio.client.HttpAsyncClientBuilder;
      import org.elasticsearch.client.RestClient;
      import org.elasticsearch.client.RestClientBuilder;
      import org.elasticsearch.client.RestHighLevelClient;
      import org.springframework.beans.factory.annotation.Value;
      import org.springframework.context.annotation.Bean;
      import org.springframework.context.annotation.Configuration;
      
      /**
       * @Author: Kevin
       * @Description:
       * @Date: create in 2021/5/11 15:21
       */
      @Configuration
      public class ElasticsearchConfiguration {
          @Value("${elasticsearch.host}")
          private String host;
      
          @Value("${elasticsearch.port}")
          private int port;
      
          @Value("${elasticsearch.connTimeout}")
          private int connTimeout;
      
          @Value("${elasticsearch.socketTimeout}")
          private int socketTimeout;
      
          @Value("${elasticsearch.connectionRequestTimeout}")
          private int connectionRequestTimeout;
      
          @Value("${elasticsearch.username}")
          private String USERNAME;
      
          @Value("${elasticsearch.password}")
          private String PASSWORD;
      
      
          @Bean(destroyMethod = "close", name = "client")
          public RestHighLevelClient initRestClient() {
      
              //如果沒配置密碼就可以不用下面這兩部
              final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
              credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(USERNAME, PASSWORD));
      
              RestClientBuilder builder = RestClient.builder(new HttpHost(host, port))
                      .setRequestConfigCallback(requestConfigBuilder -&gt; requestConfigBuilder
                              .setConnectTimeout(connTimeout)
                              .setSocketTimeout(socketTimeout)
                              .setConnectionRequestTimeout(connectionRequestTimeout))
                  //沒有密碼可以不用這一個set
                      .setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
                          @Override
                          public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
                              httpClientBuilder.disableAuthCaching();
                              return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
                          }
                      });
      
              return new RestHighLevelClient(builder);
          }
      
      }
      
      
    3. 創建索引

      官方文檔https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-index.html

      public void createContentIk() throws IOException {
          //這里的都是es的對象 indexName就是你的索引名
          CreateIndexRequest request = new CreateIndexRequest(indexName);
          
          //下面是索引的字段,和對應的字段類型 getField方法可以自己修改下,我這邊是進行的text類型的添加分詞操作
          Map<string, object=""> properties = new HashMap&lt;&gt;();
          properties.put("contentId", getField("keyword"));
          properties.put("contentName", getField("text"));
          properties.put("contentIntroduction", getField("text"));
          properties.put("coverImage", getField("keyword"));
          properties.put("contentSonType", getField("keyword"));
          properties.put("pageTemplateType", getField("keyword"));
          properties.put("sheetStatus", getField("keyword"));
          properties.put("templateId", getField("keyword"));
          properties.put("modleName", getField("keyword"));
          properties.put("content", getField("text"));
          
          Map<string, object=""> mapping = new HashMap&lt;&gt;();
          mapping.put("properties", properties);
          
          //Adds mapping that will be added when the index gets created. Note that the definition should
          request.mapping(mapping);
          
          // 執行創建請求
          CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
          //判斷成功與否
          boolean acknowledged = createIndexResponse.isAcknowledged();
          log.info("acknowledged:{}", acknowledged);
          boolean shardsAcknowledged = createIndexResponse.isShardsAcknowledged();
          
          log.info("shardsAcknowledged:{}", shardsAcknowledged);
      }
      
        private static Map<string, object=""> getField(String type) {
              Map<string, object=""> result = new HashMap&lt;&gt;();
              result.put("type", type);
              if ("text".equals(type)) {
                  result.put("analyzer", "ik_max_word");
                  result.put("search_analyzer", "ik_max_word");
      //            result.put("ignore_above", "");
                  // 這個地方是因為我存儲內容太長了,所以加上了這個配置 不然會報錯Exceeding maximum length of field in elasticsearch - error in kibana
      
                  result.put("term_vector", "with_positions_offsets");
              }
      
              return result;
          }
      

      在kibana進行查看結果

      image-20210521103540689

    4. 存數據

      https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-bulk.html

      這邊用的是es自己設置的id,我就沒手動設置

          public Boolean save(XmyContentMianSearchVo vo) {
              BulkRequest request = new BulkRequest();
              request.add(new IndexRequest(indexName)
                      .source(JSONObject.toJSONString(vo), XContentType.JSON));
              BulkResponse bulkResponse = null;
              try {
                  bulkResponse = client.bulk(request, RequestOptions.DEFAULT);
              } catch (IOException e) {
                  e.printStackTrace();
              }
              boolean b = bulkResponse.hasFailures();
              log.info("保存返回:{}", b);
              return b;
          }
      
    5. 刪除

      https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-delete.html

      我這邊做法是通過我實體類的id 查詢出來es的id,然后再通過es的id進行刪除

      public void delete(String id) {
          String esIdbyContentId = getEsIdbyContentId(id);
          DeleteRequest deleteRequest = new DeleteRequest(indexName, esIdbyContentId);
          DeleteResponse delete = null;
          try {
              delete = client.delete(deleteRequest, RequestOptions.DEFAULT);
          } catch (IOException e) {
              e.printStackTrace();
          }
      
          DocWriteResponse.Result result = delete.getResult();
          String s = checkResponse(result);
          log.info(s);
      }
      
      public String getEsIdbyContentId(String contentId) {
              String id = "";
              SearchRequest request = new SearchRequest(indexName);
              //構建查詢
              SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
              BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
              boolQueryBuilder.should(matchQuery("contentId", contentId));
              sourceBuilder.query(boolQueryBuilder);
              request.source(sourceBuilder);
              SearchResponse response = null;
              try {
                  response = client.search(request, RequestOptions.DEFAULT);
              } catch (IOException e) {
                  e.printStackTrace();
              }
              long value = response.getHits().getTotalHits().value;
              SearchHit[] hits = response.getHits().getHits();
      
              for (SearchHit hit : hits) {
                  id = hit.getId();
              }
              return id;
          }
      
    6. 修改

      https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-document-get.html

      做法也是跟刪除差不多,通過實體類的id 先找到es中的id,然后再去更新內容

      public String update(String id) {
              String esIdbyContentId = getEsIdbyContentId(id);
              if (org.apache.commons.lang3.StringUtils.isBlank(esIdbyContentId)) {
                  log.info("更新時未找到es內容,id為:{}", id);
                  return "更新時未找到es內容";
              }
              String join = StringUtils.join("自己的內容", ",");
              vo.setContent(join);
      //        log.info("更新:{}", vo);
              UpdateRequest updateRequest = new UpdateRequest(indexName, esIdbyContentId);
              updateRequest.doc(JSON.toJSONString(vo), XContentType.JSON);
              UpdateResponse update = null;
              try {
                  update = client.update(updateRequest, RequestOptions.DEFAULT);
              } catch (IOException e) {
                  e.printStackTrace();
              }
              String s = checkResponse(update.getResult());
              log.info(s);
              return s;
          }
      
    7. 查詢

      這邊就分為很多查詢類型,具體的可以看看官方的api文檔

      https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-search.html

      在QueryBuildes中對應有很多不同的查詢方式

      image-20210521142034507

講下查詢api

看到這里來了,說明你還是沒怎么搞懂,來,咱們再細細看下查詢的api

官方地址:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/6.3/java-rest-high-query-builders.html

全文查詢 Full Text Queries

什么是全文查詢?

像使用 match 或者 query_string 這樣的高層查詢都屬於全文查詢,

  • 查詢 日期(date) 或整數(integer) 字段,會將查詢字符串分別作為日期或整數對待。
  • 查詢一個( not_analyzed )未分析的精確值字符串字段,會將整個查詢字符串作為單個詞項對待。
  • 查詢一個( analyzed )已分析的全文字段,會先將查詢字符串傳遞到一個合適的分析器,然后生成一個供查詢的詞項列表

組成了詞項列表,后面就會對每個詞項逐一執行底層查詢,將查詢結果合並,並且為每個文檔生成最終的相關度評分。


match 查詢的單個詞的步驟是什么?

  1. 檢查字段類型,查看字段是 analyzed, not_analyzed
  2. 分析查詢字符串,如果只有一個單詞項, match 查詢在執行時就會是單個底層的 term 查詢
  3. 查找匹配的文檔,會在倒排索引中查找匹配文檔,然后獲取一組包含該項的文檔
  4. 為每個文檔評分

全文查詢 API 列表

全部的 API 列表如下(鏈接均指向 elasticsearch 官網)

Search Query QueryBuilder Class Method in QueryBuilders
Match MatchQueryBuilder QueryBuilders.matchQuery()
Match Phrase MatchPhraseQueryBuilder QueryBuilders.matchPhraseQuery()
Match Phrase Prefix MatchPhrasePrefixQueryBuilder QueryBuilders.matchPhrasePrefixQuery()
Multi Match MultiMatchQueryBuilder QueryBuilders.multiMatchQuery()
Common Terms CommonTermsQueryBuilder QueryBuilders.commonTermsQuery()
Query String QueryStringQueryBuilder QueryBuilders.queryStringQuery()
Simple Query String SimpleQueryStringBuilder QueryBuilders.simpleQueryStringQuery()
基於詞項的查詢

這種類型的查詢不需要分析,它們是對單個詞項操作,只是在倒排索引中查找准確的詞項(精確匹配)並且使用 TF/IDF 算法為每個包含詞項的文檔計算相關度評分 _score

Term

term 查詢可用作精確值匹配,精確值的類型則可以是數字,時間,布爾類型,或者是那些 not_analyzed 的字符串。

對應的 QueryBuilder class 是TermQueryBuilder

具體方法是 QueryBuilders.termQuery()

Terms

terms 查詢允許指定多個值進行匹配。如果這個字段包含了指定值中的任何一個值,就表示該文檔滿足條件。

對應的 QueryBuilder class 是 TermsQueryBuilder

具體方法是 QueryBuilders.termsQuery()

Wildcard

wildcard 通配符查詢是一種底層基於詞的查詢,它允許指定匹配的正則表達式。而且它使用的是標准的 shell 通配符查詢:

  • ? 匹配任意字符
  • * 匹配 0 個或多個字符

wildcard 需要掃描倒排索引中的詞列表才能找到所有匹配的詞,然后依次獲取每個詞相關的文檔 ID。

由於通配符和正則表達式只能在查詢時才能完成,因此查詢效率會比較低,在需要高性能的場合,應當謹慎使用。

對應的 QueryBuilder class 是 WildcardQueryBuilder

具體方法是 QueryBuilders.wildcardQuery()

基於詞項 API 列表
Search Query QueryBuilder Class Method in QueryBuilders
Term TermQueryBuilder QueryBuilders.termQuery()
Terms TermsQueryBuilder QueryBuilders.termsQuery()
Range RangeQueryBuilder QueryBuilders.rangeQuery()
Exists ExistsQueryBuilder QueryBuilders.existsQuery()
Prefix PrefixQueryBuilder QueryBuilders.prefixQuery()
Wildcard WildcardQueryBuilder QueryBuilders.wildcardQuery()
Regexp RegexpQueryBuilder QueryBuilders.regexpQuery()
Fuzzy FuzzyQueryBuilder QueryBuilders.fuzzyQuery()
Type TypeQueryBuilder QueryBuilders.typeQuery()
Ids IdsQueryBuilder QueryBuilders.idsQuery()
復合查詢
什么是復合查詢?

復合查詢會將其他的復合查詢或者葉查詢包裹起來,以嵌套的形式展示和執行,得到的結果也是對各個子查詢結果和分數的合並。可以分為下面幾種:

  • constant_score query

    經常用在使用 filter 的場合,所有匹配的文檔分數都是一個不變的常量

  • bool query

    可以將多個葉查詢和組合查詢再組合起來,可接受的參數如下

    • must : 文檔必須匹配這些條件才能被包含進來
    • must_not 文檔必須不匹配才能被包含進來
    • should 如果滿足其中的任何語句,都會增加分數;即使不滿足,也沒有影響
    • filter 以過濾模式進行,不評分,但是必須匹配
  • dis_max query

    叫做分離最大化查詢,它會將任何與查詢匹配的文檔都作為結果返回,但是只是將其中最佳匹配的評分作為最終的評分返回。

  • function_score query

    允許為每個與主查詢匹配的文檔應用一個函數,可用來改變甚至替換原始的評分

  • boosting query

    用來控制(提高或降低)復合查詢中子查詢的權重。

復合查詢列表
Search Query QueryBuilder Class Method in QueryBuilders
Constant Score ConstantScoreQueryBuilder QueryBuilders.constantScoreQuery()
Bool BoolQueryBuilder QueryBuilders.boolQuery()
Dis Max DisMaxQueryBuilder QueryBuilders.disMaxQuery()
Function Score FunctionScoreQueryBuilder QueryBuilders.functionScoreQuery()
Boosting BoostingQueryBuilder QueryBuilders.boostingQuery()
特殊查詢
Wrapper Query

這里比較重要的一個是 Wrapper Query,是說可以接受任何其他 base64 編碼的字符串作為子查詢。

主要應用場合就是在 Rest High-Level REST client 中接受 json 字符串作為參數。比如使用 gson 等 json 庫將要查詢的語句拼接好,直接塞到 Wrapper Query 中查詢就可以了,非常方便。

Wrapper Query 對應的 QueryBuilder class 是WrapperQueryBuilder

具體方法是 QueryBuilders.wrapperQuery()

下面是我寫的兩個查詢 一個是分頁的,一個是查詢從start 到end的,各有優勢,對比可以看下這個https://www.jianshu.com/p/14aa8b09c789

       /**
     * @param text  搜索的文本內容
     * @param flag  是否為 僅內容包含 true 為是
     * @param field 字段
     * @return
     */ 
public List<xmycontentmiansearchvo> searchES(String text, Boolean flag, String... field) {
        SearchRequest request = new SearchRequest(indexName);
        //構建查詢
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
        if (flag) {
            boolQueryBuilder.should(matchQuery(field[2], text))
                    .should(matchPhraseQuery(field[2], text));
        } else {
            boolQueryBuilder.should(matchQuery(field[0], text))
                    .should(matchQuery(field[1], text));
        }


//
        sourceBuilder.from(0).size(50000);
        sourceBuilder.query(boolQueryBuilder);
        request.source(sourceBuilder);
        SearchResponse response = null;
        try {
            response = client.search(request, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }
        long value = response.getHits().getTotalHits().value;
        List<xmycontentmiansearchvo> res = new ArrayList&lt;&gt;();
        if (value &gt; 0) {
            SearchHit[] hits = response.getHits().getHits();
            for (SearchHit hit : hits) {
                XmyContentMianSearchVo xmyContentMianSearchVo = JSONObject.parseObject(hit.getSourceAsString(), XmyContentMianSearchVo.class);
                res.add(xmyContentMianSearchVo);
            }
        }
        return res;
    }


//分頁
 /**
     * 
     * @param text 搜索內容
     * @param sid 分類id
     * @param size 每頁數量
     * @param type 查詢類型
     * @param contentIds  實體類id集合
     * @param field 匹配字段
     * @return
     */
public ResScrollerDto scrollers(String text, String sid, Integer size, String type, List<string> contentIds, String... field) {
        if (StringUtils.isBlank(text)) {
            text = "";
        }
        SearchRequest searchRequest = new SearchRequest(indexName);
        SearchHit[] hits = null;
        String scrollId = null;
        if (org.apache.commons.lang3.StringUtils.isNotBlank(sid)) {
            SearchScrollRequest scrollRequest = new SearchScrollRequest(sid);
            scrollRequest.scroll(TimeValue.timeValueSeconds(1));
            SearchResponse searchScrollResponse = null;
            try {
                searchScrollResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);
            } catch (Exception e) {
                log.info("分頁的異常:{}", e.getMessage());
              e.printStackTrace(); 
            }
            scrollId = searchScrollResponse.getScrollId();
            hits = searchScrollResponse.getHits().getHits();

        } else {

            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
            BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();

            if (StringUtils.equals(type, "9")){
                boolQueryBuilder.should(matchQuery(field[0], text))
                        .should(matchQuery(field[1], text))
                        .should(matchQuery(field[2], text))
                        .should(matchPhraseQuery(field[2], text))
                ;

            }else{
                boolQueryBuilder.should(matchQuery(field[0], text))
                        .should(matchQuery(field[1], text))
                        .should(matchQuery(field[2], text))
                        .should(matchPhraseQuery(field[2], text))
                        .must(termsQuery("contentId", contentIds))
                ;
            }

            searchSourceBuilder.query(boolQueryBuilder);
//
            searchSourceBuilder.size(size);
            searchRequest.source(searchSourceBuilder);
//        設置游標過期時間
            searchRequest.scroll(TimeValue.timeValueSeconds(1));
            SearchResponse searchResponse = null;
            try {
                searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
            } catch (IOException e) {
                e.printStackTrace();

            }
            scrollId = searchResponse.getScrollId();
            hits = searchResponse.getHits().getHits();
        }


        List<xmycontentmiansearchvo> res = new ArrayList&lt;&gt;();
        for (SearchHit hit : hits) {
            XmyContentMianSearchVo xmyContentMianSearchVo = JSONObject.parseObject(hit.getSourceAsString(), XmyContentMianSearchVo.class);
            res.add(xmyContentMianSearchVo);
        }

        ResScrollerDto dto = new ResScrollerDto();
        dto.setSid(res.size() == 0 ? "" : scrollId);
//        dto.setRes(res);
        if (StringUtils.equals(type, "1")) {
            dto.setDettContentList(res);
        } else if (StringUtils.equals(type, "2")) {
            dto.setCloudContentList(res);
        }else {
            dto.setDettContentList(res);
        }
        log.info("分頁:{}", dto);
        return dto;
    }


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM