Spring boot項目使用 restHighLevelClient 接入 elasticsearch


Spring boot 接入 ElasticSearch 查詢數據

最近在做一個需要支持大數據量查詢的項目,調研之后選用ElasticSearch存儲數據,並接入Spring Boot項目,通過rest接口查詢並返回。具體的,獲取數據並向ES中插入數據是用Python腳本實現的,本博客只涉及查詢操作。

一. 接入ElasticSearch

選用的是官網推薦的restHighLevelClient,其封裝了CRUD方法。

服務器上已經部署好ES的前提下,在spring boot項目中接入大概分為三步:

1. 添加依賴

為了簡潔,將pom文件中無關的部分都刪去了。主要是選擇一個適合的es版本,這里選的是7.6.0

    <properties>
        <java.version>1.8</java.version>
        <es.version>7.6.0</es.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>        
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>${es.version}</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>${es.version}</version>
        </dependency>
    </dependencies>

2. 在yaml文件中添加配置

在這里主要配置es服務的地址,以及鑒權。

spring:
  elasticsearch:
    rest:
      connection-timeout: 6s
      uris: test-cluster-01:9200,test-cluster-02:9200
      read-timeout: 10s
      # 如果不需要賬號密碼就可訪問,下面兩個字段可以去掉
      username: estest
      password: estest

3. 創建configration,在服務啟動時創建好restHighLevelClient

使用configration注解,服務啟動時會生成RestHighLevelClient的bean,之后使用只需要注入就行了。

@Configuration
public class ESConfig {

    @Value("${spring.elasticsearch.rest.uris}")
    private List<String> uris;

    // 如果不需要賬號密碼就可訪問,userName和password兩個字段可以去掉
    @Value("${spring.elasticsearch.rest.password}")
    private String userName;

    @Value("${spring.elasticsearch.rest.username}")
    private String password;

    @Bean
    public RestHighLevelClient restHighLevelClient() {
        HttpHost[] httpHosts = createHosts();
        RestClientBuilder restClientBuilder = RestClient.builder(httpHosts)
                .setHttpClientConfigCallback(httpClientBuilder -> {
                    CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
                    // 如果不需要賬號密碼就可訪問,下面這行可以去掉
                    credentialsProvider.setCredentials(AuthScope.ANY,new UsernamePasswordCredentials(userName,password));
                    return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
                });
        return new RestHighLevelClient(restClientBuilder);
    }

    // 支持ES分布式
    private HttpHost[] createHosts() {
        HttpHost[] httpHosts = new HttpHost[uris.size()];
        for (int i = 0; i < uris.size(); i++) {
            String hostStr = uris.get(i);
            String[] host = hostStr.split(":");
            httpHosts[i] = new HttpHost(host[0].trim(),Integer.parseInt(host[1].trim()));
        }
        return httpHosts;
    }
}

二. 查詢

es中存儲的數據結構如下,下面根據這個數據結構進行各種查詢

class Entity {

    private String id;

    private String summary;

    private String name;

    private String introduction;
}

1. 根據id查詢(單索引查詢)

這里選用GetRequest查詢,非常方便,但缺點就是只能設置一個索引查詢,也只能設置一個id,不能批量。

public class EsEntityClient {

    @Autowired
    private RestHighLevelClient restHighLevelClient;

    // 設置索引名
    private static final String INDEX_NAME = "entity";

    public Entity queryEntityById(String id) {
        GetRequest getRequest = new GetRequest(INDEX_NAME).id(id);

        Entity entity = null;
        try {
            GetResponse response = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
            entity = JSONObject.parseObject(JSONObject.toJSONString(response.getSource()), Entity.class);
        } catch (IOException e) {
            log.warn("can't find entity, id:{}", id, e);
        }
        return entity;
    }

2.根據ids進行 單/多 索引查詢

選用IdsQueryBuilder來構建查詢

    public List<Entity> queryByIds(List<String> ids) {
        IdsQueryBuilder idsQueryBuilder = QueryBuilders.idsQuery();
        idsQueryBuilder.ids().addAll(ids);

        SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource()
                .query(idsQueryBuilder);
        SearchRequest searchRequest = new SearchRequest().source(searchSourceBuilder)
                // 這里可以設置多索引
                .indices("idx1","idx2","idx3");

        List<Entity> entities = new ArrayList<>();
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            SearchHit[] hits = searchResponse.getHits().getHits();
            // 根據score倒序排序(相關度排序)
            Arrays.sort(hits, (h1, h2) -> (int) (h2.getScore() - h1.getScore()));
            for (SearchHit hit : hits) {
                String jsonString = hit.getSourceAsString();
                entities.add(JSONObject.parseObject(jsonString, Entity.class));
            }
        } catch (IOException e) {
            log.warn("search by ids failed. ids:{}", ids.toString(), e);
        }
        return entities;
    }

3.根據id查詢(多索引查詢)

項目中遇到的問題是,爬蟲從不同來源爬取的數據存在了不同的index里,所以前端給一個id,可能需要從多個索引中查詢。此時上面的GetRequest就行不通了(當然可以循環去查不同index,但是index多的情況下,IO開銷大,接口響應慢)。

由此,我選擇的方法是...

    public Entity queryById(String id) {
        List<Entity> result = queryByIds(Collections.singletonList(id));
        if (CollectionUtils.isEmpty(result)) {
            log.warn("can't find entity by id:{}", id);
            return null;
        }
        // 這里其實直接返回result.get(0)就行吧,但是這里不轉不行,感覺是aliFastJson的BUG
        return JSONObject.parseObject(JSONObject.toJSONString(result.get(0)), Entity.class);
    }

4.根據name精准查詢

    public List<Entity> queryEntityByName(String name) {
        BoolQueryBuilder queryBuilder = new BoolQueryBuilder();

        // 使用termQuery,第一個參數為:目標字段名.keyword,就可以實現對這個參數的精准匹配
        queryBuilder.filter(QueryBuilders.termQuery("name" + ".keyword", name));
        SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource().query(queryBuilder).size(20);
        SearchRequest searchRequest = new SearchRequest().source(searchSourceBuilder).indices(INDEX_NAME);

        List<Entity> entities = new ArrayList<>();
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            SearchHit[] hits = searchResponse.getHits().getHits();
            Arrays.sort(hits, (h1, h2) -> (int) (h2.getScore() - h1.getScore()));
            for (SearchHit hit : hits) {
                String jsonString = hit.getSourceAsString();
                entities.add(JSONObject.parseObject(jsonString, Entity.class));
            }
        } catch (IOException e) {
            log.warn("search entities failed. name:{}", name, e);
        }
        return entities;
    }

5.多字段模糊搜索

根據各個字段的關鍵字,模糊匹配

    public List<Entity> query(String name, String summary, String introduction) {
        BoolQueryBuilder queryBuilder = buildFuzzQueryBuilder(name, summary, introduction);
        // 暫時寫死查100個
        SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource().query(queryBuilder).size(100);
        SearchRequest searchRequest = new SearchRequest().source(searchSourceBuilder);
        // 設置查詢范圍
        searchRequest.indices("idx1","idx2","idx3");

        List<Entity> entities = new ArrayList<>();
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            SearchHit[] hits = searchResponse.getHits().getHits();
            Arrays.sort(hits, (h1, h2) -> (int) (h2.getScore() - h1.getScore()));
            for (SearchHit hit : hits) {
                String jsonString = hit.getSourceAsString();
                entities.add(JSONObject.parseObject(jsonString, Entity.class));
            }
        } catch (IOException e) {
            log.warn("search failed.", e);
        }
        return entities;
    }

    // 構建查詢
    private BoolQueryBuilder buildFuzzQueryBuilder(String name, String summary, String introduction) {
        BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
        if (Strings.isNotEmpty(name)) {
            // 模糊匹配
            MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("name", name);
            boolQueryBuilder.filter(queryBuilder);
        }

        if (Strings.isNotEmpty(summary)) {
            // 模糊匹配
            MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("summary", summary);
            boolQueryBuilder.filter(queryBuilder);
        }

        if (Strings.isNotEmpty(introduction)) {
            // 模糊匹配
            MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("introduction", introduction);
            boolQueryBuilder.filter(queryBuilder);
        }
        return boolQueryBuilder;
    }

簡單的demo實現已經上傳到 https://github.com/bupt-yanch/spring-elasticsearch-demo


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM