一.概念
1.1 基礎概念
ELK: 是ElasticSearch,LogStash以及Kibana三個產品的首字母縮寫
lucene : apache 的全文搜索引擎工具包
elasticsearch : ElasticSearch是一個基於全文檢索引擎lucene實現的一個面向文檔的schema free的數據庫。所有對數據庫的配置、監控及操作都通過Restful接口完成。數據格式為json。默認支持節點自動發現,數據自動復制,自動分布擴展,自動負載均衡。適合處理最大千萬級別的數據的檢索。處理效率非常高。可以理解為elasticSearch是一個在lucene基礎上增加了restful接口及分布式技術的整合。
elasticsearch : http協議訪問默認使用9200端口
elasticsearch : tcp協議訪問默認使用9300端口
操作elasticsearch的四種方式:
Kibana:使用http
原始的api:使用tcp
RestAPI:使用http
Sde(SpringDataElasticsearch): 使用tcp
tcp傳輸效率比http高
1.2 elasticsearch概念
Index:存儲數據的邏輯區域,類似關系型數據庫中的database,是文檔的命名空間。如下圖的湖藍色部分所示,Index為twitter。
Type:類似關系型數據庫中的Table,是包含一系列field的json數據。儲存一系列類似的field。如下圖的黃色部分所示,Type為tweet。不同document里面同名的field一定要是相同類型的。
Document:存儲的實體數據,類似關系型數據庫中的Row,是具體的包含一組filed的資料。如下圖橙色部分所示,包含user,post_data,message三個field。
Field:即關系型數據庫中Column, Document的一個組成部分,有兩個部分組成,name和value。如下圖紫色部分所示 post_date及其具體的值就是一個field。
Mapping:存儲field的相關映射信息,不同document type會有不同的mapping。
Term:不可分割的單詞,搜索最小單元。不同的分析器對同樣的內容的分析結果是不同的。也就得到不同的term。
Token:一個Term呈現方式,包含這個Term的內容,在文檔中的起始位置,以及類型。
Node:對應這關系型數據庫中的數據庫實例。
Cluster:由多個node組成的一組服務實例。
Shard:關系型數據庫中無此概念,是Lucene搜索的最小單元。一個index可能會存在於多個shards,不同shards可能在不同nodes。一個lucene index在es中我們稱為一個shard,而es中的index則是一系列shard。當es執行search操作,會將請求發送到這個index包含的所有shard上去,然后將沒一個shard上的執行結果搜集起來作為最終的結果。shard的個數在創建索引之后不能改變!
Replica:shard的備份,有一個primary shard,其余的叫做replica shards。Elasticsearch采用的是Push Replication模式,當你往 master主分片上面索引一個文檔,該分片會復制該文檔(document)到剩下的所有 replica副本分片中,這些分片也會索引這個文檔
文檔的錄入時,Elasticsearch通過對docid進行hash來確定其放在哪個shard上面,然后在shard上面進行索引存儲。
和數據庫的對應:
mysql數據庫 |
ES |
Database |
Indices index的復數 |
Table |
Type 一般一個索引庫中只有一個type |
數據 |
Document |
約束 列存儲什么數據類型之類的 |
Mapping 規定字段什么數據類型、什么分詞器 |
Column |
Field |
二.Kibana操作索引庫
1. 連接
2. 操作
創建類型並且制定每個字段的屬性(數據類型、是否存儲、是否索引、哪種分詞器
put ahd/_mapping/goods { "properties":{ "goodsName":{ "type":"text", "analyzer":"ik_max_word", "index":"true", "store":"true" }, "price":{ "type":"double", "index":"true", "store":"false" }, "brand":{ "type":"keyword", "index":"true", "store":"true" } } }
查詢創建的索引/映射 get ahd/_mapping[/goods]
分片5,副本1 put /heima { "settings":{ "number_of_shards":5, "number_of_replicas":1 } } 創建索影庫2 put ahd2
創建索引庫及其字段 put ahd2 { "mappings":{ "goods":{ "properties":{ "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } }
} } }
添加一條數據:指定id的新增
post ahd/goods/1 { "goodsname":"華為p20手機", "brand":"華為", "price":2299 }
根據id查詢記錄 get ahd/goods/1
修改,
post ahd/goods/1 { "goodsname":"華為p20手機", "brand":"華為", "price":2599 }
不指定id插入一條數據
post ahd/goods { "goodsname":"小米手機6", "brand":"小米", "price":"2500" }
插入數據最好還是使用post,修改數據使用put
使用put和使用post是一樣的效果
指定id刪除一條數據 delete ahd/goods/IkXNN2wBr0WPOOKNJpRg
自定義模板 1. 首先先添加一個索引庫, put ahd3 { "mappings":{ "goods":{ "properties":{ "image":{ "type":"text", "index":"false", "store":"true" }, "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } } } } }
在添加的這個索引庫基礎上添加模板(改動添加語句)
put ahd3 { "mappings":{ "goods":{ "properties":{ "image":{ "type":"text", "index":"false", "store":"true" }, "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } } , "dynamic_templates":[ { "mystring":{ "match_mapping_type":"string", "mapping":{ "type":"keyword" } } } ]
} } }
新增數據還就只能使用post
在ahd3中新添加一條數據 post ahd3/goods { "goodsname":"小米6X手機", "price":1199, "image":"http://image.im.com/123.jpg", "brand":"小米" }
查詢goods document get ahd3/_mapping/goods
===================================================================== ===================================================================== =========================查詢(重點)================================== ===================================================================== =====================================================================
1.查詢所有 get ahd3/_search { "query":{ "match_all": {
} } }
2.term查詢:精確查詢
get ahd3/_search { "query":{ "term":{ "goodsname":"小米" } } }
注意,第一行不能有大括號{
*.在添加一條數據,進行測試, post ahd3/goods { "goodsname":"大米", "brand":"吊牌", "price":200, "image":"http://localhost:8080/a.jpg" }
進行查詢測試 get ahd3/_search { "query":{ "term":{ "goodsname": "小米" } } }
插入一條新的記錄 post ahd3/goods { "goodsname":"大米手機", "price":20000, "brand":"大米", "image":"http://baidu.com/a.jpg" }
3.分詞查詢match測試 get ahd3/_search { "query":{ "match": { "brand":"米" } } }
2.4 Range范圍查詢 get ahd3/_search { "query":{ "range":{ "price":{ "lte":1000, "gte":100 } } }
}
新添加一條數據 post ahd3/goods { "goodsname":"appla", "brand":"apple", "price":5000, "image":"http://www.baidu.com/sadf.jpg" }
2.5 Fuzzy容錯
get ahd3/goods/_search { "query":{ "fuzzy":{ "goodsname":{ "value": "apple", "fuzziness": 1 } } } }
2.6 Bool組合查詢
get ahd3/goods/_search { "query":{ "bool": { "must":{ "match":{ "goodsname":"大米" }
} } } }
測試json書寫是否正確
get ahd3/goods/_search { "query":{ "bool": { "must":[{ "match":{ "goodsname":"大米" } },{ "term":{ "brand":"大米" } }
] } } }
顯示字段的過濾
只顯示goodsname
get ahd3/_search { "_source":{ "includes":["goodsname"] } }
排除goodsname
get ahd3/_search { "_source":{ "excludes":["goodsname"] } }
3.2 、查詢結果的過濾
查詢結果的過濾
get ahd3/_search { "query":{ "bool": { "must": { "term":{ "goodsname":"小米" } }, "filter":{ "range": { "price": { "gte": 10, "lte": 20000 } } }
} } }
分頁: get ahd3/_search { "query":{ "match_all": {
} }, "from":2, "size":2 }
排序倒序 get ahd3/_search { "query":{ "match_all": {
} }, "sort":{ "price":"desc" }
}
高亮
get ahd3/_search { "query":{ "term": { "goodsname": { "value": "小米" } } }, "highlight":{ "pre_tags":"<a href='www.baidu.com'>", "post_tags":"</a>", "fields":{ "goodsname":{} } } }
聚合: get /ahd3/goods/_search { "size":0, "aggs":{ "populor_color":{ "terms": { "field": "price", "size": 10 }
} } }
|
三.原始的api操作索引庫(tcp:9300)
2.1導入依賴
<dependencies> <dependency> <groupId>com.alibaba</groupId> <artifactId>fastjson</artifactId> <version>1.2.35</version> </dependency>
|
2.2原始api操作索引庫
TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)
public class EsManager {
private TransportClient client = null;
@Before
public void init() throws Exception{
client = new PreBuiltTransportClient(Settings.EMPTY)
.addTransportAddress(new TransportAddress(InetAddress.getByName("127.0.0.1"), 9300));
}
@After
public void end(){
client.close();
}
}
第三步:各種查詢
@Test
public void queryTest() throws Exception{
// QueryBuilder queryBuilder = QueryBuilders.matchAllQuery();
// QueryBuilder queryBuilder = QueryBuilders.matchQuery("goodsName","小米手機");
// QueryBuilder queryBuilder = QueryBuilders.termQuery("goodsName","小米");
// FuzzyQueryBuilder queryBuilder = QueryBuilders.fuzzyQuery("goodsName", "大米");
// queryBuilder.fuzziness(Fuzziness.ONE);
// QueryBuilder queryBuilder = QueryBuilders.rangeQuery("price").gte(1000).lte(2000);
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.must(QueryBuilders.rangeQuery("price").gte(1000).lte(8000));
queryBuilder.mustNot(QueryBuilders.termQuery("goodsName", "華為"));
SearchResponse searchResponse = client.prepareSearch("heima").setQuery(queryBuilder).get();
SearchHits searchHits = searchResponse.getHits();
long totalHits = searchHits.getTotalHits();
System.out.println("總記錄數:"+totalHits);
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) {
String sourceAsString = hit.getSourceAsString();
Goods goods = JSON.parseObject(sourceAsString, Goods.class);
System.out.println(goods);
}
}
四.RestAPI操作索引庫(http:9200)
3.1 坐標
<parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>2.1.3.RELEASE</version> </parent> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-logging</artifactId> </dependency> <dependency> <groupId>com.google.code.gson</groupId> <artifactId>gson</artifactId> <version>2.8.5</version> </dependency> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>3.8.1</version> </dependency> <dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>6.4.3</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build>
|
3.2 RestAPI操作索引庫
1.初始化client
private RestHighLevelClient client = null; |
2.准備pojo對象(使用lombok)
@Data
|
// 新增或修改 IndexRequest
Item item = new Item(1L,"大米6X手機","手機","小米",1199.0,"http.jpg");
String jsonStr = gson.toJson(item);
IndexRequest request = new IndexRequest("item","docs",item.getId().toString());
request.source(jsonStr, XContentType.JSON);
client.index(request, RequestOptions.DEFAULT);
修改文檔數據
就是使用上面的新增方法,它既是新增也是修改
根據id獲取文檔數據
GetRequest request = new GetRequest("item","docs","1");
GetResponse getResponse = client.get(request, RequestOptions.DEFAULT);
String sourceAsString = getResponse.getSourceAsString();
Item item = gson.fromJson(sourceAsString, Item.class);
System.out.println(item);
刪除文檔數據
DeleteRequest deleteRequest = new DeleteRequest("item","docs","1");
client.delete(deleteRequest,RequestOptions.DEFAULT);
批量新增文檔數據
// 准備文檔數據:
List<Item> list = new ArrayList<>();
list.add(new Item(1L, "小米手機7", "手機", "小米", 3299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(2L, "堅果手機R1", "手機", "錘子", 3699.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(3L, "華為META10", "手機", "華為", 4499.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(4L, "小米Mix2S", "手機", "小米", 4299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(5L, "榮耀V10", "手機", "華為", 2799.00,"http://image.leyou.com/13123.jpg"));
BulkRequest bulkRequest = new BulkRequest();
for (Item item : list) {
bulkRequest.add(new IndexRequest("item","docs",item.getId().toString()).source(JSON.toJSONString(item),XContentType.JSON)) ;
}
client.bulk(bulkRequest,RequestOptions.DEFAULT);
各種查詢
@Test
public void testQuery() throws Exception{
SearchRequest searchRequest = new SearchRequest("item");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.query(QueryBuilders.termQuery("title","小米"));
searchSourceBuilder.query(QueryBuilders.matchQuery("title","小米手機"));
searchSourceBuilder.query(QueryBuilders.fuzzyQuery("title","大米").fuzziness(Fuzziness.ONE));
searchSourceBuilder.query(QueryBuilders.rangeQuery("price").gte(3000).lte(4000));
searchSourceBuilder.query(QueryBuilders.boolQuery().must(QueryBuilders.termQuery("title","手機"))
.must(QueryBuilders.rangeQuery("price").gte(3000).lte(3500)));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHits searchHits = searchResponse.getHits();
long total = searchHits.getTotalHits();
System.out.println("總記錄數:"+total);
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) {
String sourceAsString = hit.getSourceAsString();
Item item = JSON.parseObject(sourceAsString, Item.class);
System.out.println(item);
}
}
過濾
1、屬性字段顯示的過濾
searchSourceBuilder.fetchSource(new String[]{"title","category"},null);
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
2、查詢結果的過濾
searchSourceBuilder.query(QueryBuilders.termQuery("title","手機"));
searchSourceBuilder.postFilter(QueryBuilders.termQuery("brand","小米"));
分頁
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.from(0); //起始位置
searchSourceBuilder.size(3); //每頁顯示條數
排序
searchSourceBuilder.sort("id", SortOrder.ASC); // 參數1:排序的域名 參數2:順序
高亮
構建高亮的條件
searchSourceBuilder.query(QueryBuilders.termQuery("title","小米"));
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("<font style='color:red'>");
highlightBuilder.postTags("</font>");
highlightBuilder.field("title");
searchSourceBuilder.highlighter(highlightBuilder);
解析高亮的結果
for (SearchHit hit : hits) {
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
HighlightField highlightField = highlightFields.get("title");
String title = highlightField.getFragments()[0].toString();
String sourceAsString = hit.getSourceAsString();
Item item = JSON.parseObject(sourceAsString, Item.class);
item.setTitle(title);
System.out.println(item);
}
聚合
需求:根據品牌統計數量
構建的條件代碼
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.aggregation(AggregationBuilders.terms("brandAvg").field("brand"));
解析結果:
Aggregations aggregations = searchResponse.getAggregations();
Terms terms = aggregations.get("brandAvg");
List<? extends Terms.Bucket> buckets = terms.getBuckets();
for (Terms.Bucket bucket : buckets) {
System.out.println(bucket.getKeyAsString()+":"+bucket.getDocCount());
}
五.SpringDataElasticsearch操作索引庫
1. 准備環境
1、添加依賴
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
2、創建引導類
@SpringBootApplication
public class EsApplication {
public static void main(String[] args) {
SpringApplication.run(EsApplication.class,args);
}
}
3、添加配置文件 application.yml
spring:
data:
elasticsearch:
cluster-name: leyou-elastic
cluster-nodes: 127.0.0.1:9301,127.0.0.1:9302,127.0.0.1:9303
4、創建一個測試類,注入SDE提供的一個模板
@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataEsManager {
@Autowired
private ElasticsearchTemplate elasticsearchTemplate;
}
Kibana:http
原始的api:tcp
RestAPI:http
Sde: tcp
2. 操作索引庫和映射
第一步:准備一個pojo,並且構建和索引的映射關系
@Data
@AllArgsConstructor
@NoArgsConstructor
@Document(indexName="leyou",type = "goods",shards = 3,replicas = 1)
public class Goods implements Serializable{
@Field(type = FieldType.Long)
private Long id;
@Field(type = FieldType.Text,analyzer = "ik_max_word",store = true)
private String title; //標題
@Field(type = FieldType.Keyword,index = true,store = true)
private String category;// 分類
@Field(type = FieldType.Keyword,index = true,store = true)
private String brand; // 品牌
@Field(type = FieldType.Double,index = true,store = true)
private Double price; // 價格
@Field(type = FieldType.Keyword,index = false,store = true)
private String images; // 圖片地址
}
第二步:創建索引庫和映射
@Test
public void addIndexAndMapping(){
// elasticsearchTemplate.createIndex(Goods.class); //根據pojo中的注解創建索引庫
elasticsearchTemplate.putMapping(Goods.class); //根據pojo中的注解創建映射
}
3. 操作文檔
// 新增或修改
// Goods goods = new Goods(1L,"大米6X手機","手機","小米",1199.0,"http.jpg");
// goodsRespository.save(goods); //save or update
// 根據id查詢
// Optional<Goods> optional = goodsRespository.findById(1L);
// Goods goods = optional.get();
// System.out.println(goods);
// 刪除
// goodsRespository.deleteById(1L);
// 批量新增
/* List<Goods> list = new ArrayList<>();
list.add(new Goods(1L, "小米手機7", "手機", "小米", 3299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(2L, "堅果手機R1", "手機", "錘子", 3699.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(3L, "華為META10", "手機", "華為", 4499.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(4L, "小米Mix2S", "手機", "小米", 4299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(5L, "榮耀V10", "手機", "華為", 2799.00,"http://image.leyou.com/13123.jpg"));
goodsRespository.saveAll(list);*/
4. 查詢
4.1 goodsRespository自帶的查詢
// Iterable<Goods> goodsList = goodsRespository.findAll(); //查詢所有
// Iterable<Goods> goodsList = goodsRespository.findAll(Sort.by(Sort.Direction.ASC,"price")); //排序
Iterable<Goods> goodsList = goodsRespository.findAll(PageRequest.of(0,3)); //分頁 page頁碼是從0開始代表第一頁 size 5
for (Goods goods : goodsList) {
System.out.println(goods);
}
4.2 自定義查詢方法
可以在接口中根據規定定義一些方法就可以直接使用
public interface GoodsRespository extends ElasticsearchRepository<Goods,Long>{
public List<Goods> findByTitle(String title);
public List<Goods> findByBrand(String brand);
public List<Goods> findByTitleOrBrand(String title,String brand);
public List<Goods> findByPriceBetween(Double low,Double high);
public List<Goods> findByBrandAndCategoryAndPriceBetween(String title,String categoty,Double low,Double high);
}
使用:
// List<Goods> goodsList = goodsRespository.findByTitle("手機");
List<Goods> goodsList = goodsRespository.findByBrandAndCategoryAndPriceBetween("小米","手機",4000.0,5000.0);
for (Goods goods : goodsList) {
System.out.println(goods);
}
5. SpringDataElasticSearch結合原生api查詢
1、結合native查詢
@Test
public void testQuery(){
NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
nativeSearchQueryBuilder.withQuery(QueryBuilders.termQuery("title", "小米"));
// nativeSearchQueryBuilder.withQuery(QueryBuilders.matchAllQuery());
// nativeSearchQueryBuilder.withPageable(PageRequest.of(0,3,Sort.by(Sort.Direction.DESC,"price")));
nativeSearchQueryBuilder.addAggregation(AggregationBuilders.terms("brandAvg").field("brand"));
AggregatedPage<Goods> aggregatedPage = elasticsearchTemplate.queryForPage(nativeSearchQueryBuilder.build(), Goods.class,new GoodsHighLightResultMapper());
Aggregations aggregations = aggregatedPage.getAggregations();
Terms terms = aggregations.get("brandAvg");
List<? extends Terms.Bucket> buckets = terms.getBuckets();
for (Terms.Bucket bucket : buckets) {
System.out.println(bucket.getKeyAsString()+bucket.getDocCount());
}
List<Goods> content = aggregatedPage.getContent();
for (Goods goods : content) {
System.out.println(goods);
}
}
2、自己處理高亮
需要自定一個用來處理高亮的實現類
class GoodsHighLightResultMapper implements SearchResultMapper{
@Override
public <T> AggregatedPage<T> mapResults(SearchResponse searchResponse, Class<T> aClass, Pageable pageable) {
List<T> content = new ArrayList<>();
Aggregations aggregations = searchResponse.getAggregations();
String scrollId = searchResponse.getScrollId();
SearchHits searchHits = searchResponse.getHits();
long total = searchHits.getTotalHits();
float maxScore = searchHits.getMaxScore();
for (SearchHit searchHit : searchHits) {
String sourceAsString = searchHit.getSourceAsString();
T t = JSON.parseObject(sourceAsString, aClass);
Map<String, HighlightField> highlightFields = searchHit.getHighlightFields();
HighlightField highlightField = highlightFields.get("title");
String title = highlightField.getFragments()[0].toString();
try {
BeanUtils.setProperty(t,"title",title);
} catch (Exception e) {
e.printStackTrace();
}
content.add(t);
}
return new AggregatedPageImpl<T>(content,pageable,total,aggregations,scrollId,maxScore);
// List<T> content, Pageable pageable, long total, Aggregations aggregations, String scrollId, float maxScore
}
}
3、使用