1、介紹
對mysql、oracle等數據庫數據進行同步到ES有三種做法:一個是通過elasticsearch提供的API進行增刪改查,一個就是通過中間件進行數據全量、增量的數據同步,另一個是通過收集日志進行同步。
明顯通過API增上改查比較麻煩,這里介紹的是利用中間件進行數據同步。
2、常用的同步中間件的介紹和對比
(1)elasticsearch-jdbc獨立的第三方工具
https://github.com/jprante/elasticsearch-jdbc
(2)elasticsearch-river-mysql https://github.com/scharron/elasticsearch-river-mysql
(3)go-mysql-elasticsearch(國內) https://github.com/siddontang/go-mysql-elasticsearch
(2)elasticsearch-river-mysql https://github.com/scharron/elasticsearch-river-mysql
(3)go-mysql-elasticsearch(國內) https://github.com/siddontang/go-mysql-elasticsearch
都可以完成數據同步;
elasticsearch-jdbc更通用,GitHub活躍度很高;
elasticsearch-river-mysql 自2013年后便不再更新;
go-mysql-elasticsearch仍處理開發不穩定階段;
elasticsearch-river-jdbc和elasticsearch-river-mysql都不支持對刪掉的數據進行同步,go-mysql-elasticsearch希望可以改善這個問題。
總的來說,elasticsearch-jdbc更適合使用,對於刪掉的數據可以采用API進行同步,或者在數據中不進行物理刪除可以避免該問題的出現。
3、elasticsearch的安裝
這里使用的是2.3.2版本,可以到官方網站下載,這里不提供官方地址,或者訪問
http://download.csdn.net/detail/carboncomputer/9648227 下載本篇文章所用到的兩個安裝包。
得到
elasticsearch-2.3.2.tar.gz
[zsz@zsz ~]$ tar -zxvf elasticsearch-2.3.2.tar.gz
[zsz@zsz ~]$ mv elasticsearch-2.3.2 /usr/local/elasticsearch-2.3.2
啟動elasticsearch服務
[zsz@zsz ~]$./bin/elasticsearch
另外,bin/elasticsearch -d(后台運行);
如何需要修改配置,可以查看/elasticsearch-2.3.2/config/elasticsearch.yml;
查看節點情況:
[zsz@zsz downloads]$ curl 'localhost:9200/_cat/nodes?v'
host ip heap.percent ram.percent load node.role master name
127.0.0.1 127.0.0.1 12 79 0.18 d * node-1
host ip heap.percent ram.percent load node.role master name
127.0.0.1 127.0.0.1 12 79 0.18 d * node-1
查看索引,當前為無索引:
[zsz@zsz downloads]$ curl 'localhost:9200/_cat/indices?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size
health status index pri rep docs.count docs.deleted store.size pri.store.size
創建索引:
[zsz@zsz downloads]$ curl -XPUT 'localhost:9200/customer?pretty'
{
"acknowledged" : true
}
{
"acknowledged" : true
}
[zsz@zsz downloads]$ curl 'localhost:9200/_cat/indices?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open customer 5 1 0 0 650b 650b
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open customer 5 1 0 0 650b 650b
增加索引並搜索:
[zsz@zsz downloads]$ curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
> {
> "name": "John Doe"
> }'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"created" : true
}
> {
> "name": "John Doe"
> }'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"created" : true
}
可以看到,一個新的文檔在customer索引和external類型中被成功創建。文檔也有一個內部id 1, 這個id是在增加索引的時候指定的。下面來檢索這個記錄:
[zsz@zsz downloads]$ curl -XGET 'localhost:9200/customer/external/1?pretty'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "John Doe"
}
}
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "John Doe"
}
}
[zsz@zsz downloads]$ curl 'localhost:9200/customer/_search?q=John'
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.19178301,"hits":[{"_index":"customer","_type":"external","_id":"1","_score":0.19178301,"_source":
{
"name": "John Doe"
}}]}}
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.19178301,"hits":[{"_index":"customer","_type":"external","_id":"1","_score":0.19178301,"_source":
{
"name": "John Doe"
}}]}}
更新這個索引:
[zsz@zsz downloads]$ curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
> {
> "doc": { "name": "Jane Doe Haha" }
> }'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 2,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
}
}
> {
> "doc": { "name": "Jane Doe Haha" }
> }'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 2,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
}
}
[zsz@zsz downloads]$ curl -XGET 'localhost:9200/customer/external/1?pretty'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 2,
"found" : true,
"_source" : {
"name" : "Jane Doe Haha"
}
}
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 2,
"found" : true,
"_source" : {
"name" : "Jane Doe Haha"
}
}
刪除該索引:
[zsz@zsz downloads]$ curl -XDELETE 'localhost:9200/customer/external/1?pretty'
{
"found" : true,
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 3,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
}
}
[zsz@zsz downloads]$ curl -XGET 'localhost:9200/customer/external/1?pretty'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"found" : false
}
{
"found" : true,
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 3,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
}
}
[zsz@zsz downloads]$ curl -XGET 'localhost:9200/customer/external/1?pretty'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"found" : false
}
ES必要的插件
必要的Head、kibana、IK(中文分詞)、graph等插件的詳細安裝和使用。
http://blog.csdn.net/column/details/deep-elasticsearch.html
必要的Head、kibana、IK(中文分詞)、graph等插件的詳細安裝和使用。
http://blog.csdn.net/column/details/deep-elasticsearch.html
ES對外接口
JAVA API接口
http://www.ibm.com/developerworks/library/j-use-elasticsearch-java-apps/index.html
RESTful API接口
常見的增、刪、改、查操作實現:
http://blog.csdn.net/laoyang360/article/details/51931981
JAVA API接口
http://www.ibm.com/developerworks/library/j-use-elasticsearch-java-apps/index.html
RESTful API接口
常見的增、刪、改、查操作實現:
http://blog.csdn.net/laoyang360/article/details/51931981
3、elasticsearch-jdbc的安裝配置
需要的安裝包:
elasticsearch-jdbc-2.3.2.0-dist.zip,它是與elasticsearch-2.3.2.tar.gz相對應的,其他版本會出錯。
[zsz@zsz downloads]$ vi /etc/profile
#增加elasticsearch-jdbc插件的環境變量
export JDBC_IMPORTER_HOME=/home/downloads/elasticsearch-jdbc-2.3.2.0
[zsz@zsz downloads]$ source /etc/profile
創建同步:
[zsz@zsz downloads]$ mkdir /odbc_es
[zsz@zsz downloads]$ cd /odbc_es
[zsz@zsz odbc_es]$ vi mysql_import_es.sh
#!/bin/shbin=$JDBC_IMPORTER_HOME/binlib=$JDBC_IMPORTER_HOME/libecho '{"type" : "jdbc","jdbc": {"elasticsearch.autodiscover":true,"elasticsearch.cluster":"elasticsearch",##需要與/elasticsearch-2.3.2/config/elasticsearch.yml的配置對應"url":"jdbc:mysql://***:3306/**","user":"**","password":"**","sql":"select * from news","elasticsearch" : {"host" : "127.0.0.1","port" : 9300},"index" : "myindex","type" : "mytype"}}'| java \-cp "${lib}/*" \-Dlog4j.configurationFile=${bin}/log4j2.xml \org.xbib.tools.Runner \org.xbib.tools.JDBCImporter##根據個人項目情況填寫以上的地址
運行數據同步腳本mysql_import_es.sh:
[zsz@zsz odbc_es]$ ./mysql_import_es.sh
查看是否同步成功:
[zsz@zsz odbc_es]$ curl 'localhost:9200/_cat/indices?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open myindex 5 1 163 0 146.5kb 146.5kb
yellow open customer 5 1 0 0 795b 795b
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open myindex 5 1 163 0 146.5kb 146.5kb
yellow open customer 5 1 0 0 795b 795b
[zsz@zsz odbc_es]$ curl -XGET 'http://127.0.0.1:9200/myindex/mytype/_search?pretty'
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
......
說明同步數據成功。
問題與解決:
1、提示no cluster nodes available, check settings
解決:請查看/elasticsearch-2.3.2/config/elasticsearch.yml。一般都是該文件配置錯誤造成的,比如單機模式的,配置了node節點,或者cluster.name錯誤。
有問題請與我聯系。
原文地址:http://www.cnblogs.com/zhongshengzhen/p/elasticsearch_mysql.html