基礎環境:
(二)設置增量導入為定時執行的任務:
很多人利用Windows計划任務,或者Linux的Cron來定期訪問增量導入的連接來完成定時增量導入的功能,這其實也是可以的,而且應該沒什么問題。
但是更方便,更加與Solr本身集成度高的是利用其自身的定時增量導入功能。
1、下載apache-solr-dataimportscheduler-1.0.jar放到Tomcat的webapps的solr目錄的WEB-INF的lib目錄下:
下載地址:http://yunpan.cn/cdIpMthFdFcgn (提取碼:5a1c)
由於我采用的jetty+zk配置
我將apache-solr-dataimportscheduler-1.0.jar 放在solr-4.10.4/example/solr-webapp/webapp/WEB-INF/lib目錄下
2、部分配置文件: db-data-config.xml
文件目錄位置:/solr-4.10.4/example/solr/collection1/conf
<entity name="bns_sentence" pk="id" query ="select id, uid, createname, createheadimg, wid, word, content, articlenum, state, feel, forwardnum, supportnum, updatetime, createtime from bns_sentence" deltaImportQuery ="select id, uid, createname, createheadimg, wid, word, content, articlenum, state, feel, forwardnum, supportnum, updatetime, createtime from bns_sentence where id='${dataimporter.delta.ID}'" deltaQuery ="select id, uid, createname, createheadimg, wid, word, content, articlenum, state, feel, forwardnum, supportnum, updatetime, createtime from bns_sentence where updatetime '${dataimporter.last_index_time}'"> <field column="id" name="id"/> <field column="uid" name="uid"/> <field column="createname" name="createname"/> <field column="createheadimg" name="createheadimg"/> <field column="wid" name="wid"/> <field column="word" name="word"/> <field column="content" name="content"/> <field column="articlenum" name="articlenum"/> <field column="state" name="state"/> <field column="feel" name="feel"/> <field column="forwardnum" name="forwardnum"/> <field column="supportnum" name="supportnum"/> <field column="updatetime" name="updatetime"/> <field column="createtime" name="createtime"/>
3、配置文件頭尾
<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
<dataSource driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://ip:3306/database"
user="username"
password="password" />
<span style="color:#FF0000;"> batchSize="-1"</span>/><!-- 注意:mysql中一定要batchSize="-1" 否則會報異常-->
<document>
<entity name="tablename" pk="id"
</entity>
</document>
<!--deltaQuery="select id, content, avgfeel, state, sentencenum, articlenum,updatetime, createtime from bns_word where to_char(updatetime,'yyyy-mm-dd hh24:mi:ss')> '${dataimporter.last_index_time}'"-->
</dataConfig>
4、修改配置文件dataimport.properties
我是放在/solr-4.10.4/example/solr/conf 目錄下
配置文件如下
################################################# # # # dataimport scheduler properties # # # ################################################# # to sync or not to sync # 1 - active; anything else - inactive syncEnabled=1 # which cores to schedule # in a multi-core environment you can decide which cores you want syncronized # leave empty or comment it out if using single-core deployment syncCores=game,resource # solr server name or IP address # [defaults to localhost if empty] server=ip # solr server port # [defaults to 80 if empty] port=8983 # application name/context # [defaults to current ServletContextListener's context (app) name] webapp=solr # URL params [mandatory] # remainder of URL params=/dataimport?command=delta-import&clean=true&commit=true # schedule interval # number of minutes between two runs # [defaults to 30 if empty] interval=1 # 重做索引的時間間隔,單位分鍾,默認7200,即1天; # 為空,為0,或者注釋掉:表示永不重做索引 reBuildIndexInterval=7200 # 重做索引的參數 reBuildIndexParams=/dataimport?command=full-import&clean=true&commit=true # 重做索引時間間隔的計時開始時間,第一次真正執行的時間=reBuildIndexBeginTime+reBuildIndexInterval*60*1000; # 兩種格式:2012-04-11 03:10:00 或者 03:10:00,后一種會自動補全日期部分為服務啟動時的日期 reBuildIndexBeginTime=03:10:00
5、第一次啟動會出現:
sorry, no dataimport-handler defined!
解決辦法
找到配置文件example/solr/collection1/conf 下的solrconfig.xml添加
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">db-data-config.xml</str> </lst> </requestHandler>
6、啟動后報錯信息:
- 2015-08-19 23:31:13.591; org.apache.solr.handler.dataimport.scheduler.BaseTimerTask; [game] <index update process> Response message Not Found INFO - 2015-08-19 23:31:13.592; org.apache.solr.handler.dataimport.scheduler.BaseTimerTask; [game] <index update process> Response code 404 INFO - 2015-08-19 23:31:13.592; org.apache.solr.core.SolrResourceLoader; JNDI not configured for solr (NoInitialContextEx) INFO - 2015-08-19 23:31:13.593; org.apache.solr.core.SolrResourceLoader; solr home defaulted to 'solr/' (could not find system property or JNDI) INFO - 2015-08-19 23:31:13.593; org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for deduced Solr Home: 'solr/' INFO - 2015-08-19 23:31:13.609; org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties; Instance dir = solr/
錯誤原因:
改成啟動方式:
java -Dsolr.solr.home=/home/hadoop/cloudsolr/solr-4.10.4/example -DzkHost=192.168.0.157:2181,192.168.0.158:2181,192.168.0.159:2181 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -jar start.jar
7、錯誤信息如下:
1045 [main] ERROR org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Error locating DataImportScheduler dataimport.properties file
java.io.FileNotFoundException: /home/hadoop/cloudsolr/solr-4.10.4/example/conf/dataimport.properties (No such file or directory)
將配置文件dataimport.properties移動對應的目錄
8、錯誤信息:
ter – Could not start Solr. Check solr/home property and the logs 1146 [main] ERROR org.apache.solr.core.SolrCore – null:org.apache.solr.common.SolrException: solr.xml does not exist in /home/hadoop/cloudsolr/solr-4.10.4/example/solr.xml cannot start Solr at org.apache.solr.core.ConfigSolr.fromFile(ConfigSolr.java:62)
將對應的solr.xml 復制到對應的目錄即可
9、錯誤信息:
in] ERROR org.apache.solr.servlet.SolrDispatchFilter – Could not start Solr. Check solr/home property and the logs 3230 [main] ERROR org.apache.solr.core.SolrCore – null:org.apache.solr.common.SolrException: Found multiple cores with the name [collection1], with instancedirs [/home/hadoop/cloudsolr/solr-4.10.4/example/example-schemaless/solr/collection1/] and [/home/hadoop/cloudsolr/solr-4.10.4/example/solr/collection1/]
解決辦法:example-schemaless/solr/collection1 將例子的core重新命名為其他的名字,並且在core.properties 也修改即可
10、在執行的時候另一個錯誤:
dding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-http-8.1.10.v20130312.jar' to classloader 481115 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Instance dir = /home/hadoop/cloudsolr/solr-4.10.4/example/ 481116 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Disconnected from server ip 481117 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Process ended at ................ 20.08.2015 01:37:00 595 541047 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] <index update process> Process started at .............. 20.08.2015 01:38:00 525 541049 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] <index update process> Full URL http://ip:8983/solr/game/dataimport?command=delta-import&clean=true&commit=true 541057 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] <index update process> Response message Not Found 541058 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] <index update process> Response code 404 541058 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – JNDI not configured for solr (NoInitialContextEx) 541059 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – using system property solr.solr.home: /home/hadoop/cloudsolr/solr-4.10.4/example 541059 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for deduced Solr Home: '/home/hadoop/cloudsolr/solr-4.10.4/example/' 541061 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-deploy-8.1.10.v20130312.jar' to classloader 541061 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-xml-8.1.10.v20130312.jar' to classloader 541062 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-servlet-8.1.10.v20130312.jar' to classloader 541062 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-io-8.1.10.v20130312.jar' to classloader 541063 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-util-8.1.10.v20130312.jar' to classloader 541063 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-security-8.1.10.v20130312.jar' to classloader 541064 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-server-8.1.10.v20130312.jar' to classloader 541065 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-continuation-8.1.10.v20130312.jar' to classloader 541065 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/ext/' to classloader 541066 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-webapp-8.1.10.v20130312.jar' to classloader 541067 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/servlet-api-3.0.jar' to classloader 541067 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-jmx-8.1.10.v20130312.jar' to classloader 541068 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-http-8.1.10.v20130312.jar' to classloader 541085 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Instance dir = /home/hadoop/cloudsolr/solr-4.10.4/example/ 541085 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] <index update process> Disconnected from server ip 541086 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] <index update process> Process ended at ................ 20.08.2015 01:38:00 564 541086 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Process started at .............. 20.08.2015 01:38:00 564 541087 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Full URL http://ip:8983/solr/resource/dataimport?command=delta-import&clean=true&commit=true 541091 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Response message Not Found 541091 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Response code 404 541091 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – JNDI not configured for solr (NoInitialContextEx) 541091 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – using system property solr.solr.home: /home/hadoop/cloudsolr/solr-4.10.4/example 541091 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for deduced Solr Home: '/home/hadoop/cloudsolr/solr-4.10.4/example/' 541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-deploy-8.1.10.v20130312.jar' to classloader 541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-xml-8.1.10.v20130312.jar' to classloader 541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-servlet-8.1.10.v20130312.jar' to classloader 541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-io-8.1.10.v20130312.jar' to classloader 541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-util-8.1.10.v20130312.jar' to classloader 541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-security-8.1.10.v20130312.jar' to classloader 541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-server-8.1.10.v20130312.jar' to classloader 541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-continuation-8.1.10.v20130312.jar' to classloader 541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/ext/' to classloader 541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-webapp-8.1.10.v20130312.jar' to classloader 541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/servlet-api-3.0.jar' to classloader 541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-jmx-8.1.10.v20130312.jar' to classloader 541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-http-8.1.10.v20130312.jar' to classloader 541106 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Instance dir = /home/hadoop/cloudsolr/solr-4.10.4/example/ 541106 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Disconnected from server ip 541111 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] <index update process> Process ended at ................ 20.08.2015 01:38:00 589
問題原因:
solr版本不支持
解決辦法:
jar包換1.1版本。
錯誤原因:
deltaQuery="select id, content, avgfeel, state, sentencenum, articlenum,updatetime, createtime from bns_word where updatetime >= '${dataimporter.last_index_time}'">
在xml 中定義大於號小於號:
原符號 | < | <= | > | >= | & | ' | " |
替換符號 | < | <= | > | >= | & | ' | " |
11、導入數據后出現控制台有出現導入數據成功,但是solr查詢不到數據
錯誤原因:
db-data-config.xml 配置文件中 <entity name="bns_sentence" pk="id" query ="select id, uid, createname, createheadimg, wid, word, content, articlenum, state, feel, forwardnum, supportnum, updatetime, createtime from bns_sentence" deltaImportQuery ="select id, uid, createname, createheadimg, wid, word, content, articlenum, state, feel, forwardnum, supportnum, updatetime, createtime from bns_sentence where id='${dataimporter.delta.id}'" dataimporter.delta.id 需要改為小寫的id
12 、配置完啟動出錯:
48 [coreLoadExecutor-5-thread-1] ERROR org.apache.solr.core.CoreContainer ? Error creating core [collection1]: RequestHandler init failure org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.SolrCore.<init>(SolrCore.java:881) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:654) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:172) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:800) ... 8 more Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:490) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:421) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:624) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:158) ... 9 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:474) ... 13 more
錯誤原因:
解決辦法:
軟件包下載地址:http://yunpan.cn/cHTNPkchYSCrX (提取碼:e5ee)
將solr-4.10.4/dist下的
solr-dataimporthandler-4.10.4.jar
solr-dataimporthandler-extras-4.10.4.jar
考到solr web的lib目錄下,然后重啟即可
[root@devnote ~]# cp solr-4.5.1/dist/solr-dataimporthandler-*.jar /opt/tomcat/webapps/solr/WEB-INF/lib/
13 、 solr 清空所有數據:
http://ip:port/solr/corename/update/?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E&stream.contentType=text/xml;charset=utf-8&commit=true
參考地址:http://josh-persistence.iteye.com/blog/2017155
14、如果是solr和tomcat 集成,參考http://www.aboutyun.com/thread-10496-1-1.html, 這步是必須的
、修改solr的WEB-INF目錄下面的web.xml文件: 為<web-app>元素添加一個子元素 <listener> <listener-class> org.apache.solr.handler.dataimport.scheduler.ApplicationListener </listener-class> </listener>
15、如果出現:Unsupported Media Type 錯誤提示,數據增量導入失敗
錯誤原因: 我部署的是在tomcat 下 的solr /WEB-INF/lib 下將apache-solr-dataimportscheduler-1.0.jar 包刪除
解決辦法: 將/WEB-INF/lib 下將apache-solr-dataimportscheduler-1.0.jar 刪除, 替換上solr-dataimportscheduler-1.1.jar
軟件包下載地址:http://yunpan.cn/cHTNPkchYSCrX (提取碼:e5ee)