用solr做項目已經有一年有余,但都是使用層面,只是利用solr現有機制,修改參數,然后監控調優,從沒有對solr進行源碼級別的研究。但是,最近手頭的一個項目,讓我感覺必須把solrn內部原理和擴展機制弄熟,才能把這個項目做好。今天分享的就是:Solr是如何啟動並且初始化的。大家知道,部署solr時,分兩部分:一、solr的配置文件。二、solr相關的程序、插件、依賴lucene相關的jar包、日志方面的jar。因此,在研究solr也可以順着這個思路:加載配置文件、初始化各個core、初始化各個core中的requesthandler...
研究solr的啟動,首先從solr war程序的web.xml分析開始,下面是solr的web.xml片段:
<web-app xmlns="http://java.sun.com/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
version="2.5"
metadata-complete="true"
>
<!-- Uncomment if you are trying to use a Resin version before 3.0.19.
Their XML implementation isn't entirely compatible with Xerces.
Below are the implementations to use with Sun's JVM.
<system-property javax.xml.xpath.XPathFactory=
"com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl"/>
<system-property javax.xml.parsers.DocumentBuilderFactory=
"com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl"/>
<system-property javax.xml.parsers.SAXParserFactory=
"com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl"/>
-->
<!-- People who want to hardcode their "Solr Home" directly into the
WAR File can set the JNDI property here...
-->
<!-- Solr配置文件的參數,用於Solr初始化使用 -->
<env-entry>
<env-entry-name>solr/home</env-entry-name>
<env-entry-value>R:/solrhome1/solr</env-entry-value>
<env-entry-type>java.lang.String</env-entry-type>
</env-entry>
<!-- org.apache.solr.servlet.SolrDispatchFilter Solr啟動最重要的東東,所以針對solr源碼分析,要對這個Filter開始,它主要的作用:加載solr配置文件、初始化各個core、初始化各個requestHandler和component -->
<filter>
<filter-name>SolrRequestFilter</filter-name>
<filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class>
<!-- If you are wiring Solr into a larger web application which controls
the web context root, you will probably want to mount Solr under
a path prefix (app.war with /app/solr mounted into it, for example).
You will need to put this prefix in front of the SolrDispatchFilter
url-pattern mapping too (/solr/*), and also on any paths for
legacy Solr servlet mappings you may be using.
For the Admin UI to work properly in a path-prefixed configuration,
the admin folder containing the resources needs to be under the app context root
named to match the path-prefix. For example:
.war
xxx
js
main.js
-->
<!--
<init-param>
<param-name>path-prefix</param-name>
<param-value>/xxx</param-value>
</init-param>
-->
</filter>
SolrDispatchFilter 是繼承BaseSolrFilter的一個Filter(Filter的作用是啥,大家應該清楚吧,一般web框架級別的產品源碼分析都是從filter或者servlet開始)。在介紹SolrDispatchFilter之前,先介紹一下BaseSolrFilter(也許程序員都有刨根問底的習慣)。BaseSolrFilter,是一個實現Filter接口的抽象類,功能很簡單,就是判斷當前程序是否已經加載日志方面的jar。代碼片段如下:
/**
* All Solr filters available to the user's webapp should
* extend this class and not just implement {@link Filter}.
* This class ensures that the logging configuration is correct
* before any Solr specific code is executed.
*/
abstract class BaseSolrFilter implements Filter {
static {//
CheckLoggingConfiguration.check();
}
}
着於篇幅,我就不介紹CheckLoggingConfiguration.check() 這里面的東東了。OK,我們回到SolrDispatchFilter上。由於BaseSolrFilter是一個抽象類,所有作為非抽象類的SolrDispatchFilter必須要實現Filter接口。Filter接口如下:
public interface Filter {
//進行初始化
public void init(FilterConfig filterConfig) throws ServletException;
//攔截所有的http請求
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
throws IOException, ServletException;
//進行注銷的動作
public void destroy();
}
根據上面的注釋,我們知道在init方法中是進行初始化的。因此,今天咱們研究SolrDispatchFilter是如何初始化,是離不開這個方法的。接下來,咱們看看SolrDispatchFilter的init方法吧:
@Override
public void init(FilterConfig config) throws ServletException
{
log.info("SolrDispatchFilter.init()");
try {
// web.xml configuration
this.pathPrefix = config.getInitParameter( "path-prefix" );
//各位看客,乾坤盡在此方法中
this.cores = createCoreContainer();
log.info("user.dir=" + System.getProperty("user.dir"));
}
catch( Throwable t ) {
// catch this so our filter still works
log.error( "Could not start Solr. Check solr/home property and the logs");
SolrCore.log( t );
if (t instanceof Error) {
throw (Error) t;
}
}
log.info("SolrDispatchFilter.init() done");
}
咱們順藤摸瓜,來看看createCoreContainer這個方法到底干了些什么。
protected CoreContainer createCoreContainer() {
//看好了SolrResourceLoader 是用來加載solr home中的配置文件文件的
SolrResourceLoader loader = new SolrResourceLoader(SolrResourceLoader.locateSolrHome());
//加載配置文件
ConfigSolr config = loadConfigSolr(loader);
CoreContainer cores = new CoreContainer(loader, config);
//初始化Core
cores.load();
return cores;
}
createCoreContainer這個方法是決定咱們今天能否弄懂Solr初始化和啟動的關鍵。我們順便簡單分析一下這個方法中用到的幾個類和方法:
SolrResourceLoader 類如其名,是solr資源加載器。
ConfigSolr 是通過SolrResourceLoader來讀取solr配置文件的中信息的。
loadConfigSolr,加載配置信息的方法:
private ConfigSolr loadConfigSolr(SolrResourceLoader loader) {
//優先讀取solr.solrxml.location配置的信息,往往是通過讀取zookeeper中的配置信息進行初始化的,如果沒有配置,就會讀取solrhome配置項配置的信息(記得web.xml第一個配置項否,就是它)
String solrxmlLocation = System.getProperty("solr.solrxml.location", "solrhome");
if (solrxmlLocation == null || "solrhome".equalsIgnoreCase(solrxmlLocation))
return ConfigSolr.fromSolrHome(loader, loader.getInstanceDir());
//ok 從zookeeper中讀取配置信息吧,這是在solrcloud集群下用來solr初始化的
if ("zookeeper".equalsIgnoreCase(solrxmlLocation)) {
String zkHost = System.getProperty("zkHost");
log.info("Trying to read solr.xml from " + zkHost);
if (StringUtils.isEmpty(zkHost))
throw new SolrException(ErrorCode.SERVER_ERROR,
"Could not load solr.xml from zookeeper: zkHost system property not set");
SolrZkClient zkClient = new SolrZkClient(zkHost, 30000);
try {
if (!zkClient.exists("/solr.xml", true))//solr.xml里有描述的zookeeper相關的配置信息
throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper: node not found");
byte[] data = zkClient.getData("/solr.xml", null, null, true);
//加載配置信息
return ConfigSolr.fromInputStream(loader, new ByteArrayInputStream(data));
} catch (Exception e) {
throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper", e);
} finally {
zkClient.close();//關閉zookeeper連接
}
}
throw new SolrException(ErrorCode.SERVER_ERROR,
"Bad solr.solrxml.location set: " + solrxmlLocation + " - should be 'solrhome' or 'zookeeper'");
}
CoreContainer 就是進行Core初始化工作的。我們主要看看load方法吧,這段方法有點長,代碼如下:
public void load() {
log.info("Loading cores into CoreContainer [instanceDir={}]", loader.getInstanceDir());
//加載solr共享jar包庫
// add the sharedLib to the shared resource loader before initializing cfg based plugins
String libDir = cfg.getSharedLibDirectory();
if (libDir != null) {
File f = FileUtils.resolvePath(new File(solrHome), libDir);
log.info("loading shared library: " + f.getAbsolutePath());
//對classloader不熟的,可以進去看看
loader.addToClassLoader(libDir, null, false);
loader.reloadLuceneSPI();
}
//分片相關的handler加載以及初始化
shardHandlerFactory = ShardHandlerFactory.newInstance(cfg.getShardHandlerFactoryPluginInfo(), loader);
updateShardHandler = new UpdateShardHandler(cfg);
solrCores.allocateLazyCores(cfg.getTransientCacheSize(), loader);
logging = LogWatcher.newRegisteredLogWatcher(cfg.getLogWatcherConfig(), loader);
hostName = cfg.getHost();
log.info("Host Name: " + hostName);
zkSys.initZooKeeper(this, solrHome, cfg);
collectionsHandler = createHandler(cfg.getCollectionsHandlerClass(), CollectionsHandler.class);
infoHandler = createHandler(cfg.getInfoHandlerClass(), InfoHandler.class);
coreAdminHandler = createHandler(cfg.getCoreAdminHandlerClass(), CoreAdminHandler.class);
//zookeeper 配置信息初始化solr core
coreConfigService = cfg.createCoreConfigService(loader, zkSys.getZkController());
containerProperties = cfg.getSolrProperties("solr");
// setup executor to load cores in parallel
// do not limit the size of the executor in zk mode since cores may try and wait for each other.
//多線程初始化core 不熟悉多線的可以駐足研究一會
ExecutorService coreLoadExecutor = Executors.newFixedThreadPool(
( zkSys.getZkController() == null ? cfg.getCoreLoadThreadCount() : Integer.MAX_VALUE ),
new DefaultSolrThreadFactory("coreLoadExecutor") );
try {
CompletionService<SolrCore> completionService = new ExecutorCompletionService<>(
coreLoadExecutor);
Set<Future<SolrCore>> pending = new HashSet<>();
List<CoreDescriptor> cds = coresLocator.discover(this);
checkForDuplicateCoreNames(cds);
for (final CoreDescriptor cd : cds) {
final String name = cd.getName();
try {
if (cd.isTransient() || ! cd.isLoadOnStartup()) {
// Store it away for later use. includes non-transient but not
// loaded at startup cores.
solrCores.putDynamicDescriptor(name, cd);
}
if (cd.isLoadOnStartup()) { // The normal case
Callable<SolrCore> task = new Callable<SolrCore>() {
@Override
public SolrCore call() {
SolrCore c = null;
try {
if (zkSys.getZkController() != null) {//zookeeper模式
preRegisterInZk(cd);
}
c = create(cd);//普通創建模式
registerCore(cd.isTransient(), name, c, false, false);
} catch (Exception e) {
SolrException.log(log, null, e);
try {
/* if (isZooKeeperAware()) {
try {
zkSys.zkController.unregister(name, cd);
} catch (InterruptedException e2) {
Thread.currentThread().interrupt();
SolrException.log(log, null, e2);
} catch (KeeperException e3) {
SolrException.log(log, null, e3);
}
}*/
} finally {
if (c != null) {
c.close();
}
}
}
return c;
}
};
pending.add(completionService.submit(task));
}
} catch (Exception e) {
SolrException.log(log, null, e);
}
}
while (pending != null && pending.size() > 0) {
try {
//獲取創建完成的core
Future<SolrCore> future = completionService.take();
if (future == null) return;
pending.remove(future);
try {
SolrCore c = future.get();
// track original names
if (c != null) {
solrCores.putCoreToOrigName(c, c.getName());
}
} catch (ExecutionException e) {
SolrException.log(SolrCore.log, "Error loading core", e);
}
} catch (InterruptedException e) {
throw new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE,
"interrupted while loading core", e);
}
}
//solr core的守護線程,在容器關閉或者啟動失敗的時候,進行資源注銷
// Start the background thread
backgroundCloser = new CloserThread(this, solrCores, cfg);
backgroundCloser.start();
} finally {
if (coreLoadExecutor != null) {
//初始化完成,關閉線程池
ExecutorUtil.shutdownNowAndAwaitTermination(coreLoadExecutor);
}
}
if (isZooKeeperAware()) {//如果zookeeper可用 也就是solrcloud模式
// register in zk in background threads
Collection<SolrCore> cores = getCores();
if (cores != null) {
for (SolrCore core : cores) {
try {
//講core的狀態信息注冊到zookeeper中
zkSys.registerInZk(core, true);
} catch (Throwable t) {
SolrException.log(log, "Error registering SolrCore", t);
}
}
}
//
zkSys.getZkController().checkOverseerDesignate();
}
}
在這段代碼,關鍵部分我都做了注釋。當你需要優化你的solr啟動速度時,你還會來研究這段代碼。下面,我們將研究solr的請求過濾處理的部分,我們需要關注doFilter那個方法了(關鍵部分我作以注釋,就不細講了):
if( abortErrorMessage != null ) {//500錯誤處理
((HttpServletResponse)response).sendError( 500, abortErrorMessage );
return;
}
if (this.cores == null) {//solr core初始化失敗或者已經關閉
((HttpServletResponse)response).sendError( 503, "Server is shutting down or failed to initialize" );
return;
}
CoreContainer cores = this.cores;
SolrCore core = null;
SolrQueryRequest solrReq = null;
Aliases aliases = null;
if( request instanceof HttpServletRequest) {//如果是http請求
HttpServletRequest req = (HttpServletRequest)request;
HttpServletResponse resp = (HttpServletResponse)response;
SolrRequestHandler handler = null;
String corename = "";
String origCorename = null;
try {
// put the core container in request attribute
req.setAttribute("org.apache.solr.CoreContainer", cores);
String path = req.getServletPath();
if( req.getPathInfo() != null ) {
// this lets you handle /update/commit when /update is a servlet
path += req.getPathInfo();
}
if( pathPrefix != null && path.startsWith( pathPrefix ) ) {
path = path.substring( pathPrefix.length() );
}
// check for management path
String alternate = cores.getManagementPath();
if (alternate != null && path.startsWith(alternate)) {
path = path.substring(0, alternate.length());
}
// unused feature ?
int idx = path.indexOf( ':' );
if( idx > 0 ) {
// save the portion after the ':' for a 'handler' path parameter
path = path.substring( 0, idx );
}
// Check for the core admin page
if( path.equals( cores.getAdminPath() ) ) {//solr admin 管理頁面請求
handler = cores.getMultiCoreHandler();
solrReq = SolrRequestParsers.DEFAULT.parse(null,path, req);
handleAdminRequest(req, response, handler, solrReq);
return;
}
boolean usingAliases = false;
List<String> collectionsList = null;
// Check for the core admin collections url
if( path.equals( "/admin/collections" ) ) {//管理collections
handler = cores.getCollectionsHandler();
solrReq = SolrRequestParsers.DEFAULT.parse(null,path, req);
handleAdminRequest(req, response, handler, solrReq);
return;
}
// Check for the core admin info url
if( path.startsWith( "/admin/info" ) ) {//查看admin info
handler = cores.getInfoHandler();
solrReq = SolrRequestParsers.DEFAULT.parse(null,path, req);
handleAdminRequest(req, response, handler, solrReq);
return;
}
else {
//otherwise, we should find a core from the path
idx = path.indexOf( "/", 1 );
if( idx > 1 ) {
// try to get the corename as a request parameter first
corename = path.substring( 1, idx );
// look at aliases
if (cores.isZooKeeperAware()) {//solr cloud狀態
origCorename = corename;
ZkStateReader reader = cores.getZkController().getZkStateReader();
aliases = reader.getAliases();
if (aliases != null && aliases.collectionAliasSize() > 0) {
usingAliases = true;
String alias = aliases.getCollectionAlias(corename);
if (alias != null) {
collectionsList = StrUtils.splitSmart(alias, ",", true);
corename = collectionsList.get(0);
}
}
}
core = cores.getCore(corename);
if (core != null) {
path = path.substring( idx );
}
}
if (core == null) {
if (!cores.isZooKeeperAware() ) {
core = cores.getCore("");
}
}
}
if (core == null && cores.isZooKeeperAware()) {
// we couldn't find the core - lets make sure a collection was not specified instead
core = getCoreByCollection(cores, corename, path);
if (core != null) {
// we found a core, update the path
path = path.substring( idx );
}
// if we couldn't find it locally, look on other nodes
if (core == null && idx > 0) {
String coreUrl = getRemotCoreUrl(cores, corename, origCorename);
// don't proxy for internal update requests
SolrParams queryParams = SolrRequestParsers.parseQueryString(req.getQueryString());
if (coreUrl != null
&& queryParams
.get(DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM) == null) {
path = path.substring(idx);
remoteQuery(coreUrl + path, req, solrReq, resp);
return;
} else {
if (!retry) {
// we couldn't find a core to work with, try reloading aliases
// TODO: it would be nice if admin ui elements skipped this...
ZkStateReader reader = cores.getZkController()
.getZkStateReader();
reader.updateAliases();
doFilter(request, response, chain, true);
return;
}
}
}
// try the default core
if (core == null) {
core = cores.getCore("");
}
}
// With a valid core...
if( core != null ) {//驗證core
final SolrConfig config = core.getSolrConfig();
// get or create/cache the parser for the core
SolrRequestParsers parser = config.getRequestParsers();
// Handle /schema/* and /config/* paths via Restlet
if( path.equals("/schema") || path.startsWith("/schema/")
|| path.equals("/config") || path.startsWith("/config/")) {//solr rest api 入口
solrReq = parser.parse(core, path, req);
SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, new SolrQueryResponse()));
if( path.equals(req.getServletPath()) ) {
// avoid endless loop - pass through to Restlet via webapp
chain.doFilter(request, response);
} else {
// forward rewritten URI (without path prefix and core/collection name) to Restlet
req.getRequestDispatcher(path).forward(request, response);
}
return;
}
// Determine the handler from the url path if not set
// (we might already have selected the cores handler)
if( handler == null && path.length() > 1 ) { // don't match "" or "/" as valid path
handler = core.getRequestHandler( path );
// no handler yet but allowed to handle select; let's check
if( handler == null && parser.isHandleSelect() ) {
if( "/select".equals( path ) || "/select/".equals( path ) ) {//solr 各種查詢過濾入口
solrReq = parser.parse( core, path, req );
String qt = solrReq.getParams().get( CommonParams.QT );
handler = core.getRequestHandler( qt );
if( handler == null ) {
throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "unknown handler: "+qt);
}
if( qt != null && qt.startsWith("/") && (handler instanceof ContentStreamHandlerBase)) {
//For security reasons it's a bad idea to allow a leading '/', ex: /select?qt=/update see SOLR-3161
//There was no restriction from Solr 1.4 thru 3.5 and it's not supported for update handlers.
throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "Invalid Request Handler ('qt'). Do not use /select to access: "+qt);
}
}
}
}
// With a valid handler and a valid core...
if( handler != null ) {
// if not a /select, create the request
if( solrReq == null ) {
solrReq = parser.parse( core, path, req );
}
if (usingAliases) {
processAliases(solrReq, aliases, collectionsList);
}
final Method reqMethod = Method.getMethod(req.getMethod());
HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod);
// unless we have been explicitly told not to, do cache validation
// if we fail cache validation, execute the query
if (config.getHttpCachingConfig().isNever304() ||
!HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {//solr http 緩存 在header控制失效時間的方式
SolrQueryResponse solrRsp = new SolrQueryResponse();
/* even for HEAD requests, we need to execute the handler to
* ensure we don't get an error (and to make sure the correct
* QueryResponseWriter is selected and we get the correct
* Content-Type)
*/
SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp));
this.execute( req, handler, solrReq, solrRsp );
HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod);
// add info to http headers
//TODO: See SOLR-232 and SOLR-267.
/*try {
NamedList solrRspHeader = solrRsp.getResponseHeader();
for (int i=0; i<solrRspHeader.size(); i++) {
((javax.servlet.http.HttpServletResponse) response).addHeader(("Solr-" + solrRspHeader.getName(i)), String.valueOf(solrRspHeader.getVal(i)));
}
} catch (ClassCastException cce) {
log.log(Level.WARNING, "exception adding response header log information", cce);
}*/
QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);
writeResponse(solrRsp, response, responseWriter, solrReq, reqMethod);
}
return; // we are done with a valid handler
}
}
log.debug("no handler or core retrieved for " + path + ", follow through...");
}
catch (Throwable ex) {
sendError( core, solrReq, request, (HttpServletResponse)response, ex );
if (ex instanceof Error) {
throw (Error) ex;
}
return;
} finally {
try {
if (solrReq != null) {
log.debug("Closing out SolrRequest: {}", solrReq);
solrReq.close();
}
} finally {
try {
if (core != null) {
core.close();
}
} finally {
SolrRequestInfo.clearRequestInfo();
}
}
}
}
// Otherwise let the webapp handle the request
chain.doFilter(request, response);
}
文章轉載請注明出處:http://www.cnblogs.com/likehua/p/4353608.html
