Solr初始化源碼分析-Solr初始化與啟動

本文轉載自查看原文 2015-03-20 14:48 13964 java/ solr/ 搜索引擎

用solr做項目已經有一年有余，但都是使用層面，只是利用solr現有機制，修改參數，然后監控調優，從沒有對solr進行源碼級別的研究。但是，最近手頭的一個項目，讓我感覺必須把solrn內部原理和擴展機制弄熟，才能把這個項目做好。今天分享的就是：Solr是如何啟動並且初始化的。大家知道，部署solr時，分兩部分：一、solr的配置文件。二、solr相關的程序、插件、依賴lucene相關的jar包、日志方面的jar。因此，在研究solr也可以順着這個思路：加載配置文件、初始化各個core、初始化各個core中的requesthandler...

　　研究solr的啟動，首先從solr war程序的web.xml分析開始，下面是solr的web.xml片段：

<web-app xmlns="http://java.sun.com/xml/ns/javaee"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
         version="2.5"
         metadata-complete="true"
>


  <!-- Uncomment if you are trying to use a Resin version before 3.0.19.
    Their XML implementation isn't entirely compatible with Xerces.
    Below are the implementations to use with Sun's JVM.
  <system-property javax.xml.xpath.XPathFactory=
             "com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl"/>
  <system-property javax.xml.parsers.DocumentBuilderFactory=
             "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl"/>
  <system-property javax.xml.parsers.SAXParserFactory=
             "com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl"/>
   -->

  <!-- People who want to hardcode their "Solr Home" directly into the
       WAR File can set the JNDI property here...
   -->
    <!--  Solr配置文件的參數,用於Solr初始化使用  -->
    <env-entry>
       <env-entry-name>solr/home</env-entry-name>
       <env-entry-value>R:/solrhome1/solr</env-entry-value>
       <env-entry-type>java.lang.String</env-entry-type>
    </env-entry>

   
  
  <!-- org.apache.solr.servlet.SolrDispatchFilter  Solr啟動最重要的東東，所以針對solr源碼分析，要對這個Filter開始，它主要的作用：加載solr配置文件、初始化各個core、初始化各個requestHandler和component -->
  <filter>
    <filter-name>SolrRequestFilter</filter-name>
    <filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class>
    <!-- If you are wiring Solr into a larger web application which controls
         the web context root, you will probably want to mount Solr under
         a path prefix (app.war with /app/solr mounted into it, for example).
         You will need to put this prefix in front of the SolrDispatchFilter
         url-pattern mapping too (/solr/*), and also on any paths for
         legacy Solr servlet mappings you may be using.
         For the Admin UI to work properly in a path-prefixed configuration,
         the admin folder containing the resources needs to be under the app context root
         named to match the path-prefix.  For example:

            .war
               xxx
                 js
                   main.js
    -->
    <!--
    <init-param>
      <param-name>path-prefix</param-name>
      <param-value>/xxx</param-value>
    </init-param>
    -->
  </filter>

　　SolrDispatchFilter 是繼承BaseSolrFilter的一個Filter（Filter的作用是啥，大家應該清楚吧，一般web框架級別的產品源碼分析都是從filter或者servlet開始）。在介紹SolrDispatchFilter之前，先介紹一下BaseSolrFilter（也許程序員都有刨根問底的習慣）。BaseSolrFilter，是一個實現Filter接口的抽象類，功能很簡單，就是判斷當前程序是否已經加載日志方面的jar。代碼片段如下：

/**
 * All Solr filters available to the user's webapp should
 * extend this class and not just implement {@link Filter}.
 * This class ensures that the logging configuration is correct
 * before any Solr specific code is executed.
 */
abstract class BaseSolrFilter implements Filter {
  
  static {//
    CheckLoggingConfiguration.check();
  }
  
}

　　着於篇幅，我就不介紹CheckLoggingConfiguration.check() 這里面的東東了。OK，我們回到SolrDispatchFilter上。由於BaseSolrFilter是一個抽象類，所有作為非抽象類的SolrDispatchFilter必須要實現Filter接口。Filter接口如下：

public interface Filter {

    //進行初始化
    public void init(FilterConfig filterConfig) throws ServletException;
	
   //攔截所有的http請求
    public void doFilter(ServletRequest request, ServletResponse response,
                         FilterChain chain)
            throws IOException, ServletException;

   //進行注銷的動作
    public void destroy();
}

　　根據上面的注釋，我們知道在init方法中是進行初始化的。因此，今天咱們研究SolrDispatchFilter是如何初始化，是離不開這個方法的。接下來，咱們看看SolrDispatchFilter的init方法吧：

  @Override
  public void init(FilterConfig config) throws ServletException
  {
    log.info("SolrDispatchFilter.init()");

    try {
      // web.xml configuration
      this.pathPrefix = config.getInitParameter( "path-prefix" );
      //各位看客，乾坤盡在此方法中
      this.cores = createCoreContainer();
      log.info("user.dir=" + System.getProperty("user.dir"));
    }
    catch( Throwable t ) {
      // catch this so our filter still works
      log.error( "Could not start Solr. Check solr/home property and the logs");
      SolrCore.log( t );
      if (t instanceof Error) {
        throw (Error) t;
      }
    }

    log.info("SolrDispatchFilter.init() done");
  }

　　咱們順藤摸瓜，來看看createCoreContainer這個方法到底干了些什么。

  protected CoreContainer createCoreContainer() {
　　//看好了SolrResourceLoader 是用來加載solr home中的配置文件文件的 
    SolrResourceLoader loader = new SolrResourceLoader(SolrResourceLoader.locateSolrHome());
    //加載配置文件
    ConfigSolr config = loadConfigSolr(loader);
    CoreContainer cores = new CoreContainer(loader, config);
　　 //初始化Core 
    cores.load();
    return cores;
  }

　　createCoreContainer這個方法是決定咱們今天能否弄懂Solr初始化和啟動的關鍵。我們順便簡單分析一下這個方法中用到的幾個類和方法：

　　SolrResourceLoader 類如其名，是solr資源加載器。

ConfigSolr 是通過SolrResourceLoader來讀取solr配置文件的中信息的。

loadConfigSolr，加載配置信息的方法：

  private ConfigSolr loadConfigSolr(SolrResourceLoader loader) {
    //優先讀取solr.solrxml.location配置的信息，往往是通過讀取zookeeper中的配置信息進行初始化的，如果沒有配置，就會讀取solrhome配置項配置的信息（記得web.xml第一個配置項否，就是它）
    String solrxmlLocation = System.getProperty("solr.solrxml.location", "solrhome");
    
    if (solrxmlLocation == null || "solrhome".equalsIgnoreCase(solrxmlLocation))
      return ConfigSolr.fromSolrHome(loader, loader.getInstanceDir());
     //ok 從zookeeper中讀取配置信息吧，這是在solrcloud集群下用來solr初始化的
    if ("zookeeper".equalsIgnoreCase(solrxmlLocation)) {
      String zkHost = System.getProperty("zkHost");
      log.info("Trying to read solr.xml from " + zkHost);
      if (StringUtils.isEmpty(zkHost))
        throw new SolrException(ErrorCode.SERVER_ERROR,
            "Could not load solr.xml from zookeeper: zkHost system property not set");
      SolrZkClient zkClient = new SolrZkClient(zkHost, 30000);
      try {
        if (!zkClient.exists("/solr.xml", true))//solr.xml里有描述的zookeeper相關的配置信息
          throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper: node not found");
        byte[] data = zkClient.getData("/solr.xml", null, null, true);
         //加載配置信息
        return ConfigSolr.fromInputStream(loader, new ByteArrayInputStream(data));
      } catch (Exception e) {
        throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper", e);
      } finally {
        zkClient.close();//關閉zookeeper連接
      }
    }

    throw new SolrException(ErrorCode.SERVER_ERROR,
        "Bad solr.solrxml.location set: " + solrxmlLocation + " - should be 'solrhome' or 'zookeeper'");
  }

　　CoreContainer 就是進行Core初始化工作的。我們主要看看load方法吧，這段方法有點長，代碼如下：

public void load()  {

    log.info("Loading cores into CoreContainer [instanceDir={}]", loader.getInstanceDir());
     //加載solr共享jar包庫
    // add the sharedLib to the shared resource loader before initializing cfg based plugins
    String libDir = cfg.getSharedLibDirectory();
    if (libDir != null) {
      File f = FileUtils.resolvePath(new File(solrHome), libDir);
      log.info("loading shared library: " + f.getAbsolutePath());
      //對classloader不熟的，可以進去看看
      loader.addToClassLoader(libDir, null, false);
      loader.reloadLuceneSPI();
    }

    //分片相關的handler加載以及初始化
    shardHandlerFactory = ShardHandlerFactory.newInstance(cfg.getShardHandlerFactoryPluginInfo(), loader);
    
    updateShardHandler = new UpdateShardHandler(cfg);

    solrCores.allocateLazyCores(cfg.getTransientCacheSize(), loader);

    logging = LogWatcher.newRegisteredLogWatcher(cfg.getLogWatcherConfig(), loader);

    hostName = cfg.getHost();
    log.info("Host Name: " + hostName);

    zkSys.initZooKeeper(this, solrHome, cfg);

    collectionsHandler = createHandler(cfg.getCollectionsHandlerClass(), CollectionsHandler.class);
    infoHandler        = createHandler(cfg.getInfoHandlerClass(), InfoHandler.class);
    coreAdminHandler   = createHandler(cfg.getCoreAdminHandlerClass(), CoreAdminHandler.class);
    //zookeeper 配置信息初始化solr core
    coreConfigService = cfg.createCoreConfigService(loader, zkSys.getZkController());

    containerProperties = cfg.getSolrProperties("solr");

    // setup executor to load cores in parallel
    // do not limit the size of the executor in zk mode since cores may try and wait for each other.
    //多線程初始化core  不熟悉多線的可以駐足研究一會
    ExecutorService coreLoadExecutor = Executors.newFixedThreadPool(
        ( zkSys.getZkController() == null ? cfg.getCoreLoadThreadCount() : Integer.MAX_VALUE ),
        new DefaultSolrThreadFactory("coreLoadExecutor") );

    try {
      CompletionService<SolrCore> completionService = new ExecutorCompletionService<>(
          coreLoadExecutor);

      Set<Future<SolrCore>> pending = new HashSet<>();

      List<CoreDescriptor> cds = coresLocator.discover(this);
      checkForDuplicateCoreNames(cds);

      for (final CoreDescriptor cd : cds) {

        final String name = cd.getName();
        try {

          if (cd.isTransient() || ! cd.isLoadOnStartup()) {
            // Store it away for later use. includes non-transient but not
            // loaded at startup cores.
            solrCores.putDynamicDescriptor(name, cd);
          }
          if (cd.isLoadOnStartup()) { // The normal case

            Callable<SolrCore> task = new Callable<SolrCore>() {
              @Override
              public SolrCore call() {
                SolrCore c = null;
                try {
                  if (zkSys.getZkController() != null) {//zookeeper模式
                    preRegisterInZk(cd);
                  }
                  c = create(cd);//普通創建模式
                  registerCore(cd.isTransient(), name, c, false, false);
                } catch (Exception e) {
                  SolrException.log(log, null, e);
                  try {
              /*    if (isZooKeeperAware()) {
                    try {
                      zkSys.zkController.unregister(name, cd);
                    } catch (InterruptedException e2) {
                      Thread.currentThread().interrupt();
                      SolrException.log(log, null, e2);
                    } catch (KeeperException e3) {
                      SolrException.log(log, null, e3);
                    }
                  }*/
                  } finally {
                    if (c != null) {
                      c.close();
                    }
                  }            
                }
                return c;
              }
            };
            pending.add(completionService.submit(task));

          }
        } catch (Exception e) {
          SolrException.log(log, null, e);
        }
      }

      while (pending != null && pending.size() > 0) {
        try {
          //獲取創建完成的core
          Future<SolrCore> future = completionService.take();
          if (future == null) return;
          pending.remove(future);

          try {
            SolrCore c = future.get();
            // track original names
            if (c != null) {
              solrCores.putCoreToOrigName(c, c.getName());
            }
          } catch (ExecutionException e) {
            SolrException.log(SolrCore.log, "Error loading core", e);
          }

        } catch (InterruptedException e) {
          throw new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE,
              "interrupted while loading core", e);
        }
      }

      //solr core的守護線程，在容器關閉或者啟動失敗的時候，進行資源注銷
      // Start the background thread
      backgroundCloser = new CloserThread(this, solrCores, cfg);
      backgroundCloser.start();

    } finally {
      if (coreLoadExecutor != null) {
        //初始化完成，關閉線程池
        ExecutorUtil.shutdownNowAndAwaitTermination(coreLoadExecutor);
      }
    }
    
    if (isZooKeeperAware()) {//如果zookeeper可用 也就是solrcloud模式
      // register in zk in background threads
      Collection<SolrCore> cores = getCores();
      if (cores != null) {
        for (SolrCore core : cores) {
          try {
            //講core的狀態信息注冊到zookeeper中
            zkSys.registerInZk(core, true);
          } catch (Throwable t) {
            SolrException.log(log, "Error registering SolrCore", t);
          }
        }
      }
      //
      zkSys.getZkController().checkOverseerDesignate();
    }
  }

　　在這段代碼，關鍵部分我都做了注釋。當你需要優化你的solr啟動速度時，你還會來研究這段代碼。下面，我們將研究solr的請求過濾處理的部分，我們需要關注doFilter那個方法了（關鍵部分我作以注釋，就不細講了）：

 if( abortErrorMessage != null ) {//500錯誤處理
      ((HttpServletResponse)response).sendError( 500, abortErrorMessage );
      return;
    }
    
    if (this.cores == null) {//solr core初始化失敗或者已經關閉
      ((HttpServletResponse)response).sendError( 503, "Server is shutting down or failed to initialize" );
      return;
    }
    CoreContainer cores = this.cores;
    SolrCore core = null;
    SolrQueryRequest solrReq = null;
    Aliases aliases = null;
    
    if( request instanceof HttpServletRequest) {//如果是http請求
      HttpServletRequest req = (HttpServletRequest)request;
      HttpServletResponse resp = (HttpServletResponse)response;
      SolrRequestHandler handler = null;
      String corename = "";
      String origCorename = null;
      try {
        // put the core container in request attribute
        req.setAttribute("org.apache.solr.CoreContainer", cores);
        String path = req.getServletPath();
        if( req.getPathInfo() != null ) {
          // this lets you handle /update/commit when /update is a servlet
          path += req.getPathInfo();
        }
        if( pathPrefix != null && path.startsWith( pathPrefix ) ) {
          path = path.substring( pathPrefix.length() );
        }
        // check for management path
        String alternate = cores.getManagementPath();
        if (alternate != null && path.startsWith(alternate)) {
          path = path.substring(0, alternate.length());
        }
        // unused feature ?
        int idx = path.indexOf( ':' );
        if( idx > 0 ) {
          // save the portion after the ':' for a 'handler' path parameter
          path = path.substring( 0, idx );
        }

        // Check for the core admin page
        if( path.equals( cores.getAdminPath() ) ) {//solr admin 管理頁面請求
          handler = cores.getMultiCoreHandler();
          solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
          handleAdminRequest(req, response, handler, solrReq);
          return;
        }
        boolean usingAliases = false;
        List<String> collectionsList = null;
        // Check for the core admin collections url
        if( path.equals( "/admin/collections" ) ) {//管理collections 
          handler = cores.getCollectionsHandler();
          solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
          handleAdminRequest(req, response, handler, solrReq);
          return;
        }
        // Check for the core admin info url
        if( path.startsWith( "/admin/info" ) ) {//查看admin info
          handler = cores.getInfoHandler();
          solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
          handleAdminRequest(req, response, handler, solrReq);
          return;
        }
        else {
          //otherwise, we should find a core from the path
          idx = path.indexOf( "/", 1 );
          if( idx > 1 ) {
            // try to get the corename as a request parameter first
            corename = path.substring( 1, idx );
            
            // look at aliases
            if (cores.isZooKeeperAware()) {//solr cloud狀態
              origCorename = corename;
              ZkStateReader reader = cores.getZkController().getZkStateReader();
              aliases = reader.getAliases();
              if (aliases != null && aliases.collectionAliasSize() > 0) {
                usingAliases = true;
                String alias = aliases.getCollectionAlias(corename);
                if (alias != null) {
                  collectionsList = StrUtils.splitSmart(alias, ",", true);
                  corename = collectionsList.get(0);
                }
              }
            }
            
            core = cores.getCore(corename);

            if (core != null) {
              path = path.substring( idx );
            }
          }
          if (core == null) {
            if (!cores.isZooKeeperAware() ) {
              core = cores.getCore("");
            }
          }
        }
        
        if (core == null && cores.isZooKeeperAware()) {
          // we couldn't find the core - lets make sure a collection was not specified instead
          core = getCoreByCollection(cores, corename, path);
          
          if (core != null) {
            // we found a core, update the path
            path = path.substring( idx );
          }
          
          // if we couldn't find it locally, look on other nodes
          if (core == null && idx > 0) {
            String coreUrl = getRemotCoreUrl(cores, corename, origCorename);
            // don't proxy for internal update requests
            SolrParams queryParams = SolrRequestParsers.parseQueryString(req.getQueryString());
            if (coreUrl != null
                && queryParams
                    .get(DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM) == null) {
              path = path.substring(idx);
              remoteQuery(coreUrl + path, req, solrReq, resp);
              return;
            } else {
              if (!retry) {
                // we couldn't find a core to work with, try reloading aliases
                // TODO: it would be nice if admin ui elements skipped this...
                ZkStateReader reader = cores.getZkController()
                    .getZkStateReader();
                reader.updateAliases();
                doFilter(request, response, chain, true);
                return;
              }
            }
          }
          
          // try the default core
          if (core == null) {
            core = cores.getCore("");
          }
        }

        // With a valid core...
        if( core != null ) {//驗證core
          final SolrConfig config = core.getSolrConfig();
          // get or create/cache the parser for the core
          SolrRequestParsers parser = config.getRequestParsers();

          // Handle /schema/* and /config/* paths via Restlet
          if( path.equals("/schema") || path.startsWith("/schema/")
              || path.equals("/config") || path.startsWith("/config/")) {//solr rest api 入口  
            solrReq = parser.parse(core, path, req);
            SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, new SolrQueryResponse()));
            if( path.equals(req.getServletPath()) ) {
              // avoid endless loop - pass through to Restlet via webapp
              chain.doFilter(request, response);
            } else {
              // forward rewritten URI (without path prefix and core/collection name) to Restlet
              req.getRequestDispatcher(path).forward(request, response);
            }
            return;
          }

          // Determine the handler from the url path if not set
          // (we might already have selected the cores handler)
          if( handler == null && path.length() > 1 ) { // don't match "" or "/" as valid path
            handler = core.getRequestHandler( path );
            // no handler yet but allowed to handle select; let's check
            if( handler == null && parser.isHandleSelect() ) {
              if( "/select".equals( path ) || "/select/".equals( path ) ) {//solr 各種查詢過濾入口 
                solrReq = parser.parse( core, path, req );
                String qt = solrReq.getParams().get( CommonParams.QT );
                handler = core.getRequestHandler( qt );
                if( handler == null ) {
                  throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "unknown handler: "+qt);
                }
                if( qt != null && qt.startsWith("/") && (handler instanceof ContentStreamHandlerBase)) {
                  //For security reasons it's a bad idea to allow a leading '/', ex: /select?qt=/update see SOLR-3161
                  //There was no restriction from Solr 1.4 thru 3.5 and it's not supported for update handlers.
                  throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "Invalid Request Handler ('qt').  Do not use /select to access: "+qt);
                }
              }
            }
          }

          // With a valid handler and a valid core...
          if( handler != null ) {
            // if not a /select, create the request
            if( solrReq == null ) {
              solrReq = parser.parse( core, path, req );
            }

            if (usingAliases) {
              processAliases(solrReq, aliases, collectionsList);
            }
            
            final Method reqMethod = Method.getMethod(req.getMethod());
            HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod);
            // unless we have been explicitly told not to, do cache validation
            // if we fail cache validation, execute the query
            if (config.getHttpCachingConfig().isNever304() ||
                !HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {//solr http 緩存 在header控制失效時間的方式
                SolrQueryResponse solrRsp = new SolrQueryResponse();
                /* even for HEAD requests, we need to execute the handler to
                 * ensure we don't get an error (and to make sure the correct
                 * QueryResponseWriter is selected and we get the correct
                 * Content-Type)
                 */
                SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp));
                this.execute( req, handler, solrReq, solrRsp );
                HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod);
              // add info to http headers
              //TODO: See SOLR-232 and SOLR-267.  
                /*try {
                  NamedList solrRspHeader = solrRsp.getResponseHeader();
                 for (int i=0; i<solrRspHeader.size(); i++) {
                   ((javax.servlet.http.HttpServletResponse) response).addHeader(("Solr-" + solrRspHeader.getName(i)), String.valueOf(solrRspHeader.getVal(i)));
                 }
                } catch (ClassCastException cce) {
                  log.log(Level.WARNING, "exception adding response header log information", cce);
                }*/
               QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);
               writeResponse(solrRsp, response, responseWriter, solrReq, reqMethod);
            }
            return; // we are done with a valid handler
          }
        }
        log.debug("no handler or core retrieved for " + path + ", follow through...");
      } 
      catch (Throwable ex) {
        sendError( core, solrReq, request, (HttpServletResponse)response, ex );
        if (ex instanceof Error) {
          throw (Error) ex;
        }
        return;
      } finally {
        try {
          if (solrReq != null) {
            log.debug("Closing out SolrRequest: {}", solrReq);
            solrReq.close();
          }
        } finally {
          try {
            if (core != null) {
              core.close();
            }
          } finally {
            SolrRequestInfo.clearRequestInfo();
          }
        }
      }
    }

    // Otherwise let the webapp handle the request
    chain.doFilter(request, response);
  }

文章轉載請注明出處：http://www.cnblogs.com/likehua/p/4353608.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 kube-scheduler源碼分析（1）-初始化與啟動分析 linux中斷源碼分析 - 初始化(二) nginx源碼分析之模塊初始化 SQLmap源碼分析之框架初始化(一) MyBatis 源碼分析 - MyBatis初始化（四）之 SQL 初始化（下）【spring源碼分析】IOC容器初始化（一） spark 源碼分析之二 -- SparkContext 的初始化過程 A2dp初始化流程源碼分析 ABP源碼分析二：ABP中配置的注冊和初始化 ThinkPHP6源碼分析之應用初始化