細說Lucene源碼(一):索引文件鎖機制


大家都知道,在多線程或多進程的環境中,對統一資源的訪問需要特別小心,特別是在寫資源時,如果不加鎖,將會導致很多嚴重的后果,Lucene的索引也是如此,lucene對索引的讀寫分為IndexReader和IndexWriter,顧名思義,一個讀,一個寫,lucene可以對同一個索引文件建立多個IndexReader對象,但是只能有一個IndexWriter對象,這是怎么做到的呢?顯而易見是需要加鎖的,加鎖可以保證一個索引文件只能建立一個IndexWriter對象。下面就細說Lucene索引文件鎖機制:

 

如果我們對同一個索引文件建立多個不同的IndexWriter會怎么樣呢?

IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);

IndexWriter indexWriter = new IndexWriter(dir, indexWriterConfig);

 

IndexWriterConfig indexWriterConfig2 = new IndexWriterConfig(analyzer);

IndexWriter indexWriter2 = new IndexWriter(dir,indexWriterConfig2);

 

運行后,控制台輸出:

Exception in thread "main" org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@C:\Users\new\Desktop\Lucene\write.lock

    at org.apache.lucene.store.Lock.obtain(Lock.java:89)

    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)

    at test.Index.index(Index.java:51)

    at test.Index.main(Index.java:78)

 

顯然是不可以對同一個索引文件開啟多個IndexWriter。

 

上面是一個比較簡略的類圖,可以看到lucene采用了工廠方法,這樣可以方便擴展其他實現,這里只以SimpleFsLock為例說明lucene的鎖機制(其他的有興趣可以看lucene源碼)。

 

Lock類是鎖的基類,一個抽象類,源碼如下:

public abstract class Lock implements Closeable {

  /** How long {@link #obtain(long)} waits, in milliseconds,
   *  in between attempts to acquire the lock. */
  public static long LOCK_POLL_INTERVAL = 1000;

  /** Pass this value to {@link #obtain(long)} to try
   *  forever to obtain the lock. */
  public static final long LOCK_OBTAIN_WAIT_FOREVER = -1;

  /** Attempts to obtain exclusive access and immediately return
   *  upon success or failure.  Use {@link #close} to
   *  release the lock.
   * @return true iff exclusive access is obtained
   */
  public abstract boolean obtain() throws IOException;

  /**
   * If a lock obtain called, this failureReason may be set
   * with the "root cause" Exception as to why the lock was
   * not obtained.
   */
  protected Throwable failureReason;

  /** Attempts to obtain an exclusive lock within amount of
   *  time given. Polls once per {@link #LOCK_POLL_INTERVAL}
   *  (currently 1000) milliseconds until lockWaitTimeout is
   *  passed.
   * @param lockWaitTimeout length of time to wait in
   *        milliseconds or {@link
   *        #LOCK_OBTAIN_WAIT_FOREVER} to retry forever
   * @return true if lock was obtained
   * @throws LockObtainFailedException if lock wait times out
   * @throws IllegalArgumentException if lockWaitTimeout is
   *         out of bounds
   * @throws IOException if obtain() throws IOException
   */
  public final boolean obtain(long lockWaitTimeout) throws IOException {
    failureReason = null;
    boolean locked = obtain();
    if (lockWaitTimeout < 0 && lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER)
      throw new IllegalArgumentException("lockWaitTimeout should be LOCK_OBTAIN_WAIT_FOREVER or a non-negative number (got " + lockWaitTimeout + ")");

    long maxSleepCount = lockWaitTimeout / LOCK_POLL_INTERVAL;
    long sleepCount = 0;
    while (!locked) {
      if (lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER && sleepCount++ >= maxSleepCount) {
        String reason = "Lock obtain timed out: " + this.toString();
        if (failureReason != null) {
          reason += ": " + failureReason;
        }
        throw new LockObtainFailedException(reason, failureReason);
      }
      try {
        Thread.sleep(LOCK_POLL_INTERVAL);
      } catch (InterruptedException ie) {
        throw new ThreadInterruptedException(ie);
      }
      locked = obtain();
    }
    return locked;
  }

  /** Releases exclusive access. */
  public abstract void close() throws IOException;

  /** Returns true if the resource is currently locked.  Note that one must
   * still call {@link #obtain()} before using the resource. */
  public abstract boolean isLocked() throws IOException;


  /** Utility class for executing code with exclusive access. */
  public abstract static class With {
    private Lock lock;
    private long lockWaitTimeout;


    /** Constructs an executor that will grab the named lock. */
    public With(Lock lock, long lockWaitTimeout) {
      this.lock = lock;
      this.lockWaitTimeout = lockWaitTimeout;
    }

    /** Code to execute with exclusive access. */
    protected abstract Object doBody() throws IOException;

    /** Calls {@link #doBody} while <i>lock</i> is obtained.  Blocks if lock
     * cannot be obtained immediately.  Retries to obtain lock once per second
     * until it is obtained, or until it has tried ten times. Lock is released when
     * {@link #doBody} exits.
     * @throws LockObtainFailedException if lock could not
     * be obtained
     * @throws IOException if {@link Lock#obtain} throws IOException
     */
    public Object run() throws IOException {
      boolean locked = false;
      try {
         locked = lock.obtain(lockWaitTimeout);
         return doBody();
      } finally {
        if (locked) {
          lock.close();
        }
      }
    }
  }

}

 

里面最重要的方法就是obtain(),這個方法用來維持鎖,建立鎖之后,維持時間為LOCK_POLL_INTERVAL,之后需要重新申請維持鎖,這樣做是為了支持多線程讀寫。當然也可以將lockWaitTimeout設置為-1,這樣就是一直維持寫鎖。

 

抽象基類LockFactory,只定義了一個抽象方法makeLock,返回Lock對象的一個實例。

public abstract class LockFactory {

  /**
   * Return a new Lock instance identified by lockName.
   * @param lockName name of the lock to be created.
   */
  public abstract Lock makeLock(Directory dir, String lockName);

}

 

抽象類FSLockFactory繼承Lock:

public abstract class FSLockFactory extends LockFactory {
  
  /** Returns the default locking implementation for this platform.
   * This method currently returns always {@link NativeFSLockFactory}.
   */
  public static final FSLockFactory getDefault() {
    return NativeFSLockFactory.INSTANCE;
  }

  @Override
  public final Lock makeLock(Directory dir, String lockName) {
    if (!(dir instanceof FSDirectory)) {
      throw new UnsupportedOperationException(getClass().getSimpleName() + " can only be used with FSDirectory subclasses, got: " + dir);
    }
    return makeFSLock((FSDirectory) dir, lockName);
  }
  
  /** Implement this method to create a lock for a FSDirectory instance. */
  protected abstract Lock makeFSLock(FSDirectory dir, String lockName);

}

 

可以看到

public static final FSLockFactory getDefault() {

return NativeFSLockFactory.INSTANCE;

}

這個方法默認返回NativeFSLockFactory,和SimpleFSLockFactory一樣是一個具體實現,NativeFSLockFactory使用的是nio中FileChannel.tryLock方法,這里不展開討論,有興趣的讀者可以去看jdk nio的源碼(好像現在oracle不提供FileChannel實現類的源碼了,需要去jvm里找)。

 

下面就是本篇文章的重頭戲,SimpleFSLockFactory

public final class SimpleFSLockFactory extends FSLockFactory {

  /**
   * Singleton instance
   */
  public static final SimpleFSLockFactory INSTANCE = new SimpleFSLockFactory();
  
  private SimpleFSLockFactory() {}

  @Override
  protected Lock makeFSLock(FSDirectory dir, String lockName) {
    return new SimpleFSLock(dir.getDirectory(), lockName);
  }
  
  static class SimpleFSLock extends Lock {

    Path lockFile;
    Path lockDir;

    public SimpleFSLock(Path lockDir, String lockFileName) {
      this.lockDir = lockDir;
      lockFile = lockDir.resolve(lockFileName);
    }

    @Override
    public boolean obtain() throws IOException {
      try {
        Files.createDirectories(lockDir);
        Files.createFile(lockFile);
        return true;
      } catch (IOException ioe) {
        // On Windows, on concurrent createNewFile, the 2nd process gets "access denied".
        // In that case, the lock was not aquired successfully, so return false.
        // We record the failure reason here; the obtain with timeout (usually the
        // one calling us) will use this as "root cause" if it fails to get the lock.
        failureReason = ioe;
        return false;
      }
    }

    @Override
    public void close() throws LockReleaseFailedException {
      // TODO: wierd that clearLock() throws the raw IOException...
      try {
        Files.deleteIfExists(lockFile);
      } catch (Throwable cause) {
        throw new LockReleaseFailedException("failed to delete " + lockFile, cause);
      }
    }

    @Override
    public boolean isLocked() {
      return Files.exists(lockFile);
    }

    @Override
    public String toString() {
      return "SimpleFSLock@" + lockFile;
    }
  }

}

 

在SimpleFSLockFactory定義了一個內部類SimpleFSLock繼承Lock,我們還是主要看SimpleFSLockFactory的obtain方法,這里就是SimpleFSLock具體實現文件鎖的代碼。

Files.createDirectories(lockDir);

Files.createFile(lockFile);

 

可以看着兩行代碼,createDirectories建立write.lock(可以是別的文件名,lucene默認使用write.lock)文件所在的文件夾及父文件夾。createFile則是創建write.lock文件,這里有一個精妙的地方,如果write.lock已經存在,那么createFile則會拋出異常,如果拋出異常,則表明SimpleFSLockFactory維持文件鎖失敗,也即意味着別的進程正在寫索引文件。

看到close()方法中Files.deleteIfExists(lockFile); 就表示如果每次關閉IndexWriter,則會刪除write.lock文件。

 

總結一下,SimpleFSLockFactory加文件鎖的機制可以通俗的理解為,在索引文件所在的目錄下,創建一個write.lock文件,如果此文件夾下已經有write.lock文件,則表明已經有其他進程在寫當前的索引目錄,所以此次添加文件鎖失敗,也即不能像索引文件中添加信息。每次添加完信息后,則會刪除write.lock文件,釋放文件鎖。也即如果write.lock文件存在,就表明已經有進程在寫索引文件,如果write.lock不存在就創建文件並添加了文件鎖,別的進程不能寫文件

 

這是一個非常精妙的方式去實現寫文件鎖,當然可能有些讀者會疑惑為什么自己在Demo中,創建完索引,close后還有write.lock文件存在,因為現在lucene的默認實現是NativeFSLockFactory,也是上文提及的使用nio調用本地方法去實現的lock。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM