JAVA I/O(三)內存映射文件


《Java編程思想》中對內存映射文件有詳細的介紹,此處僅做簡單記錄和總結。內存映射文件允許創建和修改因為太大而不能放入內存的文件。

1. 內存映射文件簡單實例

import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

public class LargeMappedFiles {
    
    private static int LENGTH = 0x0000FFF;

    public static void main(String[] args) throws IOException{
        MappedByteBuffer out = new RandomAccessFile("test.dat", "rw")
          .getChannel() .map(FileChannel.MapMode.READ_WRITE,
0, LENGTH); for(int i = 0; i < LENGTH; i++) { out.put((byte)'x'); } for(int i = LENGTH/2; i < LENGTH/2 + 6; i++) { System.out.print((char)out.get(i)); } } }

輸出:

xxxxxx
  • 通過RandomAccessFile類獲取FileChannel,使其具備讀寫功能。
  • 通過FileChannel的map方法,獲取MappedByteBuffer,該方法包含三個參數,MapMode映射類型、開始位置、映射總數量,意味着可以映射大文件的較小部分。
  • MappedByteBuffer是一個特殊的直接緩沖器,對該緩沖器的修改會反映到對應文件中;另外,其繼承ByteBuffer,具有ByteBuffer的所有方法。

本例中首先創建MappedByteBuffer,並設置為讀寫模式;然后往緩沖器中寫入字符x;最后在文件中間開始讀取6個字符。

2. 內存映射文件源碼

以下是FileChannel.map()方法的解釋:

 /**
     * Maps a region of this channel's file directly into memory.
     *
     * <p> A region of a file may be mapped into memory in one of three modes:
     * </p>
     *
     * <ul>
     *
     *   <li><p> <i>Read-only:</i> Any attempt to modify the resulting buffer
     *   will cause a {@link java.nio.ReadOnlyBufferException} to be thrown.
     *   ({@link MapMode#READ_ONLY MapMode.READ_ONLY}) </p></li>
     *
     *   <li><p> <i>Read/write:</i> Changes made to the resulting buffer will
     *   eventually be propagated to the file; they may or may not be made
     *   visible to other programs that have mapped the same file.  ({@link
     *   MapMode#READ_WRITE MapMode.READ_WRITE}) </p></li>
     *
     *   <li><p> <i>Private:</i> Changes made to the resulting buffer will not
     *   be propagated to the file and will not be visible to other programs
     *   that have mapped the same file; instead, they will cause private
     *   copies of the modified portions of the buffer to be created.  ({@link
     *   MapMode#PRIVATE MapMode.PRIVATE}) </p></li>
     *
     * </ul>
     *
     * <p> For a read-only mapping, this channel must have been opened for
     * reading; for a read/write or private mapping, this channel must have
     * been opened for both reading and writing.
     *
     * <p> The {@link MappedByteBuffer <i>mapped byte buffer</i>}
     * returned by this method will have a position of zero and a limit and
     * capacity of <tt>size</tt>; its mark will be undefined.  The buffer and
     * the mapping that it represents will remain valid until the buffer itself
     * is garbage-collected.
     *
     * <p> A mapping, once established, is not dependent upon the file channel
     * that was used to create it.  Closing the channel, in particular, has no
     * effect upon the validity of the mapping.
     *
     * <p> Many of the details of memory-mapped files are inherently dependent
     * upon the underlying operating system and are therefore unspecified.  The
     * behavior of this method when the requested region is not completely
     * contained within this channel's file is unspecified.  Whether changes
     * made to the content or size of the underlying file, by this program or
     * another, are propagated to the buffer is unspecified.  The rate at which
     * changes to the buffer are propagated to the file is unspecified.
     *
     * <p> For most operating systems, mapping a file into memory is more
     * expensive than reading or writing a few tens of kilobytes of data via
     * the usual {@link #read read} and {@link #write write} methods.  From the
     * standpoint of performance it is generally only worth mapping relatively
     * large files into memory.  </p>
     *
     * @param  mode
     *         One of the constants {@link MapMode#READ_ONLY READ_ONLY}, {@link
     *         MapMode#READ_WRITE READ_WRITE}, or {@link MapMode#PRIVATE
     *         PRIVATE} defined in the {@link MapMode} class, according to
     *         whether the file is to be mapped read-only, read/write, or
     *         privately (copy-on-write), respectively
     *
     * @param  position
     *         The position within the file at which the mapped region
     *         is to start; must be non-negative
     *
     * @param  size
     *         The size of the region to be mapped; must be non-negative and
     *         no greater than {@link java.lang.Integer#MAX_VALUE}
     *
     * @return  The mapped byte buffer
     *
     * @throws NonReadableChannelException
     *         If the <tt>mode</tt> is {@link MapMode#READ_ONLY READ_ONLY} but
     *         this channel was not opened for reading
     *
     * @throws NonWritableChannelException
     *         If the <tt>mode</tt> is {@link MapMode#READ_WRITE READ_WRITE} or
     *         {@link MapMode#PRIVATE PRIVATE} but this channel was not opened
     *         for both reading and writing
     *
     * @throws IllegalArgumentException
     *         If the preconditions on the parameters do not hold
     *
     * @throws IOException
     *         If some other I/O error occurs
     *
     * @see java.nio.channels.FileChannel.MapMode
     * @see java.nio.MappedByteBuffer
     */
    public abstract MappedByteBuffer map(MapMode mode,
                                         long position, long size)
        throws IOException;
  • 該方法直接將通道對應文件的一部分映射到內存,並返回MappedByteBuffer
  • 有3種模式:READ_ONLY(只讀)、READ_WRITE(讀寫)、PRIVATE(私有,用於copy-on-write)
  • MappedByteBuffer一旦建立,就與創建它的通道無關,即通道關閉時,不影響該緩沖器
  • 內存映射需要依賴於底層操作系統;另外,對大部分操作系統,內存映射要比直接讀寫昂貴,故一般都映射較大的文件。
  • 該方法的參數包括讀寫模式(由FileChannel內部類MapMode定義,如下)、開始位置position、映射大小size
/**
     * A typesafe enumeration for file-mapping modes.
     *
     * @since 1.4
     *
     * @see java.nio.channels.FileChannel#map
     */
    public static class MapMode {

        /**
         * Mode for a read-only mapping.
         */
        public static final MapMode READ_ONLY
            = new MapMode("READ_ONLY");

        /**
         * Mode for a read/write mapping.
         */
        public static final MapMode READ_WRITE
            = new MapMode("READ_WRITE");

        /**
         * Mode for a private (copy-on-write) mapping.
         */
        public static final MapMode PRIVATE
            = new MapMode("PRIVATE");

        private final String name;

        private MapMode(String name) {
            this.name = name;
        }

        /**
         * Returns a string describing this file-mapping mode.
         *
         * @return  A descriptive string
         */
        public String toString() {
            return name;
        }

    }

3. 文件加鎖

JDK1.4引入文件加鎖機制,允許同步訪問共享資源文件。文件鎖對其他操作系統進程是可見的,因為Java的文件加鎖直接映射到本地操作系統的加鎖工具。

可以通過FileChannel的tryLock()和lock()方法獲取整個文件的FileLock。接口如下,tryLock()是非阻塞的,如果不能獲取鎖,則返回null;lock()是阻塞的,一直等待文件鎖。另外,SocketChannel、DatgramChannel、ServerSocketChannel不需要加鎖,因為他們是從單進程實體繼承而來;並且通常不會在兩個進程間共享socket。

public abstract FileLock lock(long position, long size, boolean shared)
        throws IOException;

public final FileLock lock() throws IOException {
        return lock(0L, Long.MAX_VALUE, false);
    }

public abstract FileLock tryLock(long position, long size, boolean shared)
        throws IOException;

public final FileLock tryLock() throws IOException {
        return tryLock(0L, Long.MAX_VALUE, false);
    }

FileLock是對文件某區域進行標識的(A token representing a lock on a region of a file.),可以通過FileChannel和AsynchronousFileChannel的加鎖方法創建,包含四個成員:

public abstract class FileLock implements AutoCloseable {

    private final Channel channel;
    private final long position;
    private final long size;
    private final boolean shared;

加鎖區域由size-position決定,不會根據文件大小變化而變化。shared為共享鎖和排它鎖標識。

對映射文件的部分加鎖

文件映射通常應用於極大的文件,對其一部分進行加鎖,其他進程可以對其他部分文件進行操作。數據庫就是這樣,多個用戶可以同時訪問。下邊用2個線程分別對文件不同部分加鎖。

import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import java.nio.channels.FileLock;

/**
 * 對映射文件加鎖
 * 映射文件:MappedByteBuffer out = fc.map(MapMode.READ_WRITE, 0, LENGTH);
 * 加鎖:FileLock fl = fc.lock(start, end, false);    fl.release();
 * @author bob
 *
 */
public class LockingMappedFIles {
    
    private static final int LENGTH = 0x0000FFF;//128M
    static FileChannel fc;

    public static void main(String[] args) throws IOException{
        //1.獲取讀寫FileChannel
        fc = new RandomAccessFile("data.dat", "rw").getChannel();
        //2.根據FileChannel獲取MappedByteBuffer,讀寫模式、全文件
        MappedByteBuffer out = fc.map(MapMode.READ_WRITE, 0, LENGTH);
        //3.寫入字符  x
        for(int i = 0; i < LENGTH; i++) {
            out.put((byte)'x');
        }
        //4.啟動線程1,對文件的前1/3加鎖,通過緩沖器操作文件
        Thread thread1 = new LockAndModify(out, 0, LENGTH/3);
        thread1.start();
        //5.啟動線程2,對文件的后1/3加鎖,通過緩沖器操作文件
        Thread thread2 = new LockAndModify(out, LENGTH*2/3, LENGTH);
        thread2.start();

    }
    
    static class LockAndModify extends Thread{
        private ByteBuffer byteBuffer;
        private int start, end;
        
        public LockAndModify(ByteBuffer byteBuffer, int start, int end) {
            //記錄加鎖位置的起始位置
            this.start = start;
            this.end = end;
            /**
             * 1. 設置MappedByteBuffer的position和limit
             * 2. 調slice()方法,創建新ByteBuffer,映射原ByteBuffer;其position為0,limit為緩沖器容量
             *       由slice()方法創建的ByteBuffer是直接的、只可讀的
             *       修改會映射到原ByteBuffer中
             * 3. 另外,limit 和 position不可顛倒順序,否則position可能比limit大,報錯
             */
            byteBuffer.limit(end);
            byteBuffer.position(start);
            this.byteBuffer = byteBuffer.slice();
        }
        
        public void run() {
            try {
                //加排它鎖
                FileLock fl = fc.lock(start, end, false);
                System.out.println("Locked: " + start + " to " + end);
                //修改內容
                while(byteBuffer.position() < byteBuffer.limit()+1) {
                    byteBuffer.put(byteBuffer.position(), (byte)(byteBuffer.get()+1));
                }
                fl.release();
                System.out.println("release: " + start + " to " + end);
            } catch (Exception e) {
                // TODO: handle exception
            }
        }
    }
}

運行結果,文件data.bat中前1/3和后1/3字符變為y。

4. 內存映射文件性能比普通NIO好

(1)內存映射文件和標准IO操作最大的不同之處就在於它雖然最終也是要從磁盤讀取數據,但是它並不需要將數據讀取到OS內核緩沖區,而是直接將進程的用戶私有地址空間中的一部分區域與文件對象建立起映射關系,就好像直接從內存中讀、寫文件一樣,速度當然快.

(2)MappedByteBuffer是一種特殊的直接緩沖器,他們相比基礎的 IO操作來說就是少了中間緩沖區的數據拷貝開銷。同時他們屬於JVM堆外內存,不受JVM堆內存大小的限制。

(3)ByteBuffer.allocateDirect() ,通過DirectMemory的方式來創建直接緩沖區,他在內存上分配空間,與-Xmx和-XX:MaxDirectMemorySize有關,不能超過最大值

具體參考文章:JAVA NIO之淺談內存映射文件原理與DirectMemory

文章內存映射文件原理探索介紹了數據從磁盤到內存的拷貝過程。

總結

1.內存文件映射主要用於極大文件的訪問和操作,可提高性能;

2. 內存映射文件通過通道創建,可設置讀寫模式和限制映射區域;

3. 對文件某區域加鎖可實現多線程或進程對共享資源文件不同區域並發修改;

4. MappedByteBuffer是一種特殊的直接緩沖器,對其修改會反映到文件中。

5. 通過內存映射的方式,性能要比I/O流好,原因是mmap()將文件直接映射到用戶空間,減少從磁盤讀到內核空間的步驟。

參考

《Java核心編程》

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM