qemu對虛擬機的內存管理（一）

本文轉載自查看原文 2018-08-15 16:36 2136 qemu

在分析了KVM中對虛擬機各級地址（gva->gpa->hva->hpa）的轉換之后，想要知道qemu中又是如何完成各級地址轉換的，因此對qemu中對虛擬機內存管理的相關數據結構與源碼進行了分析。qemu中對於虛擬機內存管理涉及的數據結構較多，僅gpa->hpa的轉換過程涉及的數據結構就有：MemoryRegion, AddressSpace, MemoryRegionSection, Flatview, FlatRange, RAMBlock, RAMList等。

這幾個數據結構的關系剛接觸時有些混亂，以下試圖從gpa到hva的轉換來整理這幾個數據結構之間的關系。

qemu源碼版本為qemu-2.8.0

一、MemoryRegion

QEMU通過MemoryRegion來管理虛擬機內存，通過內存屬性，GUEST物理地址等特點對內存分類，就形成了多個MemoryRegion，這些MemoryRegion 通過樹狀組織起來，掛接到根MemoryRegion下。每個MemoryRegion樹代表了一類作用的內存，如系統內存空間(system_memory)或IO內存空間(system_io),這兩個是qemu中的兩個全局MemoryRegion。

struct MemoryRegion {
    Object parent_obj;

    /* All fields are private - violators will be prosecuted */

    /* The following fields should fit in a cache line */
    bool romd_mode;
    bool ram;
    bool subpage;
    bool readonly; /* For RAM regions */
    bool rom_device;
    bool flush_coalesced_mmio;
    bool global_locking;
    uint8_t dirty_log_mask;
    RAMBlock *ram_block; //指向對應的RAMBlock
    Object *owner;
    const MemoryRegionIOMMUOps *iommu_ops;

    const MemoryRegionOps *ops;
    void *opaque;
    MemoryRegion *container; //指向父MR
    Int128 size; //區域大小
    hwaddr addr; //在父MR中的偏移量
    void (*destructor)(MemoryRegion *mr);
    uint64_t align;
    bool terminates;
    bool ram_device;
    bool enabled;
    bool warning_printed; /* For reservations */
    uint8_t vga_logging_count;
    MemoryRegion *alias; //指向實體MR
    hwaddr alias_offset;// 起始地址 (GPA) 在實體 MemoryRegion 中的偏移量
    int32_t priority;
    QTAILQ_HEAD(subregions, MemoryRegion) subregions; //子區域鏈表頭
    QTAILQ_ENTRY(MemoryRegion) subregions_link; //子區域鏈表結點
    QTAILQ_HEAD(coalesced_ranges, CoalescedMemoryRange) coalesced;
    const char *name;
    unsigned ioeventfd_nb;
    MemoryRegionIoeventfd *ioeventfds;
    QLIST_HEAD(, IOMMUNotifier) iommu_notify;
    IOMMUNotifierFlag iommu_notify_flags;
};

MemoryRegion 表示在 Guest memory layout 中的一段內存，可將 MemoryRegion 划分為以下三種類型：

根級 MemoryRegion: 直接通過 memory_region_init 初始化，沒有自己的內存，用於管理 subregion。如 system_memory
實體 MemoryRegion: 通過 memory_region_init_ram 初始化，有自己的內存 (從 QEMU 進程地址空間中分配)，大小為 size 。如 ram_memory(pc.ram) 、 pci_memory(pci) 等。這種MemoryRegion中真正的分配物理內存，最主要的就是pc.ram和pci。分配的物理內存的作用分別是內存、PCI地址空間以及fireware空間。QEMU是用戶空間代碼，分配的物理內存返回的是hva，hva保存至RAMBlock的host域。通過實體MemoryRegion對應的RAMBlock可以管理HVA。
別名 MemoryRegion: 通過 memory_region_init_alias 初始化，沒有自己的內存，表示實體 MemoryRegion(如 pc.ram) 的一部分，通過 alias 成員指向實體 MemoryRegion，alias_offset 代表了該別名MemoryRegion所代表內存起始GPA相對於實體 MemoryRegion 所代表內存起始GPA的偏移量。如 ram_below_4g 、ram_above_4g 等。

代碼中常見的 MemoryRegion 關系為：

                   alias
ram_memory (pc.ram) - ram_below_4g(ram-below-4g)
                    - ram_above_4g(ram-above-4g)

                     sub
system_memory(system) - ram_below_4g(ram-below-4g)
                      - ram_above_4g(ram-above-4g)
                      - pcms->hotplug_memory.mr        熱插拔內存

實際上虛擬機的ram申請時是一次性申請的一個完成的ram，記錄在一個MR中，之后又對此ram按照size進行了划分，形成subregion,而subregion 的alias便指向原始的MR，而alias_offset 便是在原始ram中的偏移。對於系統地址空間的ram，會把剛才得到的subregion注冊到系統中，父MR是剛才提到的全局MR system_memory,subregions_link是鏈表節點。addr是子MR相對於父MR的偏移，在函數pc_memory_init()函數中有對實體MemoryRegion和別名MemoryRegion的初始化：

void pc_memory_init(PCMachineState *pcms,
                    MemoryRegion *system_memory,
                    MemoryRegion *rom_memory,
                    MemoryRegion **ram_memory)
{
    int linux_boot, i;
    MemoryRegion *ram, *option_rom_mr;
    MemoryRegion *ram_below_4g, *ram_above_4g;
    FWCfgState *fw_cfg;
    MachineState *machine = MACHINE(pcms);
    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);

    assert(machine->ram_size == pcms->below_4g_mem_size +
                                pcms->above_4g_mem_size);

    linux_boot = (machine->kernel_filename != NULL);

    /* Allocate RAM.  We allocate it as a single memory region and use
     * aliases to address portions of it, mostly for backwards compatibility
     * with older qemus that used qemu_ram_alloc().
     */
    ram = g_malloc(sizeof(*ram));
    memory_region_allocate_system_memory(ram, NULL, "pc.ram",
                                         machine->ram_size); //初始化實體MR pc.ram
    *ram_memory = ram;
    ram_below_4g = g_malloc(sizeof(*ram_below_4g));
    memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
                             0, pcms->below_4g_mem_size); //初始化別名MR ram_below_4g，將其alias指向ram，alias_offset為0
    memory_region_add_subregion(system_memory, 0, ram_below_4g); //將別名MRram_below_4g添加為system_memory的subregion,設置偏移addr為0
    e820_add_entry(0, pcms->below_4g_mem_size, E820_RAM);
    if (pcms->above_4g_mem_size > 0) {
        ram_above_4g = g_malloc(sizeof(*ram_above_4g));
        memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
                                 pcms->below_4g_mem_size,
                                 pcms->above_4g_mem_size); //初始化別名MR ram_above_4g，將其alias指向ram，alias_offset為below_4g_mem_size
        memory_region_add_subregion(system_memory, 0x100000000ULL,
                                    ram_above_4g);//同上述ram_below_4g，初始化並添加ram_above_4g，設置偏移addr為0x100000000ull,即4g
        e820_add_entry(0x100000000ULL, pcms->above_4g_mem_size, E820_RAM);
    }

void memory_region_init_alias(MemoryRegion *mr,
                              Object *owner,
                              const char *name,
                              MemoryRegion *orig,
                              hwaddr offset,
                              uint64_t size)
{
    memory_region_init(mr, owner, name, size);
    mr->alias = orig; //別名MR的alias指向原實體MR
    mr->alias_offset = offset; //alias_offset表示偏移
}
static void memory_region_add_subregion_common(MemoryRegion *mr,
                                               hwaddr offset,
                                               MemoryRegion *subregion)
{
    assert(!subregion->container);
    subregion->container = mr;
    subregion->addr = offset; //將addr設置為offset
    memory_region_update_container_subregions(subregion);
}

void memory_region_add_subregion(MemoryRegion *mr,
                                 hwaddr offset,
                                 MemoryRegion *subregion)
{
    subregion->priority = 0;
    memory_region_add_subregion_common(mr, offset, subregion);
}

可見subregion的addr即為相對於父MR的偏移，對於ram_below_4g，addr為0，對於ram_above_4g,偏移則為4g,而alias_offset為相對於實體MR的偏移量，對於ram_below_4g，alias_offset為0，對於ram_above_4g，alias_offset為ram_below_4g_size，即為4g。

二、RAMBlock

上面提到了qemu為虛擬機分配的內存的hva保存在RAMblock的host域，RAMBlock的定義如下：

struct RAMBlock {
    struct rcu_head rcu;                                        // 用於保護 Read-Copy-Update
    struct MemoryRegion *mr;                                    // 對應的 MemoryRegion
    uint8_t *host;                                              // 對應的 HVA
    ram_addr_t offset;                                          // 在 ram_list 地址空間中的偏移 (要把前面 block 的 size 都加起來)
    ram_addr_t used_length;                                     // 當前使用的長度
    ram_addr_t max_length;                                      // 總長度
    void (*resized)(const char*, uint64_t length, void *host);  // resize 函數
    uint32_t flags;
    /* Protected by iothread lock.  */
    char idstr[256];                                            // id
    /* RCU-enabled, writes protected by the ramlist lock */
    QLIST_ENTRY(RAMBlock) next;                                 // 指向在 ram_list.blocks 中的下一個 block
    int fd;                                                     // 映射文件的文件描述符
    size_t page_size;                                           // page 大小，一般和 host 保持一致
};

一個RAMBlock表示一段虛擬內存，host域指向申請的ram的虛擬地址，即hva。所有的RAMBlock通過next字段連接起來，表頭保存在全局RAMList中，offset表示當前RAMBlock在RAMList中的偏移。每個RAMBlock都有一個唯一的MemoryRegion對應，但需要注意的是不是每個MemoryRegion都有RAMBlock對應。

在函數pc_memory_init()中為實體memoryregion分配內存時，調用了函數memory_region_allocate_system_memory()，非numa架構下調用函數allocate_system_memory_nonnuma(),繼而調用memory_region_init_ram_from_file():

#ifdef __linux__
void memory_region_init_ram_from_file(MemoryRegion *mr,
                                      struct Object *owner,
                                      const char *name,
                                      uint64_t size,
                                      bool share,
                                      const char *path,
                                      Error **errp)
{
    memory_region_init(mr, owner, name, size);
    mr->ram = true;
    mr->terminates = true;
    mr->destructor = memory_region_destructor_ram;
    mr->ram_block = qemu_ram_alloc_from_file(size, mr, share, path, errp); //實體MR指向的RAM_BLOCK為qemu_ram_alloc_from_file函數返回的RAMBlock
    mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0;
}
#endif

函數 qemu_ram_alloc_from_file()中申請並設置RAMBlock，RAMBlock->host 為函數file_ram_alloc()函數的返回值，該函數使用對應路徑的（設備）文件來分配內存，調用qemu_ram_mmap()通過mmap方式進行內存分配，可見RAMBlock->host 則為分配的內存的hva的起始地址。

static void *file_ram_alloc(RAMBlock *block,
                            ram_addr_t memory,
                            const char *path,
                            Error **errp)
{
    ......
    area = qemu_ram_mmap(fd, memory, block->mr->align,
                         block->flags & RAM_SHARED);//通過mmap在qemu的進程地址空間中進行地址分配
    if (area == MAP_FAILED) {
        error_setg_errno(errp, errno,
                         "unable to map backing store for guest RAM");
        goto error;
    }

上述為對實體MemoryRegion “pc.ram” 內存的分配，在為別名MemoryRegion“ram-below-4g”和“ram-above-4g”初始化時調用的是函數memory_region_init_alias(), 該函數調用memory_region_init()

void memory_region_init(MemoryRegion *mr,
                        Object *owner,
                        const char *name,
                        uint64_t size)
{
    object_initialize(mr, sizeof(*mr), TYPE_MEMORY_REGION);
    mr->size = int128_make64(size);
    if (size == UINT64_MAX) {
        mr->size = int128_2_64();
    }
    mr->name = g_strdup(name);
    mr->owner = owner;
    mr->ram_block = NULL; //別名MR的ram_block設置為null
    .......
}

在該函數中將別名MR的ram_block設置為NULL，而“pc.ram”指向的ram_block是有內容的，可見不是所有的MemoryRegion都有對應的RAMBlock，對於分配的RAMBlock，最后會將其插入到全局鏈表RAMList中。

上述對結構體MemoryRegion和RAMblock的分析可知，對於系統內存而言（不考慮io）實體MemoryRegion是有具體內存的，而別名MemoryRegion是對實體MR不同分段的一個指向，其alias指向實體MR。別名MR都是根級MR system_memory的subregion，通過RAMBlock，可以知道一個MemoryRegion對應內存的hva，其關系大致如下：

三、AddressSpace

從GPA與hva的角度來看，如果以結構體MemoryRegion為核心的話，RAMBlock可以看成是對該片內存區域hva的關聯，而AddressSpace在我看來可以看做是對該片內存區域GPA的一個關聯，從其注釋AddressSpace: describes a mapping of addresses to #MemoryRegion objects也可看出。

這里我有一個疑問：在qemu-2.3.0版本的源碼中，結構體MemoryRegion中有一個變量ram_addr表示該片內存區域的GPA的起始地址，而在qemu-2.8.0中，結構體MemoryRegion中沒有了這個變量。猜想對於實體MR而言，其addr變量是否就表示為該片內存區域GPA的起始地址，如果是的話，那么對於subregion而言，其alias_offset加上實體addr即可表示該片MemoryRegion的GPA起始地址，加上實體MR對應的RAMBlock，應該就可以實現GPA到HVA的映射了，那么AddressSpace的作用又是什么，其意義何在？先提出這個疑問，看看后續能否得到解答。

/**
 * AddressSpace: describes a mapping of addresses to #MemoryRegion objects
 */
struct AddressSpace {
    /* All fields are private. */
    struct rcu_head rcu;
    char *name;
    MemoryRegion *root; //指向根MR
    int ref_count;
    bool malloced;

    /* Accessed via RCU.  */
    struct FlatView *current_map;                               // 指向當前維護的 FlatView，在 address_space_update_topology 時作為 old 比較

    int ioeventfd_nb;
    struct MemoryRegionIoeventfd *ioeventfds;
    struct AddressSpaceDispatch *dispatch;                      // 負責根據 GPA 找到 HVA
    struct AddressSpaceDispatch *next_dispatch;
    MemoryListener dispatch_listener;
    QTAILQ_HEAD(memory_listeners_as, MemoryListener) listeners;
    QTAILQ_ENTRY(AddressSpace) address_spaces_link;
};

結構體AddressSpace用來表示虛擬機的一片地址空間，不同的設備使用的地址空間不同，但qemu x86中只有兩種， address_space_memory和address_space_io,這也是兩個全局的address_space變量，所有設備的地址空間都被映射到了這兩個上面。其root指向根MemoryRegion, 對於全局變量address_space_memory而言，其root指向系統全局的system_memory，address_space_io的root則指向system_io.由於根MR可能有自己的若干個subregion，因此每個AddressSpace一般包含一系列MemoryRegion，形成樹狀結構。

AddressSpace中的current_map指向當前維護的FlatView:

/*
 * Note that signed integers are needed for negative offsetting in aliases
 * (large MemoryRegion::alias_offset).
 */
struct AddrRange {
    Int128 start; //起始
    Int128 size; //大小
};

/* Range of memory in the global map.  Addresses are absolute. */
struct FlatRange {
    MemoryRegion *mr; //指向所屬的MR
    hwaddr offset_in_region; //在MR中的offset
    AddrRange addr; //本FR代表的區間
    uint8_t dirty_log_mask;
    bool romd_mode;
    bool readonly;
};

/* Flattened global view of current active memory hierarchy.  Kept in sorted
 * order.
 */
struct FlatView {
    struct rcu_head rcu;
    unsigned ref; //引用計數，為0就銷毀
    FlatRange *ranges; //對應的flatrange數組
    unsigned nr; //flatrange數目
    unsigned nr_allocated;
};

FlatView管理MR展開后得到的所有FlatRange，ranges是一個數組，記錄FlatView下所有的FlatRange，每個FlatRange對應一段虛擬機物理地址區間，各個FlatRange不會重疊，按照地址的順序保存在數組中。具體的范圍由一個AddrRange結構描述，其描述了地址和大小。當memory region發生變化的時候，執行memory_region_transaction_commit，address_space_update_topology，address_space_update_topology_pass最終完成更新FlatView的目標。

FlatView結構如下，圖源見水印：

由圖片可知每個FlatRange的中的AddrRange的start為該段內存區間GPA的首地址，size則描述了該段區間的大小。那么結構體FlatRange中的offset_in_region是什么，是該flatrange相對於所屬MR的offset？

與flatrange對應的是MemoryRegionSection：

/**
 * MemoryRegionSection: describes a fragment of a #MemoryRegion
 *
 * @mr: the region, or %NULL if empty
 * @address_space: the address space the region is mapped in
 * @offset_within_region: the beginning of the section, relative to @mr's start
 * @size: the size of the section; will not exceed @mr's boundaries
 * @offset_within_address_space: the address of the first byte of the section
 *     relative to the region's address space
 * @readonly: writes to this section are ignored
 */
    struct MemoryRegionSection {
    MemoryRegion *mr;                           // 指向所屬 MemoryRegion
    AddressSpace *address_space;                // 所屬 AddressSpace
    hwaddr offset_within_region;                // 起始地址 (HVA) 在 MemoryRegion 內的偏移量
    Int128 size;
    hwaddr offset_within_address_space;         // 在 AddressSpace 內的偏移量，如果該 AddressSpace 為系統內存，則為 GPA 起始地址
    bool readonly;
};

MemoryRegionSection 指向 MemoryRegion 的一部分 ([offset_within_region, offset_within_region + size])，是注冊到 KVM 的基本單位。

將 AddressSpace 中的 MemoryRegion 映射到線性地址空間后，由於重疊的關系，原本完整的 region 可能會被切分成片段，於是產生了 MemoryRegionSection。

其中偏移offset_within_region描述的是該section在其所屬的MR中的偏移，一個address_space可能有多個MR構成，因此該offset是局部的。而offset_within_address_space是在整個地址空間中的偏移，是全局的offset，如果AddressSpace為系統內存，則該偏移則為GPA的起始地址。

到這里，借助函數kvm_set_phys_mem()中組裝kvmslot，並通過kvm_userspace_memory_region將qemu的內存分布信息傳遞給kvm的部分過程整理一下上述數據結構中GPA到HVA的對應關系：

static void kvm_set_phys_mem(KVMMemoryListener *kml,
                             MemoryRegionSection *section, bool add)
{
    KVMState *s = kvm_state;
    KVMSlot *mem, old;
    int err;
    MemoryRegion *mr = section->mr;
    bool writeable = !mr->readonly && !mr->rom_device;
    hwaddr start_addr = section->offset_within_address_space; //獲取GPA
    ram_addr_t size = int128_get64(section->size);
    void *ram = NULL;
    unsigned delta;

    /* kvm works in page size chunks, but the function may be called
       with sub-page size and unaligned start address. Pad the start
       address to next and truncate size to previous page boundary. */
    delta = qemu_real_host_page_size - (start_addr & ~qemu_real_host_page_mask);
    delta &= ~qemu_real_host_page_mask;
    if (delta > size) {
        return;
    }
    start_addr += delta; //頁對齊修正
    size -= delta;
    size &= qemu_real_host_page_mask;
    if (!size || (start_addr & ~qemu_real_host_page_mask)) {
        return;
    }

    if (!memory_region_is_ram(mr)) {
        if (writeable || !kvm_readonly_mem_allowed) {
            return;
        } else if (!mr->romd_mode) {
            /* If the memory device is not in romd_mode, then we actually want
             * to remove the kvm memory slot so all accesses will trap. */
            add = false;
        }
    }

    ram = memory_region_get_ram_ptr(mr) + section->offset_within_region + delta; //獲取hva
  .......
}

GPA：在該函數中傳入的參數為MemoryRegionSection，根據region section在AddressSpace中的偏移，即offset_within_address_space，加上頁對齊修正（delta）得到該section的GPA，填入start_addr。

HVA: hva是通過該section所屬的MR的起始HVA + 該region section在所屬MR中的偏移量（offset_within_region）+頁對齊修正（delta）得到。

該region section所屬MR的起始HVA通過函數memory_region_get_ram_ptr()得到，該函數內容如下：

void *memory_region_get_ram_ptr(MemoryRegion *mr)
{
    void *ptr;
    uint64_t offset = 0;

    rcu_read_lock();
    while (mr->alias) { //追溯到實體MR為止
        offset += mr->alias_offset;
        mr = mr->alias;
    }
    assert(mr->ram_block);
    ptr = qemu_map_ram_ptr(mr->ram_block, offset); //實體MR有對應的RAMBlock
    rcu_read_unlock();

    return ptr;
}

void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t addr)
{
    RAMBlock *block = ram_block;

    if (block == NULL) {
        block = qemu_get_ram_block(addr);
        addr -= block->offset;
    }

    if (xen_enabled() && block->host == NULL) {
        /* We need to check if the requested address is in the RAM
         * because we don't want to map the entire memory in QEMU.
         * In that case just map until the end of the page.
         */
        if (block->offset == 0) {
            return xen_map_cache(addr, 0, 0);
        }

        block->host = xen_map_cache(block->offset, block->max_length, 1);
    }
    return ramblock_ptr(block, addr);
}

static inline void *ramblock_ptr(RAMBlock *block, ram_addr_t offset)
{
    assert(offset_in_ramblock(block, offset));
    return (char *)block->host + offset; //hva的起始地址加上所有偏移得到最終hva
}

在 memory_region_get_ram_ptr 中，如果當前MR是另一個MR的 alias，則會向上追溯，一直追溯到非 alias region(實體 region) 為止。將追溯過程中的 alias_offset 加起來，可以得到當前 region 在實體 region 中的偏移量。由於實體 region 具有對應的 RAMBlock，所以調用函數 qemu_map_ram_ptr ，將實體 region 對應的 RAMBlock 的 host 和總 offset 加起來，得到當前 region 的起始 HVA。

在函數qemu_map_ram_ptr()中，如果傳入的ram_block為空，還可以根據當前region在實體region中的偏移量找到對應的ramblock，其調用qemu_get_ram_block（）

static RAMBlock *qemu_get_ram_block(ram_addr_t addr)
{
    RAMBlock *block;

    block = atomic_rcu_read(&ram_list.mru_block);//首先看是不是處於最近使用的block中
    if (block && addr - block->offset < block->max_length) { //addr即為當前region相對於實體region的offset，若offset-當前block.offset小於該block的大小，說明該region對應的內存處於該block中
        return block;
    }
    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { //不在最近使用的block中，則遍歷RAMList的所有block
        if (addr - block->offset < block->max_length) {
            goto found;
        }
    }

    fprintf(stderr, "Bad ram offset %" PRIx64 "\n", (uint64_t)addr);
    abort();
    .......
}

由於每一段內存都對應一個RAMblock，通過當前region相對於實體region的offset可以知道這段內存的大小，如果該段大小減去某個RAMBlock的offset小於該block的size，說明該段內存對應的hva在這段block中，否則則查找下一個。比如第一個block的offset為0，如果addr小於該block的大小，那么該block就是這段內存區域對應的block。

一些猜想及疑問：

1、虛擬機的GPA是從0開始的，由系統內存的初始化過程可以看出（不考慮io），初始時分配了一整片內存“pc.ram”及對應的RAMBlock，因此猜想MemoryRegion “pc.ram”的addr為起始GPA，即為0，其他region到該實體region的各級alias_offset之和應該就是該region的起始GPA。又MemoryRegionSection中的offset_within_address_space表示在所屬AddressSpace中的偏移量，若該AS為系統內存，則為GPA的起始地址。那么各級subregion的alias_offset相加，再加上實體MR的addr是否就等於MemoryRegionSection中的offset_within_address_space。個人感覺應該是，但不確定，可通過實驗進行相關驗證。

2、若上述猜想是對的，那么由MemoryRegion及RAMBlock即可得到GPA到HVA的對應關系，那么之前提出的疑問：AddressSpace的意義何在？分析qemu的源碼可知AddressSpace綁定了相關listener，當發生變化時會觸發相關的listener，不能單從GPA到HVA的映射來考慮AddressSpace的意義。兩個全局的AddressSpace（address_space_io，address_space_memory）串起了屬於系統內存和io內存的所有memoryRegion，當內存發生變化時，會觸發相關listener。所以個人認為AddressSpace可以更好地對不同級別的MemoryRegion進行管理，而不需要為各個MemoryRegion注冊綁定listener。且由源碼可以看出，MemoryRegion的偏移更偏向於應用得到該region對應於起始hva的偏移，從而計算該region的起始hva，而AddressSpace更偏向於應用於得到起始GPA。（若實體MR的addr為起始GPA，那么該MR到實體MR的偏移之和也可以用於得到該region的起始GPA，但源碼中並沒有應用此種方式，因為AddressSpace中的相關變量已經可以表示起始GPA了）

3、結構體FlatRange.addr.start就可以表示該段FlatRange的起始GPA,那么該結構體中的offset_in_region是什么，是其相對於所屬的MR的offset，其意義又是什么？該問題從函數listener_add_address_space()中可以得到一些解答：

static void listener_add_address_space(MemoryListener *listener,
                                       AddressSpace *as)
{
    FlatView *view;
    FlatRange *fr;
    .......

    view = address_space_get_flatview(as);//獲取as中的flatview
    FOR_EACH_FLAT_RANGE(fr, view) { //遍歷flatview中的每個flatrange
        MemoryRegionSection section = { //新建一個memoryregionsection 並進行賦值
            .mr = fr->mr,
            .address_space = as,
            .offset_within_region = fr->offset_in_region, 
            .size = fr->addr.size,
            .offset_within_address_space = int128_get64(fr->addr.start),
            .readonly = fr->readonly,
        };     
    ......
}

由上述代碼也可以看出FlatRange和MemoryRegionSection的對應關系，MemoryRegionSection中的offset_within_region即為FlatRange的offset_in_region，因此均表示為在所屬MR中的偏移，若所屬MR為全局MR，則表示為在全局MR中的偏移。同樣的，MemoryRegionSection中的offset_within_address_space即為FlatRange.addr.start，表示GPA的起始地址。

補充一個在虛擬機退出時如何根據GPA找到HVA：https://www.anquanke.com/post/id/86412 鏈接中的第四小節對此進行了分析，主要原理是由AddressSpaceDispatch中的6級頁表PhysPageMap實現，該頁表的最后一級指向MemoryRegionSection，由MemoryRegionSection可以得到GPA對應的MR，由此得到HVA。

后續會分析結構體AddressSpace注冊的listerner的一些操作，以及qemu如何把內存管理的信息傳至KVM中，以及如何進行視圖的更新。

以上僅是對qemu中管理虛擬機內存的一些數據結構的整理，由於個人理解及分析不夠，存在着一些疑問及猜想，難免有不對的地方，歡迎大家提出疑問，指正錯誤。

參考：https://www.cnblogs.com/ck1020/p/6729224.html

https://www.binss.me/blog/qemu-note-of-memory/

http://oenhan.com/qemu-memory-struct

https://blog.csdn.net/leoufung/article/details/48781205

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 qemu對虛擬機的內存管理（二） qemu-img管理虛擬機 [qemu] QEMU虛擬機的安裝步驟 ubuntu下使用qemu虛擬機使用QEMU創建虛擬機 qemu虛擬機網橋通訊過程 [轉]qemu安裝虛擬機 Ubuntu 虛擬機安裝qemu 深入java虛擬機學習 -- 內存管理機制 [qemu][kvm] 在一個vmware虛擬機里安裝qemu-kvm虛擬機