MIT 6.S081 2021: Lab mmap

本文轉載自查看原文 2021-11-22 23:15 920

mmap

mmap就是把指定的文件fd映射到進程內存空間的某一個部分，映射建立之后，進程讀寫這塊內存就像是在讀寫文件一樣。按照提示來做實驗：

Implement mmap: find an unused region in the process's address space in which to map the file, and add a VMA to the process's table of mapped regions. The VMA should contain a pointer to a struct file for the file being mapped; mmap should increase the file's reference count so that the structure doesn't disappear when the file is closed (hint: see filedup). Run mmaptest: the first mmap should succeed, but the first access to the mmap-ed memory will cause a page fault and kill mmaptest.

這就是mmap的設計思路了。首先是在進程的地址空間里“找到一個未使用的區域”。我們能想到的，肯定可以合法訪問的”未使用的區域“只有堆的頂部p->sz之上的地址了。lab lazy里的sbrk()函數就是通過對p->sz作修改來對進程的堆空間做伸縮操作。因此mmap的區域一定是基於p->sz進行操作的。

然后要求給進程加入一個VMA表。VMA表里面存儲了每次調用mmap()得到的映射空間的信息。每個進程都有一個VMA表，理所當然的我們應該把它放在struct proc中。VMA的結構如下：

typedef struct vma{
  struct file* mmapfile;	//mmap()函數打開的文件的指針
  struct inode* ip;			//指向上述文件的inode
  uint64 mmapaddr;			//mmap()分配的映射起始地址
  uint64 mmapend;			//映射結束的位置
  uint64 mmlength;			//映射的剩余長度。注意這個值是會變化的
  int mmprot;				//mmap()參數里指定的prot
  int mmflag;				//mmap()參數里指定的flag
  int valid;				//該項是否空閑。若不空閑則置為1，反之置為0
}vma;

把下面這個數組加到struct proc里面：

vma map_region[16];

然后開始設計sys_mmap系統調用。按照提示，mmap()不能分配頁面，不能讀入文件，這些操作必須依靠page fault執行。也就是這些操作應該寫在usertrap()里面。mmap()的主要工作就是分配地址，思路如下：

通過p->trapframe傳入mmap()的參數。注意這個lab中addr和offset就是0，不用傳參。

在map_region里面找一個空位置，先把已經確定的參數寫入這個位置。這里使用一個簡單的mapalloc()函數，如果找不到空位置就返回-1:

static int
mapalloc()
{
  int i;
  struct proc *p = myproc();
  for(i = 0; i < NOFILE; i++){
    if(p->map_region[i].valid == 0){
      p->map_region[i].valid=1;
      return i;
    }
  }
  return -1;
}

使用filedup()增加映射文件的引用數。
為映射區域找一個起始地址addr，寫入mmapaddr和mmapend，直接返回addr。

sys_mmap()代碼如下：

uint64 sys_mmap(void)
{
  struct proc *p = myproc();
  //傳入參數
  uint64 fail=(uint64)((char*)-1);
  uint64 addr;
  uint64 length=p->trapframe->a1;
  int prot=p->trapframe->a2;
  int flags=p->trapframe->a3;
  int fd=p->trapframe->a4;

  //檢查打開的文件。如果是read-only文件開啟了MAP_SHARED，則必須返回錯誤
  if((p->ofile[fd]->writable)==0 && (flags&MAP_SHARED)&&(prot&PROT_WRITE)){
    return fail;
  }

  //在map_region里面找到一個空位
  int idx=mapalloc();
  //printf("%d idx\n",idx);
  //初始化
  p->map_region[idx].mmlength=length;
  p->map_region[idx].mmprot=prot;
  p->map_region[idx].mmflag=flags;
  p->map_region[idx].mmapfile=p->ofile[fd];
  p->map_region[idx].ip=p->ofile[fd]->ip;
  //file ref++
  filedup(p->ofile[fd]);

  //尋找一個地址
  addr=PGROUNDUP(p->sz);
  p->sz+=PGROUNDUP(length);
  //p確定mmap的范圍
  p->map_region[idx].mmapaddr=addr;
  p->map_region[idx].mmapend=addr+PGROUNDUP(length);
  //printf("mmap range %p---%p\n",p->map_region[idx].mmapaddr,p->map_region[idx].mmapend);
  return addr;
}

這里說明一下：

mmaptest會檢查對只讀文件的映射這種情況。如果一個文件以O_RDONLY打開，那么如果同時開啟MAP_SHARED和PROT_WRITE就意味着：映射區域可寫，修改過的映射區域需要寫回文件，這和O_RDONLY是沖突的，必須返回錯誤值0xffffffffffffffff。
如何尋找映射地址：這里使用PGROUNDUP(p->sz)作為映射的起始地址addr。為映射分配PGROUNDUP(length)個字節的空間，這樣的話映射空間的地址就可以對齊頁表。然后立刻把p->sz加上PGROUNDUP(length)，不能拖到usertrap()里面再加。否則，當進程連續多次調用mmap()時，mmap()每次分配的會是相同的起始地址，所有映射空間會互相覆蓋。

現在設計page fault的機制。這里仿照lab lazy的解決方案即可：

void
usertrap(void)
{
  int which_dev = 0;

  if((r_sstatus() & SSTATUS_SPP) != 0)
    panic("usertrap: not from user mode");

  // send interrupts and exceptions to kerneltrap(),
  // since we're now in the kernel.
  w_stvec((uint64)kernelvec);

  struct proc *p = myproc();
  
  // save user program counter.
  p->trapframe->epc = r_sepc();
  
  if(r_scause() == 8){
    // system call

    if(p->killed)
      exit(-1);

    // sepc points to the ecall instruction,
    // but we want to return to the next instruction.
    p->trapframe->epc += 4;

    // an interrupt will change sstatus &c registers,
    // so don't enable until done with those registers.
    intr_on();

    syscall();
  } else if((which_dev = devintr()) != 0){
    // ok
  } 
  else if(r_scause()==13||r_scause()==15)
  {
    uint64 stval=r_stval();
    
    //找到stval在哪個map區域里面    
    int idx=findmap(stval);
   
    if(idx>=0)
    {
      int PTEword=PTE_U;
      int prot=(p->map_region[idx]).mmprot;
      uint64 length=(p->map_region[idx]).mmlength;
      //設置關鍵字
      if(prot&PROT_READ)
        PTEword|=PTE_R;
      if(prot&PROT_WRITE)
        PTEword|=PTE_W;      
      if(prot&PROT_EXEC)
        PTEword|=PTE_X;      
      
      uint64 sz=(p->map_region[idx]).mmapaddr;
      uint64 newsz=(p->map_region[idx]).mmapend;
      if((newsz=mmapalloc(p->pagetable, sz, newsz,PTEword))==0){
        printf("allocate error");
      }
      
      struct inode* ip=p->map_region[idx].ip;
      ilock(ip);
      readi(ip,1,(p->map_region[idx]).mmapaddr,0,length);
      iunlock(ip);
      
    } 
    else{
      p->killed = 1;	//沒找到vma務必記得kill進程
    }  
  }
  else {
    printf("usertrap(): unexpected scause %p pid=%d\n", r_scause(), p->pid);
    printf("            sepc=%p stval=%p\n", r_sepc(), r_stval());
    p->killed = 1;
  }

  if(p->killed)
    exit(-1);

  // give up the CPU if this is a timer interrupt.
  if(which_dev == 2)
    yield();

  usertrapret();
}

首先是檢測page fault。r_scause()等於13或15的時候說明發生了page fault，觸發page fault的虛擬地址可以通過r_stval()獲取。出現之后先使用findmap()在vma表里搜索該地址對應的映射。findmap()的原理很簡單：傳入一個地址addr，遍歷vma表，看addr在哪個項的映射區間里即可，如果沒找到返回-1：
```
int findmap(uint64 addr)
{
  struct proc *p = myproc();
  int i;
  for(i=0;i<16;i++)
  {
    uint64 a=p->map_region[i].mmapaddr;
    uint64 b=p->map_region[i].mmapend;
    if(addr>=a && addr<b){
      return i;
    }   
  }
  return -1;
}
```
這里務必要注意一個細節：在本lab中，只有當進程訪問了mmap()的未調入文件的映射空間時，page fault才是正常的。其他任何導致page fault的情況都是異常的。因此，如果findmap返回-1，說明進程訪問了頁表里不存在的、不該訪問的地址空間，應該將其終止。（usertests里面的kernmem會檢查這種情況，我費了好大力氣才找到原因）

然后分配內存即可，這里如果產生了page fault，就直接把整個文件調入內存。稍微修改一下uvmalloc()得到mmapalloc()，它可以指定page的PTE項。mmapalloc()還有一個好處就是可以確保分配連續的虛擬內存空間。

uint64
mmapalloc(pagetable_t pagetable, uint64 oldsz, uint64 newsz, int prot)
{
  char *mem;
  uint64 a;

  if(newsz < oldsz)
    return oldsz;

  //oldsz = PGROUNDUP(oldsz);
  for(a = oldsz; a < newsz; a += PGSIZE){
    //printf("maphere\n");
    mem = kalloc();
    if(mem == 0){
      uvmdealloc(pagetable, a, oldsz);
      return 0;
    }
    memset(mem, 0, PGSIZE);

    if(mappages(pagetable, a, PGSIZE, (uint64)mem, prot) != 0){
      kfree(mem);
      uvmdealloc(pagetable, a, oldsz);
      return 0;
    }
  }
  return newsz;
}

使用readi讀入inode的數據到映射起始地址即可。注意為inode加上讀寫鎖。

munmap

munmap()需要解除mmap()的映射。繼續按照提示實現：

Implement munmap: find the VMA for the address range and unmap the specified pages (hint: use uvmunmap). If munmap removes all pages of a previous mmap, it should decrement the reference count of the corresponding struct file. If an unmapped page has been modified and the file is mapped MAP_SHARED, write the page back to the file. Look at filewrite for inspiration.

這里需要注意，munmap不一定會釋放整個映射。不過munmap()會按照從低到高的順序，從剩余映射的起始釋放：

An munmap call might cover only a portion of an mmap-ed region, but you can assume that it will either unmap at the start, or at the end, or the whole region (but not punch a hole in the middle of a region).

munmap的思路還是很容易的：先使用findmap()找到對應的vma項。如果mmap()指定了MAP_SHARED，需要用filewrite()把映射里的修改寫回文件。使用PGROUNDUP(length)/PGSIZE計算從addr開始需要釋放的頁數。

使用uvmunmap()釋放相應的頁，並將p->map_region[idx].mmlength減去length項。如果mmlength為0，說明映射已經徹底釋放了，使用fileclose()關閉對應文件，並使用memset()把vma項清零。

uint64 sys_munmap(void)
{
  struct proc *p = myproc();
  uint64 addr=p->trapframe->a0;
  uint64 length=p->trapframe->a1;
  //printf("unmap %p:addr %p:length\n",addr,length);
  int idx=findmap(addr);
  if(idx<0)
  {
    return -1;
  }
  int npages=PGROUNDUP(length)/PGSIZE;
  //如果設置了MAP_SHARED
  if(p->map_region[idx].mmflag & MAP_SHARED)
  {
    //printf("reach here1\n");
    filewrite(p->map_region[idx].mmapfile, addr, length);
  }
  //printf("reach here2\n");
  uvmunmap(p->pagetable,addr,npages,1);

  p->map_region[idx].mmlength-=length;
  if(p->map_region[idx].mmlength==0)
  {
    fileclose(p->map_region[idx].mmapfile);
  //清除表項
    memset((void*)&p->map_region[idx],0,sizeof(vma));
  }

  return 0;
}

fork

修改fork()和exit()，讓子進程擁有父進程的映射空間，實現效果是這樣的:

不過本實驗允許子進程不和父進程共享物理頁面，只需要讓兩個進程的映射空間映射到同一個文件就可以了。

所以很簡單，直接在fork()的時候把vmap表復制到子進程就可以了。注意：Don't forget to increment the reference count for a VMA's struct file. 把如下代碼插入fork()：

memmove(&np->map_region, &p->map_region,sizeof(vma)*16); 
  for(int idx=0;idx<16;idx++)
  {
    if(p->map_region[idx].valid!=0)//如果存在映射
    {
      filedup(p->map_region[idx].mmapfile);
    }
  }

還有一個問題：fork()復制頁表的時候會調用uvmcopy()來復制父進程的頁。如果只用上面的代碼，會出現panic：uvmcopy: page not present。這是因為：調用mmap之后我們已經擴大了sz，但如果沒有訪問映射地址的話，映射地址內是沒有合法的頁的，這時sz里會包含PTE_V==0的頁。uvmcopy會復制0到sz的所有頁表，因此會訪問這些尚未通過page fault載入的頁面，從而panic。

解決方法很簡單：uvmcopy檢查PTE_V的時候直接跳過，不執行復制即可。稍微修改一下uvmcopy()得到mmapcopy()，把fork()里面的uvmcopy換掉即可：

for(i = 0; i < sz; i += PGSIZE){
    if((pte = walk(old, i, 0)) == 0)
      panic("uvmcopy: pte should exist");
    if((*pte & PTE_V) == 0)
    {
      continue;
    }
      //panic("uvmcopy: page not present");
    pa = PTE2PA(*pte);
    flags = PTE_FLAGS(*pte);
    if((mem = kalloc()) == 0)
      goto err;
    memmove(mem, (char*)pa, PGSIZE);
    if(mappages(new, i, PGSIZE, (uint64)mem, flags) != 0){
      kfree(mem);
      goto err;
    }
  }

exit()需要釋放所有的映射。

遍歷vma表。因為映射長度不一定是初始時的長度，所以要計算出剩余映射空間的起始位置offset.因為本實驗中所有munmap的區域都是從低到高緊密連接的，所以offset到mmapend的距離一定是剩余的length。

for(int idx=0;idx<16;idx++)
  {
    if(p->map_region[idx].valid!=0)//如果存在vma
    {
      uint64 len=p->map_region[idx].mmlength;
      uint64 offset=p->map_region[idx].mmapend-len;
      fileclose(p->map_region[idx].mmapfile);
      uvmunmap(p->pagetable,offset,PGROUNDUP(len)/PGSIZE,1);
      memset((void*)&p->map_region[idx],0,sizeof(vma));
    }
  }

這里uvmunmap也會出現panic，修改一下uvmunmap跳過PTE_V的檢測：（其實這里也應該像上面一樣新定義一個函數）

for(a = va; a < va + npages*PGSIZE; a += PGSIZE){
    if((pte = walk(pagetable, a, 0)) == 0)
      panic("uvmunmap: walk");
    if((*pte & PTE_V) == 0)
    {
      return;
    }
      //panic("uvmunmap: not mapped");
    if(PTE_FLAGS(*pte) == PTE_V)
      panic("uvmunmap: not a leaf");
    if(do_free){
      uint64 pa = PTE2PA(*pte);
      kfree((void*)pa);
    }
    *pte = 0;
  }

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 MIT 6.S081 2021: Lab traps MIT 6.S081 Lab8 File System MIT6.S081-Lab2 Syscall [2021Fall] MIT6.S081/6.828 實驗1：Lab Utilities MIT 6.S081 聊聊xv6中的文件系統（上） MIT 6.S081 聊聊xv6的文件系統（中）日志層與事務 MIT-6.S081-2020實驗（xv6-riscv64）十：mmap MIT-6.S081-2020實驗（xv6-riscv64）五：lazy MIT-6.S081-2020實驗（xv6-riscv64）七：thread MIT6.S081/6.828准備：MacOS下搭建xv6和risc-v環境