前言: 還記得大學時聽linux操作系統原理這門課學到內存相關章節時,無論是老師的講解還是書本的描述都是及其生硬,學完后對里面的缺頁異常、內存管理單元等這些抽象又晦澀難懂的概念和原理一臉懵逼,今天有幸看到一篇針對現代操作系統當中虛擬內存的講解,可以說是簡潔易懂十分精彩了,於是決定將原文和譯文放到博客上,希望為后輩們在學習尋找相關概念時能夠做出幫助。
虛擬內存介紹及其在現代操作系統中的重要作用
Computers are complex machines designed to perform a simple task: to run programs — browsers, text editors, web servers, video games, ... — that operate on data — photos, music, text files, databases and so on.
計算機是一種復雜的機器,被設計出來執行一項簡單的任務:運行程序——瀏覽器、文本編輯器、web服務器、視頻游戲...——它們的特征是都要操作數據:照片、音樂、文本文件、數據庫等。
When not in use, such programs and data live peacefully in the hard drive, the device responsible for keeping information alive even if your computer is turned off. Running an application means to ask the processor (a.k.a. Central Processing Unit or CPU) to read and execute the machine instructions that make up the computer program, along with any additional data processing.
當不使用時,這些程序和數據會安然地呆在硬盤中,硬盤的責任是確保信息不會丟失,即使你的電腦關閉了。運行應用程序意味着要求 處理器(又名中央處理單元或CPU) 讀取和執行構成計算機程序的機器指令,以及任何額外的數據處理。
Hard drives store huge amount of information, yet they are terribly slow. Way slower than the processor: a CPU that reads instructions from a hard drive directly would become a serious bottleneck for the whole system. For this reason, the program and its data are first copied to the main memory (a.k.a. Random Access Memory or RAM), another storage hardware component smaller than a hard drive but much faster, so that the processor can read instructions from there without speed penalties.
硬盤存儲了大量的信息,但它們的速度卻慢得可怕。比處理器慢得多:直接從硬盤上讀取指令的CPU會成為整個系統的嚴重瓶頸。為此,程序及其數據首先被復制到 主存儲器(又名隨機存取存儲器或RAM),這是另一個比硬盤小但速度快得多的存儲硬件,這樣處理器就可以從那里讀取指令,而不會影響速度。
The main memory can be seen as a long list of cells, each one containing some binary data and marked with a number called the memory address. Memory addresses span from 0 to N, based on the amount of main memory available in the system. The range of addresses used by a program is called the address space.
主存儲器可以看成是一個長長的 單元格列表,每個單元格都包含一些二進制數據,並標有一個稱為 內存地址 的數字。內存地址的范圍從 0 到 N,是根據系統中可用的主存儲器的容量而定的。一個程序使用的地址范圍稱為 地址空間。
(1. Two programs loaded in memory. Each cell is a memory address. Space between program A and program B might be used by other programs or data. 內存加載的兩個程序。每個單元格是一個內存地址。程序A和程序B之間的空間可能被其他程序或數據使用)
Usage of the main memory in early computers
早期計算機對主存儲器的使用
In the beginning of the computer history (and also nowadays in embedded systems), programs had access to the entire main memory and its management was left to the programmer. Writing software for those machines was challenging: part of the developer's job was to devise a good way to manage RAM accesses and make sure that the whole program would not overflow the available memory.
在計算機歷史的初期(現在的嵌入式系統也是如此),程序可以訪問整個主內存,且內存的管理交由程序員負責。為這些機器編寫軟件很有挑戰性:開發人員的部分工作是設計出一種好的方法來管理RAM訪問,並確保整個程序不會出現內存溢出。
Things got trickier with the advent of multitasking, when multiple programs could run on the same computer. Programmers had to face new critical issues:
隨着多任務處理的出現,事情變得更加棘手,多個程序可以在同一台計算機上同時運行了。程序員們不得不面對新的關鍵問題:
memory layout — programs located in RAM after the first one would have their address space offset by a certain amount, no longer in the initial range 0 to N. An additional pain point to take care of during development;
memory fragmentation —— as things are moved back and forth to memory, the available space becomes fragmented into smaller and smaller chunks. This would make it harder to find available space to load new programs and data in memory;
security —— what if program A accidentally overwrites program B's memory? Or, even worse: what if it deliberately reads sensitive data from another program, such as passwords or credit card information?
內存布局 —— 在第一個程序之后加載時RAM中的程序,其地址空間會有一定的偏移,不再是初始范圍 0 到 N,在開發過程中多了一個需要注意的痛點。
內存碎片化 —— 當東西來回在內存移動時,可用空間會被分割成越來越小的碎片。這將導致為新的程序和數據找到可用空間更加困難。
安全性 —— 如果程序A不小心覆蓋了程序B的內存怎么辦?或者更糟糕的:如果它故意從另一個程序中讀取敏感數據,比如密碼或信用卡信息等敏感數據怎么辦?
So it was pretty obvious to hardware architects in the early 1960s that a form of automatic memory management could significantly simplify programming and fix the more critical memory protection problem. Eventually they came up with what is known today as virtual memory.
因此,在20世紀60年代初,硬件架構師們很明顯地發現,自動內存管理可以大大簡化編程,並解決更關鍵的內存保護問題。最終,他們設計出了今天所說的虛擬內存。
Virtual memory in a nutshell
虛擬內存簡述
In virtual memory, a program does not have direct access to physical RAM. Instead, it interacts with an illusory address space called virtual address space. The operating system works together with the processor to provide such virtual address space and convert it, sooner or later, into the physical one.
在虛擬內存中,程序不能直接訪問物理RAM。相反,它與一個稱為虛擬地址空間的地址空間進行交互。操作系統與處理器一起工作,提供這種虛擬地址空間,並在需要時將其轉換為物理地址空間。
Every memory access is performed through a virtual address that does not refer to the actual physical location in memory. A program always reads or write the virtual address, and it's completely unaware of what is going on in the underlying hardware.
每一次內存訪問都是通過一個虛擬地址進行的,而這個虛擬地址並不指向內存中的實際物理位置。程序總是在讀取或寫入虛擬地址,它完全不知道底層硬件中發生了什么。
(2. Two processes with their own virtual address spaces. Notice how the physical memory is not contiguous for process. 兩個進程都有自己的虛擬地址空間。注意物理內存對進程來說並不一定是連續的)
Benefits of the virtual memory
虛擬內存的好處
In the picture above you can see an example of virtual to physical translation in action, which reveals two main benefits of the virtual memory:
在上面的圖片中,你可以看到一個從虛擬地址轉換為物理地址的實際例子,從中可以看出虛擬內存的兩大好處:
each program has a virtual address space that starts from 0 — this simplifies a lot the programmer's life: no need to manually keep track of memory offsets anymore;
virtual memory is always contiguous, even if the underlying physical counterpart isn't — the operating system does the hard job of gathering the available pieces together into a single, uniform virtual memory chunk.
每個程序都有一個從0開始的虛擬地址空間 —— 這簡化了程序員的工作:無需再手動記錄內存偏移量;
虛擬內存總是毗連的,即使底層的物理內存不是,操作系統也會把可用的碎片聚集到一個統一的虛擬內存塊中。
The virtual memory mechanism also solves the problem of a limited RAM: every process is given the impression that it is working with an undefined amount of memory, often larger than the physical one. Moreover, the virtual memory guarantees security: program A can't read or write virtual memory assigned to program B without triggering an operating system error. We will see how all of this magic is possible in the following paragraphs.
虛擬內存機制還解決了有限RAM的問題:每一個進程都以為自己在未定義數量的內存內工作,前者往往比物理內存更大。此外,虛擬內存還保證了安全性:程序A無法讀取或寫入分配給程序B的虛擬內存,此類違規行為將觸發操作系統錯誤。我們將在下面的段落中看到所有這些神奇的東西是如何實現的。
Pages and frames: where it all begins
Pages(頁) 和 frames(楨):一切的起點
The virtual memory mechanism needs a place to store the mapping between virtual and physical addresses. That is, given a virtual address X, the system must be able to find the corresponding physical address Y. However, you can't save such information as a 1:1 relationship: it would require a database as big as the whole RAM!
虛擬內存機制需要一個地方來存儲虛擬地址和物理地址之間的映射關系。也就是說,給定一個虛擬地址X,系統必須能夠找到相應的物理地址Y。然而,你不能把這樣的信息以1:1的關系保存下來:不然就需要一個和整個RAM一樣大的數據庫了!
Modern virtual memory implementations overcome this problem (and many others) by interpreting the virtual and the physical memory as a long list of small, fixed-size chunks. The chunks of the virtual memory are called pages and the chunks of the physical one are called frames. The Memory Management Unit (MMU) is a hardware component in the CPU that stores the mapping information between pages and frames inside a special data structure called page table. A page table is like a database table where each row contains a page index and the frame index it corresponds to. Every running program has a page table in the MMU, as you can see in the picture below.
現代虛擬內存的實現克服了這個問題(以及許多其他問題),它將虛擬內存和物理內存解釋為一長串固定大小的小塊。虛擬內存的塊被稱為pages(頁),物理內存的塊被稱為frames(楨)。內存管理單元(MMU) 是CPU中的一個硬件組件,它將 pages 和 frames 之間的映射信息存儲在一個叫做 page table 的特殊數據結構中。page table就像一個數據庫表,每一行都包含一個page索引和對應的frame索引。每個運行中的程序在MMU中都有一個 page table,如下圖所示。
(3. The MMU mapping in action. Each cell is a process page or a physical memory frame. Some pages may not have a corresponding frame mapped: we will see why in the next paragraphs. MMU映射。每個單元格都是一個進程頁或物理內存幀。有些頁面可能沒有相應的幀映射:我們將在下一段中看到原因。)
Converting pages to frames
將 pages 轉換為 frames
A virtual address is made up of two things:
一個虛擬地址由兩個東西組成:
a page index, that tells the page the virtual address belongs to;
a frame offset, that tells the position of the physical address inside the frame;
一個page索引,告訴虛擬地址屬於哪個page。
一個frame偏移量,告訴物理地址在frame中的位置。
This information is enough for the MMU to perform the virtual to physical conversion. When a program reads or write a virtual address, it wakes up the MMU which in turn grabs the page index (1) and searches for the corresponding frame in the program's page table. Once the frame is found, the MMU makes use of the frame offset (2) to find the exact physical memory address and pass it back to the program. At this point the conversion is done: the program has a physical address in RAM to read or write through the virtual one.
這些信息足以讓MMU進行虛擬地址到物理地址的轉換。當程序讀取或寫入一個虛擬地址時,它會喚醒MMU,MMU反過來抓取page索引(1),並在程序的page table中搜索相應的frame。一旦找到該frame,MMU利用frame的偏移量(2)找到准確的物理內存地址,並將其傳回給程序。至此,轉換工作完成:程序在RAM中擁有了一個通過虛擬地址進行讀寫的物理地址。
Under the hood of virtual memory
虛擬內存的背后
While programs are provided with a continguous, clean and tidy virtual address space, both the operating system and the hardware are allowed to do crazy things in the background with data residing in the physical memory.
雖然程序被提供了一個持續的、干凈整潔的虛擬地址空間,但操作系統和硬件仍然有能力在后台用駐留在物理內存中的數據做瘋狂的事情。
For example, the operating system often delays loading parts of a program from the hard drive until the program attempts to use it. Some of the code will only be run during initialization or when a special condition occurs. A program's page table may be filled with entries that point to non-existing or not yet allocated frames. This case is depicted by the image 3. above, where the last two pages map to nowhere.
例如,操作系統經常會延遲從硬盤中加載程序的部分內容,直到程序嘗試使用時才繼續載入。有些代碼只有在初始化期間或發生特殊條件時才會運行。程序的page table中的條目可能會指向不存在或尚未分配的frame。這種情況在上面圖3中已有所示,最后兩個page的映射為空。
Tricks like this one are completely transparent to the application, which keeps reading and writing its own virtual address space unaware of the background noise. However, sooner or later the program may want to access one of the virtual addresses that don't map to the RAM: what to do?
像這樣的技巧對程序來說是完全透明的,程序會在不知道背景噪音的情況下不停地讀寫自己的虛擬地址空間。然而,程序遲早會訪問到其中一個沒有映射到物理RAM的虛擬地址,那時要怎么辦?
Page faults (缺頁錯誤)
A page fault (also known as page miss) occurs when a program accesses a virtual address on a page not currently mapped to a physical frame. More specifically, a page fault takes place when the page exists in the program's page table but points to a non-existent or not yet available frame in the physical memory.
當程序訪問沒有映射到物理frame上的虛擬地址時,就發生了 page fault(也稱為 page miss)。更具體地說,當一個page在程序的page table中存在,但卻指向了物理存儲器中不存在或尚不可用的frame時,就會發生 page fault。
The MMU detects the page fault and redirects the message to the operating system, which will do its best to find a frame in the physical memory for the mapping. Most of the time this is a straightforward operation, unless the system is running out of RAM.
MMU檢測到page fault,並將消息重定向到操作系統,操作系統將盡最大努力在物理內存中找到映射的frame。大多數情況下,這是一個直接的操作,除非系統的RAM用完了。
Paging, or when the physical memory is not enough
Paging(分頁), 或當物理內存不夠用時
Paging is another memory management trick: the operating system moves some pages to the hard drive, to make room for other programs or data when there is no more physical memory available. Sometimes it is also called swapping, although not 100% correct. Swapping is about moving the entire process to disk. Some operating systems do this too, when needed.
Paging 是另一種內存管理技巧:當沒有更多的物理內存可用時,操作系統會將一些pages移動到硬盤上,為其他程序或數據騰出空間。有時它也被稱為swapping,雖然不是100%正確。Swapping(交換)是指將整個進程移動到磁盤上。有些操作系統在需要的時候也會這樣做。
Paging gives programs the illusion of an unlimited amount of available RAM. The operating system optimistically allows for a virtual memory address space larger than the physical one, knowing that data can be moved in and out the hard drive in case of need. Some systems (e.g. Windows) make use of a special file called paging file for this purpose. Others (e.g. Linux) have a dedicated hard drive partition called swap area (for historical reasons though, modern Linux performs paging instead of swapping).
分頁給程序帶來了一種假象,以為可用RAM是無限量的。操作系統樂觀地允許虛擬內存地址空間大於物理地址空間,因為它知道數據可以在需要的時候向硬盤中移入移出。有些系統(如Windows)為此目的使用了一個叫做 paging file(分頁文件) 的特殊文件。其他系統(如Linux)有一個專門的硬盤分區,稱為 swap area(swap區)(由於歷史原因,現代Linux執行的是paging,而不是swapping)。
Unfortunately the hard drive is way slower than the main memory. So when a page fault occurs and the page was temporarily moved to the hard drive, the operating system has to read data from the sluggish medium and move it back to memory, causing a lag. All in all, less paging means a system that runs more efficiently.
不幸的是,硬盤的速度比主內存要慢得多。所以當出現page fault,臨時將page移動到硬盤上時,操作系統必須從低速介質中讀取數據,然后將數據移回內存,造成滯后。總而言之,越少的paging意味着系統的運行效率越高。
Thrashing
Thrashing occurs when the system spends more time in paging than running applications, triggered by a constant stream of page faults. This is an extreme corner case that happens if you are running too many programs that fill up the entire RAM and/or the paging area on the hard drive is unoptimized. The operating system tries to keep up with the large amount of page fault requests, constantly moving data between the hard drive and the physical memory, grinding the system to a halt. Thrashing can be avoided by increasing the amount of RAM, reducing the number of programs being run or again by adjusting the size of the swap file.
當系統在paging中花費的時間多於運行應用程序本身時,就出現了 Thrashing 現象。如果你運行的程序太多,占用了整個內存,或者硬盤上的分頁區域沒有經過優化時,就會出現這種極端的情況。操作系統會努力跟上大量的page faults請求,不斷地在硬盤和物理內存之間移動數據,使系統陷入停頓。可以通過增加RAM、減少正在運行的程序數量或調整swap file的大小,來避免thrashing現象。
Memory protection
內存保護
Virtual memory also provides security across running applications: your browser can't peep into your text editor's virtual memory and vice versa without triggering an error. The main purpose of memory protection is to prevent a process from accessing memory that doesn't belong to it.
虛擬內存還提供了運行中的應用程序之間的安全性:你的瀏覽器不能偷窺到你的文本編輯器的虛擬內存,反之亦然。內存保護的主要目的是防止進程訪問不屬於它的內存。
The memory protection mechanism is usually provided by the MMU and the page tables it manages, while other architectures may use different hardware strategies. When a program tries to access a portion of virtual memory it doesn't own, an invalid page fault is triggered. The MMU and the operating system catch the signal and raise a failure condition called segmentation fault (on Unix) or access violation (on Windows). The operating system usually kills the program in response.
內存保護機制通常由MMU和它所管理的page tables提供,而其他架構可能使用不同的硬件策略。當程序試圖訪問不屬於自己的虛擬內存時,就會觸發 invalid page fault(無效頁故障)。MMU和操作系統會捕捉到這個信號,並發出一個稱為segmentation fault(分段故障,在Unix上)或access violation(訪問違規,在Windows上)的故障條件。作為響應,操作系統通常會殺死該程序。
Segmentation faults and access violations are also often raised by mistake. Programming languages that perform manual memory management give you the ability to set aside portions of memory to be used to store program data: the operating system will provide you with a nice chunk of free memory (a.k.a. a buffer) to read and write according to your program's needs. However, nothing prevents you to read or write outside the buffer boundaries, accessing memory that doesn't belong to your program or simply doesn't exist. The operating system would detect the illegal access and raise the usual violation signal.
Segmentation faults 和 access violations 也常常因疏忽而產生。執行手動內存管理的編程語言允許你預留一部分內存用於存儲程序數據:操作系統會給你提供一塊不錯的空閑內存(又名buffer(緩沖區)),讓你根據程序的需要進行讀寫。但是,沒有任何東西可以阻止你在緩沖區邊界之外進行讀寫,訪問不屬於你的或根本不存在的內存。操作系統會檢測到非法訪問,並發出通常的違規信號。
Read more (閱讀更多)
Virtual memory paves the road for many other interesting topics. For example, memory-mapped files are a powerful abstraction over the traditional way of reading and writing files. Instead of manually copying data into memory in order to operate on it, memory mapping allows a program to access a file directly from the hard drive as if it was already fully loaded in RAM. The virtual memory mechanism will take care of moving data from the hard drive to RAM as usual, when necessary. Memory-mapped files simplify the programmer's work and usually speed up file access operations. More information here.
虛擬內存為許多其他有趣的話題鋪平了道路。例如,與傳統的讀寫文件的方式相比,memory-mapped files(內存映射文件)是一種強大的抽象。內存映射不需要手動復制數據到內存中進行操作,而是允許程序直接從硬盤中訪問文件,就像訪問在RAM中已經完全加載好的文件一樣。必要時,虛擬內存機制會像往常一樣將數據從硬盤中移動到RAM中。內存映射文件簡化了程序員的工作,通常會加快文件訪問操作的速度。更多信息請看這里。
Virtual memory also makes more difficult to reason about memory consumption. Suppose one of your programs is taking up 300 megabytes of memory: is it virtual or physical? Is part of that space paged to disk? And if it is, will the paging operations be fast enough? Also, tuning the paging file/swap area is an important step if you want to keep your system in a good shape. Operating systems provide many tools to measure and adjust memory: more information here and here.
虛擬內存也使得對內存消耗的推理(計算)更加困難。假設你的一個程序占用了300兆字節的內存:它是虛擬內存還是物理內存?其中的一部分空間有分頁到磁盤上嗎?如果是,分頁操作是否足夠快?另外,如果你想讓系統保持良好的狀態,調整分頁文件/交換區是一個重要的步驟。操作系統提供了許多測量和調整內存的工具:更多信息請看這里和這里。
閱讀原文:https://www.internalpointers.com/post/introduction-virtual-memory****