Unix-like OS Development - Virtual Address Space (6)

In a previous article, we examined the basics of working in IA-32 protected mode. Today it’s time to learn how to work with virtual address space.



Table of contents



Build system (make, gcc, gas). Initial boot (multiboot). Launch (qemu). C library (strcpy, memcpy, strext).



C library (sprintf, strcpy, strcmp, strtok, va_list ...). Building the library in kernel mode and user application mode.



The kernel system log. Video memory Output to the terminal (kprintf, kpanic, kassert).

Dynamic memory, heap (kmalloc, kfree).



Organization of memory and interrupt handling (GDT, IDT, PIC, syscall). Exceptions

Virtual memory (page directory and page table).



Process. Scheduler. Multitasking. System calls (kill, exit, ps).



The file system of the kernel (initrd), elf and its internals. System calls (exec).



Character device drivers. System calls (ioctl, fopen, fread, fwrite). C library (fopen, fclose, fprintf, fscanf).



Shell as a complete program for the kernel.



User protection mode (ring3). Task state segment (tss).



Virtual memory



Virtual memory is needed so that each process can be isolated from another, i.e. could not stop him. If there were no virtual memory, we would have to load elf files at different addresses in memory each time. But as you know, executable files may contain links to specific addresses (absolute). Therefore, when compiling elf, it is already known at what address in memory it will be loaded (see linker script). Therefore, we cannot load two elf files without virtual memory. But even with virtual memory turned on, there are dynamic libraries (such as .so) that can load at any address. They can be downloaded at any address due to the fact that they have a reloc section. In this section, all places where absolute addressing is used are registered, and the kernel, when loading such an elf file, must fix these addresses with pens, i.e. add to them the difference between the real and the desired download address.



We will consider working with 4 kilobyte pages. In this situation, we can address up to 4 megabytes of RAM. That’s enough for us. The address map will look like this:



0-1 mb : do not touch.

1-2 mb : code and kernel data.

2-3 mb : a bunch of kernels.

3-4 mb : custom pages of elf files uploaded.



The linear address (obtained from the flat model) when page paging is enabled is not equal to the physical one. Instead, the address is divided by the offset (low bits), the index of the entries in the page table, and the index of the page directory (high bits). Each process will have its own directory of pages and, accordingly, page tables. This is what allows you to organize a virtual address space.



The page directory entry looks like this:



struct page_directory_entry_t { u8 present : 1; u8 read_write : 1; u8 user_supervisor : 1; u8 write_through : 1; u8 cache_disabled : 1; u8 accessed : 1; u8 zero : 1; u8 page_size : 1; u8 ignored : 1; u8 available : 3; u32 page_table_addr : 20; } attribute(packed);
      
      





The page table element looks like this:



 struct page_table_entry_t { u8 present : 1; u8 read_write : 1; u8 user_supervisor : 1; u8 write_through : 1; u8 cache_disabled : 1; u8 accessed : 1; u8 dirty : 1; u8 zero : 1; u8 global : 1; u8 available : 3; u32 page_phys_addr : 20; } attribute(packed);
      
      





For the kernel, we will describe the page directory and page table as static variables. True, there is a requirement that they be aligned on the page border.



 static struct page_directory_entry_t kpage_directory attribute(aligned(4096)); static struct page_table_entry_t kpage_table[MMU_PAGE_TABLE_ENTRIES_COUNT] attribute(aligned(4096));
      
      





We will go the simple way and make the whole physical address space accessible to the kernel. The privilege level of the kernel pages should be a supervisor so that no one climbed into them. Lightweight processes that must run in kernel mode will share the same address space with the kernel. With this process, we will have a pending execution queue to handle pending interrupts. But more about that in the lesson about character device drivers. We have yet to realize multitasking before considering this topic.



Create a directory of kernel pages and a corresponding page table. When the kernel is initialized, it will be active.



 /* * Api - init kernel page directory * Here assumed each entry addresses 4Kb */ extern void mmu_init() { memset(&kpage_directory, 0, sizeof(struct page_directory_entry_t)); /* set kernel page directory */ kpage_directory.zero = 1; kpage_directory.accessed = 0; kpage_directory.available = 0; kpage_directory.cache_disabled = 0; kpage_directory.ignored = 0; kpage_directory.page_size = 0; /* 4KB */ kpage_directory.present = 1; /* kernel pages always in memory */ kpage_directory.read_write = 1; /* read & write */ kpage_directory.user_supervisor = 1; /* kernel mode pages */ kpage_directory.write_through = 1; kpage_directory.page_table_addr = (size_t)kpage_table >> 12; /* set kernel table */ for (int i = 0; i < MMU_PAGE_TABLE_ENTRIES_COUNT; ++i) { kpage_table[i].zero = 0; kpage_table[i].accessed = 0; kpage_table[i].available = 0; kpage_table[i].cache_disabled = 0; kpage_table[i].dirty = 0; kpage_table[i].global = 1; kpage_table[i].present = 1; /* kernel pages always in memory */ kpage_table[i].read_write = 1; /* read & write */ kpage_table[i].user_supervisor = 1; /* kernel mode pages */ kpage_table[i].write_through = 1; kpage_table[i].page_phys_addr = (i * 4096) >> 12; /* assume 4Kb pages */ } }
      
      





When we upload the elf files, we will need to create a page directory for the user process. You can do this with the following function:



 /* * Api - Create user page directory */ extern struct page_directory_entry_t* mmu_create_user_page_directory(struct page_table_entry_t* page_table) { struct page_directory_entry_t* upage_dir; upage_dir = malloc_a(sizeof(struct page_directory_entry_t), 4096); upage_dir->zero = 1; upage_dir->accessed = 0; upage_dir->available = 0; upage_dir->cache_disabled = 0; upage_dir->ignored = 0; upage_dir->page_size = 0; /* 4KB */ upage_dir->present = 1; upage_dir->read_write = 1; /* read & write */ upage_dir->user_supervisor = 0; /* user mode pages */ upage_dir->write_through = 1; upage_dir->page_table_addr = (size_t)page_table >> 12; /* assume 4Kb pages */ return upage_dir; }
      
      





By default, the process page table will contain kernel pages and empty entries for future process pages, i.e. records with the present flag cleared and the physical page address at 0.



 /* * Api - Create user page table */ extern struct page_table_entry_t* mmu_create_user_page_table() { struct page_table_entry_t* upage_table; upage_table = malloc_a(sizeof(struct page_table_entry_t) * MMU_PAGE_TABLE_ENTRIES_COUNT, 4096); /* share kernel pages */ memcpy(upage_table, kpage_table, sizeof(struct page_table_entry_t) * MMU_KERNEL_PAGES_COUNT); /* fill user pages */ for (int i = MMU_KERNEL_PAGES_COUNT; i < MMU_PAGE_TABLE_ENTRIES_COUNT; ++i) { struct page_table_entry_t* current; current = upage_table + i; current->zero = 0; current->accessed = 0; current->available = 0; current->cache_disabled = 0; current->dirty = 0; current->global = 1; current->present = 0; /* not present as so as there is no user pages yet */ current->read_write = 1; /* read & write */ current->user_supervisor = 0; /* user mode page */ current->write_through = 1; current->page_phys_addr = 0; /* page is not present */ } return upage_table; }
      
      





We need to learn how to add new physical pages to the process page table, since there will be none by default. We will need this when loading elf files into memory, when we load segments described in program headers. The function will help us with this:



 /* * Api - Occupy user page */ extern bool mmu_occupy_user_page(struct page_table_entry_t* upage_table, void* phys_addr) { for (int i = MMU_KERNEL_PAGES_COUNT; i < MMU_PAGE_TABLE_ENTRIES_COUNT; ++i) { struct page_table_entry_t* current; current = upage_table + i; if (current->present) { /* page is buzy */ continue; } current->zero = 0; current->accessed = 0; current->available = 0; current->cache_disabled = 0; current->dirty = 0; current->global = 1; current->present = 1; current->read_write = 1; /* read & write */ current->user_supervisor = 0; /* user mode page */ current->write_through = 1; current->page_phys_addr = (size_t)phys_addr >> 12; /* assume 4Kb pages */ return true; } return false; }
      
      





Paging mode is turned on and off by a bit in the processor flag register.



 /* * Enable paging * void asm_enable_paging(void *page_directory) */ asm_enable_paging: mov 4(%esp),%eax # page_directory mov %eax,%cr3 mov %cr0,%eax or $0x80000001,%eax # set PE & PG bits mov %eax,%cr0 ret /* * Disable paging * void asm_disable_paging() */ asm_disable_paging: mov %eax,%cr3 mov %cr0,%eax xor $0x80000000,%eax # unset PG bit mov %eax,%cr0 ret
      
      





After we have learned how to create the address space of processes, we need to somehow manage the physical pages, which one is busy and which one is free. There is a bitmap mechanism for this, one bit per page. We will not describe pages up to 3rd megabyte, because they belong to the kernel and are always busy. We begin to select user pages from 3rd to 4th megabytes.



 static u32 bitmap[MM_BITMAP_SIZE];
      
      







Physical pages are allocated and deallocated according to the following functions. In fact, we simply find the desired bit in the map at the physical address of the page and vice versa. The disadvantage is that we have a limited memory cell size, so you have to use two coordinates: byte number and bit number.



 /* * Api - allocate pages */ extern void* mm_phys_alloc_pages(u_int count) { /* find free pages */ for (int i = 0; i < MM_DYNAMIC_PAGES_COUNT; ++i) { bool is_found = true; for (int j = 0; j < count; ++j) { is_found = is_found && !mm_get_bit(i + j); } if (is_found) { /* occupy */ for (int j = 0; j < count; ++j) { assert(!mm_get_bit(i + j)); mm_set_bit(i + j); } return (void *)mm_get_addr(i); } } return null; } /* * Api - free page */ extern bool mm_phys_free_pages(void* ptr, u_int count) { size_t address = (size_t)ptr; assert(address >= MM_AREA_START); assert(address % MM_PAGE_SIZE == 0); /* find page */ for (int i = 0; i < MM_DYNAMIC_PAGES_COUNT; ++i) { size_t addr = mm_get_addr(i); if (addr == address) { /* free pages */ for (int j = 0; j < count; ++j) { assert(mm_get_bit(i + j)); mm_clear_bit(i + j); } return true; } } return false; }
      
      





This is quite enough to introduce full support for virtual memory in your kernel.



References



Details and explanations in the video tutorial .



The source code in the git repository (you need the lesson6 branch).



Bibliography



1. James Molloy. Roll your own toy UNIX-clone OS.

2. Teeth. Assembler for DOS, Windows, Unix

3. Kalashnikov. Assembler is easy!

4. Tanenbaum. Operating Systems. Implementation and development.

5. Robert Love. Linux kernel Description of the development process.



All Articles