In a previous article, we learned how to work with virtual address space. Today we will add multitasking support.
Table of contents
Build system (make, gcc, gas). Initial boot (multiboot). Launch (qemu). C library (strcpy, memcpy, strext).
C library (sprintf, strcpy, strcmp, strtok, va_list ...). Building the library in kernel mode and user application mode.
The kernel system log. Video memory Output to the terminal (kprintf, kpanic, kassert).
Dynamic memory, heap (kmalloc, kfree).
Organization of memory and interrupt handling (GDT, IDT, PIC, syscall). Exceptions
Virtual memory (page directory and page table).
Process. Scheduler. Multitasking. System calls (kill, exit, ps).
The file system of the kernel (initrd), elf and its internals. System calls (exec).
Character device drivers. System calls (ioctl, fopen, fread, fwrite). C library (fopen, fclose, fprintf, fscanf).
Shell as a complete program for the kernel.
User protection mode (ring3). Task state segment (tss).
Multitasking
As you remember, each process should have its own address space.
To do this, you need to count the pages used by the process.
Therefore, we must describe the memory of the process:
struct task_mem_t { void* pages; u_int pages_count; void* page_dir; void* page_table; };
Processes somehow have to exchange information with each other. To do this, we introduce the concept of a message:
struct message_t { u_short type; u_int len; u8 data[IPC_MSG_DATA_BUFF_SIZE]; };
After that, you need to describe the process itself as an element of a circular list.
The cyclic list here is convenient because in the scheduler you need to select the next from each task until the list is empty.
By this we remove the degenerate case, which means we simplify the logic.
struct task_t { struct clist_head_t list_head; u_short tid; char name[8]; struct gp_registers_t gp_registers; struct op_registers_t op_registers; struct flags_t flags; u_int time; bool reschedule; u_short status; int msg_count_in; struct message_t msg_buff[TASK_MSG_BUFF_SIZE]; void* kstack; void* ustack; struct task_mem_t task_mem; } attribute(packed);
We will allocate a quota to each process, the number of timer interruptions after which the process will be rescheduled.
#define TASK_QUOTA 3
Next, you need to write a function to create a new process.
While we do not have privilege levels, therefore we will use kernel selectors.
We will need 2 stacks for the future, one for user mode and one for the kernel.
We will set the initial status as TASK_UNINTERRUPTABLE so that the task does not have time to complete before full initialization.
extern bool task_create(u_short tid, void* address, struct task_mem_t *task_mem) { struct task_t* task; struct clist_head_t* entry; entry = clist_insert_entry_after(&task_list, task_list.head); task = (struct task_t*)entry->data; task->kstack = malloc(TASK_KSTACK_SIZE); task->ustack = malloc(TASK_USTACK_SIZE); task->tid = tid; task->name[0] = '\0'; task->status = TASK_UNINTERRUPTABLE; task->msg_count_in = 0; task->time = 0; memcpy(&task->task_mem, task_mem, sizeof(struct task_mem_t)); *(u32*)(&task->flags) = asm_get_eflags() | 0x200; memset(&task->gp_registers, 0, sizeof(struct gp_registers_t)); task->op_registers.cs = GDT_KCODE_SELECTOR; task->op_registers.ds = GDT_KDATA_SELECTOR; task->op_registers.ss = GDT_KSTACK_SELECTOR; task->op_registers.eip = (size_t)address; task->op_registers.cr3 = (size_t)task_mem->page_dir; task->op_registers.k_esp = (u32)task->kstack + TASK_KSTACK_SIZE; task->op_registers.u_esp = (u32)task->ustack + TASK_USTACK_SIZE; printf(MSG_SCHED_TID_CREATE, tid, (u_int)address, task->kstack, task->ustack); return true; }
A process will be deleted by the scheduler if the status of the process is TASK_KILLING.
When deleting, we just need to free the stacks, pages allocated for data and code sections, and also destroy the directory of process pages.
In general, for a good user stack, you can allocate through the memory manager, but for convenience, debugging while we implement it in the heap of the kernel, which is always in the page directories.
extern void task_delete(struct task_t* task) { printf(MSG_SCHED_TID_DELETE, (u_int)task->tid); assert(task != null); free(task->kstack); free(task->ustack); task->kstack = null; task->ustack = null; if (task->task_mem.pages_count > 0) { mm_phys_free_pages(task->task_mem.pages, task->task_mem.pages_count); task->task_mem.pages = null; task->task_mem.pages_count = 0; } if (task->task_mem.page_dir != null) { mmu_destroy_user_page_directory(task->task_mem.page_dir, task->task_mem.page_table); } clist_delete_entry(&task_list, (struct clist_head_t*)task); }
Now you need to write the scheduler itself.
First we need to understand if the current task is running or is this the first run.
If the task is being executed, you need to check whether its quota has been exhausted.
If so, then you need to save its state by explicitly specifying the enable interrupt flag.
After that, you need to find the next task for execution in the status TASK_RUNNING.
Next, you need to complete the current task if it is in the TASK_KILLING status.
After that, we prepare the stack frame for the next task and switch the context.
extern void sched_schedule(size_t* ret_addr, size_t* reg_addr) { struct task_t* next_task = null; if (current_task != null) { current_task->time += 1; if (current_task->time < TASK_QUOTA && !current_task->reschedule) { return; } current_task->time = 0; current_task->reschedule = false; current_task->op_registers.eip = *ret_addr; current_task->op_registers.cs = *(u16*)((size_t)ret_addr + 4); *(u32*)(¤t_task->flags) = *(u32*)((size_t)ret_addr + 6) | 0x200; current_task->op_registers.u_esp = (size_t)ret_addr + 12; current_task->gp_registers.esp = current_task->op_registers.u_esp; memcpy(¤t_task->gp_registers, (void*)reg_addr, sizeof(struct gp_registers_t)); } if (current_task) { next_task = task_get_next_by_status(TASK_RUNNING, current_task); } else { next_task = task_get_by_status(TASK_RUNNING); tss_set_kernel_stack(next_task->kstack); } assert(next_task != null); if (current_task && current_task->status == TASK_KILLING) { task_delete(current_task); } else { struct task_t* task; task = task_find_by_status(TASK_KILLING); if (task) { task_delete(task); } } next_task->op_registers.u_esp -= 4; *(u32*)(next_task->op_registers.u_esp) = (*(u16*)(&next_task->flags)) | 0x200; next_task->op_registers.u_esp -= 4; *(u32*)(next_task->op_registers.u_esp) = next_task->op_registers.cs; next_task->op_registers.u_esp -= 4; *(u32*)(next_task->op_registers.u_esp) = next_task->op_registers.eip; next_task->gp_registers.esp = next_task->op_registers.u_esp; next_task->op_registers.u_esp -= sizeof(struct gp_registers_t); memcpy((void*)next_task->op_registers.u_esp, (void*)&next_task->gp_registers, sizeof(struct gp_registers_t)); current_task = next_task; printf(MSG_SCHED_NEXT, next_task->tid, next_task->op_registers.u_esp, *ret_addr, next_task->op_registers.eip); asm_switch_ucontext(next_task->op_registers.u_esp, next_task->op_registers.cr3); }
It remains to write the task context switching function.
To do this, download the new page directory and restore all general registers, including flags.
You also need to switch to the stack of the next task.
Next, you need to make a return from the interrupt, since we formed a special stack frame for this.
For the fact that later we will need this to switch privilege levels.
/* * Switch context (to kernel ring) * void asm_switch_kcontext(u32 esp, u32 cr3) */ asm_switch_kcontext: mov 4(%esp),%ebp # ebp = esp mov 8(%esp),%eax # eax = cr3 mov %cr0,%ebx # ebx = cr0 and $0x7FFFFFFF,%ebx # unset PG bit mov %ebx,%cr0 mov %eax,%cr3 or $0x80000001,%ebx # set PE & PG bits mov %ebx,%cr0 mov %ebp,%esp popal sti iretl
We implement the simplest messaging mechanism between processes.
When we send a message to a process, it needs to be thawed if it is frozen because it is waiting for messages.
But when we receive a message, we must freeze the process if the message queue is empty.
After that, you need to transfer control to the scheduler.
extern void ksend(u_short tid, struct message_t* msg) { struct task_t* task; task = task_get_by_id(tid); task_pack_message(task, msg); if (task->status == TASK_INTERRUPTABLE) { task->status = TASK_RUNNING; } } extern void kreceive(u_short tid, struct message_t* msg) { struct task_t* task_before; struct task_t* task_after; task_before = sched_get_current_task(); assert(tid == task_before->tid); assert(task_before->status == TASK_RUNNING); if (task_before->msg_count_in == 0) { task_before->status = TASK_INTERRUPTABLE; } sched_yield(); task_after = sched_get_current_task(); assert(task_after == task_before); assert(tid == task_after->tid); assert(task_after->status == TASK_RUNNING); task_extract_message(task_after, msg); assert(msg != null); }
Now you need to write system calls to work with processes from user applications.
There we will throw system calls to send and receive messages.
extern size_t ih_syscall(u_int* function) { size_t params_addr = ((size_t)function + sizeof(u_int)); size_t result = 0; struct task_t *current = sched_get_current_task(); printf(MSG_SYSCALL, *function, current->tid); asm_lock(); switch (*function) { case SYSCALL_KSEND: { u_short tid = *(u_int*)params_addr; ksend(tid, *(struct message_t**)(params_addr + 4)); break; } case SYSCALL_KRECEIVE: { u_short tid = *(u_int*)params_addr; kreceive(tid, *(struct message_t**)(params_addr + 4)); break; } case SYSCALL_KILL: { u_short tid = *(u_int*)params_addr; struct task_t* task = task_find_by_id(tid); if (task != null) { assert(task->tid == tid); task->status = TASK_KILLING; task->reschedule = true; result = true; } else { result = false; } break; } case SYSCALL_EXIT: { int errno = *(int*)params_addr; u_int tid = current->tid; printf(MSG_TASK_FINISHED, tid, errno); current->status = TASK_KILLING; sched_yield(); break; } case SYSCALL_TASK_LIST: { result = (size_t)task_get_task_list(); break; } default: unreachable(); } printf(MSG_SYSCALL_RET, *function); asm_unlock(); return result; }
In order to use the system calls from our C library, we need to write a function that causes a kernel interrupt. For the sake of simplicity, we will not use special Intel commands since we do not have privilege levels. As an argument, we will pass the number of the system function and the arguments it needs.
/* * Call kernel * void asm_syscall(unsigned int function, ...) */ asm_syscall: push %ebx push %ebp mov %esp,%ebp mov %ebp,%ebx add $8,%ebx # skip registers add $4,%ebx # skip return address push %ebx # &function int $0x80 mov %ebp,%esp pop %ebp pop %ebx ret
This is quite enough to implement the simplest multitasking without the support of protection rings. Now open the
video tutorial and watch everything in order!
References
Details and explanations in the
video tutorial .
The source code
in the git repository (you need the lesson7 branch).
Bibliography
1. James Molloy. Roll your own toy UNIX-clone OS.
2. Teeth. Assembler for DOS, Windows, Unix
3. Kalashnikov. Assembler is easy!
4. Tanenbaum. Operating Systems. Implementation and development.
5. Robert Love. Linux kernel Description of the development process.