Unix-like OS development - Multitasking and system calls (7)

In a previous article, we learned how to work with virtual address space. Today we will add multitasking support.



Table of contents



Build system (make, gcc, gas). Initial boot (multiboot). Launch (qemu). C library (strcpy, memcpy, strext).



C library (sprintf, strcpy, strcmp, strtok, va_list ...). Building the library in kernel mode and user application mode.



The kernel system log. Video memory Output to the terminal (kprintf, kpanic, kassert).

Dynamic memory, heap (kmalloc, kfree).



Organization of memory and interrupt handling (GDT, IDT, PIC, syscall). Exceptions

Virtual memory (page directory and page table).



Process. Scheduler. Multitasking. System calls (kill, exit, ps).

The file system of the kernel (initrd), elf and its internals. System calls (exec).



Character device drivers. System calls (ioctl, fopen, fread, fwrite). C library (fopen, fclose, fprintf, fscanf).



Shell as a complete program for the kernel.



User protection mode (ring3). Task state segment (tss).



Multitasking



As you remember, each process should have its own address space.



To do this, you need to count the pages used by the process.



Therefore, we must describe the memory of the process:



/* * Process memory description */ struct task_mem_t { void* pages; /* task physical pages */ u_int pages_count; /* task physical pages count */ void* page_dir; /* page directory */ void* page_table; /* page table */ };
      
      





Processes somehow have to exchange information with each other. To do this, we introduce the concept of a message:



 struct message_t { u_short type; /* message type */ u_int len; /* data length */ u8 data[IPC_MSG_DATA_BUFF_SIZE]; /* message data */ };
      
      





After that, you need to describe the process itself as an element of a circular list.



The cyclic list here is convenient because in the scheduler you need to select the next from each task until the list is empty.



By this we remove the degenerate case, which means we simplify the logic.



 /* * Process descriptor */ struct task_t { struct clist_head_t list_head; /* should be at first */ u_short tid; /* task id */ char name[8]; /* task name */ struct gp_registers_t gp_registers; /* general purpose registers */ struct op_registers_t op_registers; /* other purpose registers */ struct flags_t flags; /* processor flags */ u_int time; /* time of task execution */ bool reschedule; /* whether task need to be rescheduled */ u_short status; /* task status */ int msg_count_in; /* count of incomming messages */ struct message_t msg_buff[TASK_MSG_BUFF_SIZE]; /* task message buffer */ void* kstack; /* kernel stack top */ void* ustack; /* user stack top */ struct task_mem_t task_mem; /* task memory */ } attribute(packed);
      
      





We will allocate a quota to each process, the number of timer interruptions after which the process will be rescheduled.



 #define TASK_QUOTA 3
      
      





Next, you need to write a function to create a new process.



While we do not have privilege levels, therefore we will use kernel selectors.



We will need 2 stacks for the future, one for user mode and one for the kernel.



We will set the initial status as TASK_UNINTERRUPTABLE so that the task does not have time to complete before full initialization.



 /* * Api - Create new task */ extern bool task_create(u_short tid, void* address, struct task_mem_t *task_mem) { struct task_t* task; struct clist_head_t* entry; /* allocate memory */ entry = clist_insert_entry_after(&task_list, task_list.head); task = (struct task_t*)entry->data; task->kstack = malloc(TASK_KSTACK_SIZE); task->ustack = malloc(TASK_USTACK_SIZE); /* fill data */ task->tid = tid; task->name[0] = '\0'; task->status = TASK_UNINTERRUPTABLE; task->msg_count_in = 0; task->time = 0; memcpy(&task->task_mem, task_mem, sizeof(struct task_mem_t)); /* set flags */ *(u32*)(&task->flags) = asm_get_eflags() | 0x200; /* set general purpose registers */ memset(&task->gp_registers, 0, sizeof(struct gp_registers_t)); /* set other purpose registers */ task->op_registers.cs = GDT_KCODE_SELECTOR; task->op_registers.ds = GDT_KDATA_SELECTOR; task->op_registers.ss = GDT_KSTACK_SELECTOR; task->op_registers.eip = (size_t)address; task->op_registers.cr3 = (size_t)task_mem->page_dir; task->op_registers.k_esp = (u32)task->kstack + TASK_KSTACK_SIZE; task->op_registers.u_esp = (u32)task->ustack + TASK_USTACK_SIZE; printf(MSG_SCHED_TID_CREATE, tid, (u_int)address, task->kstack, task->ustack); return true; }
      
      





A process will be deleted by the scheduler if the status of the process is TASK_KILLING.

When deleting, we just need to free the stacks, pages allocated for data and code sections, and also destroy the directory of process pages.



In general, for a good user stack, you can allocate through the memory manager, but for convenience, debugging while we implement it in the heap of the kernel, which is always in the page directories.



 /* * Api - Delete task by id */ extern void task_delete(struct task_t* task) { printf(MSG_SCHED_TID_DELETE, (u_int)task->tid); assert(task != null); /* free stack memory */ free(task->kstack); free(task->ustack); task->kstack = null; task->ustack = null; /* free user pages memory */ if (task->task_mem.pages_count > 0) { mm_phys_free_pages(task->task_mem.pages, task->task_mem.pages_count); task->task_mem.pages = null; task->task_mem.pages_count = 0; } /* clear resources */ if (task->task_mem.page_dir != null) { mmu_destroy_user_page_directory(task->task_mem.page_dir, task->task_mem.page_table); } clist_delete_entry(&task_list, (struct clist_head_t*)task); }
      
      





Now you need to write the scheduler itself.



First we need to understand if the current task is running or is this the first run.

If the task is being executed, you need to check whether its quota has been exhausted.

If so, then you need to save its state by explicitly specifying the enable interrupt flag.

After that, you need to find the next task for execution in the status TASK_RUNNING.

Next, you need to complete the current task if it is in the TASK_KILLING status.

After that, we prepare the stack frame for the next task and switch the context.



 /* * Api - Schedule task to run */ extern void sched_schedule(size_t* ret_addr, size_t* reg_addr) { struct task_t* next_task = null; /* finish current task */ if (current_task != null) { /* update running time */ current_task->time += 1; /* check quota exceed */ if (current_task->time < TASK_QUOTA && !current_task->reschedule) { return; /* continue task execution */ } /* reset quota */ current_task->time = 0; current_task->reschedule = false; /* save task state */ current_task->op_registers.eip = *ret_addr; current_task->op_registers.cs = *(u16*)((size_t)ret_addr + 4); *(u32*)(&current_task->flags) = *(u32*)((size_t)ret_addr + 6) | 0x200; current_task->op_registers.u_esp = (size_t)ret_addr + 12; current_task->gp_registers.esp = current_task->op_registers.u_esp; memcpy(&current_task->gp_registers, (void*)reg_addr, sizeof(struct gp_registers_t)); } /* pick next task */ if (current_task) { next_task = task_get_next_by_status(TASK_RUNNING, current_task); } else { next_task = task_get_by_status(TASK_RUNNING); tss_set_kernel_stack(next_task->kstack); } assert(next_task != null); /* whether should kill current task */ if (current_task && current_task->status == TASK_KILLING) { /* kill current task */ task_delete(current_task); } else { /* try to kill killing tasks */ struct task_t* task; task = task_find_by_status(TASK_KILLING); if (task) { task_delete(task); } } /* prepare context for the next task */ next_task->op_registers.u_esp -= 4; *(u32*)(next_task->op_registers.u_esp) = (*(u16*)(&next_task->flags)) | 0x200; next_task->op_registers.u_esp -= 4; *(u32*)(next_task->op_registers.u_esp) = next_task->op_registers.cs; next_task->op_registers.u_esp -= 4; *(u32*)(next_task->op_registers.u_esp) = next_task->op_registers.eip; next_task->gp_registers.esp = next_task->op_registers.u_esp; next_task->op_registers.u_esp -= sizeof(struct gp_registers_t); memcpy((void*)next_task->op_registers.u_esp, (void*)&next_task->gp_registers, sizeof(struct gp_registers_t)); /* update current task pointer */ current_task = next_task; /* run next task */ printf(MSG_SCHED_NEXT, next_task->tid, next_task->op_registers.u_esp, *ret_addr, next_task->op_registers.eip); asm_switch_ucontext(next_task->op_registers.u_esp, next_task->op_registers.cr3); }
      
      





It remains to write the task context switching function.



To do this, download the new page directory and restore all general registers, including flags.



You also need to switch to the stack of the next task.



Next, you need to make a return from the interrupt, since we formed a special stack frame for this.



For the fact that later we will need this to switch privilege levels.



 /* * Switch context (to kernel ring) * void asm_switch_kcontext(u32 esp, u32 cr3) */ asm_switch_kcontext: mov 4(%esp),%ebp # ebp = esp mov 8(%esp),%eax # eax = cr3 mov %cr0,%ebx # ebx = cr0 and $0x7FFFFFFF,%ebx # unset PG bit mov %ebx,%cr0 mov %eax,%cr3 or $0x80000001,%ebx # set PE & PG bits mov %ebx,%cr0 mov %ebp,%esp popal sti iretl
      
      





We implement the simplest messaging mechanism between processes.



When we send a message to a process, it needs to be thawed if it is frozen because it is waiting for messages.



But when we receive a message, we must freeze the process if the message queue is empty.



After that, you need to transfer control to the scheduler.



 /* * Api - Send message to task */ extern void ksend(u_short tid, struct message_t* msg) { struct task_t* task; /* get target task */ task = task_get_by_id(tid); /* put message to task buffer */ task_pack_message(task, msg); /* whether task has frozen */ if (task->status == TASK_INTERRUPTABLE) { /* defrost task */ task->status = TASK_RUNNING; } } /* * Api - Receive message to task * This function has blocking behaviour */ extern void kreceive(u_short tid, struct message_t* msg) { struct task_t* task_before; /* before yield */ struct task_t* task_after; /* after yield */ /* get current task */ task_before = sched_get_current_task(); assert(tid == task_before->tid); assert(task_before->status == TASK_RUNNING); /* whether task has not incomming messages */ if (task_before->msg_count_in == 0) { /* freeze task */ task_before->status = TASK_INTERRUPTABLE; } /* wait fot message */ sched_yield(); /* get current task */ task_after = sched_get_current_task(); assert(task_after == task_before); assert(tid == task_after->tid); assert(task_after->status == TASK_RUNNING); /* fetch message from buffer */ task_extract_message(task_after, msg); assert(msg != null); }
      
      





Now you need to write system calls to work with processes from user applications.



There we will throw system calls to send and receive messages.



 /* * Api - Syscall handler */ extern size_t ih_syscall(u_int* function) { size_t params_addr = ((size_t)function + sizeof(u_int)); size_t result = 0; struct task_t *current = sched_get_current_task(); printf(MSG_SYSCALL, *function, current->tid); asm_lock(); /* handle syscall */ switch (*function) { case SYSCALL_KSEND: { /* send message */ u_short tid = *(u_int*)params_addr; ksend(tid, *(struct message_t**)(params_addr + 4)); break; } case SYSCALL_KRECEIVE: { /* receive message */ u_short tid = *(u_int*)params_addr; kreceive(tid, *(struct message_t**)(params_addr + 4)); break; } case SYSCALL_KILL: { /* kill task */ u_short tid = *(u_int*)params_addr; struct task_t* task = task_find_by_id(tid); if (task != null) { assert(task->tid == tid); task->status = TASK_KILLING; task->reschedule = true; result = true; } else { result = false; } break; } case SYSCALL_EXIT: { /* exit task */ int errno = *(int*)params_addr; u_int tid = current->tid; printf(MSG_TASK_FINISHED, tid, errno); current->status = TASK_KILLING; sched_yield(); break; } case SYSCALL_TASK_LIST: { /* get tasks list */ result = (size_t)task_get_task_list(); break; } default: unreachable(); } printf(MSG_SYSCALL_RET, *function); asm_unlock(); return result; }
      
      





In order to use the system calls from our C library, we need to write a function that causes a kernel interrupt. For the sake of simplicity, we will not use special Intel commands since we do not have privilege levels. As an argument, we will pass the number of the system function and the arguments it needs.



 /* * Call kernel * void asm_syscall(unsigned int function, ...) */ asm_syscall: push %ebx push %ebp mov %esp,%ebp mov %ebp,%ebx add $8,%ebx # skip registers add $4,%ebx # skip return address push %ebx # &function int $0x80 mov %ebp,%esp pop %ebp pop %ebx ret
      
      





This is quite enough to implement the simplest multitasking without the support of protection rings. Now open the video tutorial and watch everything in order!



References



Details and explanations in the video tutorial .



The source code in the git repository (you need the lesson7 branch).



Bibliography



1. James Molloy. Roll your own toy UNIX-clone OS.

2. Teeth. Assembler for DOS, Windows, Unix

3. Kalashnikov. Assembler is easy!

4. Tanenbaum. Operating Systems. Implementation and development.

5. Robert Love. Linux kernel Description of the development process.



All Articles