🌑 🏉 💪🏿 How I came to the formal specification of a RISC-V processor in F # 🐅 🧒🏻 🕴️

On languid winter evenings, when the sun lazily passed through the veil of days - I found the strength to tackle the realization of a long-held dream: to figure out how the processors are arranged. This dream led me to write a formal specification for a RISC-V processor. Github Project

How it was

I had such a desire a long time ago, when 20 years ago I began to engage in my first projects. For the most part, these were scientific research, mathematical modeling in the framework of term papers and scientific articles. These were the days of Pascal and Delphi. However, even then, Haskell and functional programming attracted my interest. Time passed, the languages of the projects and technologies in which I was involved changed. But since then interest in functional programming languages has been a common thread, and they have become: Haskell, Idris, Agda. Recently, however, my projects have been in Rust. A deeper immersion in Rust led me to study Embedded devices.

From Rust to Embedded

Rust's capabilities are so wide, and the community is so active that Embedded development has begun to support a wide range of devices. And this was my first step into a lower-level understanding of processors.

My first board was: STM32F407VET6 . It was an immersion in the world of microcontrollers, from which at that time I was very far away, and understood approximately enough how the work was done at a low level.

Gradually, esp32 , ATmega328 boards (represented by the Ukraino UNO board) were added here. Immersion in stm32 turned out to be quite painful - the information is quite plentiful and often not the one I need. And it turned out that developing, for example, on Assembler is a rather routine and ungrateful task, with their subset of more than 1000 instructions. However, Rust handled this cheerfully, although at times there were difficulties with integration for specific Chinese boards.

The AVR architecture turned out to be noticeably simpler and more transparent. The abundant manuals gave me sufficient understanding of how to work with such a fairly limited set of instructions, and still be able to create very interesting solutions. Nevertheless, the Arduino path did not please me at all, but writing in Asm / C / Rust turned out to be much more interesting.

Where's the RISC-V?

And at that moment a logical question arises - where is the RISC-V CPU ?

The minimalistic nature of AVR and its sufficient documentation, I was returned to my previous dream to figure out how the processor works. By this time, I had an FPGA board and the first implementations for it in the form of interaction with VGA devices, graphics output, and interaction with peripherals.

Books were my guides to processor architecture:

John L. Hennessy and David A. Patterson - Computer Architecture: A Quantitative Approach (The Morgan Kaufmann Series in Computer Architecture and Design)
John L. Hennessy and David A. Patterson - Computer Organization and Design. The Hardware / Software Interface: RISC-V Edition
David M. Harris and Sarah L. Harris - Digital circuitry and computer architecture
The RISC-V Instruction Set Manual

Why is it necessary

It would seem - everything has already been written and implemented for a long time.

various implementations in HDL, and programming languages. By the way, quite an interesting implementation of RISC-V on Rust .

However, what could be more interesting than figuring it out yourself and creating your own. Your bike ? Or contribute to bike building ? In addition to personal deep interest, I had an idea - and how to try to popularize, to interest. And find your form, your approach. And that means presenting the rather boring RISC-V ISA documentation in the form of an official specification in a different form. And it seems to me that the path of formalization in this sense is quite interesting.

What do I mean by formalization? A fairly broad concept. Representation of a specific data set in a formal form. In this case, through a description of structures and a functional description. And in this sense, functional programming languages have their own charm. Also, the task is that a person who is not very immersed in programming can read the code as a specification, possibly minimizing the specifics of the language in which it is described.

A declarative approach, so to speak. There is a saying, but how exactly it works is no longer essential. The main thing is readability, visibility, and of course correctness. Correspondence of formal statements to the meaning embedded in them.

Total - I'm really curious to pass on my interest to others. There is a certain illusion that interest is the driving force for actions. Through which individuality becomes and manifests. And this is part of self-realization, the embodiment of creativity.

Ambitious and a bit of lyrics. What next?

Existing Implementations

They are and at the moment they are aggregated by the project: RISC-V Formal Verification .

List of formal specifications (including my work): https://github.com/SymbioticEDA/riscv-formal/blob/master/docs/references.md

As you can see - for the most part these are formalizations in the Haskell language. This was the starting point in choosing a different functional language. And my choice fell on F # .

Why `F#`

It so happened that I have known about F # for a long time, but somehow in the hustle and bustle of everyday life I did not have the opportunity to get to know each other better. Another factor was the .NET platform. Taking into account that I am under Linux, for a long time I was not happy with the IDE, and mono

looked raw enough. And returning to Windows just for the sake of MS Visual Studio is a rather dubious idea.

However, time does not stand still, and the stars in the sky are in no hurry to change. But by this time, Jetbrains Rider had evolved to a complete and convenient tool, and .NET Core

for Linux does not bring pain at a glance at it.

The question was - which functional language to choose. The fact that it should be just a functional language - in a somewhat pathetic form, I argued above.

Haskell, Idris, Agda

? F#

- I'm not familiar with him. A great occasion to learn new colors of the world of functional languages.

Yes, F#

not purely functional. But what prevents to adhere to " purity "? And then it turned out - that the F # documentation is quite detailed and complete. Readable, and I would even say interesting.

What has F#

become for me now? A fairly flexible language, with very convenient IDEs (Rider, Visual Studio). Fully developed types (although of course Idris

very far away). And overall pretty sweet in terms of readability. However, as it turned out, its functional " not cleanliness " - can lead the code to a completely insane form, both in terms of readability and logic. The analysis of packets in Nuget demonstrates this.

Another interesting and mysterious feature for me was the discovery that no one was interested in writing the RISC-V ISA formalization in F # before (officially or in a searchable form). And this means that I have a chance to introduce a new stream into this community, language, and “ ecosystem ”.

The pitfalls I encountered

The most difficult part was the Execution flow implementation. It often turned out that it was not entirely clear how the instruction should work. Unfortunately, I couldn’t ask a reliable comrade who could call at 3 a.m. and with a worried, aspirated voice: “Do you know, the BLTU

instruction is probably different in signextend ...” In this sense, having qualified comrades who will help with a kind word and appropriate qualified advice is very welcome.

What were the difficulties and pitfalls. I'll try thesis:

ELF - a curious task was to figure out how to work with it, read sections, instructions. Most likely this story is not finished within the framework of the current project.
unsigned instructions periodically led to errors that I detected during unit testing
the implementation of working with memory required thinking about beautiful and readable byte-layout algorithms.
there was no suitable package for working with bits under int32, int64

. It took time to write my package and test it.

Here I want to note that working with bits in F # is much more convenient than in Haskell with its Data.Bits
proper support for register bits, with the ability to support x32

and x64

at the same time. Inattention led me to use int64

in some places. Unit tests helped me identify this. But it took a while.
I did not find a simple, concise, personally convenient CLI F # package for me. A side effect was writing a minimalistic version in a functional style.
At the moment, it remains a mystery how to correctly implement System Instructions: FENCE, ECALL, BREAK
far from the whole set of extensions (ISA extensions) from the list: [A, M, C, F, D]

currently obvious. In particular, the implementation of [F,D]

not via soft float

.
at the moment, there is no clear understanding of Privileged Instructions, User Mod, work with peripherals - alas. And this is the way of research, trial and error.
There is no black belt for writing Assembler programs under RISC-V. Perhaps far from often this will be a need, given how many languages have already been ported for writing under RISC-V.
the time factor also turned out to be significant - it is quite small in the maelstrom of basic work, everyday needs and the ocean of life around. And there is a lot of work, and most of it is not so much in " coding " - writing the code itself, but in learning, mastering the materials.

How it works and what features

Now perhaps the most technical part. What are the features at the moment:

rv32i

instruction set rv32i
the ability to run the program as a RISC-V simulator - the execution of ELF files.
command line support (CLI) - selection of executable architecture, set of instructions, executable ELF files, logging mode, command line help.
the ability to display the log of executable instructions, close to objdump

view when disassembling.
the ability to run tests covering the entire implemented set of instructions.

The program is divided into the following stages and cycles:

read command line
reading instructions from an ELF file
reading a specific instruction according to the current PC (Program Counter) counter
decoding instructions
instruction execution
in the event of unforeseen situations, traps are set up, allowing you to complete the execution process, signaling a problem, and providing the necessary data
if the program is not in an infinite loop - display the status of the registers and end the simulation program

What is included in the plans:

Standard base 64i (almost complete)
Standard extension M (integer multiply / divide)
Standard extension A (atomic memory ops)
Standard extension C (Compressed 16-bit instructions)
Standard extension F (Single-precision floating point)
Standard extension D (Double-precision floating point * Privilege Level M (Machine)
Privilege Level U (User)
Privilege Level S (Supervisor)
Virtual Memory schemes SV32, SV39 and SV48
host programs
GPIO - work with peripherals

How to start

In order to run the program, you must have .NET Core

. If you haven’t installed it, then for example, under Ubuntu 16.04

you need to run the following set of commands:

 $ wget -q https://packages.microsoft.com/config/ubuntu/16.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb $ sudo dpkg -i packages-microsoft-prod.deb $ sudo apt-get update $ sudo apt-get install apt-transport-https $ sudo apt-get update $ sudo apt-get install dotnet-sdk-3.0

To verify that something in life has changed, run:

 $ dotnet --version

And life should sparkle with new colors!

Now try to run. To do this, stock up on your favorite tea or coffee, chocolate with buns, turn on your favorite music and follow this set of commands:

 $ cd some/my/dev/dir $ git clone https://github.com/mrLSD/riscv-fs $ cd riscv-fs $ dotnet build $ dotnet run -- --help

and your console should playfully wink at you with a help message.

The launch is:

 $ dotnet run

In a strict tone he will say that parameters are needed. Therefore, run:

 $ dotnet run -- -A rv32i -v myapp.elf

This is the same awkward moment when it turns out that we still need a ready-made ready for execution program for RISC-V. And there is something for me to help you with. However, you will need the GNU toolchain for RISC-V . Let it be installed homework - the description of the repository describes in sufficient detail how to do this.

Next, to obtain the coveted test ELF file, we perform the following actions:

 $ cd Tests/asm/ $ make build32

if you have a RISC-V toolchain then everything should go smoothly. And the files should show off in the directory:

 $ ls Tests/asm/build/ add32 alu32 alui32 br32 j32 mem32 sys32 ui32

and now boldly, without looking back, we try the command:

 $ dotnet run -- -A rv32i -v Tests/asm/build/ui32

It is important to note that Tests/asm

not a test program, but their main purpose is test instructions and their codes for writing tests. Therefore, if you like something larger and more heroic, then changing the world in your will is to find an independently executable 32-bit ELF file that supports only rv32i

instructions.

However, the set of instructions and extensions will be replenished, gain momentum and gain weight.

Summary and links

It turned out to be a little pathetic narration flavored by personal history. Sometimes technical, sometimes subjective. However enthusiastic and with a touch of enthusiasm.

For my part, I am interested to hear from you: reviews, impressions, good parting words. And for the most daring - help in supporting the project.

Are you interested in such a narrative format and would you like to continue?

How I came to the formal specification of a RISC-V processor in F #