No longer relying on darr_t or anything other than the C runtime and aliases. This means it should be *even easier* to target this via FFI from other languages without having to initialise my custom made structures! Furthermore I've removed any form of allocation in the library so FFI callers don't need to manage memory in any way. Instead we rely on the caller allocating the correct amount of memory for the functions to work, with basic error handling if that doesn't happen. In the case of inst_read_bytecode, error reporting occurs by making the return of a function an integer. If the integer is positive it is the number of bytes read from the buffer. If negative it flags a possible error, which is a member of read_err_t. prog_read_bytecode has been split into two functions: prog_read_header and prog_read_instructions. prog_read_instructions works under the assumption that the program's header has been filled, e.g. via prog_read_header. prog_read_header returns 0 if there's not enough space in the buffer or if the start_address is greater than the count. prog_read_instructions returns a custom structure which contains an byte position as well as an error enum, allowing for finer error reporting. In the case of inst_write_bytecode via the assumption that the caller allocated the correct memory there is no need for error reporting. For prog_write_bytecode if an error occurs due to In the case of inst_read_bytecode we return the number
Aryadev's Virtual Machine (AVM)
A stack based virtual machine in C11, with a dynamic register setup which acts as variable space. Deals primarily in bytes, doesn't make assertions about typing and is very simple to target.
This repository contains both a library (lib folder) to (de)serialize bytecode and a program (vm folder) to execute bytecode.
Along with this is an assembler program which can compile an assembly-like language to bytecode.
How to build
Requires GNU make and a compliant C11 compiler. Code base has been
tested against gcc and clang, but given how the project has been
written without use of GNU'isms (that I'm aware of) it shouldn't be an
issue to compile using something like tcc or another compiler (look
at here to change the compiler).
To build everything simply run make. This will build:
- instruction bytecode system which provides object files to target the VM
- VM executable which executes bytecode
You may also build each component individually through the corresponding recipe:
make libmake vm
How to target the virtual machine
Link with the object files for base.c and
inst.c to be able to properly target the virtual
machine. The general idea is to convert parse units into instances of
inst_t. Once a collection of inst_t's have been made, they must
be wrapped in a prog_t structure which is a flexibly allocated
structure with two components:
- A program header
prog_header_twith some essential properties of the program (start address, count, etc) - A buffer of type
inst_twhich should contain the ordered collection constructed
There are two ways to utilise execute this program structure: compilation or in memory execution.
Compilation
The prog_t structure can be fed to prog_write_file with a file
pointer to write well formed AVM bytecode into a file. To execute
this bytecode, simply use the avm.out executable with the bytecode
file name.
This is the classical way I expect languages to target the virtual machine.
In memory virtual machine
This method requires linking with vm/runtime.c to be able to
construct a working vm_t structure. The steps are:
- Load the stack, heap and call stack into a
vm_tstructure - Load the
prog_tinto thevm_t(vm_load_program) - Execute via
vm_executeorvm_execute_all
vm_execute executes the next instruction and stops, while
vm_execute_all continues execution till the program halts. Either
can be useful depending on requirements.
I expect this method to be used for languages that are interpreted such as Lisp or Python where code -> execution rather than code -> compile unit -> execute unit, while still providing the ability to compile code to a byte code unit.
Lines of code
| Files | Lines | Words | Bytes |
|---|---|---|---|
| ./vm/struct.h | 69 | 197 | 1534 |
| ./vm/main.c | 94 | 267 | 2266 |
| ./vm/struct.c | 262 | 767 | 6882 |
| ./vm/runtime.h | 270 | 705 | 7318 |
| ./vm/runtime.c | 792 | 2451 | 23664 |
| ./lib/darr.h | 88 | 465 | 2705 |
| ./lib/heap.c | 101 | 270 | 1910 |
| ./lib/base.h | 159 | 656 | 4180 |
| ./lib/heap.h | 42 | 111 | 803 |
| ./lib/prog.h | 173 | 243 | 2589 |
| ./lib/base.c | 107 | 306 | 2054 |
| ./lib/inst.c | 510 | 1299 | 14122 |
| ./lib/darr.c | 77 | 225 | 1767 |
| ./lib/inst.h | 113 | 461 | 4269 |
| total | 2857 | 8423 | 76063 |