OP_HALT = 1 now. This commit also adjusts the error checking in inst_read_bytecode. The main reasoning behind this is when other platforms or applications target the AVM: whenever a new opcode may be added, the actual binary for OP_HALT changes (as a result of how C enums work). Say your application targets commit alpha of AVM. OP_HALT is, say, 98. In commit beta, AVM is updated with a new opcode so OP_HALT is changed to 99 (due to the new opcode being placed before OP_HALT). If your application builds a binary for AVM version alpha and AVM version beta is used instead, OP_HALT will be interpreted as another instruction, which can lead to undefined behaviour. This can be hard to debug, so here I've made the decision to try and not place new opcodes in between old ones; new ones will always be placed *before* NUMBER_OF_OPCODES.
Aryadev's Virtual Machine (AVM)
A stack based virtual machine in C11, with a dynamic register setup which acts as variable space. Deals primarily in bytes, doesn't make assertions about typing and is very simple to target.
This repository contains both a library (lib folder) to (de)serialize bytecode and a program (vm folder) to execute bytecode.
Along with this is an assembler program which can compile an assembly-like language to bytecode.
How to build
Requires GNU make and a compliant C11 compiler. Code base has been
tested against gcc and clang, but given how the project has been
written without use of GNU'isms (that I'm aware of) it shouldn't be an
issue to compile using something like tcc or another compiler (look
at here to change the compiler).
To build everything simply run make. This will build:
- instruction bytecode system which provides object files to target the VM
- VM executable which executes bytecode
You may also build each component individually through the corresponding recipe:
make libmake vm
How to target the virtual machine
Link with the object files for base.c and
inst.c to be able to properly target the virtual
machine. The general idea is to convert parse units into instances of
inst_t. Once a collection of inst_t's have been made, they must
be wrapped in a prog_t structure which is a flexibly allocated
structure with two components:
- A program header
prog_header_twith some essential properties of the program (start address, count, etc) - A buffer of type
inst_twhich should contain the ordered collection constructed
There are two ways to utilise execute this program structure: compilation or in memory execution.
Compilation
The prog_t structure can be fed to prog_write_file with a file
pointer to write well formed AVM bytecode into a file. To execute
this bytecode, simply use the avm.out executable with the bytecode
file name.
This is the classical way I expect languages to target the virtual machine.
In memory virtual machine
This method requires linking with vm/runtime.c to be able to
construct a working vm_t structure. The steps are:
- Load the stack, heap and call stack into a
vm_tstructure - Load the
prog_tinto thevm_t(vm_load_program) - Execute via
vm_executeorvm_execute_all
vm_execute executes the next instruction and stops, while
vm_execute_all continues execution till the program halts. Either
can be useful depending on requirements.
I expect this method to be used for languages that are interpreted such as Lisp or Python where code -> execution rather than code -> compile unit -> execute unit, while still providing the ability to compile code to a byte code unit.
Lines of code
| Files | Lines | Words | Bytes |
|---|---|---|---|
| ./vm/struct.h | 69 | 197 | 1534 |
| ./vm/main.c | 94 | 267 | 2266 |
| ./vm/struct.c | 262 | 767 | 6882 |
| ./vm/runtime.h | 270 | 705 | 7318 |
| ./vm/runtime.c | 792 | 2451 | 23664 |
| ./lib/darr.h | 88 | 465 | 2705 |
| ./lib/heap.c | 101 | 270 | 1910 |
| ./lib/base.h | 159 | 656 | 4180 |
| ./lib/heap.h | 42 | 111 | 803 |
| ./lib/prog.h | 173 | 243 | 2589 |
| ./lib/base.c | 107 | 306 | 2054 |
| ./lib/inst.c | 510 | 1299 | 14122 |
| ./lib/darr.c | 77 | 225 | 1767 |
| ./lib/inst.h | 113 | 461 | 4269 |
| total | 2857 | 8423 | 76063 |