#+title: Aryadev's Virtual Machine (AVM) #+author: Aryadev Chavali #+date: 2023-10-15 A stack based virtual machine in C11, with a dynamic register setup which acts as variable space. Deals primarily in bytes, doesn't make assertions about typing and is very simple to target. This repository contains both a library ([[file:lib/][lib folder]]) to (de)serialize bytecode and a program ([[file:vm/][vm folder]]) to execute bytecode. Along with this is an [[https://github.com/aryadev-software/aal][assembler]] program which can compile an assembly-like language to bytecode. * How to build Requires =GNU make= and a compliant C11 compiler. Code base has been tested against =gcc= and =clang=, but given how the project has been written without use of GNU'isms (that I'm aware of) it shouldn't be an issue to compile using something like =tcc= or another compiler (look at [[file:Makefile::CC=gcc][here]] to change the compiler). To build everything simply run ~make~. This will build: + [[file:lib/][instruction bytecode system]] which provides object files to target the VM + [[file:vm/][VM executable]] which executes bytecode You may also build each component individually through the corresponding recipe: + ~make lib~ + ~make vm~ * How to target the virtual machine Link with the object files for [[file:lib/base.c][base.c]] and [[file:lib/inst.c][inst.c]] to be able to properly target the virtual machine. The general idea is to convert parse units into instances of ~inst_t~. Once a collection of ~inst_t~'s have been made, they must be wrapped in a ~prog_t~ structure which is a flexibly allocated structure with two components: 1) A program header ~prog_header_t~ with some essential properties of the program (start address, count, etc) 2) A buffer of type ~inst_t~ which should contain the ordered collection constructed There are two ways to utilise execute this program structure: compilation or in memory execution. ** Compilation The ~prog_t~ structure can be fed to ~prog_write_file~ with a file pointer to write well formed =AVM= bytecode into a file. To execute this bytecode, simply use the ~avm.out~ executable with the bytecode file name. This is the classical way I expect languages to target the virtual machine. ** In memory virtual machine This method requires linking with [[file:vm/runtime.c]] to be able to construct a working ~vm_t~ structure. The steps are: + Load the stack, heap and call stack into a ~vm_t~ structure + Load the ~prog_t~ into the ~vm_t~ (~vm_load_program~) + Execute via ~vm_execute~ or ~vm_execute_all~ ~vm_execute~ executes the next instruction and stops, while ~vm_execute_all~ continues execution till the program halts. Either can be useful depending on requirements. I expect this method to be used for languages that are /interpreted/ such as Lisp or Python where /code/ -> /execution/ rather than /code/ -> /compile unit/ -> /execute unit/, while still providing the ability to compile code to a byte code unit. * Lines of code #+begin_src sh :results table :exports results wc -lwc $(find -regex ".*\.[ch]\(pp\)?") #+end_src #+RESULTS: | Files | Lines | Words | Bytes | |----------------+-------+-------+-------| | ./vm/struct.h | 69 | 197 | 1534 | | ./vm/main.c | 94 | 267 | 2266 | | ./vm/struct.c | 262 | 767 | 6882 | | ./vm/runtime.h | 270 | 705 | 7318 | | ./vm/runtime.c | 792 | 2451 | 23664 | | ./lib/darr.h | 88 | 465 | 2705 | | ./lib/heap.c | 101 | 270 | 1910 | | ./lib/base.h | 159 | 656 | 4180 | | ./lib/heap.h | 42 | 111 | 803 | | ./lib/prog.h | 173 | 243 | 2589 | | ./lib/base.c | 107 | 306 | 2054 | | ./lib/inst.c | 510 | 1299 | 14122 | | ./lib/darr.c | 77 | 225 | 1767 | | ./lib/inst.h | 113 | 461 | 4269 | |----------------+-------+-------+-------| | total | 2857 | 8423 | 76063 |