86 lines
3.8 KiB
Org Mode
86 lines
3.8 KiB
Org Mode
#+title: A Virtual Machine (AVM)
|
|
#+author: Aryadev Chavali
|
|
#+date: 2023-10-15
|
|
|
|
An exercise in making a virtual machine in C11 with both a stack and
|
|
registers.
|
|
|
|
This repository contains both a shared library ([[file:lib/][lib]]) to
|
|
(de)serialise bytecode and a program ([[file:vm/][vm]]) to execute
|
|
said bytecode.
|
|
|
|
* How to build
|
|
Requires =GNU make= and a compliant C11 compiler. Look
|
|
[[file:Makefile::CC=gcc][here]] to change the compiler used.
|
|
|
|
To build a release version simply run ~make all~. To build a debug
|
|
version run ~make all RELEASE=0 VERBOSE=2~ which has most runtime logs
|
|
on. This will build:
|
|
+ [[file:lib/][instruction bytecode system]] which provides a shared
|
|
library for serialising and deserialising bytecode
|
|
+ [[file:vm/][VM executable]] to execute bytecode
|
|
* Targeting the virtual machine
|
|
Link with the shared library =libavm.so=. The general idea is to
|
|
construct a ~prog_t~ structure, which consists of:
|
|
1) A program header with some essential properties of the program
|
|
(start address, count, etc)
|
|
2) An array of type ~inst_t~ which is an ordered set of instructions
|
|
for execution
|
|
|
|
This structure can be executed in two ways.
|
|
** Compilation then separate execution
|
|
The ~prog_t~ structure along with a sufficiently sized buffer of bytes
|
|
(~prog_bytecode_size~ gives the exact number of bytes necessary) can
|
|
be used in calling ~prog_write_bytecode~, which will populate the
|
|
buffer with the corresponding bytecode.
|
|
|
|
The buffer is written to some file then executed using the =avm=
|
|
executable. This is the classical way I expect languages to target
|
|
the virtual machine.
|
|
** In memory virtual machine
|
|
This method is works by introducing the virtual machine runtime into
|
|
the program that wishes to utilise the AVM itself. After constructing
|
|
a ~prog_t~ structure, it can be fit into a ~vm_t~ structure. This
|
|
structure maintains various other components such as the stack, heap
|
|
and call stack. This structure can then be used with ~vm_execute_all~
|
|
to execute the program.
|
|
|
|
Look at [[file:vm/main.c]] to see this in practice.
|
|
|
|
Note that this skips the serialising process (i.e. the /compilation/)
|
|
by utilising the runtime directly. I could see this approach being
|
|
used when writing an interpreted language such as Lisp where code
|
|
should be executed immediately after parsing. Furthermore,
|
|
introducing the runtime directly into the calling program gives much
|
|
greater control over parameters such as stack/heap size and step by
|
|
step execution which can be useful in dynamic contexts. Furthermore,
|
|
the ~prog_t~ can still be compiled into bytecode whenever required.
|
|
* Related projects
|
|
[[https://github.com/aryadev-software/aal][Assembler]] program which
|
|
can compile an assembly-like language to bytecode.
|
|
* Lines of code
|
|
#+begin_src sh :results table :exports results
|
|
echo 'Files Lines Words Characters'
|
|
wc -lwc $(find vm/ lib/ -regex ".*\.[ch]\(pp\)?") | awk '{print $4 "\t" $1 "\t" $2 "\t" $3}'
|
|
#+end_src
|
|
|
|
#+RESULTS:
|
|
| Files | Lines | Words | Characters |
|
|
|------------------+-------+-------+------------|
|
|
| vm/runtime.h | 327 | 872 | 9082 |
|
|
| vm/main.c | 136 | 381 | 3517 |
|
|
| vm/runtime.c | 735 | 2454 | 26742 |
|
|
| vm/struct.c | 252 | 780 | 6805 |
|
|
| vm/struct.h | 74 | 204 | 1564 |
|
|
| lib/inst.c | 567 | 1369 | 14899 |
|
|
| lib/darr.h | 149 | 709 | 4482 |
|
|
| lib/inst.h | 277 | 547 | 5498 |
|
|
| lib/inst-macro.h | 71 | 281 | 2806 |
|
|
| lib/heap.h | 125 | 453 | 3050 |
|
|
| lib/base.h | 236 | 895 | 5868 |
|
|
| lib/heap.c | 79 | 214 | 1647 |
|
|
| lib/base.c | 82 | 288 | 2048 |
|
|
| lib/darr.c | 76 | 219 | 1746 |
|
|
|------------------+-------+-------+------------|
|
|
| total | 3186 | 9666 | 89754 |
|