Updated includes, README and TODO

This commit is contained in:
2024-06-19 20:01:20 +01:00
parent e3aabff845
commit bdc6e15ae9
5 changed files with 78 additions and 71 deletions

View File

@@ -2,92 +2,93 @@
#+author: Aryadev Chavali
#+date: 2023-10-15
A stack based virtual machine in C11, with a dynamic register setup
which acts as variable space. Deals primarily in bytes, doesn't make
assertions about typing and is very simple to target.
A virtual machine in C11, stack oriented with a dynamic register.
Deals primarily in bytes, doesn't make assertions about typing and is
very simple to target.
This repository contains both a library ([[file:lib/][lib folder]]) to
(de)serialize bytecode and a program ([[file:vm/][vm folder]]) to
execute bytecode.
Along with this is an
[[https://github.com/aryadev-software/aal][assembler]] program which
can compile an assembly-like language to bytecode.
This repository contains both a library ([[file:lib/][lib]]) to
(de)serialize bytecode and a program ([[file:vm/][vm]]) to execute
said bytecode.
* How to build
Requires =GNU make= and a compliant C11 compiler. Code base has been
tested against =gcc= and =clang=, but given how the project has been
written without use of GNU'isms (that I'm aware of) it shouldn't be an
issue to compile using something like =tcc= or another compiler (look
at [[file:Makefile::CC=gcc][here]] to change the compiler).
at [[file:Makefile::CC=gcc][here]] to change the compiler used).
To build everything simply run ~make~. This will build:
+ [[file:lib/][instruction bytecode system]] which provides object
files to target the VM
+ [[file:vm/][VM executable]] which executes bytecode
To build a release version simply run ~make all RELEASE=1~. To build
a debug version run ~make all VERBOSE=<n>~ where n can be 0, 1 or 2
depending on how verbose you want logs to standard output to be. This
will build:
+ [[file:lib/][instruction bytecode system]] which provides a shared
library for serialising and deserialising bytecode
+ [[file:vm/][VM executable]] to execute bytecode
You may also build each component individually through the
corresponding recipe:
+ ~make lib~
+ ~make vm~
* How to target the virtual machine
Link with the object files for [[file:lib/base.c][base.c]] and
[[file:lib/inst.c][inst.c]] to be able to properly target the virtual
machine. The general idea is to convert parse units into instances of
~inst_t~. Once a collection of ~inst_t~'s have been made, they must
be wrapped in a ~prog_t~ structure which is a flexibly allocated
structure with two components:
1) A program header ~prog_header_t~ with some essential properties of
the program (start address, count, etc)
2) A buffer of type ~inst_t~ which should contain the ordered
collection constructed
* Targeting the virtual machine
Link with the shared library =libavm.so= which should be located in
the =build= folder. The general idea is to construct a ~prog_t~
structure, which consists of:
1) A program header with some essential properties of the program
(start address, count, etc)
2) An array of type ~inst_t~, ordered instructions for execution
There are two ways to utilise execute this program structure:
compilation or in memory execution.
** Compilation
The ~prog_t~ structure can be fed to ~prog_write_file~ with a file
pointer to write well formed =AVM= bytecode into a file. To execute
this bytecode, simply use the ~avm.out~ executable with the bytecode
file name.
This structure may be executed in two ways.
** Compilation then separate execution
The ~prog_t~ structure along with a sufficiently sized buffer of bytes
(using ~prog_bytecode_size~ to get the size necessary) can be used to
call ~prog_write_bytecode~, which will populate the buffer with the
corresponding bytecode.
This is the classical way I expect languages to target the virtual
machine.
The buffer is written to some file then executed using the =avm=
executable. This is the classical way I expect languages to target
the virtual machine.
** In memory virtual machine
This method requires linking with [[file:vm/runtime.c]] to be able to
construct a working ~vm_t~ structure. The steps are:
+ Load the stack, heap and call stack into a ~vm_t~ structure
+ Load the ~prog_t~ into the ~vm_t~ (~vm_load_program~)
+ Execute via ~vm_execute~ or ~vm_execute_all~
This method is more involved, introducing the virtual machine runtime
into the program itself. After constructing a ~prog_t~ structure, it
can be fit into a ~vm_t~ structure. This ~vm_t~ structure also must
have a stack, heap and call stack (look at [[file:vm/main.c]] to see
this in practice). This structure can then be used with
~vm_execute_all~ to execute the program.
~vm_execute~ executes the next instruction and stops, while
~vm_execute_all~ continues execution till the program halts. Either
can be useful depending on requirements.
I expect this method to be used for languages that are /interpreted/
such as Lisp or Python where /code/ -> /execution/ rather than /code/
-> /compile unit/ -> /execute unit/, while still providing the ability
to compile code to a byte code unit.
Note that this skips the serialising process (i.e. the /compilation/)
by utilising the runtime directly. I could see this approach being
used when writing an interpreted language such as Lisp where code
should be executed immediately after parsing. Furthermore,
introducing the runtime directly into the calling program gives much
greater control over parameters such as stack/heap size and step by
step execution which can be useful in dynamic contexts. Furthermore,
the ~prog_t~ can still be compiled into bytecode whenever required.
* Related projects
[[https://github.com/aryadev-software/aal][Assembler]] program which
can compile an assembly-like language to bytecode.
* Lines of code
#+begin_src sh :results table :exports results
wc -lwc $(find -regex ".*\.[ch]\(pp\)?")
wc -lwc $(find vm/ lib/ -regex ".*\.[ch]\(pp\)?")
#+end_src
#+RESULTS:
| Files | Lines | Words | Bytes |
|----------------+-------+-------+-------|
| ./vm/struct.h | 69 | 197 | 1534 |
| ./vm/main.c | 94 | 267 | 2266 |
| ./vm/struct.c | 262 | 767 | 6882 |
| ./vm/runtime.h | 270 | 705 | 7318 |
| ./vm/runtime.c | 792 | 2451 | 23664 |
| ./lib/darr.h | 88 | 465 | 2705 |
| ./lib/heap.c | 101 | 270 | 1910 |
| ./lib/base.h | 159 | 656 | 4180 |
| ./lib/heap.h | 42 | 111 | 803 |
| ./lib/prog.h | 173 | 243 | 2589 |
| ./lib/base.c | 107 | 306 | 2054 |
| ./lib/inst.c | 510 | 1299 | 14122 |
| ./lib/darr.c | 77 | 225 | 1767 |
| ./lib/inst.h | 113 | 461 | 4269 |
|----------------+-------+-------+-------|
| total | 2857 | 8423 | 76063 |
|------------------+-------+-------+------------|
| File | Lines | Words | Characters |
|------------------+-------+-------+------------|
| vm/runtime.h | 266 | 699 | 7250 |
| vm/main.c | 135 | 375 | 3448 |
| vm/runtime.c | 802 | 2441 | 23634 |
| vm/struct.c | 262 | 783 | 7050 |
| vm/struct.h | 69 | 196 | 1531 |
| lib/inst.c | 493 | 1215 | 13043 |
| lib/darr.h | 149 | 709 | 4482 |
| lib/inst.h | 248 | 519 | 4964 |
| lib/inst-macro.h | 71 | 281 | 2806 |
| lib/heap.h | 125 | 453 | 3050 |
| lib/base.h | 190 | 710 | 4633 |
| lib/heap.c | 79 | 214 | 1647 |
| lib/base.c | 61 | 226 | 1583 |
| lib/darr.c | 76 | 219 | 1746 |
|------------------+-------+-------+------------|
| total | 3026 | 9040 | 80867 |
|------------------+-------+-------+------------|

View File

@@ -15,7 +15,6 @@
#include <lib/base.h>
#include <stdio.h>
#include <stdlib.h>
#define UNSIGNED_OPCODE_IS_TYPE(OPCODE, OP_TYPE) \
(((OPCODE) >= OP_TYPE##_BYTE) && ((OPCODE) <= OP_TYPE##_WORD))

View File

@@ -17,7 +17,12 @@
** TODO Specification
* TODO Introduce error handling in base library :LIB:
There is a large variety of TODOs about errors. Let's fix them!
8 TODOs currently present.
#+begin_src sh :exports results :results output verbatim replace
echo "$(find -type 'f' -regex ".*\.[ch]\(pp\)?" -exec grep -nH TODO "{}" ";" | wc -l) TODOs currently"
#+end_src
#+RESULTS:
: 8 TODOs currently
* TODO Standard library :VM:
I should start considering this and how a user may use it. Should it
be an option in the VM and/or assembler binaries (i.e. a flag) or

View File

@@ -11,6 +11,7 @@
*/
#include <stdio.h>
#include <stdlib.h>
#include <vm/runtime.h>
#include <vm/struct.h>

View File

@@ -11,9 +11,10 @@
*/
#include <stdio.h>
#include <stdlib.h>
#include "./struct.h"
#include "lib/darr.h"
#include <lib/darr.h>
#include <vm/struct.h>
void vm_load_stack(vm_t *vm, byte_t *bytes, size_t size)
{