Updated includes, README and TODO

This commit is contained in:
2024-06-19 20:01:20 +01:00
parent e3aabff845
commit bdc6e15ae9
5 changed files with 78 additions and 71 deletions

View File

@@ -2,92 +2,93 @@
#+author: Aryadev Chavali #+author: Aryadev Chavali
#+date: 2023-10-15 #+date: 2023-10-15
A stack based virtual machine in C11, with a dynamic register setup A virtual machine in C11, stack oriented with a dynamic register.
which acts as variable space. Deals primarily in bytes, doesn't make Deals primarily in bytes, doesn't make assertions about typing and is
assertions about typing and is very simple to target. very simple to target.
This repository contains both a library ([[file:lib/][lib folder]]) to This repository contains both a library ([[file:lib/][lib]]) to
(de)serialize bytecode and a program ([[file:vm/][vm folder]]) to (de)serialize bytecode and a program ([[file:vm/][vm]]) to execute
execute bytecode. said bytecode.
Along with this is an
[[https://github.com/aryadev-software/aal][assembler]] program which
can compile an assembly-like language to bytecode.
* How to build * How to build
Requires =GNU make= and a compliant C11 compiler. Code base has been Requires =GNU make= and a compliant C11 compiler. Code base has been
tested against =gcc= and =clang=, but given how the project has been tested against =gcc= and =clang=, but given how the project has been
written without use of GNU'isms (that I'm aware of) it shouldn't be an written without use of GNU'isms (that I'm aware of) it shouldn't be an
issue to compile using something like =tcc= or another compiler (look issue to compile using something like =tcc= or another compiler (look
at [[file:Makefile::CC=gcc][here]] to change the compiler). at [[file:Makefile::CC=gcc][here]] to change the compiler used).
To build everything simply run ~make~. This will build: To build a release version simply run ~make all RELEASE=1~. To build
+ [[file:lib/][instruction bytecode system]] which provides object a debug version run ~make all VERBOSE=<n>~ where n can be 0, 1 or 2
files to target the VM depending on how verbose you want logs to standard output to be. This
+ [[file:vm/][VM executable]] which executes bytecode will build:
+ [[file:lib/][instruction bytecode system]] which provides a shared
library for serialising and deserialising bytecode
+ [[file:vm/][VM executable]] to execute bytecode
You may also build each component individually through the You may also build each component individually through the
corresponding recipe: corresponding recipe:
+ ~make lib~ + ~make lib~
+ ~make vm~ + ~make vm~
* How to target the virtual machine * Targeting the virtual machine
Link with the object files for [[file:lib/base.c][base.c]] and Link with the shared library =libavm.so= which should be located in
[[file:lib/inst.c][inst.c]] to be able to properly target the virtual the =build= folder. The general idea is to construct a ~prog_t~
machine. The general idea is to convert parse units into instances of structure, which consists of:
~inst_t~. Once a collection of ~inst_t~'s have been made, they must 1) A program header with some essential properties of the program
be wrapped in a ~prog_t~ structure which is a flexibly allocated (start address, count, etc)
structure with two components: 2) An array of type ~inst_t~, ordered instructions for execution
1) A program header ~prog_header_t~ with some essential properties of
the program (start address, count, etc)
2) A buffer of type ~inst_t~ which should contain the ordered
collection constructed
There are two ways to utilise execute this program structure: This structure may be executed in two ways.
compilation or in memory execution. ** Compilation then separate execution
** Compilation The ~prog_t~ structure along with a sufficiently sized buffer of bytes
The ~prog_t~ structure can be fed to ~prog_write_file~ with a file (using ~prog_bytecode_size~ to get the size necessary) can be used to
pointer to write well formed =AVM= bytecode into a file. To execute call ~prog_write_bytecode~, which will populate the buffer with the
this bytecode, simply use the ~avm.out~ executable with the bytecode corresponding bytecode.
file name.
This is the classical way I expect languages to target the virtual The buffer is written to some file then executed using the =avm=
machine. executable. This is the classical way I expect languages to target
the virtual machine.
** In memory virtual machine ** In memory virtual machine
This method requires linking with [[file:vm/runtime.c]] to be able to This method is more involved, introducing the virtual machine runtime
construct a working ~vm_t~ structure. The steps are: into the program itself. After constructing a ~prog_t~ structure, it
+ Load the stack, heap and call stack into a ~vm_t~ structure can be fit into a ~vm_t~ structure. This ~vm_t~ structure also must
+ Load the ~prog_t~ into the ~vm_t~ (~vm_load_program~) have a stack, heap and call stack (look at [[file:vm/main.c]] to see
+ Execute via ~vm_execute~ or ~vm_execute_all~ this in practice). This structure can then be used with
~vm_execute_all~ to execute the program.
~vm_execute~ executes the next instruction and stops, while Note that this skips the serialising process (i.e. the /compilation/)
~vm_execute_all~ continues execution till the program halts. Either by utilising the runtime directly. I could see this approach being
can be useful depending on requirements. used when writing an interpreted language such as Lisp where code
should be executed immediately after parsing. Furthermore,
I expect this method to be used for languages that are /interpreted/ introducing the runtime directly into the calling program gives much
such as Lisp or Python where /code/ -> /execution/ rather than /code/ greater control over parameters such as stack/heap size and step by
-> /compile unit/ -> /execute unit/, while still providing the ability step execution which can be useful in dynamic contexts. Furthermore,
to compile code to a byte code unit. the ~prog_t~ can still be compiled into bytecode whenever required.
* Related projects
[[https://github.com/aryadev-software/aal][Assembler]] program which
can compile an assembly-like language to bytecode.
* Lines of code * Lines of code
#+begin_src sh :results table :exports results #+begin_src sh :results table :exports results
wc -lwc $(find -regex ".*\.[ch]\(pp\)?") wc -lwc $(find vm/ lib/ -regex ".*\.[ch]\(pp\)?")
#+end_src #+end_src
#+RESULTS: #+RESULTS:
| Files | Lines | Words | Bytes | |------------------+-------+-------+------------|
|----------------+-------+-------+-------| | File | Lines | Words | Characters |
| ./vm/struct.h | 69 | 197 | 1534 | |------------------+-------+-------+------------|
| ./vm/main.c | 94 | 267 | 2266 | | vm/runtime.h | 266 | 699 | 7250 |
| ./vm/struct.c | 262 | 767 | 6882 | | vm/main.c | 135 | 375 | 3448 |
| ./vm/runtime.h | 270 | 705 | 7318 | | vm/runtime.c | 802 | 2441 | 23634 |
| ./vm/runtime.c | 792 | 2451 | 23664 | | vm/struct.c | 262 | 783 | 7050 |
| ./lib/darr.h | 88 | 465 | 2705 | | vm/struct.h | 69 | 196 | 1531 |
| ./lib/heap.c | 101 | 270 | 1910 | | lib/inst.c | 493 | 1215 | 13043 |
| ./lib/base.h | 159 | 656 | 4180 | | lib/darr.h | 149 | 709 | 4482 |
| ./lib/heap.h | 42 | 111 | 803 | | lib/inst.h | 248 | 519 | 4964 |
| ./lib/prog.h | 173 | 243 | 2589 | | lib/inst-macro.h | 71 | 281 | 2806 |
| ./lib/base.c | 107 | 306 | 2054 | | lib/heap.h | 125 | 453 | 3050 |
| ./lib/inst.c | 510 | 1299 | 14122 | | lib/base.h | 190 | 710 | 4633 |
| ./lib/darr.c | 77 | 225 | 1767 | | lib/heap.c | 79 | 214 | 1647 |
| ./lib/inst.h | 113 | 461 | 4269 | | lib/base.c | 61 | 226 | 1583 |
|----------------+-------+-------+-------| | lib/darr.c | 76 | 219 | 1746 |
| total | 2857 | 8423 | 76063 | |------------------+-------+-------+------------|
| total | 3026 | 9040 | 80867 |
|------------------+-------+-------+------------|

View File

@@ -15,7 +15,6 @@
#include <lib/base.h> #include <lib/base.h>
#include <stdio.h> #include <stdio.h>
#include <stdlib.h>
#define UNSIGNED_OPCODE_IS_TYPE(OPCODE, OP_TYPE) \ #define UNSIGNED_OPCODE_IS_TYPE(OPCODE, OP_TYPE) \
(((OPCODE) >= OP_TYPE##_BYTE) && ((OPCODE) <= OP_TYPE##_WORD)) (((OPCODE) >= OP_TYPE##_BYTE) && ((OPCODE) <= OP_TYPE##_WORD))

View File

@@ -17,7 +17,12 @@
** TODO Specification ** TODO Specification
* TODO Introduce error handling in base library :LIB: * TODO Introduce error handling in base library :LIB:
There is a large variety of TODOs about errors. Let's fix them! There is a large variety of TODOs about errors. Let's fix them!
8 TODOs currently present. #+begin_src sh :exports results :results output verbatim replace
echo "$(find -type 'f' -regex ".*\.[ch]\(pp\)?" -exec grep -nH TODO "{}" ";" | wc -l) TODOs currently"
#+end_src
#+RESULTS:
: 8 TODOs currently
* TODO Standard library :VM: * TODO Standard library :VM:
I should start considering this and how a user may use it. Should it I should start considering this and how a user may use it. Should it
be an option in the VM and/or assembler binaries (i.e. a flag) or be an option in the VM and/or assembler binaries (i.e. a flag) or

View File

@@ -11,6 +11,7 @@
*/ */
#include <stdio.h> #include <stdio.h>
#include <stdlib.h>
#include <vm/runtime.h> #include <vm/runtime.h>
#include <vm/struct.h> #include <vm/struct.h>

View File

@@ -11,9 +11,10 @@
*/ */
#include <stdio.h> #include <stdio.h>
#include <stdlib.h>
#include "./struct.h" #include <lib/darr.h>
#include "lib/darr.h" #include <vm/struct.h>
void vm_load_stack(vm_t *vm, byte_t *bytes, size_t size) void vm_load_stack(vm_t *vm, byte_t *bytes, size_t size)
{ {