Update README

This commit is contained in:
2024-06-25 00:41:43 +01:00
parent a24196f236
commit 62b9b230c5
3 changed files with 41 additions and 49 deletions

1
.gitignore vendored
View File

@@ -3,3 +3,4 @@ build/*
TAGS TAGS
compile_commands.json compile_commands.json
.cache/* .cache/*
/ltximg/

View File

@@ -1,6 +1,6 @@
CC=gcc CC=gcc
VERBOSE=0 VERBOSE=0
RELEASE=0 RELEASE=1
GENERAL-FLAGS:=-Wall -Wextra -Wswitch-enum -I$(shell pwd) -std=c11 GENERAL-FLAGS:=-Wall -Wextra -Wswitch-enum -I$(shell pwd) -std=c11
DEBUG-FLAGS=-ggdb DEBUG-FLAGS=-ggdb

View File

@@ -1,59 +1,51 @@
#+title: Aryadev's Virtual Machine (AVM) #+title: A Virtual Machine (AVM)
#+author: Aryadev Chavali #+author: Aryadev Chavali
#+date: 2023-10-15 #+date: 2023-10-15
A virtual machine in C11, stack oriented with a dynamic register. An exercise in making a virtual machine in C11 with both a stack and
Deals primarily in bytes, doesn't make assertions about typing and is registers.
very simple to target.
This repository contains both a library ([[file:lib/][lib]]) to This repository contains both a shared library ([[file:lib/][lib]]) to
(de)serialize bytecode and a program ([[file:vm/][vm]]) to execute (de)serialise bytecode and a program ([[file:vm/][vm]]) to execute
said bytecode. said bytecode.
* How to build * How to build
Requires =GNU make= and a compliant C11 compiler. Code base has been Requires =GNU make= and a compliant C11 compiler. Look
tested against =gcc= and =clang=, but given how the project has been [[file:Makefile::CC=gcc][here]] to change the compiler used.
written without use of GNU'isms (that I'm aware of) it shouldn't be an
issue to compile using something like =tcc= or another compiler (look
at [[file:Makefile::CC=gcc][here]] to change the compiler used).
To build a release version simply run ~make all RELEASE=1~. To build To build a release version simply run ~make all~. To build a debug
a debug version run ~make all VERBOSE=<n>~ where n can be 0, 1 or 2 version run ~make all RELEASE=0 VERBOSE=2~ which has most runtime logs
depending on how verbose you want logs to standard output to be. This on. This will build:
will build:
+ [[file:lib/][instruction bytecode system]] which provides a shared + [[file:lib/][instruction bytecode system]] which provides a shared
library for serialising and deserialising bytecode library for serialising and deserialising bytecode
+ [[file:vm/][VM executable]] to execute bytecode + [[file:vm/][VM executable]] to execute bytecode
You may also build each component individually through the
corresponding recipe:
+ ~make lib~
+ ~make vm~
* Targeting the virtual machine * Targeting the virtual machine
Link with the shared library =libavm.so= which should be located in Link with the shared library =libavm.so=. The general idea is to
the =build= folder. The general idea is to construct a ~prog_t~ construct a ~prog_t~ structure, which consists of:
structure, which consists of:
1) A program header with some essential properties of the program 1) A program header with some essential properties of the program
(start address, count, etc) (start address, count, etc)
2) An array of type ~inst_t~, ordered instructions for execution 2) An array of type ~inst_t~ which is an ordered set of instructions
for execution
This structure may be executed in two ways. This structure can be executed in two ways.
** Compilation then separate execution ** Compilation then separate execution
The ~prog_t~ structure along with a sufficiently sized buffer of bytes The ~prog_t~ structure along with a sufficiently sized buffer of bytes
(using ~prog_bytecode_size~ to get the size necessary) can be used to (~prog_bytecode_size~ gives the exact number of bytes necessary) can
call ~prog_write_bytecode~, which will populate the buffer with the be used in calling ~prog_write_bytecode~, which will populate the
corresponding bytecode. buffer with the corresponding bytecode.
The buffer is written to some file then executed using the =avm= The buffer is written to some file then executed using the =avm=
executable. This is the classical way I expect languages to target executable. This is the classical way I expect languages to target
the virtual machine. the virtual machine.
** In memory virtual machine ** In memory virtual machine
This method is more involved, introducing the virtual machine runtime This method is works by introducing the virtual machine runtime into
into the program itself. After constructing a ~prog_t~ structure, it the program that wishes to utilise the AVM itself. After constructing
can be fit into a ~vm_t~ structure. This ~vm_t~ structure also must a ~prog_t~ structure, it can be fit into a ~vm_t~ structure. This
have a stack, heap and call stack (look at [[file:vm/main.c]] to see structure maintains various other components such as the stack, heap
this in practice). This structure can then be used with and call stack. This structure can then be used with ~vm_execute_all~
~vm_execute_all~ to execute the program. to execute the program.
Look at [[file:vm/main.c]] to see this in practice.
Note that this skips the serialising process (i.e. the /compilation/) Note that this skips the serialising process (i.e. the /compilation/)
by utilising the runtime directly. I could see this approach being by utilising the runtime directly. I could see this approach being
@@ -68,27 +60,26 @@ the ~prog_t~ can still be compiled into bytecode whenever required.
can compile an assembly-like language to bytecode. can compile an assembly-like language to bytecode.
* Lines of code * Lines of code
#+begin_src sh :results table :exports results #+begin_src sh :results table :exports results
wc -lwc $(find vm/ lib/ -regex ".*\.[ch]\(pp\)?") echo 'Files Lines Words Characters'
wc -lwc $(find vm/ lib/ -regex ".*\.[ch]\(pp\)?") | awk '{print $4 "\t" $1 "\t" $2 "\t" $3}'
#+end_src #+end_src
#+RESULTS: #+RESULTS:
| Files | Lines | Words | Characters |
|------------------+-------+-------+------------| |------------------+-------+-------+------------|
| File | Lines | Words | Characters | | vm/runtime.h | 327 | 872 | 9082 |
|------------------+-------+-------+------------| | vm/main.c | 136 | 381 | 3517 |
| vm/runtime.h | 266 | 699 | 7250 | | vm/runtime.c | 735 | 2454 | 26742 |
| vm/main.c | 135 | 375 | 3448 | | vm/struct.c | 252 | 780 | 6805 |
| vm/runtime.c | 802 | 2441 | 23634 | | vm/struct.h | 74 | 204 | 1564 |
| vm/struct.c | 262 | 783 | 7050 | | lib/inst.c | 567 | 1369 | 14899 |
| vm/struct.h | 69 | 196 | 1531 |
| lib/inst.c | 493 | 1215 | 13043 |
| lib/darr.h | 149 | 709 | 4482 | | lib/darr.h | 149 | 709 | 4482 |
| lib/inst.h | 248 | 519 | 4964 | | lib/inst.h | 277 | 547 | 5498 |
| lib/inst-macro.h | 71 | 281 | 2806 | | lib/inst-macro.h | 71 | 281 | 2806 |
| lib/heap.h | 125 | 453 | 3050 | | lib/heap.h | 125 | 453 | 3050 |
| lib/base.h | 190 | 710 | 4633 | | lib/base.h | 236 | 895 | 5868 |
| lib/heap.c | 79 | 214 | 1647 | | lib/heap.c | 79 | 214 | 1647 |
| lib/base.c | 61 | 226 | 1583 | | lib/base.c | 82 | 288 | 2048 |
| lib/darr.c | 76 | 219 | 1746 | | lib/darr.c | 76 | 219 | 1746 |
|------------------+-------+-------+------------| |------------------+-------+-------+------------|
| total | 3026 | 9040 | 80867 | | total | 3186 | 9666 | 89754 |
|------------------+-------+-------+------------|