diff --git a/.gitignore b/.gitignore index 59e32c1..14cc2cf 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,4 @@ build/* TAGS compile_commands.json .cache/* +/ltximg/ diff --git a/Makefile b/Makefile index 8a38194..a035981 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ CC=gcc VERBOSE=0 -RELEASE=0 +RELEASE=1 GENERAL-FLAGS:=-Wall -Wextra -Wswitch-enum -I$(shell pwd) -std=c11 DEBUG-FLAGS=-ggdb diff --git a/README.org b/README.org index f6e968e..5b4aa82 100644 --- a/README.org +++ b/README.org @@ -1,59 +1,51 @@ -#+title: Aryadev's Virtual Machine (AVM) +#+title: A Virtual Machine (AVM) #+author: Aryadev Chavali #+date: 2023-10-15 -A virtual machine in C11, stack oriented with a dynamic register. -Deals primarily in bytes, doesn't make assertions about typing and is -very simple to target. +An exercise in making a virtual machine in C11 with both a stack and +registers. -This repository contains both a library ([[file:lib/][lib]]) to -(de)serialize bytecode and a program ([[file:vm/][vm]]) to execute +This repository contains both a shared library ([[file:lib/][lib]]) to +(de)serialise bytecode and a program ([[file:vm/][vm]]) to execute said bytecode. * How to build -Requires =GNU make= and a compliant C11 compiler. Code base has been -tested against =gcc= and =clang=, but given how the project has been -written without use of GNU'isms (that I'm aware of) it shouldn't be an -issue to compile using something like =tcc= or another compiler (look -at [[file:Makefile::CC=gcc][here]] to change the compiler used). +Requires =GNU make= and a compliant C11 compiler. Look +[[file:Makefile::CC=gcc][here]] to change the compiler used. -To build a release version simply run ~make all RELEASE=1~. To build -a debug version run ~make all VERBOSE=~ where n can be 0, 1 or 2 -depending on how verbose you want logs to standard output to be. This -will build: +To build a release version simply run ~make all~. To build a debug +version run ~make all RELEASE=0 VERBOSE=2~ which has most runtime logs +on. This will build: + [[file:lib/][instruction bytecode system]] which provides a shared library for serialising and deserialising bytecode + [[file:vm/][VM executable]] to execute bytecode - -You may also build each component individually through the -corresponding recipe: -+ ~make lib~ -+ ~make vm~ * Targeting the virtual machine -Link with the shared library =libavm.so= which should be located in -the =build= folder. The general idea is to construct a ~prog_t~ -structure, which consists of: +Link with the shared library =libavm.so=. The general idea is to +construct a ~prog_t~ structure, which consists of: 1) A program header with some essential properties of the program (start address, count, etc) -2) An array of type ~inst_t~, ordered instructions for execution +2) An array of type ~inst_t~ which is an ordered set of instructions + for execution -This structure may be executed in two ways. +This structure can be executed in two ways. ** Compilation then separate execution The ~prog_t~ structure along with a sufficiently sized buffer of bytes -(using ~prog_bytecode_size~ to get the size necessary) can be used to -call ~prog_write_bytecode~, which will populate the buffer with the -corresponding bytecode. +(~prog_bytecode_size~ gives the exact number of bytes necessary) can +be used in calling ~prog_write_bytecode~, which will populate the +buffer with the corresponding bytecode. The buffer is written to some file then executed using the =avm= executable. This is the classical way I expect languages to target the virtual machine. ** In memory virtual machine -This method is more involved, introducing the virtual machine runtime -into the program itself. After constructing a ~prog_t~ structure, it -can be fit into a ~vm_t~ structure. This ~vm_t~ structure also must -have a stack, heap and call stack (look at [[file:vm/main.c]] to see -this in practice). This structure can then be used with -~vm_execute_all~ to execute the program. +This method is works by introducing the virtual machine runtime into +the program that wishes to utilise the AVM itself. After constructing +a ~prog_t~ structure, it can be fit into a ~vm_t~ structure. This +structure maintains various other components such as the stack, heap +and call stack. This structure can then be used with ~vm_execute_all~ +to execute the program. + +Look at [[file:vm/main.c]] to see this in practice. Note that this skips the serialising process (i.e. the /compilation/) by utilising the runtime directly. I could see this approach being @@ -68,27 +60,26 @@ the ~prog_t~ can still be compiled into bytecode whenever required. can compile an assembly-like language to bytecode. * Lines of code #+begin_src sh :results table :exports results -wc -lwc $(find vm/ lib/ -regex ".*\.[ch]\(pp\)?") +echo 'Files Lines Words Characters' +wc -lwc $(find vm/ lib/ -regex ".*\.[ch]\(pp\)?") | awk '{print $4 "\t" $1 "\t" $2 "\t" $3}' #+end_src #+RESULTS: +| Files | Lines | Words | Characters | |------------------+-------+-------+------------| -| File | Lines | Words | Characters | -|------------------+-------+-------+------------| -| vm/runtime.h | 266 | 699 | 7250 | -| vm/main.c | 135 | 375 | 3448 | -| vm/runtime.c | 802 | 2441 | 23634 | -| vm/struct.c | 262 | 783 | 7050 | -| vm/struct.h | 69 | 196 | 1531 | -| lib/inst.c | 493 | 1215 | 13043 | +| vm/runtime.h | 327 | 872 | 9082 | +| vm/main.c | 136 | 381 | 3517 | +| vm/runtime.c | 735 | 2454 | 26742 | +| vm/struct.c | 252 | 780 | 6805 | +| vm/struct.h | 74 | 204 | 1564 | +| lib/inst.c | 567 | 1369 | 14899 | | lib/darr.h | 149 | 709 | 4482 | -| lib/inst.h | 248 | 519 | 4964 | +| lib/inst.h | 277 | 547 | 5498 | | lib/inst-macro.h | 71 | 281 | 2806 | | lib/heap.h | 125 | 453 | 3050 | -| lib/base.h | 190 | 710 | 4633 | +| lib/base.h | 236 | 895 | 5868 | | lib/heap.c | 79 | 214 | 1647 | -| lib/base.c | 61 | 226 | 1583 | +| lib/base.c | 82 | 288 | 2048 | | lib/darr.c | 76 | 219 | 1746 | |------------------+-------+-------+------------| -| total | 3026 | 9040 | 80867 | -|------------------+-------+-------+------------| +| total | 3186 | 9666 | 89754 |