Clean up work tree for making assembler
This commit is contained in:
116
README.org
116
README.org
@@ -1,110 +1,44 @@
|
||||
#+title: Oreo's Virtual Machine (OVM)
|
||||
#+title: Aryadev's Assembly Language (AAL)
|
||||
#+author: Aryadev Chavali
|
||||
#+date: 2023-10-15
|
||||
|
||||
A stack based virtual machine in C11, with a dynamic register setup
|
||||
which acts as variable space. Deals primarily in bytes, doesn't make
|
||||
assertions about typing and is very simple to target.
|
||||
A compiler for Aryadev's Assembly Language, an assembly-like
|
||||
programming language, which targets the
|
||||
[[https://github.com/aryadev-software/avm/][AVM]].
|
||||
|
||||
2024-04-16: Project will now be split into two components
|
||||
1) The runtime + base library
|
||||
2) The assembler
|
||||
|
||||
This will focus each repository on separate issues and make it easier
|
||||
to organize. They will both derive from the same repositories
|
||||
i.e. I'm not making fresh repositories and just sticking the folders
|
||||
in but rather branching this repository into two different versions.
|
||||
|
||||
The two versions will be hosted at:
|
||||
1) [[https://github.com/aryadev-software/avm]]
|
||||
1) [[https://github.com/aryadev-software/aal]]
|
||||
* How to build
|
||||
Requires =GNU make= and a compliant C11 compiler. Code base has been
|
||||
tested against =gcc= and =clang=, but given how the project has been
|
||||
written without use of GNU'isms (that I'm aware of) it shouldn't be an
|
||||
issue to compile using something like =tcc= or another compiler (look
|
||||
at [[file:Makefile::CC=gcc][here]] to change the compiler).
|
||||
Requires =GNU make= and a compliant C++17 compiler. Code base has
|
||||
been tested against =g++= and =clang=, but given how the project has
|
||||
been written without use of GNU'isms (that I'm aware of) it shouldn't
|
||||
be an issue to compile using something like =tcc= or another compiler
|
||||
(look at [[file:Makefile::CPP=g++][here]] to change the compiler).
|
||||
|
||||
To build everything simply run ~make~. This will build:
|
||||
+ [[file:lib/inst.c][instruction bytecode system]] which provides
|
||||
object files to target the VM
|
||||
+ [[file:vm/main.c][VM executable]] which executes bytecode
|
||||
+ [[file:asm/main.c][Assembler executable]] which assembles compliant
|
||||
assembly code to VM bytecode
|
||||
+ [[file:asm/main.cpp][Assembler executable]] which assembles
|
||||
compliant assembly code to VM bytecode
|
||||
+ [[file:examples/][Assembly examples]] which provide some source code
|
||||
examples on common programs one may write. Use this to figure out
|
||||
how to write compliant assembly. Also a good test of both the VM
|
||||
and assembler.
|
||||
how to write compliant AAL. Also a good test of both the VM and
|
||||
assembler.
|
||||
|
||||
You may also build each component individually through the
|
||||
corresponding recipe:
|
||||
+ ~make lib~
|
||||
+ ~make vm~
|
||||
+ ~make asm~
|
||||
+ ~make examples~
|
||||
* Instructions to target the virtual machine
|
||||
You need to link with the object files for
|
||||
[[file:lib/base.c][base.c]], [[file:lib/darr.c][darr.c]] and
|
||||
[[file:lib/inst.c][inst.c]] to be able to properly target the OVM.
|
||||
The basic idea is to create some instructions via ~inst_t~,
|
||||
instantiating a ~prog_t~ structure which wraps those instructions
|
||||
(includes a header and other useful things for the runtime), then
|
||||
using ~prog_write_file~ to serialise and write bytecode to a file
|
||||
pointer.
|
||||
|
||||
To execute directly compiled bytecode use the ~ovm.out~ executable on
|
||||
the bytecode file.
|
||||
|
||||
For clarity, one may build ~lib~ (~make lib~) then use the resulting
|
||||
object files to link and create bytecode for the virtual machine.
|
||||
** In memory virtual machine
|
||||
Instead of serialising and writing bytecode to a file, one may instead
|
||||
serialise bytecode in memory using ~prog_write_bytecode~ which writes
|
||||
bytecode to a dynamic byte buffer, so called *in memory compilation*.
|
||||
To execute this bytecode, deserialise the bytecode into a program then
|
||||
load it into a complete ~vm_t~ structure (linking with
|
||||
[[file:vm/runtime.c][runtime.c]]).
|
||||
|
||||
In fact, you may skip the process of serialising entirely. You can
|
||||
emit a ~prog_t~ structure corresponding to source code, load it
|
||||
directly into the ~vm_t~ structure, then execute. To do so is a bit
|
||||
involved, so I recommend looking at [[file:vm/main.c]]. In rough
|
||||
steps:
|
||||
+ Create a virtual machine "from scratch" (load the necessary
|
||||
components (the stack, heap and call stack) by hand)
|
||||
+ Load program into VM (~vm_load_program~)
|
||||
+ Run ~vm_execute_all~
|
||||
|
||||
This is recommended if writing an interpreted language such as a Lisp,
|
||||
where on demand execution of code is more suitable.
|
||||
* Lines of code
|
||||
#+begin_src sh :results table :exports results
|
||||
wc -lwc $(find -regex ".*\.[ch]\(pp\)?")
|
||||
wc -lwc $(find -regex ".*\.[ch]\(pp\)?" -maxdepth 2)
|
||||
#+end_src
|
||||
|
||||
#+RESULTS:
|
||||
| Files | Lines | Words | Bytes |
|
||||
|------------------------+-------+-------+--------|
|
||||
| ./lib/heap.h | 42 | 111 | 801 |
|
||||
| ./lib/inst.c | 516 | 1315 | 13982 |
|
||||
| ./lib/darr.c | 77 | 225 | 1757 |
|
||||
| ./lib/base.c | 107 | 306 | 2002 |
|
||||
| ./lib/inst.h | 108 | 426 | 4067 |
|
||||
| ./lib/prog.h | 176 | 247 | 2616 |
|
||||
| ./lib/base.h | 148 | 626 | 3915 |
|
||||
| ./lib/darr.h | 88 | 465 | 2697 |
|
||||
| ./lib/heap.c | 101 | 270 | 1910 |
|
||||
| ./vm/runtime.h | 301 | 780 | 7965 |
|
||||
| ./vm/runtime.c | 1070 | 3097 | 30010 |
|
||||
| ./vm/main.c | 92 | 265 | 2243 |
|
||||
| ./asm/base.hpp | 21 | 68 | 472 |
|
||||
| ./asm/lexer.cpp | 565 | 1448 | 14067 |
|
||||
| ./asm/base.cpp | 33 | 89 | 705 |
|
||||
| ./asm/parser.hpp | 82 | 199 | 1656 |
|
||||
| ./asm/parser.cpp | 42 | 129 | 1294 |
|
||||
| ./asm/lexer.hpp | 106 | 204 | 1757 |
|
||||
| ./asm/preprocesser.cpp | 218 | 574 | 5800 |
|
||||
| ./asm/preprocesser.hpp | 62 | 147 | 1360 |
|
||||
| ./asm/main.cpp | 148 | 414 | 3791 |
|
||||
|------------------------+-------+-------+--------|
|
||||
| total | 4103 | 11405 | 104867 |
|
||||
| Files | Lines | Words | Bytes |
|
||||
|------------------------+-------+-------+-------|
|
||||
| ./asm/base.hpp | 21 | 68 | 472 |
|
||||
| ./asm/lexer.cpp | 565 | 1448 | 14067 |
|
||||
| ./asm/base.cpp | 33 | 89 | 705 |
|
||||
| ./asm/lexer.hpp | 106 | 204 | 1757 |
|
||||
| ./asm/preprocesser.cpp | 218 | 574 | 5800 |
|
||||
| ./asm/preprocesser.hpp | 62 | 147 | 1360 |
|
||||
| ./asm/main.cpp | 148 | 414 | 3791 |
|
||||
|------------------------+-------+-------+-------|
|
||||
| total | 1153 | 2944 | 27952 |
|
||||
|
||||
107
lib/base.c
107
lib/base.c
@@ -1,107 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-26
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Implementation of basic library functions
|
||||
*/
|
||||
|
||||
#include "./base.h"
|
||||
|
||||
#include <string.h>
|
||||
|
||||
union hword_pun
|
||||
{
|
||||
hword h;
|
||||
byte bytes[HWORD_SIZE];
|
||||
};
|
||||
|
||||
union word_pun
|
||||
{
|
||||
word h;
|
||||
byte bytes[WORD_SIZE];
|
||||
};
|
||||
|
||||
hword hword_htobc(hword w)
|
||||
{
|
||||
#if __LITTLE_ENDIAN__
|
||||
return w;
|
||||
#else
|
||||
union hword_pun x = {w};
|
||||
union hword_pun y = {0};
|
||||
for (size_t i = 0, j = HWORD_SIZE; i < HWORD_SIZE; ++i, --j)
|
||||
y.bytes[j - 1] = x.bytes[i];
|
||||
return y.h;
|
||||
#endif
|
||||
}
|
||||
|
||||
hword hword_bctoh(hword w)
|
||||
{
|
||||
#if __LITTLE_ENDIAN__
|
||||
return w;
|
||||
#else
|
||||
union hword_pun x = {w};
|
||||
union hword_pun y = {0};
|
||||
for (size_t i = 0, j = HWORD_SIZE; i < HWORD_SIZE; ++i, --j)
|
||||
y.bytes[j - 1] = x.bytes[i];
|
||||
return y.h;
|
||||
#endif
|
||||
}
|
||||
|
||||
word word_htobc(word w)
|
||||
{
|
||||
#if __LITTLE_ENDIAN__
|
||||
return w;
|
||||
#else
|
||||
union word_pun x = {w};
|
||||
union word_pun y = {0};
|
||||
for (size_t i = 0, j = WORD_SIZE; i < WORD_SIZE; ++i, --j)
|
||||
y.bytes[j - 1] = x.bytes[i];
|
||||
return y.h;
|
||||
#endif
|
||||
}
|
||||
|
||||
word word_bctoh(word w)
|
||||
{
|
||||
#if __LITTLE_ENDIAN__
|
||||
return w;
|
||||
#else
|
||||
union word_pun x = {w};
|
||||
union word_pun y = {0};
|
||||
for (size_t i = 0, j = WORD_SIZE; i < WORD_SIZE; ++i, --j)
|
||||
y.bytes[j - 1] = x.bytes[i];
|
||||
return y.h;
|
||||
#endif
|
||||
}
|
||||
|
||||
hword convert_bytes_to_hword(byte *bytes)
|
||||
{
|
||||
hword be_h = 0;
|
||||
memcpy(&be_h, bytes, HWORD_SIZE);
|
||||
hword h = hword_bctoh(be_h);
|
||||
return h;
|
||||
}
|
||||
|
||||
void convert_hword_to_bytes(hword w, byte *bytes)
|
||||
{
|
||||
hword be_h = hword_htobc(w);
|
||||
memcpy(bytes, &be_h, HWORD_SIZE);
|
||||
}
|
||||
|
||||
void convert_word_to_bytes(word w, byte *bytes)
|
||||
{
|
||||
word be_w = word_htobc(w);
|
||||
memcpy(bytes, &be_w, WORD_SIZE);
|
||||
}
|
||||
|
||||
word convert_bytes_to_word(byte *bytes)
|
||||
{
|
||||
word be_w = 0;
|
||||
memcpy(&be_w, bytes, WORD_SIZE);
|
||||
word w = word_bctoh(be_w);
|
||||
return w;
|
||||
}
|
||||
148
lib/base.h
148
lib/base.h
@@ -1,148 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-15
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Basic types and routines
|
||||
*/
|
||||
|
||||
#ifndef BASE_H
|
||||
#define BASE_H
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
/* Basic macros for a variety of uses. Quite self explanatory. */
|
||||
#define ARR_SIZE(xs) (sizeof(xs) / sizeof(xs[0]))
|
||||
#define MAX(a, b) ((a) > (b) ? (a) : (b))
|
||||
#define MIN(a, b) ((a) > (b) ? (b) : (a))
|
||||
#define TERM_GREEN "\e[0;32m"
|
||||
#define TERM_YELLOW "\e[0;33m"
|
||||
#define TERM_RED "\e[0;31m"
|
||||
#define TERM_RESET "\e[0;0m"
|
||||
|
||||
// Flags for program behaviour (usually related to printing)
|
||||
#ifndef VERBOSE
|
||||
#define VERBOSE 0
|
||||
#endif
|
||||
#ifndef PRINT_HEX
|
||||
#define PRINT_HEX 0
|
||||
#endif
|
||||
|
||||
/* Ease of use aliases for numeric types */
|
||||
typedef uint8_t u8;
|
||||
typedef int8_t i8;
|
||||
typedef uint32_t u32;
|
||||
typedef int32_t i32;
|
||||
typedef uint64_t u64;
|
||||
typedef int64_t i64;
|
||||
|
||||
typedef float f32;
|
||||
typedef double f64;
|
||||
|
||||
typedef u8 byte;
|
||||
typedef i8 s_byte;
|
||||
typedef u32 hword;
|
||||
typedef i32 s_hword;
|
||||
typedef u64 word;
|
||||
typedef i64 s_word;
|
||||
|
||||
/* Macros for the sizes of common base data types. */
|
||||
#define HWORD_SIZE sizeof(hword)
|
||||
#define SHWORD_SIZE sizeof(s_hword)
|
||||
#define WORD_SIZE sizeof(word)
|
||||
#define SWORD_SIZE sizeof(s_word)
|
||||
|
||||
/** Union for all basic data types in the virtual machine.
|
||||
*/
|
||||
typedef union
|
||||
{
|
||||
byte as_byte;
|
||||
s_byte as_char;
|
||||
hword as_hword;
|
||||
s_hword as_int;
|
||||
word as_word;
|
||||
s_word as_long;
|
||||
} data_t;
|
||||
|
||||
/** Enum of type tags for the data_t structure to provide context.
|
||||
*/
|
||||
typedef enum
|
||||
{
|
||||
DATA_TYPE_NIL = 0,
|
||||
DATA_TYPE_BYTE,
|
||||
DATA_TYPE_HWORD,
|
||||
DATA_TYPE_WORD,
|
||||
} data_type_t;
|
||||
|
||||
/* Some macros for constructing data_t instances quickly. */
|
||||
#define DBYTE(BYTE) ((data_t){.as_byte = (BYTE)})
|
||||
#define DHWORD(HWORD) ((data_t){.as_hword = (HWORD)})
|
||||
#define DWORD(WORD) ((data_t){.as_word = (WORD)})
|
||||
|
||||
/** Safely subtract SUB from W, where both are words (64 bit integers).
|
||||
*
|
||||
* In case of underflow (i.e. where W - SUB < 0) returns 0 instead of
|
||||
* the underflowed result.
|
||||
*/
|
||||
#define WORD_SAFE_SUB(W, SUB) ((W) > (SUB) ? ((W) - (SUB)) : 0)
|
||||
|
||||
/** Return the Nth byte of WORD
|
||||
* N should range from 0 to 7 as there are 8 bytes in a word.
|
||||
*/
|
||||
#define WORD_NTH_BYTE(WORD, N) (((WORD) >> ((N) * 8)) & 0xFF)
|
||||
|
||||
/** Return the Nth half word of WORD
|
||||
* N should range from 0 to 1 as there are 2 half words in a word
|
||||
*/
|
||||
#define WORD_NTH_HWORD(WORD, N) (((WORD) >> ((N) * 2)) & 0xFFFFFFFF)
|
||||
|
||||
/** Convert a buffer of bytes to a half word
|
||||
* We assume the buffer of bytes are in virtual machine byte code
|
||||
* format (big endian) and that they are at least HWORD_SIZE in
|
||||
* size.
|
||||
*/
|
||||
hword convert_bytes_to_hword(byte *buffer);
|
||||
|
||||
/** Convert a half word into a VM byte code format bytes (big endian)
|
||||
* @param h: Half word to convert
|
||||
* @param buffer: Buffer to store into. We assume the buffer has at
|
||||
* least HWORD_SIZE space.
|
||||
*/
|
||||
void convert_hword_to_bytes(hword h, byte *buffer);
|
||||
|
||||
/** Convert a buffer of bytes to a word
|
||||
* We assume the buffer of bytes are in virtual machine byte code
|
||||
* format (big endian) and that they are at least WORD_SIZE in
|
||||
* size.
|
||||
*/
|
||||
word convert_bytes_to_word(byte *);
|
||||
|
||||
/** Convert a word into a VM byte code format bytes (big endian)
|
||||
* @param w: Word to convert
|
||||
* @param buffer: Buffer to store into. We assume the buffer has at
|
||||
* least WORD_SIZE space.
|
||||
*/
|
||||
void convert_word_to_bytes(word w, byte *buffer);
|
||||
|
||||
/** Convert a half word into bytecode format (little endian)
|
||||
*/
|
||||
hword hword_htobc(hword);
|
||||
|
||||
/** Convert a half word in bytecode format (little endian) to host
|
||||
* format
|
||||
*/
|
||||
hword hword_bctoh(hword);
|
||||
|
||||
/** Convert a word into bytecode format (little endian)
|
||||
*/
|
||||
word word_htobc(word);
|
||||
|
||||
/** Convert a word in bytecode format (little endian) to host format
|
||||
*/
|
||||
word word_bctoh(word);
|
||||
|
||||
#endif
|
||||
77
lib/darr.c
77
lib/darr.c
@@ -1,77 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-15
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Dynamically sized byte array
|
||||
*/
|
||||
|
||||
#include <assert.h>
|
||||
#include <malloc.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "./darr.h"
|
||||
|
||||
void darr_init(darr_t *darr, size_t size)
|
||||
{
|
||||
if (size == 0)
|
||||
size = DARR_DEFAULT_SIZE;
|
||||
*darr = (darr_t){
|
||||
.data = calloc(size, 1),
|
||||
.used = 0,
|
||||
.available = size,
|
||||
};
|
||||
}
|
||||
|
||||
void darr_ensure_capacity(darr_t *darr, size_t requested)
|
||||
{
|
||||
if (darr->used + requested >= darr->available)
|
||||
{
|
||||
darr->available =
|
||||
MAX(darr->used + requested, darr->available * DARR_REALLOC_MULT);
|
||||
darr->data = realloc(darr->data, darr->available);
|
||||
memset(darr->data + darr->used, 0, darr->available - darr->used);
|
||||
}
|
||||
}
|
||||
|
||||
void darr_append_byte(darr_t *darr, byte byte)
|
||||
{
|
||||
darr_ensure_capacity(darr, 1);
|
||||
darr->data[darr->used++] = byte;
|
||||
}
|
||||
|
||||
void darr_append_bytes(darr_t *darr, byte *bytes, size_t n)
|
||||
{
|
||||
darr_ensure_capacity(darr, n);
|
||||
memcpy(darr->data + darr->used, bytes, n);
|
||||
darr->used += n;
|
||||
}
|
||||
|
||||
byte darr_at(darr_t *darr, size_t index)
|
||||
{
|
||||
if (index >= darr->used)
|
||||
// TODO: Error (index is out of bounds)
|
||||
return 0;
|
||||
return darr->data[index];
|
||||
}
|
||||
|
||||
void darr_write_file(darr_t *bytes, FILE *fp)
|
||||
{
|
||||
size_t size = fwrite(bytes->data, bytes->used, 1, fp);
|
||||
assert(size == 1);
|
||||
}
|
||||
|
||||
darr_t darr_read_file(FILE *fp)
|
||||
{
|
||||
darr_t darr = {0};
|
||||
fseek(fp, 0, SEEK_END);
|
||||
long size = ftell(fp);
|
||||
darr_init(&darr, size);
|
||||
fseek(fp, 0, SEEK_SET);
|
||||
fread(darr.data, size, 1, fp);
|
||||
return darr;
|
||||
}
|
||||
88
lib/darr.h
88
lib/darr.h
@@ -1,88 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-15
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Dynamically sized byte array
|
||||
*/
|
||||
|
||||
#ifndef DARR_H
|
||||
#define DARR_H
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
#include "./base.h"
|
||||
|
||||
/**
|
||||
* A dynamically sized buffer of bytes which may be used for a
|
||||
* variety of purposes.
|
||||
* @prop data: Buffer of bytes (may be reallocated)
|
||||
* @prop used: Number of bytes currently used
|
||||
* @prop available: Number of bytes currently allocated
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
byte *data;
|
||||
size_t used, available;
|
||||
} darr_t;
|
||||
|
||||
/* Some useful constants for dynamic array work. */
|
||||
#define DARR_DEFAULT_SIZE 8
|
||||
#define DARR_REALLOC_MULT 1.5
|
||||
|
||||
/** Get the INDth item in a darr, where the buffer of bytes is
|
||||
* considerd an array of type TYPE.
|
||||
* Unsafe operation as safety checks are not done (in particular if
|
||||
* the dynamic array has IND items or is big enough to store an
|
||||
* element of TYPE) so it is presumed the caller will.
|
||||
*/
|
||||
#define DARR_AT(TYPE, DARR_DATA, IND) ((TYPE *)(DARR_DATA))[(IND)]
|
||||
|
||||
/** Initialise a dynamic array (darr) with n elements.
|
||||
* If n == 0 then initialise with DARR_DEFAULT_SIZE elements.
|
||||
*/
|
||||
void darr_init(darr_t *darr, size_t n);
|
||||
|
||||
/** Ensure the dynamic array (darr) has at least n elements free.
|
||||
* If the dynamic array has less than n elements free it will
|
||||
* reallocate.
|
||||
*/
|
||||
void darr_ensure_capacity(darr_t *darr, size_t n);
|
||||
|
||||
/** Append a byte (b) to the dynamic array (darr).
|
||||
* If the dynamic array doesn't have enough space it will reallocate
|
||||
* to ensure it can fit it in.
|
||||
*/
|
||||
void darr_append_byte(darr_t *darr, byte b);
|
||||
|
||||
/** Append an array of n bytes (b) to the dynamic array (darr).
|
||||
* If the dynamic array doesn't have enough space to fit all n bytes
|
||||
* it will reallocate to ensure it can fit it in.
|
||||
*/
|
||||
void darr_append_bytes(darr_t *darr, byte *b, size_t n);
|
||||
|
||||
/** Safely get the nth byte of the dynamic array (darr)
|
||||
* If the dynamic array has less than n bytes used, it will return 0
|
||||
* as a default value.
|
||||
*/
|
||||
byte darr_at(darr_t *darr, size_t n);
|
||||
|
||||
/** Write the dynamic array (darr) to the file pointer (fp) as a
|
||||
* buffer of bytes.
|
||||
* Assumes fp is a valid file pointer and in write mode.
|
||||
*/
|
||||
void darr_write_file(darr_t *, FILE *);
|
||||
|
||||
/** Read a file pointer (fp) in its entirety, converting the bytes
|
||||
* into a tightly fitted dynamic array.
|
||||
* Say the file pointer is a file of n bytes. Then the dynamic array
|
||||
* returned will have available set to n and used set to 0.
|
||||
*/
|
||||
darr_t darr_read_file(FILE *);
|
||||
|
||||
#endif
|
||||
101
lib/heap.c
101
lib/heap.c
@@ -1,101 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-11-01
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Arena allocator
|
||||
*/
|
||||
|
||||
#include "./heap.h"
|
||||
|
||||
#include <stdio.h>
|
||||
|
||||
page_t *page_create(size_t max, page_t *next)
|
||||
{
|
||||
page_t *page = calloc(1, sizeof(*page) + max);
|
||||
page->available = max;
|
||||
page->next = next;
|
||||
return page;
|
||||
}
|
||||
|
||||
void page_delete(page_t *page)
|
||||
{
|
||||
free(page);
|
||||
}
|
||||
|
||||
void heap_create(heap_t *heap)
|
||||
{
|
||||
heap->beg = heap->end = NULL;
|
||||
heap->pages = 0;
|
||||
}
|
||||
|
||||
bool heap_free_page(heap_t *heap, page_t *page)
|
||||
{
|
||||
if (!page || !heap)
|
||||
return false;
|
||||
|
||||
if (page == heap->beg)
|
||||
{
|
||||
heap->beg = heap->beg->next;
|
||||
page_delete(page);
|
||||
--heap->pages;
|
||||
if (heap->pages == 0)
|
||||
heap->end = NULL;
|
||||
return true;
|
||||
}
|
||||
|
||||
page_t *prev = NULL, *next = NULL, *cur = NULL;
|
||||
for (cur = heap->beg; cur; cur = cur->next)
|
||||
{
|
||||
next = cur->next;
|
||||
if (cur == page)
|
||||
break;
|
||||
prev = cur;
|
||||
}
|
||||
|
||||
if (!cur)
|
||||
// Couldn't find the page
|
||||
return false;
|
||||
// Page was found
|
||||
prev->next = next;
|
||||
if (!next)
|
||||
// This means page == heap->end
|
||||
heap->end = prev;
|
||||
page_delete(page);
|
||||
--heap->pages;
|
||||
if (heap->pages == 0)
|
||||
heap->beg = NULL;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
page_t *heap_allocate(heap_t *heap, size_t requested)
|
||||
{
|
||||
page_t *cur = page_create(requested, NULL);
|
||||
if (heap->end)
|
||||
heap->end->next = cur;
|
||||
else
|
||||
heap->beg = cur;
|
||||
heap->end = cur;
|
||||
heap->pages++;
|
||||
return cur;
|
||||
}
|
||||
|
||||
void heap_stop(heap_t *heap)
|
||||
{
|
||||
page_t *ptr = heap->beg;
|
||||
for (size_t i = 0; i < heap->pages; ++i)
|
||||
{
|
||||
page_t *cur = ptr;
|
||||
page_t *next = ptr->next;
|
||||
page_delete(cur);
|
||||
ptr = next;
|
||||
}
|
||||
heap->beg = NULL;
|
||||
heap->end = NULL;
|
||||
heap->pages = 0;
|
||||
}
|
||||
42
lib/heap.h
42
lib/heap.h
@@ -1,42 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-11-01
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Arena allocator
|
||||
*/
|
||||
|
||||
#ifndef HEAP_H
|
||||
#define HEAP_H
|
||||
|
||||
#include "./base.h"
|
||||
|
||||
#include <stdbool.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
typedef struct Page
|
||||
{
|
||||
struct Page *next;
|
||||
size_t available;
|
||||
byte data[];
|
||||
} page_t;
|
||||
|
||||
page_t *page_create(size_t, page_t *);
|
||||
void page_delete(page_t *);
|
||||
|
||||
typedef struct
|
||||
{
|
||||
page_t *beg, *end;
|
||||
size_t pages;
|
||||
} heap_t;
|
||||
|
||||
void heap_create(heap_t *);
|
||||
bool heap_free_page(heap_t *, page_t *);
|
||||
page_t *heap_allocate(heap_t *, size_t);
|
||||
void heap_stop(heap_t *);
|
||||
|
||||
#endif
|
||||
516
lib/inst.c
516
lib/inst.c
@@ -1,516 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-15
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Implementation of bytecode for instructions
|
||||
*/
|
||||
|
||||
#include "./inst.h"
|
||||
|
||||
#include <assert.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
|
||||
const char *opcode_as_cstr(opcode_t code)
|
||||
{
|
||||
switch (code)
|
||||
{
|
||||
case OP_NOOP:
|
||||
return "NOOP";
|
||||
case OP_PUSH_BYTE:
|
||||
return "PUSH_BYTE";
|
||||
case OP_PUSH_WORD:
|
||||
return "PUSH_WORD";
|
||||
case OP_PUSH_HWORD:
|
||||
return "PUSH_HWORD";
|
||||
case OP_PUSH_REGISTER_BYTE:
|
||||
return "PUSH_REGISTER_BYTE";
|
||||
case OP_PUSH_REGISTER_WORD:
|
||||
return "PUSH_REGISTER_WORD";
|
||||
case OP_PUSH_REGISTER_HWORD:
|
||||
return "PUSH_REGISTER_HWORD";
|
||||
case OP_POP_BYTE:
|
||||
return "POP_BYTE";
|
||||
case OP_POP_WORD:
|
||||
return "POP_WORD";
|
||||
case OP_POP_HWORD:
|
||||
return "POP_HWORD";
|
||||
case OP_MOV_BYTE:
|
||||
return "MOV_BYTE";
|
||||
case OP_MOV_WORD:
|
||||
return "MOV_WORD";
|
||||
case OP_MOV_HWORD:
|
||||
return "MOV_HWORD";
|
||||
case OP_DUP_BYTE:
|
||||
return "DUP_BYTE";
|
||||
case OP_DUP_HWORD:
|
||||
return "DUP_HWORD";
|
||||
case OP_DUP_WORD:
|
||||
return "DUP_WORD";
|
||||
case OP_MALLOC_BYTE:
|
||||
return "MALLOC_BYTE";
|
||||
case OP_MALLOC_HWORD:
|
||||
return "MALLOC_HWORD";
|
||||
case OP_MALLOC_WORD:
|
||||
return "MALLOC_WORD";
|
||||
case OP_MALLOC_STACK_BYTE:
|
||||
return "MALLOC_STACK_BYTE";
|
||||
case OP_MALLOC_STACK_HWORD:
|
||||
return "MALLOC_STACK_HWORD";
|
||||
case OP_MALLOC_STACK_WORD:
|
||||
return "MALLOC_STACK_WORD";
|
||||
case OP_MSET_BYTE:
|
||||
return "MSET_BYTE";
|
||||
case OP_MSET_HWORD:
|
||||
return "MSET_HWORD";
|
||||
case OP_MSET_WORD:
|
||||
return "MSET_WORD";
|
||||
case OP_MSET_STACK_BYTE:
|
||||
return "MSET_STACK_BYTE";
|
||||
case OP_MSET_STACK_HWORD:
|
||||
return "MSET_STACK_HWORD";
|
||||
case OP_MSET_STACK_WORD:
|
||||
return "MSET_STACK_WORD";
|
||||
case OP_MGET_BYTE:
|
||||
return "MGET_BYTE";
|
||||
case OP_MGET_HWORD:
|
||||
return "MGET_HWORD";
|
||||
case OP_MGET_WORD:
|
||||
return "MGET_WORD";
|
||||
case OP_MGET_STACK_BYTE:
|
||||
return "MGET_STACK_BYTE";
|
||||
case OP_MGET_STACK_HWORD:
|
||||
return "MGET_STACK_HWORD";
|
||||
case OP_MGET_STACK_WORD:
|
||||
return "MGET_STACK_WORD";
|
||||
case OP_MDELETE:
|
||||
return "MDELETE";
|
||||
case OP_MSIZE:
|
||||
return "MDELETE";
|
||||
case OP_NOT_BYTE:
|
||||
return "NOT_BYTE";
|
||||
case OP_NOT_HWORD:
|
||||
return "NOT_HWORD";
|
||||
case OP_NOT_WORD:
|
||||
return "NOT_WORD";
|
||||
case OP_OR_BYTE:
|
||||
return "OR_BYTE";
|
||||
case OP_OR_HWORD:
|
||||
return "OR_HWORD";
|
||||
case OP_OR_WORD:
|
||||
return "OR_WORD";
|
||||
case OP_AND_BYTE:
|
||||
return "AND_BYTE";
|
||||
case OP_AND_HWORD:
|
||||
return "AND_HWORD";
|
||||
case OP_AND_WORD:
|
||||
return "AND_WORD";
|
||||
case OP_XOR_BYTE:
|
||||
return "XOR_BYTE";
|
||||
case OP_XOR_HWORD:
|
||||
return "XOR_HWORD";
|
||||
case OP_XOR_WORD:
|
||||
return "XOR_WORD";
|
||||
case OP_EQ_BYTE:
|
||||
return "EQ_BYTE";
|
||||
case OP_EQ_HWORD:
|
||||
return "EQ_HWORD";
|
||||
case OP_EQ_WORD:
|
||||
return "EQ_WORD";
|
||||
case OP_LT_BYTE:
|
||||
return "LT_BYTE";
|
||||
case OP_LT_CHAR:
|
||||
return "LT_CHAR";
|
||||
case OP_LT_HWORD:
|
||||
return "LT_HWORD";
|
||||
case OP_LT_INT:
|
||||
return "LT_INT";
|
||||
case OP_LT_LONG:
|
||||
return "LT_LONG";
|
||||
case OP_LT_WORD:
|
||||
return "LT_WORD";
|
||||
case OP_LTE_BYTE:
|
||||
return "LTE_BYTE";
|
||||
case OP_LTE_CHAR:
|
||||
return "LTE_CHAR";
|
||||
case OP_LTE_HWORD:
|
||||
return "LTE_HWORD";
|
||||
case OP_LTE_INT:
|
||||
return "LTE_INT";
|
||||
case OP_LTE_LONG:
|
||||
return "LTE_LONG";
|
||||
case OP_LTE_WORD:
|
||||
return "LTE_WORD";
|
||||
case OP_GT_BYTE:
|
||||
return "GT_BYTE";
|
||||
case OP_GT_CHAR:
|
||||
return "GT_CHAR";
|
||||
case OP_GT_HWORD:
|
||||
return "GT_HWORD";
|
||||
case OP_GT_INT:
|
||||
return "GT_INT";
|
||||
case OP_GT_LONG:
|
||||
return "GT_LONG";
|
||||
case OP_GT_WORD:
|
||||
return "GT_WORD";
|
||||
case OP_GTE_BYTE:
|
||||
return "GTE_BYTE";
|
||||
case OP_GTE_CHAR:
|
||||
return "GTE_CHAR";
|
||||
case OP_GTE_HWORD:
|
||||
return "GTE_HWORD";
|
||||
case OP_GTE_INT:
|
||||
return "GTE_INT";
|
||||
case OP_GTE_LONG:
|
||||
return "GTE_LONG";
|
||||
case OP_GTE_WORD:
|
||||
return "GTE_WORD";
|
||||
case OP_PLUS_BYTE:
|
||||
return "PLUS_BYTE";
|
||||
case OP_PLUS_HWORD:
|
||||
return "PLUS_HWORD";
|
||||
case OP_PLUS_WORD:
|
||||
return "PLUS_WORD";
|
||||
case OP_SUB_BYTE:
|
||||
return "SUB_BYTE";
|
||||
case OP_SUB_HWORD:
|
||||
return "SUB_HWORD";
|
||||
case OP_SUB_WORD:
|
||||
return "SUB_WORD";
|
||||
case OP_MULT_BYTE:
|
||||
return "MULT_BYTE";
|
||||
case OP_MULT_HWORD:
|
||||
return "MULT_HWORD";
|
||||
case OP_MULT_WORD:
|
||||
return "MULT_WORD";
|
||||
case OP_JUMP_ABS:
|
||||
return "JUMP_ABS";
|
||||
case OP_JUMP_STACK:
|
||||
return "JUMP_STACK";
|
||||
case OP_JUMP_IF_BYTE:
|
||||
return "JUMP_IF_BYTE";
|
||||
case OP_JUMP_IF_HWORD:
|
||||
return "JUMP_IF_HWORD";
|
||||
case OP_JUMP_IF_WORD:
|
||||
return "JUMP_IF_WORD";
|
||||
case OP_CALL:
|
||||
return "CALL";
|
||||
case OP_CALL_STACK:
|
||||
return "CALL_STACK";
|
||||
case OP_RET:
|
||||
return "RET";
|
||||
case OP_PRINT_CHAR:
|
||||
return "PRINT_CHAR";
|
||||
case OP_PRINT_BYTE:
|
||||
return "PRINT_BYTE";
|
||||
case OP_PRINT_INT:
|
||||
return "PRINT_INT";
|
||||
case OP_PRINT_HWORD:
|
||||
return "PRINT_HWORD";
|
||||
case OP_PRINT_LONG:
|
||||
return "PRINT_LONG";
|
||||
case OP_PRINT_WORD:
|
||||
return "PRINT_WORD";
|
||||
case OP_HALT:
|
||||
return "HALT";
|
||||
case NUMBER_OF_OPCODES:
|
||||
return "";
|
||||
}
|
||||
return "";
|
||||
}
|
||||
|
||||
void data_print(data_t datum, data_type_t type, FILE *fp)
|
||||
{
|
||||
switch (type)
|
||||
{
|
||||
case DATA_TYPE_NIL:
|
||||
break;
|
||||
case DATA_TYPE_BYTE:
|
||||
fprintf(fp, "%X", datum.as_byte);
|
||||
break;
|
||||
case DATA_TYPE_HWORD:
|
||||
fprintf(fp, "%X", datum.as_hword);
|
||||
break;
|
||||
case DATA_TYPE_WORD:
|
||||
fprintf(fp, "%lX", datum.as_word);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
void inst_print(inst_t instruction, FILE *fp)
|
||||
{
|
||||
static_assert(NUMBER_OF_OPCODES == 98, "inst_bytecode_size: Out of date");
|
||||
fprintf(fp, "%s(", opcode_as_cstr(instruction.opcode));
|
||||
if (OPCODE_IS_TYPE(instruction.opcode, OP_PUSH))
|
||||
{
|
||||
data_type_t type = (data_type_t)instruction.opcode;
|
||||
fprintf(fp, "datum=0x");
|
||||
data_print(instruction.operand, type, fp);
|
||||
}
|
||||
else if (OPCODE_IS_TYPE(instruction.opcode, OP_PUSH_REGISTER) ||
|
||||
OPCODE_IS_TYPE(instruction.opcode, OP_MOV))
|
||||
{
|
||||
fprintf(fp, "reg=0x");
|
||||
data_print(instruction.operand, DATA_TYPE_BYTE, fp);
|
||||
}
|
||||
else if (OPCODE_IS_TYPE(instruction.opcode, OP_DUP) ||
|
||||
OPCODE_IS_TYPE(instruction.opcode, OP_MALLOC) ||
|
||||
OPCODE_IS_TYPE(instruction.opcode, OP_MSET) ||
|
||||
OPCODE_IS_TYPE(instruction.opcode, OP_MGET))
|
||||
{
|
||||
fprintf(fp, "n=%lu", instruction.operand.as_word);
|
||||
}
|
||||
else if (instruction.opcode == OP_JUMP_ABS ||
|
||||
OPCODE_IS_TYPE(instruction.opcode, OP_JUMP_IF) ||
|
||||
instruction.opcode == OP_CALL)
|
||||
{
|
||||
fprintf(fp, "address=0x");
|
||||
data_print(instruction.operand, DATA_TYPE_WORD, fp);
|
||||
}
|
||||
fprintf(fp, ")");
|
||||
}
|
||||
|
||||
size_t inst_bytecode_size(inst_t inst)
|
||||
{
|
||||
static_assert(NUMBER_OF_OPCODES == 98, "inst_bytecode_size: Out of date");
|
||||
size_t size = 1; // for opcode
|
||||
if (OPCODE_IS_TYPE(inst.opcode, OP_PUSH))
|
||||
{
|
||||
if (inst.opcode == OP_PUSH_BYTE)
|
||||
++size;
|
||||
else if (inst.opcode == OP_PUSH_HWORD)
|
||||
size += HWORD_SIZE;
|
||||
else if (inst.opcode == OP_PUSH_WORD)
|
||||
size += WORD_SIZE;
|
||||
}
|
||||
else if (OPCODE_IS_TYPE(inst.opcode, OP_PUSH_REGISTER) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MOV) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_DUP) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MALLOC) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MSET) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MGET) || inst.opcode == OP_JUMP_ABS ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_JUMP_IF) || inst.opcode == OP_CALL)
|
||||
size += WORD_SIZE;
|
||||
return size;
|
||||
}
|
||||
|
||||
void inst_write_bytecode(inst_t inst, darr_t *darr)
|
||||
{
|
||||
static_assert(NUMBER_OF_OPCODES == 98, "inst_write_bytecode: Out of date");
|
||||
// Append opcode
|
||||
darr_append_byte(darr, inst.opcode);
|
||||
// Then append 0 or more operands
|
||||
data_type_t to_append = DATA_TYPE_NIL;
|
||||
if (OPCODE_IS_TYPE(inst.opcode, OP_PUSH))
|
||||
to_append = (data_type_t)inst.opcode;
|
||||
else if (OPCODE_IS_TYPE(inst.opcode, OP_PUSH_REGISTER) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MOV) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_DUP) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MALLOC) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MSET) ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_MGET) || inst.opcode == OP_JUMP_ABS ||
|
||||
OPCODE_IS_TYPE(inst.opcode, OP_JUMP_IF) || inst.opcode == OP_CALL)
|
||||
to_append = DATA_TYPE_WORD;
|
||||
|
||||
switch (to_append)
|
||||
{
|
||||
case DATA_TYPE_NIL:
|
||||
break;
|
||||
case DATA_TYPE_BYTE:
|
||||
darr_append_byte(darr, inst.operand.as_byte);
|
||||
break;
|
||||
case DATA_TYPE_HWORD:
|
||||
darr_ensure_capacity(darr, HWORD_SIZE);
|
||||
convert_hword_to_bytes(inst.operand.as_hword, darr->data + darr->used);
|
||||
darr->used += HWORD_SIZE;
|
||||
break;
|
||||
case DATA_TYPE_WORD:
|
||||
darr_ensure_capacity(darr, WORD_SIZE);
|
||||
convert_word_to_bytes(inst.operand.as_word, darr->data + darr->used);
|
||||
darr->used += WORD_SIZE;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
void insts_write_bytecode(inst_t *insts, size_t size, darr_t *darr)
|
||||
{
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
inst_write_bytecode(insts[i], darr);
|
||||
}
|
||||
|
||||
data_t read_type_from_darr(darr_t *darr, data_type_t type)
|
||||
{
|
||||
switch (type)
|
||||
{
|
||||
case DATA_TYPE_NIL:
|
||||
break;
|
||||
case DATA_TYPE_BYTE:
|
||||
if (darr->used > darr->available)
|
||||
// TODO: Error (darr has no space left)
|
||||
return DBYTE(0);
|
||||
return DBYTE(darr->data[darr->used++]);
|
||||
break;
|
||||
case DATA_TYPE_HWORD:
|
||||
if (darr->used + HWORD_SIZE > darr->available)
|
||||
// TODO: Error (darr has no space left)
|
||||
return DWORD(0);
|
||||
hword u = convert_bytes_to_hword(darr->data + darr->used);
|
||||
darr->used += HWORD_SIZE;
|
||||
return DHWORD(u);
|
||||
break;
|
||||
case DATA_TYPE_WORD:
|
||||
if (darr->used + WORD_SIZE > darr->available)
|
||||
// TODO: Error (darr has no space left)
|
||||
return DWORD(0);
|
||||
word w = convert_bytes_to_word(darr->data + darr->used);
|
||||
darr->used += WORD_SIZE;
|
||||
return DWORD(w);
|
||||
break;
|
||||
}
|
||||
// TODO: Error (unrecognised type)
|
||||
return DBYTE(0);
|
||||
}
|
||||
|
||||
inst_t inst_read_bytecode(darr_t *darr)
|
||||
{
|
||||
static_assert(NUMBER_OF_OPCODES == 98, "inst_read_bytecode: Out of date");
|
||||
if (darr->used >= darr->available)
|
||||
return (inst_t){0};
|
||||
inst_t inst = {0};
|
||||
opcode_t opcode = darr->data[darr->used++];
|
||||
if (opcode > OP_HALT || opcode == NUMBER_OF_OPCODES || opcode < OP_NOOP)
|
||||
return INST_NOOP;
|
||||
// Read operands
|
||||
if (OPCODE_IS_TYPE(opcode, OP_PUSH))
|
||||
inst.operand = read_type_from_darr(darr, (data_type_t)opcode);
|
||||
// Read register (as a byte)
|
||||
else if (OPCODE_IS_TYPE(opcode, OP_PUSH_REGISTER) ||
|
||||
OPCODE_IS_TYPE(opcode, OP_MOV) || OPCODE_IS_TYPE(opcode, OP_DUP) ||
|
||||
OPCODE_IS_TYPE(opcode, OP_MALLOC) ||
|
||||
OPCODE_IS_TYPE(opcode, OP_MSET) || OPCODE_IS_TYPE(opcode, OP_MGET) ||
|
||||
opcode == OP_JUMP_ABS || OPCODE_IS_TYPE(opcode, OP_JUMP_IF) ||
|
||||
opcode == OP_CALL)
|
||||
inst.operand = read_type_from_darr(darr, DATA_TYPE_WORD);
|
||||
// Otherwise opcode doesn't take operands
|
||||
|
||||
inst.opcode = opcode;
|
||||
|
||||
return inst;
|
||||
}
|
||||
|
||||
inst_t *insts_read_bytecode(darr_t *bytes, size_t *ret_size)
|
||||
{
|
||||
*ret_size = 0;
|
||||
// NOTE: Here we use the darr as a dynamic array of inst_t.
|
||||
darr_t instructions = {0};
|
||||
darr_init(&instructions, sizeof(inst_t));
|
||||
while (bytes->used < bytes->available)
|
||||
{
|
||||
inst_t instruction = inst_read_bytecode(bytes);
|
||||
darr_append_bytes(&instructions, (byte *)&instruction, sizeof(instruction));
|
||||
}
|
||||
*ret_size = instructions.used / sizeof(inst_t);
|
||||
return (inst_t *)instructions.data;
|
||||
}
|
||||
|
||||
inst_t *insts_read_bytecode_file(FILE *fp, size_t *ret)
|
||||
{
|
||||
darr_t darr = darr_read_file(fp);
|
||||
inst_t *instructions = insts_read_bytecode(&darr, ret);
|
||||
free(darr.data);
|
||||
return instructions;
|
||||
}
|
||||
|
||||
void insts_write_bytecode_file(inst_t *instructions, size_t size, FILE *fp)
|
||||
{
|
||||
darr_t darr = {0};
|
||||
darr_init(&darr, 0);
|
||||
insts_write_bytecode(instructions, size, &darr);
|
||||
darr_write_file(&darr, fp);
|
||||
free(darr.data);
|
||||
}
|
||||
|
||||
void prog_header_write_bytecode(prog_header_t header, darr_t *buffer)
|
||||
{
|
||||
word start = word_htobc(header.start_address);
|
||||
darr_append_bytes(buffer, (byte *)&start, sizeof(start));
|
||||
}
|
||||
|
||||
void prog_write_bytecode(prog_t *program, darr_t *buffer)
|
||||
{
|
||||
// Write program header
|
||||
prog_header_write_bytecode(program->header, buffer);
|
||||
// Write instruction count
|
||||
word pcount = word_htobc(program->count);
|
||||
darr_append_bytes(buffer, (byte *)&pcount, sizeof(pcount));
|
||||
// Write instructions
|
||||
insts_write_bytecode(program->instructions, program->count, buffer);
|
||||
}
|
||||
|
||||
void prog_append_bytecode(prog_t *program, darr_t *buffer)
|
||||
{
|
||||
insts_write_bytecode(program->instructions, program->count, buffer);
|
||||
}
|
||||
|
||||
prog_header_t prog_header_read_bytecode(darr_t *buffer)
|
||||
{
|
||||
prog_header_t header = {0};
|
||||
header.start_address = convert_bytes_to_word(buffer->data + buffer->used);
|
||||
buffer->used += sizeof(header.start_address);
|
||||
return header;
|
||||
}
|
||||
|
||||
prog_t *prog_read_bytecode(darr_t *buffer)
|
||||
{
|
||||
// TODO: Error (not enough space for program header)
|
||||
if ((buffer->available - buffer->used) < sizeof(prog_header_t))
|
||||
return NULL;
|
||||
// Read program header
|
||||
prog_header_t header = prog_header_read_bytecode(buffer);
|
||||
// TODO: Error (not enough space for program instruction count)
|
||||
if ((buffer->available - buffer->used) < WORD_SIZE)
|
||||
return NULL;
|
||||
|
||||
// Read instruction count
|
||||
word count = convert_bytes_to_word(buffer->data + buffer->used);
|
||||
buffer->used += sizeof(count);
|
||||
|
||||
prog_t *program = malloc(sizeof(*program) + (sizeof(inst_t) * count));
|
||||
size_t i;
|
||||
for (i = 0; i < count && (buffer->used < buffer->available); ++i)
|
||||
program->instructions[i] = inst_read_bytecode(buffer);
|
||||
|
||||
// TODO: Error (Expected more instructions)
|
||||
if (i < count - 1)
|
||||
{
|
||||
free(program);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
program->header = header;
|
||||
program->count = count;
|
||||
|
||||
return program;
|
||||
}
|
||||
|
||||
void prog_write_file(prog_t *program, FILE *fp)
|
||||
{
|
||||
darr_t bytecode = {0};
|
||||
prog_write_bytecode(program, &bytecode);
|
||||
fwrite(bytecode.data, bytecode.used, 1, fp);
|
||||
free(bytecode.data);
|
||||
}
|
||||
|
||||
prog_t *prog_read_file(FILE *fp)
|
||||
{
|
||||
darr_t buffer = darr_read_file(fp);
|
||||
prog_t *p = prog_read_bytecode(&buffer);
|
||||
free(buffer.data);
|
||||
return p;
|
||||
}
|
||||
108
lib/inst.h
108
lib/inst.h
@@ -1,108 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-15
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Instructions and opcodes
|
||||
*/
|
||||
|
||||
#ifndef INST_H
|
||||
#define INST_H
|
||||
|
||||
#include <lib/darr.h>
|
||||
#include <lib/prog.h>
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
const char *opcode_as_cstr(opcode_t);
|
||||
|
||||
#define OPCODE_IS_TYPE(OPCODE, OP_TYPE) \
|
||||
(((OPCODE) >= OP_TYPE##_BYTE) && ((OPCODE) <= OP_TYPE##_WORD))
|
||||
|
||||
#define OPCODE_DATA_TYPE(OPCODE, OP_TYPE) \
|
||||
((OPCODE) == OP_TYPE##_BYTE ? DATA_TYPE_BYTE \
|
||||
: ((OPCODE) == OP_TYPE##_HWORD) ? DATA_TYPE_HWORD \
|
||||
: DATA_TYPE_WORD)
|
||||
|
||||
void inst_print(inst_t, FILE *);
|
||||
|
||||
size_t inst_bytecode_size(inst_t);
|
||||
void inst_write_bytecode(inst_t, darr_t *);
|
||||
void insts_write_bytecode(inst_t *, size_t, darr_t *);
|
||||
// Here the dynamic array is a preloaded buffer of bytes, where
|
||||
// darr.available is the number of overall bytes and used is the
|
||||
// cursor (where we are in the buffer).
|
||||
inst_t inst_read_bytecode(darr_t *);
|
||||
inst_t *insts_read_bytecode(darr_t *, size_t *);
|
||||
|
||||
void insts_write_bytecode_file(inst_t *, size_t, FILE *);
|
||||
inst_t *insts_read_bytecode_file(FILE *, size_t *);
|
||||
|
||||
// Write the entire program as bytecode
|
||||
void prog_write_bytecode(prog_t *, darr_t *);
|
||||
// Only append the instructions as bytecode
|
||||
void prog_append_bytecode(prog_t *, darr_t *);
|
||||
// Read an entire program as bytecode
|
||||
prog_t *prog_read_bytecode(darr_t *);
|
||||
|
||||
void prog_write_file(prog_t *, FILE *);
|
||||
prog_t *prog_read_file(FILE *);
|
||||
|
||||
#define INST_NOOP ((inst_t){0})
|
||||
#define INST_HALT ((inst_t){.opcode = OP_HALT})
|
||||
|
||||
#define INST_PUSH(TYPE, OP) \
|
||||
((inst_t){.opcode = OP_PUSH_##TYPE, .operand = D##TYPE(OP)})
|
||||
|
||||
#define INST_MOV(TYPE, OP) \
|
||||
((inst_t){.opcode = OP_MOV_##TYPE, .operand = D##TYPE(OP)})
|
||||
|
||||
#define INST_POP(TYPE) ((inst_t){.opcode = OP_POP_##TYPE})
|
||||
|
||||
#define INST_PUSH_REG(TYPE, REG) \
|
||||
((inst_t){.opcode = OP_PUSH_REGISTER_##TYPE, .operand = D##TYPE(REG)})
|
||||
|
||||
#define INST_DUP(TYPE, OP) \
|
||||
((inst_t){.opcode = OP_DUP_##TYPE, .operand = DWORD(OP)})
|
||||
|
||||
#define INST_MALLOC(TYPE, OP) \
|
||||
((inst_t){.opcode = OP_MALLOC_##TYPE, .operand = DWORD(OP)})
|
||||
#define INST_MALLOC_STACK(TYPE) ((inst_t){.opcode = OP_MALLOC_STACK_##TYPE})
|
||||
#define INST_MSET(TYPE, OP) \
|
||||
((inst_t){.opcode = OP_MSET_##TYPE, .operand = DWORD(OP)})
|
||||
#define INST_MSET_STACK(TYPE) ((inst_t){.opcode = OP_MSET_STACK_##TYPE})
|
||||
#define INST_MGET(TYPE, OP) \
|
||||
((inst_t){.opcode = OP_MGET_##TYPE, .operand = DWORD(OP)})
|
||||
#define INST_MGET_STACK(TYPE) ((inst_t){.opcode = OP_MGET_STACK_##TYPE})
|
||||
#define INST_MDELETE ((inst_t){.opcode = OP_MDELETE})
|
||||
#define INST_MSIZE ((inst_t){.opcode = OP_MSIZE})
|
||||
|
||||
#define INST_NOT(TYPE) ((inst_t){.opcode = OP_NOT_##TYPE})
|
||||
#define INST_OR(TYPE) ((inst_t){.opcode = OP_OR_##TYPE})
|
||||
#define INST_AND(TYPE) ((inst_t){.opcode = OP_AND_##TYPE})
|
||||
#define INST_XOR(TYPE) ((inst_t){.opcode = OP_XOR_##TYPE})
|
||||
#define INST_EQ(TYPE) ((inst_t){.opcode = OP_EQ_##TYPE})
|
||||
#define INST_LT(TYPE) ((inst_t){.opcode = OP_LT_##TYPE})
|
||||
#define INST_LTE(TYPE) ((inst_t){.opcode = OP_LTE_##TYPE})
|
||||
#define INST_GT(TYPE) ((inst_t){.opcode = OP_GT_##TYPE})
|
||||
#define INST_GTE(TYPE) ((inst_t){.opcode = OP_GTE_##TYPE})
|
||||
#define INST_PLUS(TYPE) ((inst_t){.opcode = OP_PLUS_##TYPE})
|
||||
#define INST_SUB(TYPE) ((inst_t){.opcode = OP_SUB_##TYPE})
|
||||
#define INST_MULT(TYPE) ((inst_t){.opcode = OP_MULT_##TYPE})
|
||||
|
||||
#define INST_JUMP_ABS(OP) \
|
||||
((inst_t){.opcode = OP_JUMP_ABS, .operand = DWORD(OP)})
|
||||
#define INST_JUMP_STACK ((inst_t){.opcode = OP_JUMP_STACK})
|
||||
#define INST_JUMP_IF(TYPE, OP) \
|
||||
((inst_t){.opcode = OP_JUMP_IF_##TYPE, .operand = DWORD(OP)})
|
||||
#define INST_CALL(OP) ((inst_t){.opcode = OP_CALL, .operand = DWORD(OP)})
|
||||
#define INST_CALL_STACK ((inst_t){.opcode = OP_CALL_STACK})
|
||||
#define INST_RET ((inst_t){.opcode = OP_RET})
|
||||
|
||||
#define INST_PRINT(TYPE) ((inst_t){.opcode = OP_PRINT_##TYPE})
|
||||
#endif
|
||||
176
lib/prog.h
176
lib/prog.h
@@ -1,176 +0,0 @@
|
||||
/* Copyright (C) 2024 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2024-04-14
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Structures for both instructions and programs for the
|
||||
* virtual machine
|
||||
*/
|
||||
|
||||
#ifndef PROG_H
|
||||
#define PROG_H
|
||||
|
||||
#include <lib/base.h>
|
||||
|
||||
typedef enum
|
||||
{
|
||||
OP_NOOP = 0,
|
||||
|
||||
// Dealing with data and registers
|
||||
OP_PUSH_BYTE,
|
||||
OP_PUSH_HWORD,
|
||||
OP_PUSH_WORD,
|
||||
|
||||
OP_POP_BYTE,
|
||||
OP_POP_HWORD,
|
||||
OP_POP_WORD,
|
||||
|
||||
OP_PUSH_REGISTER_BYTE,
|
||||
OP_PUSH_REGISTER_HWORD,
|
||||
OP_PUSH_REGISTER_WORD,
|
||||
|
||||
OP_MOV_BYTE,
|
||||
OP_MOV_HWORD,
|
||||
OP_MOV_WORD,
|
||||
|
||||
OP_DUP_BYTE,
|
||||
OP_DUP_HWORD,
|
||||
OP_DUP_WORD,
|
||||
|
||||
// Dealing with the heap
|
||||
OP_MALLOC_BYTE,
|
||||
OP_MALLOC_HWORD,
|
||||
OP_MALLOC_WORD,
|
||||
|
||||
OP_MALLOC_STACK_BYTE,
|
||||
OP_MALLOC_STACK_HWORD,
|
||||
OP_MALLOC_STACK_WORD,
|
||||
|
||||
OP_MSET_BYTE,
|
||||
OP_MSET_HWORD,
|
||||
OP_MSET_WORD,
|
||||
|
||||
OP_MSET_STACK_BYTE,
|
||||
OP_MSET_STACK_HWORD,
|
||||
OP_MSET_STACK_WORD,
|
||||
|
||||
OP_MGET_BYTE,
|
||||
OP_MGET_HWORD,
|
||||
OP_MGET_WORD,
|
||||
|
||||
OP_MGET_STACK_BYTE,
|
||||
OP_MGET_STACK_HWORD,
|
||||
OP_MGET_STACK_WORD,
|
||||
|
||||
OP_MDELETE,
|
||||
OP_MSIZE,
|
||||
|
||||
// Boolean operations
|
||||
OP_NOT_BYTE,
|
||||
OP_NOT_HWORD,
|
||||
OP_NOT_WORD,
|
||||
|
||||
OP_OR_BYTE,
|
||||
OP_OR_HWORD,
|
||||
OP_OR_WORD,
|
||||
|
||||
OP_AND_BYTE,
|
||||
OP_AND_HWORD,
|
||||
OP_AND_WORD,
|
||||
|
||||
OP_XOR_BYTE,
|
||||
OP_XOR_HWORD,
|
||||
OP_XOR_WORD,
|
||||
|
||||
OP_EQ_BYTE,
|
||||
OP_EQ_HWORD,
|
||||
OP_EQ_WORD,
|
||||
|
||||
// Mathematical operations
|
||||
OP_LT_BYTE,
|
||||
OP_LT_CHAR,
|
||||
OP_LT_HWORD,
|
||||
OP_LT_INT,
|
||||
OP_LT_LONG,
|
||||
OP_LT_WORD,
|
||||
|
||||
OP_LTE_BYTE,
|
||||
OP_LTE_CHAR,
|
||||
OP_LTE_HWORD,
|
||||
OP_LTE_INT,
|
||||
OP_LTE_LONG,
|
||||
OP_LTE_WORD,
|
||||
|
||||
OP_GT_BYTE,
|
||||
OP_GT_CHAR,
|
||||
OP_GT_HWORD,
|
||||
OP_GT_INT,
|
||||
OP_GT_LONG,
|
||||
OP_GT_WORD,
|
||||
|
||||
OP_GTE_BYTE,
|
||||
OP_GTE_CHAR,
|
||||
OP_GTE_HWORD,
|
||||
OP_GTE_INT,
|
||||
OP_GTE_LONG,
|
||||
OP_GTE_WORD,
|
||||
|
||||
OP_PLUS_BYTE,
|
||||
OP_PLUS_HWORD,
|
||||
OP_PLUS_WORD,
|
||||
|
||||
OP_SUB_BYTE,
|
||||
OP_SUB_HWORD,
|
||||
OP_SUB_WORD,
|
||||
|
||||
OP_MULT_BYTE,
|
||||
OP_MULT_HWORD,
|
||||
OP_MULT_WORD,
|
||||
|
||||
// Simple I/O
|
||||
OP_PRINT_BYTE,
|
||||
OP_PRINT_CHAR,
|
||||
OP_PRINT_HWORD,
|
||||
OP_PRINT_INT,
|
||||
OP_PRINT_LONG,
|
||||
OP_PRINT_WORD,
|
||||
|
||||
// Program control flow
|
||||
OP_JUMP_ABS,
|
||||
OP_JUMP_STACK,
|
||||
OP_JUMP_IF_BYTE,
|
||||
OP_JUMP_IF_HWORD,
|
||||
OP_JUMP_IF_WORD,
|
||||
// Subroutines
|
||||
OP_CALL,
|
||||
OP_CALL_STACK,
|
||||
OP_RET,
|
||||
|
||||
// Should not be an opcode
|
||||
NUMBER_OF_OPCODES,
|
||||
OP_HALT = 0b11111111, // top of the byte is a HALT
|
||||
} opcode_t;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
opcode_t opcode;
|
||||
data_t operand;
|
||||
} inst_t;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
word start_address;
|
||||
} prog_header_t;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
prog_header_t header;
|
||||
word count;
|
||||
inst_t instructions[];
|
||||
} prog_t;
|
||||
|
||||
#endif
|
||||
88
spec.org
88
spec.org
@@ -1,88 +0,0 @@
|
||||
#+title: VM Specification
|
||||
#+author: Aryadev Chavali
|
||||
#+description: A specification of instructions for the virtual machine
|
||||
#+date: 2023-11-02
|
||||
|
||||
* WIP Data types
|
||||
There are 3 main data types of the virtual machine. They are all
|
||||
unsigned. There exist signed versions of these data types, though
|
||||
there is no difference (internally) between them. For an unsigned
|
||||
type <T> the signed version is simply S_<T>.
|
||||
|-------+------|
|
||||
| Name | Bits |
|
||||
|-------+------|
|
||||
| Byte | 8 |
|
||||
| HWord | 32 |
|
||||
| Word | 64 |
|
||||
|-------+------|
|
||||
|
||||
Generally, the abbreviations B, H and W are used for Byte, HWord and
|
||||
Word respectively. The following table shows a comparison between the
|
||||
data types where an entry (row and column) $A\times{B}$ refers to "How
|
||||
many of A can I fit in B".
|
||||
|-------+------+-------+------|
|
||||
| | Byte | Hword | Word |
|
||||
|-------+------+-------+------|
|
||||
| Byte | 1 | 4 | 8 |
|
||||
| HWord | 1/4 | 1 | 2 |
|
||||
| Word | 1/8 | 1/2 | 1 |
|
||||
|-------+------+-------+------|
|
||||
* WIP Instructions
|
||||
An instruction for the virtual machine is composed of an *opcode* and,
|
||||
potentially, an *operand*. The /opcode/ represents the behaviour of
|
||||
the instruction i.e. what _is_ the instruction. The /operand/ is an
|
||||
element of one of the /data types/ described previously.
|
||||
|
||||
Some instructions do have /operands/ while others do not. The former
|
||||
type of instructions are called *UNIT* instructions while the latter
|
||||
type are called *MULTI* instructions[fn:1].
|
||||
|
||||
All /opcodes/ (with very few exceptions[fn:2]) have two components:
|
||||
the *root* and the *type specifier*. The /root/ represents the
|
||||
general behaviour of the instruction: ~PUSH~, ~POP~, ~MOV~, etc. The
|
||||
/type specifier/ specifies what /data type/ it manipulates. A
|
||||
complete opcode will be a combination of these two e.g. ~PUSH_BYTE~,
|
||||
~POP_WORD~, etc. Some /opcodes/ may have more /type specifiers/ than
|
||||
others.
|
||||
* TODO Bytecode format
|
||||
Bytecode files are byte sequence which encode instructions for the
|
||||
virtual machine. Any instruction (even with an operand) has one and
|
||||
only one byte sequence associated with it.
|
||||
* TODO Storage
|
||||
Two types of storage:
|
||||
+ Data stack which all core VM routines manipulate and work on (FILO)
|
||||
+ ~DS~ in shorthand, with indexing from 0 (referring to the top of the
|
||||
stack) up to n (referring to the bottom of the stack). B(DS)
|
||||
refers to the bytes in the stack (the default).
|
||||
+ Register space which is generally reserved for user space code
|
||||
i.e. other than ~mov~ no other core VM routine manipulates the
|
||||
registers
|
||||
+ ~R~ in shorthand, with indexing from 0 to $\infty$.
|
||||
* TODO Standard library
|
||||
Standard library subroutines reserve the first 16 words (128 bytes) of
|
||||
register space (W(R)[0] to W(R)[15]). The first 8 words (W(R)[0] to
|
||||
W(R)[7]) are generally considered "arguments" to the subroutine while
|
||||
the remaining 8 words (W(R)[8] to W(R)[15]) are considered additional
|
||||
space that the subroutine may access and mutate for internal purposes.
|
||||
|
||||
The stack may have additional bytes pushed, which act as the "return
|
||||
value" of the subroutine, but no bytes will be popped off (*Stack
|
||||
Preservation*).
|
||||
|
||||
If a subroutine requires more than 8 words for its arguments, then it
|
||||
will use the stack. This is the only case where the stack is mutated
|
||||
due to a subroutine call, as those arguments will always be popped off
|
||||
the stack.
|
||||
|
||||
Subroutines must always end in ~RET~. Therefore, they must always be
|
||||
called via ~CALL~, never by ~JUMP~ (which will always cause error
|
||||
prone behaviour).
|
||||
* Footnotes
|
||||
[fn:2] ~NOOP~, ~HALT~, ~MDELETE~, ~MSIZE~, ~JUMP_*~
|
||||
|
||||
[fn:1] /UNIT/ refers to the fact that the internal representation of
|
||||
these instructions are singular: two instances of the same /UNIT/
|
||||
instruction will be identical in terms of their binary. On the other
|
||||
hand, two instances of the same /MULTI/ instruction may not be
|
||||
equivalent due to the operand they take. Crucially, most if not all
|
||||
/MULTI/ instructions have different versions for each /data type/.
|
||||
87
todo.org
87
todo.org
@@ -5,17 +5,9 @@
|
||||
|
||||
* TODO Better documentation [0%] :DOC:
|
||||
** TODO Comment coverage [0%]
|
||||
*** WIP Lib [50%]
|
||||
**** DONE lib/base.h
|
||||
**** DONE lib/darr.h
|
||||
**** TODO lib/heap.h
|
||||
**** TODO lib/inst.h
|
||||
*** TODO ASM [0%]
|
||||
**** TODO asm/lexer.h
|
||||
**** TODO asm/parser.h
|
||||
*** TODO VM [0%]
|
||||
**** TODO vm/runtime.h
|
||||
** TODO Specification
|
||||
* TODO Preprocessing directives :ASM:
|
||||
Like in FASM or NASM where we can give certain helpful instructions to
|
||||
the assembler. I'd use the ~%~ symbol to designate preprocessor
|
||||
@@ -200,85 +192,6 @@ process_const(V: Vector[Unit]) ->
|
||||
v = v_x[0]
|
||||
for v_x in V]
|
||||
#+end_src
|
||||
* TODO Introduce error handling in base library :LIB:
|
||||
There is a large variety of TODOs about errors. Let's fix them!
|
||||
8 TODOs currently present.
|
||||
* TODO Standard library :ASM:VM:
|
||||
I should start considering this and how a user may use it. Should it
|
||||
be an option in the VM and/or assembler binaries (i.e. a flag) or
|
||||
something the user has to specify in their source files?
|
||||
|
||||
Something to consider is /static/ and /dynamic/ "linking" i.e.:
|
||||
+ Static linking: assembler inserts all used library definitions into
|
||||
the bytecode output directly
|
||||
+ We could insert all of it at the start of the bytecode file, and
|
||||
with [[*Start points][Start points]] this won't interfere with
|
||||
user code
|
||||
+ 2023-11-03: Finishing the Start point feature has made these
|
||||
features more tenable. A program header which is compiled and
|
||||
interpreted in bytecode works wonders.
|
||||
+ Furthermore library code will have fixed program addresses (always
|
||||
at the start) so we'll know at start of assembler runtime where to
|
||||
resolve standard library subroutine calls
|
||||
+ Virtual machine needs no changes to do this
|
||||
** TODO Consider dynamic Linking
|
||||
+ Dynamic linking: virtual machine has fixed program storage for
|
||||
library code (a ROM), and assembler makes jump references
|
||||
specifically for this program storage
|
||||
+ When assembling subroutine calls, just need to put references to
|
||||
this library storage (some kind of shared state between VM and
|
||||
assembler to know what these references are)
|
||||
+ VM needs to manage a ROM of some kind for library code
|
||||
+ How do we ensure assembled links to subroutine calls don't
|
||||
conflict with user code jumps?
|
||||
|
||||
What follows is a possible dynamic linking strategy. It requires
|
||||
quite a few moving parts:
|
||||
|
||||
The address operand of every program control instruction (~CALL~,
|
||||
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
|
||||
dynamically linked:
|
||||
+ If the most significant bit is 0, the remaining 63 bits encode an
|
||||
absolute address within the program
|
||||
+ Otherwise, the address encodes a standard library subroutine. The
|
||||
bits within the address follow this schema:
|
||||
+ The next 30 bits represent the specific module where the
|
||||
subroutine is defined (over 1.07 *billion* possible library values)
|
||||
+ The remaining 33 bits (4 bytes + 1 bit) encode the absolute
|
||||
program address in the bytecode of that specific module for the
|
||||
start of the subroutine (over 8.60 *billion* values)
|
||||
|
||||
The assembler will automatically encode this based on "%USE" calls and
|
||||
the name of the subroutines called. On the virtual machine, there is
|
||||
a storage location (similar to the ROM of real machines) which stores
|
||||
the bytecode for modules of the standard library, indexed by the
|
||||
module number. This means, on deserialising the address into the
|
||||
proper components, the VM can refer to the module bytecode then jump
|
||||
to the correct address.
|
||||
|
||||
2023-11-09: I'll need a way to run library code in the current program
|
||||
system in the runtime. It currently doesn't support jumps or work in
|
||||
programs outside of the main one unfortunately. Any proper work done
|
||||
in this area requires some proper refactoring.
|
||||
|
||||
2023-11-09: Constants or inline macros need to be reconfigured for
|
||||
this to work: at parse time, we work out the inlines directly which
|
||||
means compiling bytecode with "standard library" macros will not work
|
||||
as they won't be in the token stream. Either we don't allow
|
||||
preprocessor work in the standard library at all (which is bad cos we
|
||||
can't then set standard limits or other useful things) or we insert
|
||||
them into the registries at parse time for use in program parsing
|
||||
(which not only requires assembler refactoring to figure out what
|
||||
libraries are used (to pull definitions from) but also requires making
|
||||
macros "recognisable" in bytecode because they're essentially
|
||||
invisible).
|
||||
|
||||
2024-04-15: Perhaps we could insert the linking information into the
|
||||
program header?
|
||||
1) A table which states the load order of certain modules would allow
|
||||
the runtime to selectively spin up and properly delegate module
|
||||
jumps to the right bytecode
|
||||
2)
|
||||
* Completed
|
||||
** DONE Write a label/jump system :ASM:
|
||||
Essentially a user should be able to write arbitrary labels (maybe
|
||||
|
||||
92
vm/main.c
92
vm/main.c
@@ -1,92 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-15
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Entrypoint to program
|
||||
*/
|
||||
|
||||
#include <assert.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "./runtime.h"
|
||||
#include <lib/inst.h>
|
||||
|
||||
void usage(const char *program_name, FILE *out)
|
||||
{
|
||||
fprintf(out,
|
||||
"Usage: %s [OPTIONS] FILE\n"
|
||||
"\t FILE: Bytecode file to execute\n"
|
||||
"\tOptions:\n"
|
||||
"\t\t To be developed...\n",
|
||||
program_name);
|
||||
}
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
if (argc == 1)
|
||||
{
|
||||
usage(argv[0], stderr);
|
||||
return 1;
|
||||
}
|
||||
const char *filename = argv[1];
|
||||
|
||||
#if VERBOSE >= 1
|
||||
printf("[" TERM_YELLOW "INTERPRETER" TERM_RESET "]: `%s`\n", filename);
|
||||
#endif
|
||||
|
||||
FILE *fp = fopen(filename, "rb");
|
||||
prog_t *program = prog_read_file(fp);
|
||||
fclose(fp);
|
||||
|
||||
#if VERBOSE >= 1
|
||||
printf("\t[" TERM_GREEN "SETUP" TERM_RESET "]: Read %lu instructions\n",
|
||||
program->count);
|
||||
#endif
|
||||
|
||||
size_t stack_size = 256;
|
||||
byte *stack = calloc(stack_size, 1);
|
||||
registers_t registers = {0};
|
||||
darr_init(®isters, 8 * WORD_SIZE);
|
||||
heap_t heap = {0};
|
||||
heap_create(&heap);
|
||||
size_t call_stack_size = 256;
|
||||
word *call_stack = calloc(call_stack_size, sizeof(call_stack));
|
||||
|
||||
vm_t vm = {0};
|
||||
vm_load_stack(&vm, stack, stack_size);
|
||||
vm_load_program(&vm, program);
|
||||
vm_load_registers(&vm, registers);
|
||||
vm_load_heap(&vm, heap);
|
||||
vm_load_call_stack(&vm, call_stack, call_stack_size);
|
||||
|
||||
#if VERBOSE >= 1
|
||||
printf("\t[" TERM_GREEN "SETUP" TERM_RESET
|
||||
"]: Loaded stack and program into VM\n");
|
||||
#endif
|
||||
#if VERBOSE >= 1
|
||||
printf("[" TERM_YELLOW "INTERPRETER" TERM_RESET "]: Beginning execution\n");
|
||||
#endif
|
||||
err_t err = vm_execute_all(&vm);
|
||||
|
||||
int ret = 0;
|
||||
if (err)
|
||||
{
|
||||
const char *error_str = err_as_cstr(err);
|
||||
fprintf(stderr, "[ERROR]: %s\n", error_str);
|
||||
vm_print_all(&vm, stderr);
|
||||
ret = 255 - err;
|
||||
}
|
||||
|
||||
vm_stop(&vm);
|
||||
|
||||
#if VERBOSE >= 1
|
||||
printf("[%sINTERPRETER%s]: Finished execution\n", TERM_GREEN, TERM_RESET);
|
||||
#endif
|
||||
return ret;
|
||||
}
|
||||
1070
vm/runtime.c
1070
vm/runtime.c
File diff suppressed because it is too large
Load Diff
301
vm/runtime.h
301
vm/runtime.h
@@ -1,301 +0,0 @@
|
||||
/* Copyright (C) 2023 Aryadev Chavali
|
||||
|
||||
* You may distribute and modify this code under the terms of the
|
||||
* GPLv2 license. You should have received a copy of the GPLv2
|
||||
* license with this file. If not, please write to:
|
||||
* aryadev@aryadevchavali.com.
|
||||
|
||||
* Created: 2023-10-15
|
||||
* Author: Aryadev Chavali
|
||||
* Description: Virtual machine implementation
|
||||
*/
|
||||
|
||||
#ifndef RUNTIME_H
|
||||
#define RUNTIME_H
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
#include <lib/heap.h>
|
||||
#include <lib/inst.h>
|
||||
|
||||
typedef enum
|
||||
{
|
||||
ERR_OK = 0,
|
||||
ERR_STACK_UNDERFLOW,
|
||||
ERR_STACK_OVERFLOW,
|
||||
ERR_CALL_STACK_UNDERFLOW,
|
||||
ERR_CALL_STACK_OVERFLOW,
|
||||
ERR_INVALID_OPCODE,
|
||||
ERR_INVALID_REGISTER_BYTE,
|
||||
ERR_INVALID_REGISTER_HWORD,
|
||||
ERR_INVALID_REGISTER_WORD,
|
||||
ERR_INVALID_PROGRAM_ADDRESS,
|
||||
ERR_INVALID_PAGE_ADDRESS,
|
||||
ERR_OUT_OF_BOUNDS,
|
||||
ERR_END_OF_PROGRAM,
|
||||
} err_t;
|
||||
|
||||
const char *err_as_cstr(err_t);
|
||||
|
||||
typedef darr_t registers_t;
|
||||
#define VM_NTH_REGISTER(REGISTERS, N) (((word *)((REGISTERS).data))[N])
|
||||
#define VM_REGISTERS_AVAILABLE(REGISTERS) (((REGISTERS).available) / WORD_SIZE)
|
||||
|
||||
typedef struct
|
||||
{
|
||||
registers_t registers;
|
||||
struct Stack
|
||||
{
|
||||
byte *data;
|
||||
size_t ptr, max;
|
||||
} stack;
|
||||
heap_t heap;
|
||||
struct Program
|
||||
{
|
||||
prog_t *data;
|
||||
word ptr;
|
||||
} program;
|
||||
struct CallStack
|
||||
{
|
||||
word *address_pointers;
|
||||
size_t ptr, max;
|
||||
} call_stack;
|
||||
} vm_t;
|
||||
|
||||
err_t vm_execute(vm_t *);
|
||||
err_t vm_execute_all(vm_t *);
|
||||
|
||||
void vm_load_stack(vm_t *, byte *, size_t);
|
||||
void vm_load_registers(vm_t *, registers_t);
|
||||
void vm_load_heap(vm_t *, heap_t);
|
||||
void vm_load_program(vm_t *, prog_t *);
|
||||
void vm_load_call_stack(vm_t *, word *, size_t);
|
||||
void vm_stop(vm_t *);
|
||||
|
||||
// Print routines
|
||||
#define VM_PRINT_PROGRAM_EXCERPT 5
|
||||
void vm_print_registers(vm_t *, FILE *);
|
||||
void vm_print_stack(vm_t *, FILE *);
|
||||
void vm_print_program(vm_t *, FILE *);
|
||||
void vm_print_heap(vm_t *, FILE *);
|
||||
void vm_print_call_stack(vm_t *, FILE *);
|
||||
void vm_print_all(vm_t *, FILE *);
|
||||
|
||||
// Execution routines
|
||||
err_t vm_jump(vm_t *, word);
|
||||
|
||||
err_t vm_pop_byte(vm_t *, data_t *);
|
||||
err_t vm_pop_hword(vm_t *, data_t *);
|
||||
err_t vm_pop_word(vm_t *, data_t *);
|
||||
|
||||
err_t vm_push_byte(vm_t *, data_t);
|
||||
err_t vm_push_hword(vm_t *, data_t);
|
||||
err_t vm_push_word(vm_t *, data_t);
|
||||
|
||||
typedef err_t (*push_f)(vm_t *, data_t);
|
||||
static const push_f PUSH_ROUTINES[] = {
|
||||
[OP_PUSH_BYTE] = vm_push_byte,
|
||||
[OP_PUSH_HWORD] = vm_push_hword,
|
||||
[OP_PUSH_WORD] = vm_push_word,
|
||||
};
|
||||
|
||||
err_t vm_push_byte_register(vm_t *, word);
|
||||
err_t vm_push_hword_register(vm_t *, word);
|
||||
err_t vm_push_word_register(vm_t *, word);
|
||||
|
||||
err_t vm_mov_byte(vm_t *, word);
|
||||
err_t vm_mov_hword(vm_t *, word);
|
||||
err_t vm_mov_word(vm_t *, word);
|
||||
|
||||
err_t vm_dup_byte(vm_t *, word);
|
||||
err_t vm_dup_hword(vm_t *, word);
|
||||
err_t vm_dup_word(vm_t *, word);
|
||||
|
||||
err_t vm_malloc_byte(vm_t *, word);
|
||||
err_t vm_malloc_hword(vm_t *, word);
|
||||
err_t vm_malloc_word(vm_t *, word);
|
||||
|
||||
err_t vm_mset_byte(vm_t *, word);
|
||||
err_t vm_mset_hword(vm_t *, word);
|
||||
err_t vm_mset_word(vm_t *, word);
|
||||
|
||||
err_t vm_mget_byte(vm_t *, word);
|
||||
err_t vm_mget_hword(vm_t *, word);
|
||||
err_t vm_mget_word(vm_t *, word);
|
||||
|
||||
typedef err_t (*word_f)(vm_t *, word);
|
||||
static const word_f WORD_ROUTINES[] = {
|
||||
[OP_PUSH_REGISTER_BYTE] = vm_push_byte_register,
|
||||
[OP_PUSH_REGISTER_HWORD] = vm_push_hword_register,
|
||||
[OP_PUSH_REGISTER_WORD] = vm_push_word_register,
|
||||
[OP_MOV_BYTE] = vm_mov_byte,
|
||||
[OP_MOV_HWORD] = vm_mov_hword,
|
||||
[OP_MOV_WORD] = vm_mov_word,
|
||||
[OP_DUP_BYTE] = vm_dup_byte,
|
||||
[OP_DUP_HWORD] = vm_dup_hword,
|
||||
[OP_DUP_WORD] = vm_dup_word,
|
||||
[OP_MALLOC_BYTE] = vm_malloc_byte,
|
||||
[OP_MALLOC_HWORD] = vm_malloc_hword,
|
||||
[OP_MALLOC_WORD] = vm_malloc_word,
|
||||
[OP_MGET_BYTE] = vm_mget_byte,
|
||||
[OP_MGET_HWORD] = vm_mget_hword,
|
||||
[OP_MGET_WORD] = vm_mget_word,
|
||||
[OP_MSET_BYTE] = vm_mset_byte,
|
||||
[OP_MSET_HWORD] = vm_mset_hword,
|
||||
[OP_MSET_WORD] = vm_mset_word,
|
||||
};
|
||||
|
||||
err_t vm_malloc_stack_byte(vm_t *);
|
||||
err_t vm_malloc_stack_hword(vm_t *);
|
||||
err_t vm_malloc_stack_word(vm_t *);
|
||||
|
||||
err_t vm_mset_stack_byte(vm_t *);
|
||||
err_t vm_mset_stack_hword(vm_t *);
|
||||
err_t vm_mset_stack_word(vm_t *);
|
||||
|
||||
err_t vm_mget_stack_byte(vm_t *);
|
||||
err_t vm_mget_stack_hword(vm_t *);
|
||||
err_t vm_mget_stack_word(vm_t *);
|
||||
|
||||
err_t vm_mdelete(vm_t *);
|
||||
err_t vm_msize(vm_t *);
|
||||
|
||||
err_t vm_not_byte(vm_t *);
|
||||
err_t vm_not_hword(vm_t *);
|
||||
err_t vm_not_word(vm_t *);
|
||||
|
||||
err_t vm_or_byte(vm_t *);
|
||||
err_t vm_or_hword(vm_t *);
|
||||
err_t vm_or_word(vm_t *);
|
||||
|
||||
err_t vm_and_byte(vm_t *);
|
||||
err_t vm_and_hword(vm_t *);
|
||||
err_t vm_and_word(vm_t *);
|
||||
|
||||
err_t vm_xor_byte(vm_t *);
|
||||
err_t vm_xor_hword(vm_t *);
|
||||
err_t vm_xor_word(vm_t *);
|
||||
|
||||
err_t vm_eq_byte(vm_t *);
|
||||
err_t vm_eq_char(vm_t *);
|
||||
err_t vm_eq_int(vm_t *);
|
||||
err_t vm_eq_hword(vm_t *);
|
||||
err_t vm_eq_long(vm_t *);
|
||||
err_t vm_eq_word(vm_t *);
|
||||
|
||||
err_t vm_lt_byte(vm_t *);
|
||||
err_t vm_lt_char(vm_t *);
|
||||
err_t vm_lt_int(vm_t *);
|
||||
err_t vm_lt_hword(vm_t *);
|
||||
err_t vm_lt_long(vm_t *);
|
||||
err_t vm_lt_word(vm_t *);
|
||||
|
||||
err_t vm_lte_byte(vm_t *);
|
||||
err_t vm_lte_char(vm_t *);
|
||||
err_t vm_lte_int(vm_t *);
|
||||
err_t vm_lte_hword(vm_t *);
|
||||
err_t vm_lte_long(vm_t *);
|
||||
err_t vm_lte_word(vm_t *);
|
||||
|
||||
err_t vm_gt_byte(vm_t *);
|
||||
err_t vm_gt_char(vm_t *);
|
||||
err_t vm_gt_int(vm_t *);
|
||||
err_t vm_gt_hword(vm_t *);
|
||||
err_t vm_gt_long(vm_t *);
|
||||
err_t vm_gt_word(vm_t *);
|
||||
|
||||
err_t vm_gte_byte(vm_t *);
|
||||
err_t vm_gte_char(vm_t *);
|
||||
err_t vm_gte_int(vm_t *);
|
||||
err_t vm_gte_hword(vm_t *);
|
||||
err_t vm_gte_long(vm_t *);
|
||||
err_t vm_gte_word(vm_t *);
|
||||
|
||||
err_t vm_plus_byte(vm_t *);
|
||||
err_t vm_plus_hword(vm_t *);
|
||||
err_t vm_plus_word(vm_t *);
|
||||
|
||||
err_t vm_sub_byte(vm_t *);
|
||||
err_t vm_sub_hword(vm_t *);
|
||||
err_t vm_sub_word(vm_t *);
|
||||
|
||||
err_t vm_mult_byte(vm_t *);
|
||||
err_t vm_mult_hword(vm_t *);
|
||||
err_t vm_mult_word(vm_t *);
|
||||
|
||||
typedef err_t (*stack_f)(vm_t *);
|
||||
static const stack_f STACK_ROUTINES[] = {
|
||||
[OP_MALLOC_STACK_BYTE] = vm_malloc_stack_byte,
|
||||
[OP_MALLOC_STACK_HWORD] = vm_malloc_stack_hword,
|
||||
[OP_MALLOC_STACK_WORD] = vm_malloc_stack_word,
|
||||
[OP_MGET_STACK_BYTE] = vm_mget_stack_byte,
|
||||
[OP_MGET_STACK_HWORD] = vm_mget_stack_hword,
|
||||
[OP_MGET_STACK_WORD] = vm_mget_stack_word,
|
||||
[OP_MSET_STACK_BYTE] = vm_mset_stack_byte,
|
||||
[OP_MSET_STACK_HWORD] = vm_mset_stack_hword,
|
||||
[OP_MSET_STACK_WORD] = vm_mset_stack_word,
|
||||
[OP_MDELETE] = vm_mdelete,
|
||||
[OP_MSIZE] = vm_msize,
|
||||
|
||||
[OP_NOT_BYTE] = vm_not_byte,
|
||||
[OP_NOT_HWORD] = vm_not_hword,
|
||||
[OP_NOT_WORD] = vm_not_word,
|
||||
|
||||
[OP_OR_BYTE] = vm_or_byte,
|
||||
[OP_OR_HWORD] = vm_or_hword,
|
||||
[OP_OR_WORD] = vm_or_word,
|
||||
|
||||
[OP_AND_BYTE] = vm_and_byte,
|
||||
[OP_AND_HWORD] = vm_and_hword,
|
||||
[OP_AND_WORD] = vm_and_word,
|
||||
|
||||
[OP_XOR_BYTE] = vm_xor_byte,
|
||||
[OP_XOR_HWORD] = vm_xor_hword,
|
||||
[OP_XOR_WORD] = vm_xor_word,
|
||||
|
||||
[OP_EQ_BYTE] = vm_eq_byte,
|
||||
[OP_EQ_HWORD] = vm_eq_hword,
|
||||
[OP_EQ_WORD] = vm_eq_word,
|
||||
|
||||
[OP_LT_BYTE] = vm_lt_byte,
|
||||
[OP_LT_CHAR] = vm_lt_char,
|
||||
[OP_LT_INT] = vm_lt_int,
|
||||
[OP_LT_HWORD] = vm_lt_hword,
|
||||
[OP_LT_LONG] = vm_lt_long,
|
||||
[OP_LT_WORD] = vm_lt_word,
|
||||
|
||||
[OP_LTE_BYTE] = vm_lte_byte,
|
||||
[OP_LTE_CHAR] = vm_lte_char,
|
||||
[OP_LTE_INT] = vm_lte_int,
|
||||
[OP_LTE_HWORD] = vm_lte_hword,
|
||||
[OP_LTE_LONG] = vm_lte_long,
|
||||
[OP_LTE_WORD] = vm_lte_word,
|
||||
|
||||
[OP_GT_BYTE] = vm_gt_byte,
|
||||
[OP_GT_CHAR] = vm_gt_char,
|
||||
[OP_GT_INT] = vm_gt_int,
|
||||
[OP_GT_HWORD] = vm_gt_hword,
|
||||
[OP_GT_LONG] = vm_gt_long,
|
||||
[OP_GT_WORD] = vm_gt_word,
|
||||
|
||||
[OP_GTE_BYTE] = vm_gte_byte,
|
||||
[OP_GTE_CHAR] = vm_gte_char,
|
||||
[OP_GTE_INT] = vm_gte_int,
|
||||
[OP_GTE_HWORD] = vm_gte_hword,
|
||||
[OP_GTE_LONG] = vm_gte_long,
|
||||
[OP_GTE_WORD] = vm_gte_word,
|
||||
|
||||
[OP_PLUS_BYTE] = vm_plus_byte,
|
||||
[OP_PLUS_HWORD] = vm_plus_hword,
|
||||
[OP_PLUS_WORD] = vm_plus_word,
|
||||
[OP_SUB_BYTE] = vm_sub_byte,
|
||||
[OP_SUB_HWORD] = vm_sub_hword,
|
||||
[OP_SUB_WORD] = vm_sub_word,
|
||||
|
||||
[OP_MULT_BYTE] = vm_mult_byte,
|
||||
[OP_MULT_HWORD] = vm_mult_hword,
|
||||
[OP_MULT_WORD] = vm_mult_word,
|
||||
};
|
||||
|
||||
#endif
|
||||
Reference in New Issue
Block a user