I've had a revelation: the virtual machine shouldn't cater to any specific dynamic or trend. What I need the VM to do is provide a basis where very useful features such as arithmetic of arbitrary sized integers, basic control flow, memory management and file I/O are a bit nicer than doing it myself in C. Past that it can be as barebones as necessary. Previous development was done in tandem with the assembler, which influenced how I designed the system. I will no longer do this. Here I describe the creation of more generic opcodes and instructions. These complicate the virtual machined data model but they are also extensible, can be generalised to a wider array of use cases and can be optimised by the virtual machine. Instead, the assembler can use these generic instructions and make porcelains over them to provide a nicer environment. Take the new PUSH opcode: since it can take an arbitrary payload of bytes and push them all onto the stack, the assembler can provide porcelains for shorts, half words and words as well as more interesting macros.
442 lines
19 KiB
Org Mode
442 lines
19 KiB
Org Mode
#+title: TODOs
|
|
#+author: Aryadev Chavali
|
|
#+date: 2023-11-02
|
|
#+startup: noindent
|
|
|
|
* TODO Completely rework opcodes
|
|
Instead of having an opcode per type, we can implement a generic
|
|
opcode using two operands just on bytes.
|
|
|
|
Instead of ~PUSH_(BYTE|SHORT|HWORD|WORD) n~ where n is the data to
|
|
push of the precomposed type, we make a generic ~PUSH m, {n}~ where m
|
|
is the number of bytes and {n} is the set of bytes to push.
|
|
|
|
In bytecode that would look like ~<OP_PUSH>|m|n1|n2|n3...|nm~.
|
|
Opcodes are already variably sized so we may as well allow this. And
|
|
we reduce the number of opcodes by 3.
|
|
|
|
Each opcode can be encoded this way, but we need to describe the
|
|
semantics clearly.
|
|
** Register encoding
|
|
Firstly, registers are now only encoded by byte pointer. No short,
|
|
half word or word pointers.
|
|
|
|
Since they're so easy to translate between anyway, why should the
|
|
virtual machine do the work to handle that?
|
|
|
|
So a register r is the byte register at index r.
|
|
** PUSH
|
|
=PUSH m {n}= pushes m bytes of data, encoded by {n}.
|
|
** POP
|
|
=POP m= pops m bytes of data
|
|
** PUSH_REGISTER
|
|
=PUSH_REGISTER m r= pushes the m bytes from register space starting at
|
|
index r.
|
|
** MOV
|
|
=MOV m r= moves m bytes of data off the stack into the register space,
|
|
starting at index r.
|
|
Easy to error check as well in one go.
|
|
** DUP
|
|
=DUP m= duplicates the last m bytes, pushing them onto the top of the
|
|
stack.
|
|
** NOT
|
|
=NOT m= takes the last m bytes and pushes the ~NOT~ of each byte onto
|
|
the stack in order.
|
|
|
|
Say the top of the stack has the m bytes {n_i} where i is from 1 to m.
|
|
Then =NOT m= would pop those bytes then push {!n_i} onto the stack in
|
|
the exact same order.
|
|
** Binary boolean operators
|
|
=<OP> m= pops the last 2m bytes off the stack and does a byte by byte
|
|
operation, pushing the result onto the stack.
|
|
|
|
Say the top of the stack has m bytes {a_i} and then m bytes {b_i}.
|
|
These would both be popped off and what would be pushed is {<OP>(a_i,
|
|
b_i)} onto the stack in order.
|
|
** Mathematical and comparison operations
|
|
PLUS, SUB and MULT will now have two versions: U<OP> and <OP> for
|
|
unsigned <OP> and signed <OP>. This allows us to deal with edge case
|
|
2s complement arithmetic.
|
|
|
|
=<OP> m= pops the last 2m bytes off the stack then applies the
|
|
operation on the two portions of bytes, considering them as signed or
|
|
unsigned based on the OP. It then pushes that result back onto the
|
|
stack.
|
|
|
|
NOTE: We can still optimise by checking if m is within some bound of
|
|
the known types we have already (i.e. is it about the size of a short,
|
|
or a word) then using those known types to do computations faster.
|
|
What this provides is a generic algorithm for =m= byte arithmetic
|
|
which is what all cool programming languages do.
|
|
|
|
Comparison operations can be done in basically the same way.
|
|
** JUMP_IF
|
|
JUMP_IF can check the truthiness of some m bytes of memory, which we
|
|
can optimise if the m bytes are in some known bound already.
|
|
|
|
=JUMP_IF m= pops m bytes off the stack and examines them: if it's all
|
|
zero then it doesn't perform the jump, but otherwise it does.
|
|
** Shifting
|
|
I want to really work on making shifting operators. These move the
|
|
stack pointer without manipulating the actual data on the stack, which
|
|
can be useful when performing an operation that pops some resource
|
|
over and over again (i.e. =MSET='ing data from some heap allocation
|
|
requires popping the pointer and data off the stack). Since all
|
|
operations use the stack pointer when manipulating it (even ~POP~),
|
|
shifting the stack pointer doesn't change their behaviour a whole lot
|
|
but may require some extra mental work on the developer.
|
|
+ =SHIFT_DOWN m= moves the stack pointer down m bytes. Error may
|
|
happen if pointer is shifted further than 0
|
|
+ =SHIFT_UP m= moves the stack pointer down m bytes. Error may
|
|
occur if pointer shifts past the ~STACK_MAX~.
|
|
** Memory model
|
|
Something different will have to happen here. I have a few ideas
|
|
around making pages and reserving "space" as a generic sense, allowing
|
|
the virtual machine to use that space in a variety of ways regardless
|
|
of what storage is being used for that space.
|
|
|
|
Essentially I want a better model which will allow me to use the stack
|
|
as generic memory space: pointers to the stack. So a tentative API
|
|
would be:
|
|
+ A page is a reserved space in some storage, whether that be the heap
|
|
or the stack. It is represented by a word which is a pointer to the
|
|
start of it. The structure of a page in memory has a word
|
|
representing the size of the page and a number of bytes following
|
|
it.
|
|
+ =RESERVE_STACK m= reserves a page of m bytes on the stack. The
|
|
stack pointer is shifted up m+8 bytes and a pointer to the page is
|
|
pushed onto the stack.
|
|
+ =RESERVE_HEAP m= reserves a page of m bytes in the heap, which is a
|
|
VM managed resource that cannot be directly accessed by the user.
|
|
The page is pushed onto the stack.
|
|
+ =PAGE_WRITE m= writes m bytes of memory, stored on the stack, to a
|
|
page. The data to write and the page pointer are popped off the
|
|
stack in that order.
|
|
+ =PAGE_READ a b= pushes the bytes of a page between indexes [a, b)
|
|
onto the stack. The page pointer is popped off the stack.
|
|
+ =PAGE_REALLOC m= reallocates the page to the new size of m bytes,
|
|
allowing for dynamic memory management. The page pointer is popped
|
|
off the stack and a new page pointer is pushed onto the stack.
|
|
+ If the page is a stack page, this errors out because that stack
|
|
space will be forcibly leaked.
|
|
+ =PAGE_FREE= returns ownership of a page back to the runtime. The
|
|
page pointer is popped off the stack.
|
|
+ In the case of a stack page, this does nothing but zero the space
|
|
originally in the stack (including the first 8 bytes for the size
|
|
of the page) which means the user must shift down and/or pop data
|
|
to use the space effectively and avoid stack leaks.
|
|
** I/O
|
|
Something better needs to happen here. Perhaps writing a better
|
|
wrapper over C file I/O such that users can open file handles and deal
|
|
with them. Tentative API:
|
|
+ A file handle is a word representing a pointer to it. This can
|
|
either be the raw C pointer or an index in some abstraction such as
|
|
a dynamic array of file pointers
|
|
+ =FILE_OPEN m t= interprets the top m bytes of the stack as the file
|
|
name to open. t is a byte encoding the file mode. File handle is
|
|
pushed onto the stack.
|
|
+ 0 -> Read
|
|
+ 1 -> Write
|
|
+ 2 -> Append
|
|
+ 3 -> Read+
|
|
+ 4 -> Write+
|
|
+ 5 -> Append+
|
|
+ =FILE_READ m= reads the m bytes from a file handle, pushing them
|
|
onto the stack. File handle is popped off the stack.
|
|
+ =FILE_WRITE m= writes the m bytes on the top of the stack to the
|
|
file handle given. Both the bytes to write and the handle are
|
|
stored on the stack, first the bytes then the handle.
|
|
+ =FILE_STATUS= pushes the current position of the file onto the
|
|
stack. File handle is popped off the stack.
|
|
+ =FILE_CLOSE= closes and frees the file handle. File handle is
|
|
popped off the stack.
|
|
* TODO Rework heap to use one allocation
|
|
The current approach for the heap is so:
|
|
+ Per call to ~malloc~, allocate a new ~page_t~ structure by
|
|
requesting memory from the operating system
|
|
+ Append the pointer to the ~page_t~ to a dynamic array of pointers
|
|
|
|
In the worst case, per allocation call by the user the runtime must
|
|
request memory /twice/ from the operating system. For small scale
|
|
allocations of a few bytes this is especially wasteful. Furthermore
|
|
the actual heap usage of a program can seem unpredictable for a user
|
|
of the virtual machine, particularly in cases where the dynamic array
|
|
of pointers must resize to append a new allocation.
|
|
|
|
I propose that the runtime has one massive allocation done at init
|
|
time for a sufficiently large buffer of bytes (call it =B=) which we
|
|
use as the underlying memory for the heap.
|
|
* TODO Deal with TODOs
|
|
There is a large variety of TODOs about errors. Let's fix them!
|
|
#+begin_src sh :exports results :results output verbatim replace
|
|
find -type 'f' -regex ".*\.[ch]\(pp\)?" -exec grep -nH TODO "{}" ";"
|
|
#+end_src
|
|
|
|
#+RESULTS:
|
|
: ./vm/runtime.c:228: // TODO: Figure out a way to ensure the ordering of OP_PRINT_* is
|
|
: ./vm/runtime.c:578:// TODO: rename this to something more appropriate
|
|
: ./vm/runtime.c:625:// TODO: rename this to something more appropriate
|
|
: ./vm/runtime.c:641:// TODO: rename this to something more appropriate
|
|
: ./vm/runtime.c:655:// TODO: rename this to something more appropriate
|
|
: ./lib/heap.c:59: // TODO: When does this fragmentation become a performance
|
|
: ./lib/base.c:19: // TODO: is there a faster way of doing this?
|
|
: ./lib/base.c:25: // TODO: is there a faster way of doing this?
|
|
: ./lib/base.c:32: // TODO: is there a faster way of doing this?
|
|
* WAIT Better documentation [0%] :DOC:
|
|
** TODO Comment coverage [0%]
|
|
*** WIP Lib [75%]
|
|
**** DONE lib/base.h
|
|
**** DONE lib/darr.h
|
|
**** DONE lib/heap.h
|
|
**** TODO lib/inst.h
|
|
*** TODO VM [0%]
|
|
**** TODO vm/runtime.h
|
|
**** TODO vm/struct.h
|
|
**** TODO vm/main.c
|
|
** TODO Specification
|
|
* WAIT Standard library :VM:
|
|
I should start considering this and how a user may use it. Should it
|
|
be an option in the VM and/or assembler binaries (i.e. a flag) or
|
|
something the user has to specify in their source files?
|
|
|
|
Something to consider is /static/ and /dynamic/ "linking" i.e.:
|
|
+ Static linking: assembler inserts all used library definitions into
|
|
the bytecode output directly
|
|
+ We could insert all of it at the start of the bytecode file, and
|
|
with [[*Start points][Start points]] this won't interfere with
|
|
user code
|
|
+ 2023-11-03: Finishing the Start point feature has made these
|
|
features more tenable. A program header which is compiled and
|
|
interpreted in bytecode works wonders.
|
|
+ Furthermore library code will have fixed program addresses (always
|
|
at the start) so we'll know at start of assembler runtime where to
|
|
resolve standard library subroutine calls
|
|
+ Virtual machine needs no changes to do this
|
|
** WAIT Consider dynamic Linking
|
|
+ Dynamic linking: virtual machine has fixed program storage for
|
|
library code (a ROM), and assembler makes jump references
|
|
specifically for this program storage
|
|
+ When assembling subroutine calls, just need to put references to
|
|
this library storage (some kind of shared state between VM and
|
|
assembler to know what these references are)
|
|
+ VM needs to manage a ROM of some kind for library code
|
|
+ How do we ensure assembled links to subroutine calls don't
|
|
conflict with user code jumps?
|
|
|
|
What follows is a possible dynamic linking strategy. It requires
|
|
quite a few moving parts:
|
|
|
|
The address operand of every program control instruction (~CALL~,
|
|
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
|
|
dynamically linked:
|
|
+ If the most significant bit is 0, the remaining 63 bits encode an
|
|
absolute address within the program
|
|
+ Otherwise, the address encodes a standard library subroutine. The
|
|
bits within the address follow this schema:
|
|
+ The next 30 bits represent the specific module where the
|
|
subroutine is defined (over 1.07 *billion* possible library values)
|
|
+ The remaining 33 bits (4 bytes + 1 bit) encode the absolute
|
|
program address in the bytecode of that specific module for the
|
|
start of the subroutine (over 8.60 *billion* values)
|
|
|
|
The assembler will automatically encode this based on "%USE" calls and
|
|
the name of the subroutines called. On the virtual machine, there is
|
|
a storage location (similar to the ROM of real machines) which stores
|
|
the bytecode for modules of the standard library, indexed by the
|
|
module number. This means, on deserialising the address into the
|
|
proper components, the VM can refer to the module bytecode then jump
|
|
to the correct address.
|
|
|
|
2023-11-09: I'll need a way to run library code in the current program
|
|
system in the runtime. It currently doesn't support jumps or work in
|
|
programs outside of the main one unfortunately. Any proper work done
|
|
in this area requires some proper refactoring.
|
|
|
|
2023-11-09: Constants or inline macros need to be reconfigured for
|
|
this to work: at parse time, we work out the inlines directly which
|
|
means compiling bytecode with "standard library" macros will not work
|
|
as they won't be in the token stream. Either we don't allow
|
|
preprocessor work in the standard library at all (which is bad cos we
|
|
can't then set standard limits or other useful things) or we insert
|
|
them into the registries at parse time for use in program parsing
|
|
(which not only requires assembler refactoring to figure out what
|
|
libraries are used (to pull definitions from) but also requires making
|
|
macros "recognisable" in bytecode because they're essentially
|
|
invisible).
|
|
|
|
2024-04-15: Perhaps we could insert the linking information into the
|
|
program header?
|
|
1) A table which states the load order of certain modules would allow
|
|
the runtime to selectively spin up and properly delegate module
|
|
jumps to the right bytecode
|
|
2)
|
|
* Completed
|
|
** DONE Write a label/jump system :ASM:
|
|
Essentially a user should be able to write arbitrary labels (maybe
|
|
through ~label x~ or ~x:~ syntax) which can be referred to by ~jump~.
|
|
|
|
It'll purely be on the assembler side as a processing step, where the
|
|
emitted bytecode purely refers to absolute addresses; the VM should
|
|
just be dealing with absolute addresses here.
|
|
** DONE Allow relative addresses in jumps :ASM:
|
|
As requested, a special syntax for relative address jumps. Sometimes
|
|
it's a bit nicer than a label.
|
|
** DONE Calling and returning control flow :VM: :ASM:
|
|
When writing library code we won't know the addresses of where
|
|
callers are jumping from. However, most library functions want to
|
|
return control flow back to where the user had called them: we want
|
|
the code to act almost like an inline function.
|
|
|
|
There are two ways I can think of achieving this:
|
|
+ Some extra syntax around labels (something like ~@inline <label>:~)
|
|
which tells the assembly processor to inline the label when a "jump"
|
|
to that label is given
|
|
+ This requires no changes to the VM, which keeps it simple, but a
|
|
major change to the assembler to be able to inline code. However,
|
|
the work on writing a label system and relative addresses should
|
|
provide some insight into how this could be possible.
|
|
+ A /call stack/ and two new syntactic constructs ~call~ and ~ret~
|
|
which work like so:
|
|
+ When ~call <label>~ is encountered, the next program address is
|
|
pushed onto the call stack and control flow is set to the label
|
|
+ During execution of the ~<label>~, when a ~ret~ is encountered,
|
|
pop an address off the call stack and set control flow to that
|
|
address
|
|
+ This simulates the notion of "calling" and "returning from" a
|
|
function in classical languages, but requires more machinery on
|
|
the VM side.
|
|
|
|
2024-04-15: The latter option was chosen, though the former has been
|
|
implemented through [[*Constants][Constants]].
|
|
** DONE Start points :ASM:VM:
|
|
In standard assembly you can write
|
|
#+begin_src asm
|
|
global _start
|
|
_start:
|
|
...
|
|
#+end_src
|
|
and that means the label ~_start~ is the point the program should
|
|
start from. This means the user can define other code anywhere in the
|
|
program and specify something similar to "main" in C programs.
|
|
|
|
Proposed syntax:
|
|
#+begin_src asm
|
|
init <label>
|
|
#+end_src
|
|
|
|
2024-04-15: Used the same syntax as standard assembly, with the
|
|
conceit that multiple ~global~'s may be present but only the last one
|
|
has an effect.
|
|
** DONE Constants
|
|
Essentially a directive which assigns some literal to a symbol as a
|
|
constant. Something like
|
|
#+begin_src asm
|
|
%const(n) 20 %end
|
|
#+end_src
|
|
|
|
Then, during my program I could use it like so
|
|
#+begin_src asm
|
|
...
|
|
push.word $n
|
|
print.word
|
|
#+end_src
|
|
|
|
The preprocessor should convert this to the equivalent code of
|
|
#+begin_src asm
|
|
...
|
|
push.word 20
|
|
print.word
|
|
#+end_src
|
|
|
|
2023-11-04: You could even put full program instructions for a
|
|
constant potentially
|
|
#+begin_src asm
|
|
%const(print-1)
|
|
push.word 1
|
|
print.word
|
|
%end
|
|
#+end_src
|
|
which when referred to (by ~$print-1~) would insert the bytecode given
|
|
inline.
|
|
** DONE Rigid endian :LIB:
|
|
Say a program is compiled on a little endian machine. The resultant
|
|
bytecode file, as a result of using C's internal functions, will use
|
|
little endian.
|
|
|
|
This file, when distributed to other computers, will not work on those
|
|
that use big endian.
|
|
|
|
This is a massive problem; I would like bytecode compiled on one
|
|
computer to work on any other one. Therefore we have to enforce big
|
|
endian. This refactor is limited to only LIB as a result of only the
|
|
~convert_*~ functions being used in the runtime to convert between
|
|
byte buffers (usually read from the bytecode file directly or from
|
|
memory to use in the stack).
|
|
|
|
2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
|
|
both way host to specific endian conversion of shorts, half words and
|
|
words. This will make it super simple to just convert.
|
|
|
|
2024-04-15: Found it better to implement the functions myself as
|
|
=endian.h= is not particularly portable.
|
|
** DONE Import another file
|
|
Say I have two "asm" files: /a.asm/ and /b.asm/.
|
|
|
|
#+CAPTION: a.asm
|
|
#+begin_src asm
|
|
global main
|
|
main:
|
|
push.word 1
|
|
push.word 1
|
|
push.word 1
|
|
sub.word
|
|
sub.word
|
|
call b-println
|
|
halt
|
|
#+end_src
|
|
|
|
#+CAPTION: b.asm
|
|
#+begin_src asm
|
|
b-println:
|
|
print.word
|
|
push.byte '\n'
|
|
print.char
|
|
ret
|
|
#+end_src
|
|
|
|
How would one assemble this? We've got two files, with /a.asm/
|
|
depending on /b.asm/ for the symbol ~b-println~. It's obvious they
|
|
need to be assembled "together" to make something that could work. A
|
|
possible "correct" program would be having the file /b.asm/ completely
|
|
included into /a.asm/, such that compiling /a.asm/ would lead to
|
|
classical symbol resolution without much hassle. As a feature, this
|
|
would be best placed in the preprocessor as symbol resolution occurs
|
|
in the third stage of parsing (~process_presults~), whereas the
|
|
preprocessor is always the first stage.
|
|
|
|
That would be a very simple way of solving the static vs dynamic
|
|
linking problem: just include the files you actually need. Even the
|
|
standard library would be fine and not require any additional work.
|
|
Let's see how this would work.
|
|
** DONE Do not request for more memory in registers
|
|
The stack is a fixed size object allocated at the start of a program
|
|
and inserted onto the VM. The VM cannot request more memory for the
|
|
stack if it runs out, but this also ensures a very strict upper bound
|
|
on stack memory usage which can be profiled easily. Furthermore, the
|
|
code that interacts with the stack can use the strict sizing as an
|
|
invariant to simplify implementation (e.g. pushing to the stack when
|
|
the stack is full will trap the program). Also the stack cannot be
|
|
used to OOM attack the virtual machine.
|
|
|
|
Registers are currently dynamic arrays. Say 8 word registers are
|
|
allocated at init time. If a user requests a 9th word register,
|
|
memory is requested from the operating system to increase register
|
|
space. This is unacceptable from both a profiling and an attack point
|
|
of view; it would be trivial to write a program which forced the
|
|
runtime to request ridiculous amounts of memory from the operating
|
|
system (for example, by ~mov.word <very large number>~).
|
|
|
|
Registers should not be infinite; a standardised size (with a compile
|
|
time option to alter it) ensures the benefits stated above for the
|
|
stack.
|