aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAryadev Chavali <aryadev@aryadevchavali.com>2024-04-15 16:35:44 +0630
committerAryadev Chavali <aryadev@aryadevchavali.com>2024-04-15 16:35:44 +0630
commitd594c0c531dc04224d6909cbc99d3798bcf04060 (patch)
treefbb7ac633fd14c19b7cf566f23acb0168ec1ec8b
parente960af2904463039290a1dd9a7c8b259ee5d76a9 (diff)
downloadovm-d594c0c531dc04224d6909cbc99d3798bcf04060.tar.gz
ovm-d594c0c531dc04224d6909cbc99d3798bcf04060.tar.bz2
ovm-d594c0c531dc04224d6909cbc99d3798bcf04060.zip
Annotate some completed todos in todo.org
-rw-r--r--todo.org52
1 files changed, 36 insertions, 16 deletions
diff --git a/todo.org b/todo.org
index 41da77d..1c09ae0 100644
--- a/todo.org
+++ b/todo.org
@@ -33,7 +33,7 @@ A call should look something like this:
$name 1 2 3
#+end_src
and those tokens will be substituted literally in the macro body.
-* TODO Write assembler in a different language :ASM:
+* WIP Write assembler in a different language :ASM:
While the runtime and base library needs to deal with only
binary, the assembler has to deal with string inputs and a larger
variety of bugs. As the base library is written in C, and is all that
@@ -45,8 +45,14 @@ Languages in the competition:
+ C++
+ Rust
+ Python
+
+2024-04-14: Chose C++ cos it will require the least effort to rewrite
+the currently existing codebase while still leveraging some less
+efficient but incredibly useful features.
* TODO Introduce error handling in base library :LIB:
-There is a large variety of TODOs about errors
+There is a large variety of TODOs about errors. Let's fix them!
+
+8 TODOs currently present.
* TODO Standard library :ASM:VM:
I should start considering this and how a user may use it. Should it
be an option in the VM and/or assembler binaries (i.e. a flag) or
@@ -65,6 +71,7 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
at the start) so we'll know at start of assembler runtime where to
resolve standard library subroutine calls
+ Virtual machine needs no changes to do this
+** TODO Consider dynamic Linking
+ Dynamic linking: virtual machine has fixed program storage for
library code (a ROM), and assembler makes jump references
specifically for this program storage
@@ -74,13 +81,10 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
+ VM needs to manage a ROM of some kind for library code
+ How do we ensure assembled links to subroutine calls don't
conflict with user code jumps?
- + Possibility: most significant bit of a program address is
- reserved such that if 0 it refers to user code and if 1 it
- refers to library code
- + 63 bit references user code (not a lot of loss in precision)
- + Easy to check if a reference is a library reference or a user
- code reference by checking "sign bit" (negativity)
-** TODO Dynamic Linking
+
+What follows is a possible dynamic linking strategy. It requires
+quite a few moving parts:
+
The address operand of every program control instruction (~CALL~,
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
dynamically linked:
@@ -88,12 +92,11 @@ dynamically linked:
absolute address within the program
+ Otherwise, the address encodes a standard library subroutine. The
bits within the address follow this schema:
- + The next 15 bits (7 from the most significant byte, then 8 from
- the next byte) represent the specific module where the subroutine
- is defined (over 32767 possible library values)
- + The remaining 48 bits (6 bytes) encode the absolute program
- address in the bytecode of that specific module for the start of
- the subroutine (over 281 *trillion* values)
+ + The next 30 bits represent the specific module where the
+ subroutine is defined (over 1.07 *billion* possible library values)
+ + The remaining 33 bits (4 bytes + 1 bit) encode the absolute
+ program address in the bytecode of that specific module for the
+ start of the subroutine (over 8.60 *billion* values)
The assembler will automatically encode this based on "%USE" calls and
the name of the subroutines called. On the virtual machine, there is
@@ -119,6 +122,13 @@ them into the registries at parse time for use in program parsing
libraries are used (to pull definitions from) but also requires making
macros "recognisable" in bytecode because they're essentially
invisible).
+
+2024-04-15: Perhaps we could insert the linking information into the
+program header?
+1) A table which states the load order of certain modules would allow
+ the runtime to selectively spin up and properly delegate module
+ jumps to the right bytecode
+2)
* Completed
** DONE Write a label/jump system :ASM:
Essentially a user should be able to write arbitrary labels (maybe
@@ -154,8 +164,11 @@ There are two ways I can think of achieving this:
+ This simulates the notion of "calling" and "returning from" a
function in classical languages, but requires more machinery on
the VM side.
+
+2024-04-15: The latter option was chosen, though the former has been
+implemented through [[*Constants][Constants]].
** DONE Start points :ASM:VM:
-You know how in standard assembly you can write
+In standard assembly you can write
#+begin_src asm
global _start
_start:
@@ -169,6 +182,10 @@ Proposed syntax:
#+begin_src asm
init <label>
#+end_src
+
+2024-04-15: Used the same syntax as standard assembly, with the
+conceit that multiple ~global~'s may be present but only the last one
+has an effect.
** DONE Constants
Essentially a directive which assigns some literal to a symbol as a
constant. Something like
@@ -218,6 +235,9 @@ memory to use in the stack).
2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
both way host to specific endian conversion of shorts, half words and
words. This will make it super simple to just convert.
+
+2024-04-15: Found it better to implement the functions myself as
+=endian.h= is not particularly portable.
** DONE Import another file
Say I have two "asm" files: /a.asm/ and /b.asm/.