Annotate some completed todos in todo.org

This commit is contained in:
2024-04-15 16:35:44 +06:30
parent e960af2904
commit d594c0c531

View File

@@ -33,7 +33,7 @@ A call should look something like this:
$name 1 2 3
#+end_src
and those tokens will be substituted literally in the macro body.
* TODO Write assembler in a different language :ASM:
* WIP Write assembler in a different language :ASM:
While the runtime and base library needs to deal with only
binary, the assembler has to deal with string inputs and a larger
variety of bugs. As the base library is written in C, and is all that
@@ -45,8 +45,14 @@ Languages in the competition:
+ C++
+ Rust
+ Python
2024-04-14: Chose C++ cos it will require the least effort to rewrite
the currently existing codebase while still leveraging some less
efficient but incredibly useful features.
* TODO Introduce error handling in base library :LIB:
There is a large variety of TODOs about errors
There is a large variety of TODOs about errors. Let's fix them!
8 TODOs currently present.
* TODO Standard library :ASM:VM:
I should start considering this and how a user may use it. Should it
be an option in the VM and/or assembler binaries (i.e. a flag) or
@@ -65,6 +71,7 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
at the start) so we'll know at start of assembler runtime where to
resolve standard library subroutine calls
+ Virtual machine needs no changes to do this
** TODO Consider dynamic Linking
+ Dynamic linking: virtual machine has fixed program storage for
library code (a ROM), and assembler makes jump references
specifically for this program storage
@@ -74,13 +81,10 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
+ VM needs to manage a ROM of some kind for library code
+ How do we ensure assembled links to subroutine calls don't
conflict with user code jumps?
+ Possibility: most significant bit of a program address is
reserved such that if 0 it refers to user code and if 1 it
refers to library code
+ 63 bit references user code (not a lot of loss in precision)
+ Easy to check if a reference is a library reference or a user
code reference by checking "sign bit" (negativity)
** TODO Dynamic Linking
What follows is a possible dynamic linking strategy. It requires
quite a few moving parts:
The address operand of every program control instruction (~CALL~,
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
dynamically linked:
@@ -88,12 +92,11 @@ dynamically linked:
absolute address within the program
+ Otherwise, the address encodes a standard library subroutine. The
bits within the address follow this schema:
+ The next 15 bits (7 from the most significant byte, then 8 from
the next byte) represent the specific module where the subroutine
is defined (over 32767 possible library values)
+ The remaining 48 bits (6 bytes) encode the absolute program
address in the bytecode of that specific module for the start of
the subroutine (over 281 *trillion* values)
+ The next 30 bits represent the specific module where the
subroutine is defined (over 1.07 *billion* possible library values)
+ The remaining 33 bits (4 bytes + 1 bit) encode the absolute
program address in the bytecode of that specific module for the
start of the subroutine (over 8.60 *billion* values)
The assembler will automatically encode this based on "%USE" calls and
the name of the subroutines called. On the virtual machine, there is
@@ -119,6 +122,13 @@ them into the registries at parse time for use in program parsing
libraries are used (to pull definitions from) but also requires making
macros "recognisable" in bytecode because they're essentially
invisible).
2024-04-15: Perhaps we could insert the linking information into the
program header?
1) A table which states the load order of certain modules would allow
the runtime to selectively spin up and properly delegate module
jumps to the right bytecode
2)
* Completed
** DONE Write a label/jump system :ASM:
Essentially a user should be able to write arbitrary labels (maybe
@@ -154,8 +164,11 @@ There are two ways I can think of achieving this:
+ This simulates the notion of "calling" and "returning from" a
function in classical languages, but requires more machinery on
the VM side.
2024-04-15: The latter option was chosen, though the former has been
implemented through [[*Constants][Constants]].
** DONE Start points :ASM:VM:
You know how in standard assembly you can write
In standard assembly you can write
#+begin_src asm
global _start
_start:
@@ -169,6 +182,10 @@ Proposed syntax:
#+begin_src asm
init <label>
#+end_src
2024-04-15: Used the same syntax as standard assembly, with the
conceit that multiple ~global~'s may be present but only the last one
has an effect.
** DONE Constants
Essentially a directive which assigns some literal to a symbol as a
constant. Something like
@@ -218,6 +235,9 @@ memory to use in the stack).
2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
both way host to specific endian conversion of shorts, half words and
words. This will make it super simple to just convert.
2024-04-15: Found it better to implement the functions myself as
=endian.h= is not particularly portable.
** DONE Import another file
Say I have two "asm" files: /a.asm/ and /b.asm/.