Annotate some completed todos in todo.org

2024-04-15 16:35:44 +06:30
parent e960af2904
commit d594c0c531
1 changed files with 36 additions and 16 deletions
--- a/todo.org
+++ b/todo.org
@@ -33,7 +33,7 @@ A call should look something like this:
  $name 1 2 3
 #+end_src
 and those tokens will be substituted literally in the macro body.
-* TODO Write assembler in a different language :ASM:
+* WIP Write assembler in a different language :ASM:
 While the runtime and base library needs to deal with only
 binary, the assembler has to deal with string inputs and a larger
 variety of bugs.  As the base library is written in C, and is all that
@@ -45,8 +45,14 @@ Languages in the competition:
 + C++
 + Rust
 + Python
 2024-04-14: Chose C++ cos it will require the least effort to rewrite
 the currently existing codebase while still leveraging some less
 efficient but incredibly useful features.
 * TODO Introduce error handling in base library :LIB:
-There is a large variety of TODOs about errors
+There is a large variety of TODOs about errors.  Let's fix them!
 8 TODOs currently present.
 * TODO Standard library :ASM:VM:
 I should start considering this and how a user may use it.  Should it
 be an option in the VM and/or assembler binaries (i.e. a flag) or
@@ -65,6 +71,7 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
    at the start) so we'll know at start of assembler runtime where to
    resolve standard library subroutine calls
  + Virtual machine needs no changes to do this
 ** TODO Consider dynamic Linking
 + Dynamic linking: virtual machine has fixed program storage for
  library code (a ROM), and assembler makes jump references
  specifically for this program storage
@@ -74,13 +81,10 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
  + VM needs to manage a ROM of some kind for library code
  + How do we ensure assembled links to subroutine calls don't
    conflict with user code jumps?
-    + Possibility: most significant bit of a program address is
+
-      reserved such that if 0 it refers to user code and if 1 it
+What follows is a possible dynamic linking strategy.  It requires
-      refers to library code
+quite a few moving parts:
-    + 63 bit references user code (not a lot of loss in precision)
+
    + Easy to check if a reference is a library reference or a user
      code reference by checking "sign bit" (negativity)
 ** TODO Dynamic Linking
 The address operand of every program control instruction (~CALL~,
 ~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
 dynamically linked:
@@ -88,12 +92,11 @@ dynamically linked:
  absolute address within the program
 + Otherwise, the address encodes a standard library subroutine.  The
  bits within the address follow this schema:
-  + The next 15 bits (7 from the most significant byte, then 8 from
+  + The next 30 bits represent the specific module where the
-    the next byte) represent the specific module where the subroutine
+    subroutine is defined (over 1.07 *billion* possible library values)
-    is defined (over 32767 possible library values)
+  + The remaining 33 bits (4 bytes + 1 bit) encode the absolute
-  + The remaining 48 bits (6 bytes) encode the absolute program
+    program address in the bytecode of that specific module for the
-    address in the bytecode of that specific module for the start of
+    start of the subroutine (over 8.60 *billion* values)
    the subroutine (over 281 *trillion* values)
 The assembler will automatically encode this based on "%USE" calls and
 the name of the subroutines called.  On the virtual machine, there is
@@ -119,6 +122,13 @@ them into the registries at parse time for use in program parsing
 libraries are used (to pull definitions from) but also requires making
 macros "recognisable" in bytecode because they're essentially
 invisible).
 2024-04-15: Perhaps we could insert the linking information into the
 program header?
 1) A table which states the load order of certain modules would allow
   the runtime to selectively spin up and properly delegate module
   jumps to the right bytecode
 2)
 * Completed
 ** DONE Write a label/jump system :ASM:
 Essentially a user should be able to write arbitrary labels (maybe
@@ -154,8 +164,11 @@ There are two ways I can think of achieving this:
  + This simulates the notion of "calling" and "returning from" a
    function in classical languages, but requires more machinery on
    the VM side.
 2024-04-15: The latter option was chosen, though the former has been
 implemented through [[*Constants][Constants]].
 ** DONE Start points :ASM:VM:
-You know how in standard assembly you can write
+In standard assembly you can write
 #+begin_src asm
  global _start
 _start:
@@ -169,6 +182,10 @@ Proposed syntax:
 #+begin_src asm
  init <label>
 #+end_src
 2024-04-15: Used the same syntax as standard assembly, with the
 conceit that multiple ~global~'s may be present but only the last one
 has an effect.
 ** DONE Constants
 Essentially a directive which assigns some literal to a symbol as a
 constant.  Something like
@@ -218,6 +235,9 @@ memory to use in the stack).
 2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
 both way host to specific endian conversion of shorts, half words and
 words.  This will make it super simple to just convert.
 2024-04-15: Found it better to implement the functions myself as
 =endian.h= is not particularly portable.
 ** DONE Import another file
 Say I have two "asm" files: /a.asm/ and /b.asm/.