Annotate some completed todos in todo.org

This commit is contained in:
2024-04-15 16:35:44 +06:30
parent e960af2904
commit d594c0c531

View File

@@ -33,7 +33,7 @@ A call should look something like this:
$name 1 2 3 $name 1 2 3
#+end_src #+end_src
and those tokens will be substituted literally in the macro body. and those tokens will be substituted literally in the macro body.
* TODO Write assembler in a different language :ASM: * WIP Write assembler in a different language :ASM:
While the runtime and base library needs to deal with only While the runtime and base library needs to deal with only
binary, the assembler has to deal with string inputs and a larger binary, the assembler has to deal with string inputs and a larger
variety of bugs. As the base library is written in C, and is all that variety of bugs. As the base library is written in C, and is all that
@@ -45,8 +45,14 @@ Languages in the competition:
+ C++ + C++
+ Rust + Rust
+ Python + Python
2024-04-14: Chose C++ cos it will require the least effort to rewrite
the currently existing codebase while still leveraging some less
efficient but incredibly useful features.
* TODO Introduce error handling in base library :LIB: * TODO Introduce error handling in base library :LIB:
There is a large variety of TODOs about errors There is a large variety of TODOs about errors. Let's fix them!
8 TODOs currently present.
* TODO Standard library :ASM:VM: * TODO Standard library :ASM:VM:
I should start considering this and how a user may use it. Should it I should start considering this and how a user may use it. Should it
be an option in the VM and/or assembler binaries (i.e. a flag) or be an option in the VM and/or assembler binaries (i.e. a flag) or
@@ -65,6 +71,7 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
at the start) so we'll know at start of assembler runtime where to at the start) so we'll know at start of assembler runtime where to
resolve standard library subroutine calls resolve standard library subroutine calls
+ Virtual machine needs no changes to do this + Virtual machine needs no changes to do this
** TODO Consider dynamic Linking
+ Dynamic linking: virtual machine has fixed program storage for + Dynamic linking: virtual machine has fixed program storage for
library code (a ROM), and assembler makes jump references library code (a ROM), and assembler makes jump references
specifically for this program storage specifically for this program storage
@@ -74,13 +81,10 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
+ VM needs to manage a ROM of some kind for library code + VM needs to manage a ROM of some kind for library code
+ How do we ensure assembled links to subroutine calls don't + How do we ensure assembled links to subroutine calls don't
conflict with user code jumps? conflict with user code jumps?
+ Possibility: most significant bit of a program address is
reserved such that if 0 it refers to user code and if 1 it What follows is a possible dynamic linking strategy. It requires
refers to library code quite a few moving parts:
+ 63 bit references user code (not a lot of loss in precision)
+ Easy to check if a reference is a library reference or a user
code reference by checking "sign bit" (negativity)
** TODO Dynamic Linking
The address operand of every program control instruction (~CALL~, The address operand of every program control instruction (~CALL~,
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is ~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
dynamically linked: dynamically linked:
@@ -88,12 +92,11 @@ dynamically linked:
absolute address within the program absolute address within the program
+ Otherwise, the address encodes a standard library subroutine. The + Otherwise, the address encodes a standard library subroutine. The
bits within the address follow this schema: bits within the address follow this schema:
+ The next 15 bits (7 from the most significant byte, then 8 from + The next 30 bits represent the specific module where the
the next byte) represent the specific module where the subroutine subroutine is defined (over 1.07 *billion* possible library values)
is defined (over 32767 possible library values) + The remaining 33 bits (4 bytes + 1 bit) encode the absolute
+ The remaining 48 bits (6 bytes) encode the absolute program program address in the bytecode of that specific module for the
address in the bytecode of that specific module for the start of start of the subroutine (over 8.60 *billion* values)
the subroutine (over 281 *trillion* values)
The assembler will automatically encode this based on "%USE" calls and The assembler will automatically encode this based on "%USE" calls and
the name of the subroutines called. On the virtual machine, there is the name of the subroutines called. On the virtual machine, there is
@@ -119,6 +122,13 @@ them into the registries at parse time for use in program parsing
libraries are used (to pull definitions from) but also requires making libraries are used (to pull definitions from) but also requires making
macros "recognisable" in bytecode because they're essentially macros "recognisable" in bytecode because they're essentially
invisible). invisible).
2024-04-15: Perhaps we could insert the linking information into the
program header?
1) A table which states the load order of certain modules would allow
the runtime to selectively spin up and properly delegate module
jumps to the right bytecode
2)
* Completed * Completed
** DONE Write a label/jump system :ASM: ** DONE Write a label/jump system :ASM:
Essentially a user should be able to write arbitrary labels (maybe Essentially a user should be able to write arbitrary labels (maybe
@@ -154,8 +164,11 @@ There are two ways I can think of achieving this:
+ This simulates the notion of "calling" and "returning from" a + This simulates the notion of "calling" and "returning from" a
function in classical languages, but requires more machinery on function in classical languages, but requires more machinery on
the VM side. the VM side.
2024-04-15: The latter option was chosen, though the former has been
implemented through [[*Constants][Constants]].
** DONE Start points :ASM:VM: ** DONE Start points :ASM:VM:
You know how in standard assembly you can write In standard assembly you can write
#+begin_src asm #+begin_src asm
global _start global _start
_start: _start:
@@ -169,6 +182,10 @@ Proposed syntax:
#+begin_src asm #+begin_src asm
init <label> init <label>
#+end_src #+end_src
2024-04-15: Used the same syntax as standard assembly, with the
conceit that multiple ~global~'s may be present but only the last one
has an effect.
** DONE Constants ** DONE Constants
Essentially a directive which assigns some literal to a symbol as a Essentially a directive which assigns some literal to a symbol as a
constant. Something like constant. Something like
@@ -218,6 +235,9 @@ memory to use in the stack).
2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide 2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
both way host to specific endian conversion of shorts, half words and both way host to specific endian conversion of shorts, half words and
words. This will make it super simple to just convert. words. This will make it super simple to just convert.
2024-04-15: Found it better to implement the functions myself as
=endian.h= is not particularly portable.
** DONE Import another file ** DONE Import another file
Say I have two "asm" files: /a.asm/ and /b.asm/. Say I have two "asm" files: /a.asm/ and /b.asm/.