diff options
-rw-r--r-- | todo.org | 52 |
1 files changed, 36 insertions, 16 deletions
@@ -33,7 +33,7 @@ A call should look something like this: $name 1 2 3 #+end_src and those tokens will be substituted literally in the macro body. -* TODO Write assembler in a different language :ASM: +* WIP Write assembler in a different language :ASM: While the runtime and base library needs to deal with only binary, the assembler has to deal with string inputs and a larger variety of bugs. As the base library is written in C, and is all that @@ -45,8 +45,14 @@ Languages in the competition: + C++ + Rust + Python + +2024-04-14: Chose C++ cos it will require the least effort to rewrite +the currently existing codebase while still leveraging some less +efficient but incredibly useful features. * TODO Introduce error handling in base library :LIB: -There is a large variety of TODOs about errors +There is a large variety of TODOs about errors. Let's fix them! + +8 TODOs currently present. * TODO Standard library :ASM:VM: I should start considering this and how a user may use it. Should it be an option in the VM and/or assembler binaries (i.e. a flag) or @@ -65,6 +71,7 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.: at the start) so we'll know at start of assembler runtime where to resolve standard library subroutine calls + Virtual machine needs no changes to do this +** TODO Consider dynamic Linking + Dynamic linking: virtual machine has fixed program storage for library code (a ROM), and assembler makes jump references specifically for this program storage @@ -74,13 +81,10 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.: + VM needs to manage a ROM of some kind for library code + How do we ensure assembled links to subroutine calls don't conflict with user code jumps? - + Possibility: most significant bit of a program address is - reserved such that if 0 it refers to user code and if 1 it - refers to library code - + 63 bit references user code (not a lot of loss in precision) - + Easy to check if a reference is a library reference or a user - code reference by checking "sign bit" (negativity) -** TODO Dynamic Linking + +What follows is a possible dynamic linking strategy. It requires +quite a few moving parts: + The address operand of every program control instruction (~CALL~, ~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is dynamically linked: @@ -88,12 +92,11 @@ dynamically linked: absolute address within the program + Otherwise, the address encodes a standard library subroutine. The bits within the address follow this schema: - + The next 15 bits (7 from the most significant byte, then 8 from - the next byte) represent the specific module where the subroutine - is defined (over 32767 possible library values) - + The remaining 48 bits (6 bytes) encode the absolute program - address in the bytecode of that specific module for the start of - the subroutine (over 281 *trillion* values) + + The next 30 bits represent the specific module where the + subroutine is defined (over 1.07 *billion* possible library values) + + The remaining 33 bits (4 bytes + 1 bit) encode the absolute + program address in the bytecode of that specific module for the + start of the subroutine (over 8.60 *billion* values) The assembler will automatically encode this based on "%USE" calls and the name of the subroutines called. On the virtual machine, there is @@ -119,6 +122,13 @@ them into the registries at parse time for use in program parsing libraries are used (to pull definitions from) but also requires making macros "recognisable" in bytecode because they're essentially invisible). + +2024-04-15: Perhaps we could insert the linking information into the +program header? +1) A table which states the load order of certain modules would allow + the runtime to selectively spin up and properly delegate module + jumps to the right bytecode +2) * Completed ** DONE Write a label/jump system :ASM: Essentially a user should be able to write arbitrary labels (maybe @@ -154,8 +164,11 @@ There are two ways I can think of achieving this: + This simulates the notion of "calling" and "returning from" a function in classical languages, but requires more machinery on the VM side. + +2024-04-15: The latter option was chosen, though the former has been +implemented through [[*Constants][Constants]]. ** DONE Start points :ASM:VM: -You know how in standard assembly you can write +In standard assembly you can write #+begin_src asm global _start _start: @@ -169,6 +182,10 @@ Proposed syntax: #+begin_src asm init <label> #+end_src + +2024-04-15: Used the same syntax as standard assembly, with the +conceit that multiple ~global~'s may be present but only the last one +has an effect. ** DONE Constants Essentially a directive which assigns some literal to a symbol as a constant. Something like @@ -218,6 +235,9 @@ memory to use in the stack). 2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide both way host to specific endian conversion of shorts, half words and words. This will make it super simple to just convert. + +2024-04-15: Found it better to implement the functions myself as +=endian.h= is not particularly portable. ** DONE Import another file Say I have two "asm" files: /a.asm/ and /b.asm/. |