Annotate some completed todos in todo.org
This commit is contained in:
52
todo.org
52
todo.org
@@ -33,7 +33,7 @@ A call should look something like this:
|
||||
$name 1 2 3
|
||||
#+end_src
|
||||
and those tokens will be substituted literally in the macro body.
|
||||
* TODO Write assembler in a different language :ASM:
|
||||
* WIP Write assembler in a different language :ASM:
|
||||
While the runtime and base library needs to deal with only
|
||||
binary, the assembler has to deal with string inputs and a larger
|
||||
variety of bugs. As the base library is written in C, and is all that
|
||||
@@ -45,8 +45,14 @@ Languages in the competition:
|
||||
+ C++
|
||||
+ Rust
|
||||
+ Python
|
||||
|
||||
2024-04-14: Chose C++ cos it will require the least effort to rewrite
|
||||
the currently existing codebase while still leveraging some less
|
||||
efficient but incredibly useful features.
|
||||
* TODO Introduce error handling in base library :LIB:
|
||||
There is a large variety of TODOs about errors
|
||||
There is a large variety of TODOs about errors. Let's fix them!
|
||||
|
||||
8 TODOs currently present.
|
||||
* TODO Standard library :ASM:VM:
|
||||
I should start considering this and how a user may use it. Should it
|
||||
be an option in the VM and/or assembler binaries (i.e. a flag) or
|
||||
@@ -65,6 +71,7 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
|
||||
at the start) so we'll know at start of assembler runtime where to
|
||||
resolve standard library subroutine calls
|
||||
+ Virtual machine needs no changes to do this
|
||||
** TODO Consider dynamic Linking
|
||||
+ Dynamic linking: virtual machine has fixed program storage for
|
||||
library code (a ROM), and assembler makes jump references
|
||||
specifically for this program storage
|
||||
@@ -74,13 +81,10 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
|
||||
+ VM needs to manage a ROM of some kind for library code
|
||||
+ How do we ensure assembled links to subroutine calls don't
|
||||
conflict with user code jumps?
|
||||
+ Possibility: most significant bit of a program address is
|
||||
reserved such that if 0 it refers to user code and if 1 it
|
||||
refers to library code
|
||||
+ 63 bit references user code (not a lot of loss in precision)
|
||||
+ Easy to check if a reference is a library reference or a user
|
||||
code reference by checking "sign bit" (negativity)
|
||||
** TODO Dynamic Linking
|
||||
|
||||
What follows is a possible dynamic linking strategy. It requires
|
||||
quite a few moving parts:
|
||||
|
||||
The address operand of every program control instruction (~CALL~,
|
||||
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
|
||||
dynamically linked:
|
||||
@@ -88,12 +92,11 @@ dynamically linked:
|
||||
absolute address within the program
|
||||
+ Otherwise, the address encodes a standard library subroutine. The
|
||||
bits within the address follow this schema:
|
||||
+ The next 15 bits (7 from the most significant byte, then 8 from
|
||||
the next byte) represent the specific module where the subroutine
|
||||
is defined (over 32767 possible library values)
|
||||
+ The remaining 48 bits (6 bytes) encode the absolute program
|
||||
address in the bytecode of that specific module for the start of
|
||||
the subroutine (over 281 *trillion* values)
|
||||
+ The next 30 bits represent the specific module where the
|
||||
subroutine is defined (over 1.07 *billion* possible library values)
|
||||
+ The remaining 33 bits (4 bytes + 1 bit) encode the absolute
|
||||
program address in the bytecode of that specific module for the
|
||||
start of the subroutine (over 8.60 *billion* values)
|
||||
|
||||
The assembler will automatically encode this based on "%USE" calls and
|
||||
the name of the subroutines called. On the virtual machine, there is
|
||||
@@ -119,6 +122,13 @@ them into the registries at parse time for use in program parsing
|
||||
libraries are used (to pull definitions from) but also requires making
|
||||
macros "recognisable" in bytecode because they're essentially
|
||||
invisible).
|
||||
|
||||
2024-04-15: Perhaps we could insert the linking information into the
|
||||
program header?
|
||||
1) A table which states the load order of certain modules would allow
|
||||
the runtime to selectively spin up and properly delegate module
|
||||
jumps to the right bytecode
|
||||
2)
|
||||
* Completed
|
||||
** DONE Write a label/jump system :ASM:
|
||||
Essentially a user should be able to write arbitrary labels (maybe
|
||||
@@ -154,8 +164,11 @@ There are two ways I can think of achieving this:
|
||||
+ This simulates the notion of "calling" and "returning from" a
|
||||
function in classical languages, but requires more machinery on
|
||||
the VM side.
|
||||
|
||||
2024-04-15: The latter option was chosen, though the former has been
|
||||
implemented through [[*Constants][Constants]].
|
||||
** DONE Start points :ASM:VM:
|
||||
You know how in standard assembly you can write
|
||||
In standard assembly you can write
|
||||
#+begin_src asm
|
||||
global _start
|
||||
_start:
|
||||
@@ -169,6 +182,10 @@ Proposed syntax:
|
||||
#+begin_src asm
|
||||
init <label>
|
||||
#+end_src
|
||||
|
||||
2024-04-15: Used the same syntax as standard assembly, with the
|
||||
conceit that multiple ~global~'s may be present but only the last one
|
||||
has an effect.
|
||||
** DONE Constants
|
||||
Essentially a directive which assigns some literal to a symbol as a
|
||||
constant. Something like
|
||||
@@ -218,6 +235,9 @@ memory to use in the stack).
|
||||
2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
|
||||
both way host to specific endian conversion of shorts, half words and
|
||||
words. This will make it super simple to just convert.
|
||||
|
||||
2024-04-15: Found it better to implement the functions myself as
|
||||
=endian.h= is not particularly portable.
|
||||
** DONE Import another file
|
||||
Say I have two "asm" files: /a.asm/ and /b.asm/.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user