Annotate some completed todos in todo.org
This commit is contained in:
52
todo.org
52
todo.org
@@ -33,7 +33,7 @@ A call should look something like this:
|
|||||||
$name 1 2 3
|
$name 1 2 3
|
||||||
#+end_src
|
#+end_src
|
||||||
and those tokens will be substituted literally in the macro body.
|
and those tokens will be substituted literally in the macro body.
|
||||||
* TODO Write assembler in a different language :ASM:
|
* WIP Write assembler in a different language :ASM:
|
||||||
While the runtime and base library needs to deal with only
|
While the runtime and base library needs to deal with only
|
||||||
binary, the assembler has to deal with string inputs and a larger
|
binary, the assembler has to deal with string inputs and a larger
|
||||||
variety of bugs. As the base library is written in C, and is all that
|
variety of bugs. As the base library is written in C, and is all that
|
||||||
@@ -45,8 +45,14 @@ Languages in the competition:
|
|||||||
+ C++
|
+ C++
|
||||||
+ Rust
|
+ Rust
|
||||||
+ Python
|
+ Python
|
||||||
|
|
||||||
|
2024-04-14: Chose C++ cos it will require the least effort to rewrite
|
||||||
|
the currently existing codebase while still leveraging some less
|
||||||
|
efficient but incredibly useful features.
|
||||||
* TODO Introduce error handling in base library :LIB:
|
* TODO Introduce error handling in base library :LIB:
|
||||||
There is a large variety of TODOs about errors
|
There is a large variety of TODOs about errors. Let's fix them!
|
||||||
|
|
||||||
|
8 TODOs currently present.
|
||||||
* TODO Standard library :ASM:VM:
|
* TODO Standard library :ASM:VM:
|
||||||
I should start considering this and how a user may use it. Should it
|
I should start considering this and how a user may use it. Should it
|
||||||
be an option in the VM and/or assembler binaries (i.e. a flag) or
|
be an option in the VM and/or assembler binaries (i.e. a flag) or
|
||||||
@@ -65,6 +71,7 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
|
|||||||
at the start) so we'll know at start of assembler runtime where to
|
at the start) so we'll know at start of assembler runtime where to
|
||||||
resolve standard library subroutine calls
|
resolve standard library subroutine calls
|
||||||
+ Virtual machine needs no changes to do this
|
+ Virtual machine needs no changes to do this
|
||||||
|
** TODO Consider dynamic Linking
|
||||||
+ Dynamic linking: virtual machine has fixed program storage for
|
+ Dynamic linking: virtual machine has fixed program storage for
|
||||||
library code (a ROM), and assembler makes jump references
|
library code (a ROM), and assembler makes jump references
|
||||||
specifically for this program storage
|
specifically for this program storage
|
||||||
@@ -74,13 +81,10 @@ Something to consider is /static/ and /dynamic/ "linking" i.e.:
|
|||||||
+ VM needs to manage a ROM of some kind for library code
|
+ VM needs to manage a ROM of some kind for library code
|
||||||
+ How do we ensure assembled links to subroutine calls don't
|
+ How do we ensure assembled links to subroutine calls don't
|
||||||
conflict with user code jumps?
|
conflict with user code jumps?
|
||||||
+ Possibility: most significant bit of a program address is
|
|
||||||
reserved such that if 0 it refers to user code and if 1 it
|
What follows is a possible dynamic linking strategy. It requires
|
||||||
refers to library code
|
quite a few moving parts:
|
||||||
+ 63 bit references user code (not a lot of loss in precision)
|
|
||||||
+ Easy to check if a reference is a library reference or a user
|
|
||||||
code reference by checking "sign bit" (negativity)
|
|
||||||
** TODO Dynamic Linking
|
|
||||||
The address operand of every program control instruction (~CALL~,
|
The address operand of every program control instruction (~CALL~,
|
||||||
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
|
~JUMP~, ~JUMP.IF~) has a specific encoding if the standard library is
|
||||||
dynamically linked:
|
dynamically linked:
|
||||||
@@ -88,12 +92,11 @@ dynamically linked:
|
|||||||
absolute address within the program
|
absolute address within the program
|
||||||
+ Otherwise, the address encodes a standard library subroutine. The
|
+ Otherwise, the address encodes a standard library subroutine. The
|
||||||
bits within the address follow this schema:
|
bits within the address follow this schema:
|
||||||
+ The next 15 bits (7 from the most significant byte, then 8 from
|
+ The next 30 bits represent the specific module where the
|
||||||
the next byte) represent the specific module where the subroutine
|
subroutine is defined (over 1.07 *billion* possible library values)
|
||||||
is defined (over 32767 possible library values)
|
+ The remaining 33 bits (4 bytes + 1 bit) encode the absolute
|
||||||
+ The remaining 48 bits (6 bytes) encode the absolute program
|
program address in the bytecode of that specific module for the
|
||||||
address in the bytecode of that specific module for the start of
|
start of the subroutine (over 8.60 *billion* values)
|
||||||
the subroutine (over 281 *trillion* values)
|
|
||||||
|
|
||||||
The assembler will automatically encode this based on "%USE" calls and
|
The assembler will automatically encode this based on "%USE" calls and
|
||||||
the name of the subroutines called. On the virtual machine, there is
|
the name of the subroutines called. On the virtual machine, there is
|
||||||
@@ -119,6 +122,13 @@ them into the registries at parse time for use in program parsing
|
|||||||
libraries are used (to pull definitions from) but also requires making
|
libraries are used (to pull definitions from) but also requires making
|
||||||
macros "recognisable" in bytecode because they're essentially
|
macros "recognisable" in bytecode because they're essentially
|
||||||
invisible).
|
invisible).
|
||||||
|
|
||||||
|
2024-04-15: Perhaps we could insert the linking information into the
|
||||||
|
program header?
|
||||||
|
1) A table which states the load order of certain modules would allow
|
||||||
|
the runtime to selectively spin up and properly delegate module
|
||||||
|
jumps to the right bytecode
|
||||||
|
2)
|
||||||
* Completed
|
* Completed
|
||||||
** DONE Write a label/jump system :ASM:
|
** DONE Write a label/jump system :ASM:
|
||||||
Essentially a user should be able to write arbitrary labels (maybe
|
Essentially a user should be able to write arbitrary labels (maybe
|
||||||
@@ -154,8 +164,11 @@ There are two ways I can think of achieving this:
|
|||||||
+ This simulates the notion of "calling" and "returning from" a
|
+ This simulates the notion of "calling" and "returning from" a
|
||||||
function in classical languages, but requires more machinery on
|
function in classical languages, but requires more machinery on
|
||||||
the VM side.
|
the VM side.
|
||||||
|
|
||||||
|
2024-04-15: The latter option was chosen, though the former has been
|
||||||
|
implemented through [[*Constants][Constants]].
|
||||||
** DONE Start points :ASM:VM:
|
** DONE Start points :ASM:VM:
|
||||||
You know how in standard assembly you can write
|
In standard assembly you can write
|
||||||
#+begin_src asm
|
#+begin_src asm
|
||||||
global _start
|
global _start
|
||||||
_start:
|
_start:
|
||||||
@@ -169,6 +182,10 @@ Proposed syntax:
|
|||||||
#+begin_src asm
|
#+begin_src asm
|
||||||
init <label>
|
init <label>
|
||||||
#+end_src
|
#+end_src
|
||||||
|
|
||||||
|
2024-04-15: Used the same syntax as standard assembly, with the
|
||||||
|
conceit that multiple ~global~'s may be present but only the last one
|
||||||
|
has an effect.
|
||||||
** DONE Constants
|
** DONE Constants
|
||||||
Essentially a directive which assigns some literal to a symbol as a
|
Essentially a directive which assigns some literal to a symbol as a
|
||||||
constant. Something like
|
constant. Something like
|
||||||
@@ -218,6 +235,9 @@ memory to use in the stack).
|
|||||||
2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
|
2024-04-09: Found the ~hto_e~ functions under =endian.h= that provide
|
||||||
both way host to specific endian conversion of shorts, half words and
|
both way host to specific endian conversion of shorts, half words and
|
||||||
words. This will make it super simple to just convert.
|
words. This will make it super simple to just convert.
|
||||||
|
|
||||||
|
2024-04-15: Found it better to implement the functions myself as
|
||||||
|
=endian.h= is not particularly portable.
|
||||||
** DONE Import another file
|
** DONE Import another file
|
||||||
Say I have two "asm" files: /a.asm/ and /b.asm/.
|
Say I have two "asm" files: /a.asm/ and /b.asm/.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user