119 lines
4.6 KiB
Org Mode
119 lines
4.6 KiB
Org Mode
#+title: TODOs
|
|
#+author: Aryadev Chavali
|
|
#+date: 2023-11-02
|
|
|
|
* TODO Standard library :ASM:VM:
|
|
I should start considering this and how a user may use it. Should it
|
|
be an option in the VM and/or assembler binaries (i.e. a flag) or
|
|
something the user has to specify in their source files?
|
|
|
|
Something to consider is /static/ and /dynamic/ "linking" i.e.:
|
|
+ Static linking: assembler inserts all used library definitions into
|
|
the bytecode output directly
|
|
+ We could insert all of it at the start of the bytecode file, and
|
|
with [[*Start points][Start points]] this won't interfere with
|
|
user code
|
|
+ 2023-11-03: Finishing the Start point feature has made these
|
|
features more tenable. A program header which is compiled and
|
|
interpreted in bytecode works wonders.
|
|
+ Furthermore library code will have fixed program addresses (always
|
|
at the start) so we'll know at start of assembler runtime where to
|
|
resolve standard library subroutine calls
|
|
+ Virtual machine needs no changes to do this
|
|
+ Virtual machine has fixed program storage for library code, and
|
|
assembler makes jump references specifically for this program
|
|
storage (dynamic linking)
|
|
+ When assembling subroutine calls, just need to put references to
|
|
this library storage (some kind of shared state between VM and
|
|
assembler to know what these references are)
|
|
+ VM needs to manage a ROM of some kind for library code
|
|
+ How do we ensure assembled links to subroutine calls don't
|
|
conflict with user code jumps?
|
|
+ Possibility: most significant bit of a program address is
|
|
reserved such that if 0 it refers to user code and if 1 it
|
|
refers to library code
|
|
+ 63 bit references user code (not a lot of loss in precision)
|
|
+ Easy to check if a reference is a library reference or a user
|
|
code reference by checking "sign bit" (negativity)
|
|
* TODO Inlining subroutines :ASM:
|
|
Essentially consider the following situation
|
|
#+begin_src asm
|
|
global _start
|
|
a:
|
|
push.word 1
|
|
push.word 1
|
|
plus.word
|
|
print.word
|
|
ret
|
|
_start:
|
|
call a
|
|
halt
|
|
#+end_src
|
|
What I'd like to be able to do is:
|
|
#+begin_src asm
|
|
global _start
|
|
%inline:a:
|
|
push.word 1
|
|
push.word 1
|
|
plus.word
|
|
print.word
|
|
ret
|
|
%end
|
|
_start:
|
|
%inline a
|
|
halt
|
|
#+end_src
|
|
This is equivalent in LOC, but in terms of the actually output
|
|
instructions the second is smaller in size, doesn't use any jumps nor
|
|
does it use the call stack. As a result it's a linear program.
|
|
* Completed
|
|
** DONE Write a label/jump system :ASM:
|
|
Essentially a user should be able to write arbitrary labels (maybe
|
|
through ~label x~ or ~x:~ syntax) which can be referred to by ~jump~.
|
|
|
|
It'll purely be on the assembler side as a processing step, where the
|
|
emitted bytecode purely refers to absolute addresses; the VM should
|
|
just be dealing with absolute addresses here.
|
|
** DONE Allow relative addresses in jumps :ASM:
|
|
As requested, a special syntax for relative address jumps. Sometimes
|
|
it's a bit nicer than a label.
|
|
** DONE Calling and returning control flow :VM: :ASM:
|
|
When writing library code we won't know the addresses of where
|
|
callers are jumping from. However, most library functions want to
|
|
return control flow back to where the user had called them: we want
|
|
the code to act almost like an inline function.
|
|
|
|
There are two ways I can think of achieving this:
|
|
+ Some extra syntax around labels (something like ~@inline <label>:~)
|
|
which tells the assembly processor to inline the label when a "jump"
|
|
to that label is given
|
|
+ This requires no changes to the VM, which keeps it simple, but a
|
|
major change to the assembler to be able to inline code. However,
|
|
the work on writing a label system and relative addresses should
|
|
provide some insight into how this could be possible.
|
|
+ A /call stack/ and two new syntactic constructs ~call~ and ~ret~
|
|
which work like so:
|
|
+ When ~call <label>~ is encountered, the next program address is
|
|
pushed onto the call stack and control flow is set to the label
|
|
+ During execution of the ~<label>~, when a ~ret~ is encountered,
|
|
pop an address off the call stack and set control flow to that
|
|
address
|
|
+ This simulates the notion of "calling" and "returning from" a
|
|
function in classical languages, but requires more machinery on
|
|
the VM side.
|
|
** DONE Start points :ASM:VM:
|
|
You know how in standard assembly you can write
|
|
#+begin_src asm
|
|
global _start
|
|
_start:
|
|
...
|
|
#+end_src
|
|
and that means the label ~_start~ is the point the program should
|
|
start from. This means the user can define other code anywhere in the
|
|
program and specify something similar to "main" in C programs.
|
|
|
|
Proposed syntax:
|
|
#+begin_src asm
|
|
init <label>
|
|
#+end_src
|