ovm/todo.org

#+title: TODOs
#+author: Aryadev Chavali
#+date: 2023-11-02

* TODO Standard library :ASM:VM:
I should start considering this and how a user may use it.  Should it
be an option in the VM and/or assembler binaries (i.e. a flag) or
something the user has to specify in their source files?

Something to consider is /static/ and /dynamic/ "linking" i.e.:
+ Static linking: assembler inserts all used library definitions into
  the bytecode output directly
  + We could insert all of it at the start of the bytecode file, and
    with [[*Start points][Start points]] this won't interfere with
    user code
    + 2023-11-03: Finishing the Start point feature has made these
      features more tenable.  A program header which is compiled and
      interpreted in bytecode works wonders.
  + Furthermore library code will have fixed program addresses (always
    at the start) so we'll know at start of assembler runtime where to
    resolve standard library subroutine calls
  + Virtual machine needs no changes to do this
+ Virtual machine has fixed program storage for library code, and
  assembler makes jump references specifically for this program
  storage (dynamic linking)
  + When assembling subroutine calls, just need to put references to
    this library storage (some kind of shared state between VM and
    assembler to know what these references are)
  + VM needs to manage a ROM of some kind for library code
  + How do we ensure assembled links to subroutine calls don't
    conflict with user code jumps?
    + Possibility: most significant bit of a program address is
      reserved such that if 0 it refers to user code and if 1 it
      refers to library code
    + 63 bit references user code (not a lot of loss in precision)
    + Easy to check if a reference is a library reference or a user
      code reference by checking "sign bit" (negativity)
* TODO Inlining subroutines :ASM:
Essentially consider the following situation
#+begin_src asm
  global _start
a:
  push.word 1
  push.word 1
  plus.word
  print.word
  ret
_start:
  call a
  halt
#+end_src
What I'd like to be able to do is:
#+begin_src asm
  global _start
%inline:a:
  push.word 1
  push.word 1
  plus.word
  print.word
  ret
%end
_start:
  %inline a
  halt
#+end_src
This is equivalent in LOC, but in terms of the actually output
instructions the second is smaller in size, doesn't use any jumps nor
does it use the call stack.  As a result it's a linear program.
* Completed
** DONE Write a label/jump system :ASM:
Essentially a user should be able to write arbitrary labels (maybe
through ~label x~ or ~x:~ syntax) which can be referred to by ~jump~.

It'll purely be on the assembler side as a processing step, where the
emitted bytecode purely refers to absolute addresses; the VM should
just be dealing with absolute addresses here.
** DONE Allow relative addresses in jumps :ASM:
As requested, a special syntax for relative address jumps.  Sometimes
it's a bit nicer than a label.
** DONE Calling and returning control flow :VM: :ASM:
When writing library code we won't know the addresses of where
callers are jumping from.  However, most library functions want to
return control flow back to where the user had called them: we want
the code to act almost like an inline function.

There are two ways I can think of achieving this:
+ Some extra syntax around labels (something like ~@inline <label>:~)
  which tells the assembly processor to inline the label when a "jump"
  to that label is given
  + This requires no changes to the VM, which keeps it simple, but a
    major change to the assembler to be able to inline code.  However,
    the work on writing a label system and relative addresses should
    provide some insight into how this could be possible.
+ A /call stack/ and two new syntactic constructs ~call~ and ~ret~
  which work like so:
  + When ~call <label>~ is encountered, the next program address is
    pushed onto the call stack and control flow is set to the label
  + During execution of the ~<label>~, when a ~ret~ is encountered,
    pop an address off the call stack and set control flow to that
    address
  + This simulates the notion of "calling" and "returning from" a
    function in classical languages, but requires more machinery on
    the VM side.
** DONE Start points :ASM:VM:
You know how in standard assembly you can write
#+begin_src asm
  global _start
_start:
  ...
#+end_src
and that means the label ~_start~ is the point the program should
start from.  This means the user can define other code anywhere in the
program and specify something similar to "main" in C programs.

Proposed syntax:
#+begin_src asm
  init <label>
#+end_src