Instead of using endian.h that is not portable AND doesn't work with
C++, I'll just write my own using a forced union based type punning
trick.
I've decided to use little endian for the format as well: it seems to
be used by most desktop computers so it should make these functions
faster to run for most CPUs.
This means if I write the new assembler in another language I only
need to FFI this header rather than all the functions as well which
may not be as useful.
Have to define _DEFAULT_SOURCE before you can use the endian
conversion functions. As most standard library headers use
features.h, and _DEFAULT_SOURCE must be defined before features.h is
included, we have to include base.h before other headers.
Essentially a "program header", followed by a count, followed by
instructions.
Provides a stronger format for bytecode files and allows for better
bounds checking on instructions.
Essentially you may "call" an absolute program address, which pushes
the current address onto the call stack. CALL_STACK does the same
thing but the absolute program address is taken from the data stack.
RET pops an address off the call stack then jumps to that address.
Essentially they use the stack for their one and only operand. This
allows user level control, in particular it allows for loops to work
correctly while using these operands.
This will allow for more library level code to be written. For
example, say you wanted to write a generic byte level reversal
algorithm for dynamically sized allocations. Getting the size of the
allocation would be fundamental to this operation.
One may allocate any number of (bytes|hwords|words), set or get some
index from allocated memory, and delete heap memory.
The idea is that all the relevant datums will be on the stack, so no
register usage. This means no instructions should use register space
at all (other than POP, which I'm debating about currently). Register
space is purely for users.
Instead of having each page be an area of memory, where multiple
pointers to differing data may lie, we instead have each page being
one allocation. This ensures that a deletion algorithm, as provided,
would actually work without destroying older pointers which may have
been allocated. Great!
A page is a flexibly allocated structure of bytes, with a count of the
number of bytes already allocated (used) and number of bytes available
overall (available), with a pointer to the next page, if any.
heap_t is a linked list of pages. One may allocate a requested size
off the heap which causes one of two things:
1) Either a page already exists with enough space for the requested
size, in which case that page's pointer is used as the base for the
requested pointer
2) No pages satisfy the requested size, so a new page is allocated
which is the new end of the heap.
Thankfully multiplication, like addition, is the same under 2s
complement as it is for unsigned numbers. So I just need to implement
those versions to be fine.
A negative number under 2s complement can never be equal to its
positive as the top bit *must* be on. If two numbers are equivalent
bit-by-bit then they are equal for both signed and unsigned numbers.
Anything other than char (which can just use print.byte to print the
hex) and byte (which prints hexes anyway), all other types may be
forced to print a hex rather than a number if PRINT_HEX is 1.
These new members are just signed versions of the previous members.
This makes type punning and usage for signed versions easier than
before (no need for memcpy).
As it has no dependencies on vm specifically, and it's more necessary
for any vendors who wish to target the virtual machine, it makes more
sense for inst to be a lib module rather than a vm module.