Age | Commit message (Collapse) | Author |
|
As PUSH_REGISTER and MOV have the same signature of taking a word as
input, DUP may as well be part of it.
This leads to a larger discussion about how signatures of functions
matter: I may need to do a cleanup at some point.
|
|
|
|
Instead of having each page be an area of memory, where multiple
pointers to differing data may lie, we instead have each page being
one allocation. This ensures that a deletion algorithm, as provided,
would actually work without destroying older pointers which may have
been allocated. Great!
|
|
Now need to create some instructions which manage the heap
|
|
A page is a flexibly allocated structure of bytes, with a count of the
number of bytes already allocated (used) and number of bytes available
overall (available), with a pointer to the next page, if any.
heap_t is a linked list of pages. One may allocate a requested size
off the heap which causes one of two things:
1) Either a page already exists with enough space for the requested
size, in which case that page's pointer is used as the base for the
requested pointer
2) No pages satisfy the requested size, so a new page is allocated
which is the new end of the heap.
|
|
|
|
|
|
|
|
|
|
word register 0 refers to the first 8 bytes of the dynamic array.
Hence the used counter should be at least 8 bytes. This deals with
those issues. Also print more useful information in
vm_print_registers (how many byte|hword|word registers are currently
in use, how many are available).
|
|
Dependencies are just ASM_OUT binary and the corresponding assembly
program for the bytecode output file. Actually works very well, with
changes triggering a recompilation. Also an `exec` recipe is
introduced to do the task of compiling an assembly program and
executing the corresponding bytecode all at once.
|
|
|
|
Very cool, easy, and reads well
|
|
|
|
Lucky surprise: OP_PLUS follows the same principle rules as the
bitwise operators in that they return the same type as the input.
Therefore I can simply use the same macro to implement it and MULT as
those. Very nice.
|
|
|
|
Thankfully multiplication, like addition, is the same under 2s
complement as it is for unsigned numbers. So I just need to implement
those versions to be fine.
|
|
By default I initialise the registers with 8 words, though this may
not be necessary for your purposes.
|
|
This is because comparators may apply to signed types, so I need to
use the right parsing function.
|
|
This is using the comparators and a jump-if
|
|
As registers may be theoretically infinite in number, we should use
the largest size possible when referring to them in bytecode (a word).
|
|
This is because: say we have {a, b} where a is on top of the stack. A
comparator C applies in the order C(b, a) i.e. b `C` a. The previous
version did a `C` b which was wrong.
|
|
This means the stack should be heap allocated, which makes sense as
beyond 1KB one should really be using the heap rather than the stack.
|
|
Stack based machines generally need "variable space". This may be
quite via a symbol-to-word association a list, a hashmap, or some
other system. Here I decide to go for the simplest: extending the
register system to a dynamic/infinite number of them. This means, in
practice, that we may use a theoretically infinite number of indexed
words, hwords and bytes to act as variable space. This means that the
onus is on those who are targeting this virtual machine to create
their own association system to create syntactic variables: all the
machinery is technically installed within the VM, without the veneer
that causes extra cruft.
|
|
This is only new data allocated, so it's a very careful procedure.
|
|
|
|
|
|
|
|
|
|
Much simpler, uses a switch case which is a much faster method of
doing the parsing. Though roughly equivalent in terms of LOC, I feel
that this is more extensible
|
|
More useful tokens, in particular for each opcode possible. This
makes parsing a simpler task to reason as now we're just checking
against an enum rather than doing a string check in linear time.
It makes more sense to do this at the tokeniser as the local data from
the buffer will be in the cache most likely as the buffer is
contiguously allocated. While it will always be slow to do linear
time checks on strings, when doing it at the parser we're having to
check strings that may be allocated in a variety of different places.
This means caching becomes a harder task, but with this approach we're
less likely to have cache misses as long as the buffer stays there.
|
|
A negative number under 2s complement can never be equal to its
positive as the top bit *must* be on. If two numbers are equivalent
bit-by-bit then they are equal for both signed and unsigned numbers.
|
|
This pushes a datum of the same type as the operands, which is why it
cannot use the comparator macro as that always pushes bytes.
|
|
Anything other than char (which can just use print.byte to print the
hex) and byte (which prints hexes anyway), all other types may be
forced to print a hex rather than a number if PRINT_HEX is 1.
|
|
As strto(ul|ll) allow the parsing of hex literals of the form `0x`, we
allow lexing of hex literals which start with `x`.
They're lexed into C hex literals which work for strtol.
|
|
I've made a single macro which defines a function through some common
metric, removing code duplication. Not particularly readable per se,
but using a macro expansion in your IDE allows one to inspect the code.
|
|
These new members are just signed versions of the previous members.
This makes type punning and usage for signed versions easier than
before (no need for memcpy).
|
|
|
|
So much reused code, I definitely need to find a way to make this cleaner.
|
|
|
|
For each type T there is the signed version s_T
|
|
|
|
As it has no dependencies on vm specifically, and it's more necessary
for any vendors who wish to target the virtual machine, it makes more
sense for inst to be a lib module rather than a vm module.
|
|
Just need to call their unsigned versions.
All comparators should push bytes as it makes return types uniform.
|
|
otherwise
Changed VERBOSE checks to ensure a degree of information.
|
|
Will cause error if used currently, which is fine.
|
|
Comparing signed and unsigned versions of numbers. Same for EQ as
well.
Notice the irregular pattern of BYTE, CHAR, INT, HWORD,LONG,WORD as
OPCODE_IS_TYPE requires the subcodes to be surrounded by BYTE and
WORD.
|
|
|
|
Currently only for invalid character literals, but still a possible
problem.
|
|
Just takes the character literally as a number.
|