#+title: VM Specification #+author: Aryadev Chavali #+description: A specification of instructions for the virtual machine #+date: 2023-11-02 * Data types There are 3 main data types of the virtual machine. They are all unsigned. There exist signed versions of these data types, though there is no difference in terms of bytecode between them. For an unsigned type the signed version is simply S_. |-------+------| | Name | Bits | |-------+------| | Byte | 8 | | HWord | 32 | | Word | 64 | |-------+------| * Instructions An instruction for the virtual machine is composed of an *opcode* and, potentially, an *operand*. An /opcode/ represents the behaviour of the instruction i.e. what _is_ the instruction. The /operand/ is a datum of one of the /data types/ described previously. Some instructions do have /operands/ while others do not. The former type of instructions are called *UNIT* instructions while the latter type are called *MULTI* instructions[fn:1]. All /opcodes/ (with very few exceptions[fn:2]) have two components: the *root* and the *type specifier*. The /root/ represents the general behaviour of the instruction: ~PUSH~, ~POP~, ~MOV~, etc. The /type specifier/ specifies what /data type/ it manipulates. A complete opcode will be a combination of these two e.g. ~PUSH_BYTE~, ~POP_WORD~, etc. Some /opcodes/ may have more /type specifiers/ than others. * Bytecode format Bytecode files are byte sequence which encode instructions for the virtual machine. Any instruction (even with an operand) has one and only one byte sequence associated with it. * Footnotes [fn:2] ~NOOP~, ~HALT~, ~MDELETE~, ~MSIZE~, ~JUMP_*~ [fn:1] /UNIT/ refers to the fact that the internal representation of these instructions are singular: two instances of the same /UNIT/ instruction will be identical in terms of their binary. On the other hand, two instances of the same /MULTI/ instruction may not be equivalent due to the operand they take. Crucially, most if not all /MULTI/ instructions have different versions for each /data type/.