aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--spec.org46
1 files changed, 46 insertions, 0 deletions
diff --git a/spec.org b/spec.org
new file mode 100644
index 0000000..8878161
--- /dev/null
+++ b/spec.org
@@ -0,0 +1,46 @@
+#+title: VM Specification
+#+author: Aryadev Chavali
+#+description: A specification of instructions for the virtual machine
+#+date: 2023-11-02
+
+* Data types
+There are 3 main data types of the virtual machine. They are all
+unsigned. There exist signed versions of these data types, though
+there is no difference in terms of bytecode between them. For an
+unsigned type <T> the signed version is simply S_<T>.
+|-------+------|
+| Name | Bits |
+|-------+------|
+| Byte | 8 |
+| HWord | 32 |
+| Word | 64 |
+|-------+------|
+* Instructions
+An instruction for the virtual machine is composed of an *opcode* and,
+potentially, an *operand*. An /opcode/ represents the behaviour of
+the instruction i.e. what _is_ the instruction. The /operand/ is a
+datum of one of the /data types/ described previously.
+
+Some instructions do have /operands/ while others do not. The former
+type of instructions are called *UNIT* instructions while the latter
+type are called *MULTI* instructions[fn:1].
+
+All /opcodes/ (with very few exceptions[fn:2]) have two components:
+the *root* and the *type specifier*. The /root/ represents the
+general behaviour of the instruction: ~PUSH~, ~POP~, ~MOV~, etc. The
+/type specifier/ specifies what /data type/ it manipulates. A
+complete opcode will be a combination of these two e.g. ~PUSH_BYTE~,
+~POP_WORD~, etc. Some /opcodes/ may have more /type specifiers/ than
+others.
+* Bytecode format
+Bytecode files are byte sequence which encode instructions for the
+virtual machine. Any instruction (even with an operand) has one and
+only one byte sequence associated with it.
+* Footnotes
+[fn:2] ~NOOP~, ~HALT~, ~MDELETE~, ~MSIZE~, ~JUMP_*~
+[fn:1] /UNIT/ refers to the fact that the internal representation of
+these instructions are singular: two instances of the same /UNIT/
+instruction will be identical in terms of their binary. On the other
+hand, two instances of the same /MULTI/ instruction may not be
+equivalent due to the operand they take. Crucially, most if not all
+/MULTI/ instructions have different versions for each /data type/.