This repository has been archived on 2025-11-10. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
avm/spec.org

2.6 KiB

VM Specification

WIP Data types

There are 3 main data types of the virtual machine. They are all unsigned. There exist signed versions of these data types, though there is no difference in terms of bytecode between them. For an unsigned type <T> the signed version is simply S_<T>.

Name Bits
Byte 8
HWord 32
Word 64

WIP Instructions

An instruction for the virtual machine is composed of an opcode and, potentially, an operand. An opcode represents the behaviour of the instruction i.e. what is the instruction. The operand is a datum of one of the data types described previously.

Some instructions do have operands while others do not. The former type of instructions are called UNIT instructions while the latter type are called MULTI instructions1.

All opcodes (with very few exceptions2) have two components: the root and the type specifier. The root represents the general behaviour of the instruction: PUSH, POP, MOV, etc. The type specifier specifies what data type it manipulates. A complete opcode will be a combination of these two e.g. PUSH_BYTE, POP_WORD, etc. Some opcodes may have more type specifiers than others.

TODO Bytecode format

Bytecode files are byte sequence which encode instructions for the virtual machine. Any instruction (even with an operand) has one and only one byte sequence associated with it.

TODO Storage

Two types of storage:

  • Data stack which all core VM routines manipulate and work on (FIFO)
  • Register space which is generally reserved for user space code i.e. other than mov no other core VM routine manipulates the registers

TODO Assembler conventions

The standard library is allowed to use the first 8 words (64 bytes) of register space without regard to user applications, so it's recommended to use register space from the 9th (72nd bit) word onwards for user applications if standard library is to be used.

Footnotes


1

UNIT refers to the fact that the internal representation of these instructions are singular: two instances of the same UNIT instruction will be identical in terms of their binary. On the other hand, two instances of the same MULTI instruction may not be equivalent due to the operand they take. Crucially, most if not all MULTI instructions have different versions for each data type.

2

NOOP, HALT, MDELETE, MSIZE, JUMP_*