Before this if a file only defined constants and was then included,
the root unit set would still have an empty PP_USE unit. We should
only have that if there's something there past preprocessing. Hence,
do not add empty bodies to the unit list.
A constant can only be redefined by a file that is closer to the root.
If a constant is defined at depth n, it can only be redefined at
depths lower than n.
Preprocesser requires one function to use: preprocess. Takes Tokens
and gives back Units.
A unit is a tree of tokens, where each unit is a node in that tree. A
unit has a "root" token (value of node) and an "expansion" (children
of node) where the root is some preprocesser token (such as a
reference or USE call) and the expansion is the tokens it yields. In
the case of a USE call this is the tokens of the file it includes, in
the case of a reference it's the tokens of the constant it refers to.
This means that the leaves of the tree of units are the completely
preprocessed/expanded form of the source code.
The function has many working components, which may need to be
extracted. In particular, the function ensures we don't include a
source twice through a hash map and that constants are not redefined
in inner include scopes if they're already defined in outer
scopes (i.e. if compiling a.asm which defines constant N, then include
b.asm which defines constant N, then N uses the definition of a.asm
rather than b.asm).
I need to make a spec for this.
So instead of the previous weird format, we have the format
<source_name>:<line>:<column>: <TYPE> which also allows me to quickly
go to that token via Emacs' (compile).
I've decided to split the project into 2 repositories: the assembler
and the runtime. The runtime will contain both the executable and
lib/ while the assembler will have the runtime as a git submodule and
use it to build. I think this is a clean solution, a lot cleaner than
having them all in one project where the Makefile has to massively
expand.
Instead of %const(<name>) ... %end it will now be %const <name>
... %end i.e. the first symbol after %const will be considered the
name of the constant similar to %use.
The preprocess_* functions are now privately contained within the
implementation file to help the preprocesser outer function.
Furthermore I've simplified the API of the preprocess_* functions by
making them only return pp_err_t and store their results in a vector
parameter taken by reference.