- Now take a single command line argument for the filename to read and
compile.
- If filename is "--", then read stdin until EOF using a different
read handler (using ~vec_t~ along with buffered reading).
- ~read_file~ now returns an error code and takes the ~sv_t~ (which
contains the file contents) by pointer. We can now deal with the
error in ~main~ directly.
- Make the return code of ~main~ a variable which error branches can
set. Unify the error branch and normal branch code. Pattern for
error handling is now unified.
LOG, LOG_ERR. LOG_ERR will always compile to a /stderr/ print. LOG,
on the other hand, may not actually do anything if VERBOSE_LOGS is
not 1. By default it is 0, so it must be defined when compiling to
enable - hence the adjustment of the Makefile.
Primitive is a bit of a word conflict here; primitives are what we'd
expect our callables to be named eventually. However, these parser
"primitives" are just well known symbols that we want to optimise the
representation of for later stages. Thus, KNOWN is a bit better for
signalling intent then PRIMITIVE is.
Main reason is so we don't have that stupid arl prefix directory in
our source code. Now our source code is flat, and we can still
reference headers by linking from root.
commit 1588e7b46d
Author: Aryadev Chavali <aryadev@aryadevchavali.com>
Date: Sat Jan 24 02:55:12 2026 +0000
parser/parser: parse_symbol now supports primitives
parse_symbol now investigates if the parsed symbol data is actually
just a primitive (linear search through all primitives). If it is,
return a primitive first. Otherwise, generate a symbol as per
previous form of routine.
commit 62c91990c4
Author: Aryadev Chavali <aryadev@aryadevchavali.com>
Date: Sat Jan 24 02:40:26 2026 +0000
parser/ast: Added support for node level primitives
These are just an enumeration of primitives we already expect to be
present within a program. Instead of leaving everything as a symbol,
we can compile certain symbols into the enumeration ahead of time to
make later stages easier.
My previous idea was to generate a list of all the headers, and add it
as a dependency for all object files. This way, any changes in a
header would trigger a rebuild of all object files, which would
in-turn trigger a build of the binary.
This will be a bit of an issue later on when we have stuff that's
independent of others; a change in parser code won't necessarily
affect code generation, but a change in AST will. We don't want to
re-trigger builds for everything.
This setup forces gcc to generate a clear set of dependencies in the
build folder (in a syntax recognisable by Make), then include that in
the Makefile itself. These dependencies are specific to each code
unit and so only concern the headers that code unit uses.
Much faster than dealing with the line and column as we go. In the
vast majority of cases this data is completely unnecessary, so this is
wasted effort. At the point where we need accurate line/column
information, we can compute it - in an error state, it really doesn't
matter that we're spending that extra time to compute it.
I've made prototypes for them, put at the top, and moved their
implementations to the bottom. They're not exposed to anything
outside this code unit. Now, when reading the code, the parsing
routines (which are the main reason to be here) are at the top and
clear to read.
We now have a primitive and not fully tested parser for strings and
symbol sequences. We record the lines and columns of each object on
the object for better compile time error handling.
I've also structured the code base in a slightly weirder fashion,
which makes my includes look nicer. I've split up stuff quite a bit
to ensure code units are bit more focused.