Age | Commit message (Collapse) | Author |
|
|
|
Pretty simple fix, stupid error in hindsight.
|
|
Ensures that iteration over vec_out by caller doesn't occur (such as
in a loop to free the memory).
|
|
Name assigned to %CONST is the next symbol in stream, not the symbol
attached to it.
|
|
Instead of %const(<name>) ... %end it will now be %const <name>
... %end i.e. the first symbol after %const will be considered the
name of the constant similar to %use.
|
|
The preprocess_* functions are now privately contained within the
implementation file to help the preprocesser outer function.
Furthermore I've simplified the API of the preprocess_* functions by
making them only return pp_err_t and store their results in a vector
parameter taken by reference.
|
|
|
|
|
|
|
|
We leave the parameter tokens alone, considering it constant, while
the parameter vec_out is used to hold the new stream of tokens. This
allows the caller to have a before and after view on the token stream
and reduces the worry of double frees.
|
|
|
|
|
|
|
|
This is actually an improvement on the older lexer.
|
|
Similar principle to pp_err_t in that a structure provides the
opportunity for more information about the error such as location.
|
|
Not much of an actual performance change, more semantic meaning for me.
|
|
|
|
Don't need to make a directory every time I compile some code.
|
|
|
|
|
|
|
|
|
|
vec.clear() doesn't delete pointers (unless they're smart) so I need
to do it myself.
|
|
|
|
|
|
They copy and construct new token vectors and just read the token
inputs.
|
|
|
|
Once again quite similar to preprocess_macro_blocks but shorter,
easier to use and easier to read. (76 vs 109)
|
|
Another great thing for C++: the ability to tell it how to print
structures the way I want. In C it's either:
1) Write a function to print the structure out (preferably to a file
pointer)
2) Write a function to return a string (allocated on the heap) which
represents it
Both are not fun to write, whereas it's much easier to write this.
|
|
|
|
While being very similar in style to the C version, it takes 27 lines
of code less to implement it due to the niceties of C++ (41 lines vs
68).
|
|
|
|
|
|
This removes the problem of possibly expensive copies occurring due to
working with tokens produced from the lexer (that C++ just... does):
now we hold pointers where the copy operator is a lot easier to use.
I want expensive stuff to be done by me and for a reason: I want to
be holding the shotgun.
|
|
This C++ rewrite allows me to rewrite the actual API of the system.
In particular, I'm no longer restricting myself to just using enums
then figuring out a way to get proper error logging later down the
line (through tracking tokens in the buffer internally, for example).
Instead I can now design error structures which hold references to the
token they occurred on as well as possible lexical errors (if they're
a FILE_LEXICAL_ERROR which occurs due to the ~%USE~ macro). This
means it's a lot easier to write error logging now at the top level.
|
|
I've decided to split the module parsing into two modules, one for the
preprocessing stage which only deals with tokens and the parsing stage
which generates bytecode.
|
|
This makes enum elements scoped which is actually quite useful as I
prefer the namespacing that enum's give in C++.
|
|
|
|
With error checking!
|
|
Uses std::optional in case file doesn't exist.
|
|
|
|
|
|
|
|
Note that this is basically the same as the previous version,
excluding the fact that it uses C++ idioms more and does a bit better
in error checking.
|
|
One thing I've realised is that even methods such as this require
error tracking. I won't implement it in the tokenise method as it's
not related to consuming the string per se but instead in the main method.
|
|
I made the escape sequence parsing occur here instead of leaving it to
the main tokenise_buffer function as I think it's better suited here.
|
|
Note the overall size of this function in comparison to the C version,
as well as its clarity.
Of course, it is doing allocations in the background through
std::string which requires more profiling if I want to make this super
efficientâ„¢ but honestly the assembler just needs to work, whereas the
runtime needs to be fast.
|
|
|
|
The implementation for tokenise_symbol is already a lot nicer to look
at and add to due to string/string_view operator overloading of ==.
Furthermore, error handling through pair<> instead of making some
custom structure which essentially does the same thing is already
making me happy for this rewrite.
|
|
Essentially a refactor of the C formed lexer into C++ style. I can
already see some benefits from doing this, in particular speed of
prototyping.
|