360 Commits

Author SHA1 Message Date
Aryadev Chavali
2a1d006a88 Updated README about change to project
Some checks failed
C/C++ CI / build (push) Has been cancelled
2024-04-16 15:49:26 +06:30
Aryadev Chavali
8f75241bcb Halting work on preprocesser units and rewrite as a whole
I've decided to split the project into 2 repositories: the assembler
and the runtime.  The runtime will contain both the executable and
lib/ while the assembler will have the runtime as a git submodule and
use it to build.  I think this is a clean solution, a lot cleaner than
having them all in one project where the Makefile has to massively
expand.
2024-04-16 15:42:59 +06:30
Aryadev Chavali
d5c43b1c3f Wrote up some notes on how preprocesser language may work
Bit formal and really excessively written but I needed my thoughts
down.
2024-04-16 15:42:34 +06:30
Aryadev Chavali
715facf015 Updated README lines of code 2024-04-16 15:42:22 +06:30
Aryadev Chavali
4ecd184759 lerr_type_t::UNKNOWN_CHAR -> UNKNOWN_LEXEME 2024-04-16 15:41:01 +06:30
Aryadev Chavali
27d6a47320 Clean up error message from preprocesser 2024-04-16 15:40:49 +06:30
Aryadev Chavali
3fc1f08134 Fix bug where CONST table didn't actually store symbol names
Pretty simple fix, stupid error in hindsight.
2024-04-16 15:40:00 +06:30
Aryadev Chavali
4b3e9b3567 Clear vector after deleting all tokens
Ensures that iteration over vec_out by caller doesn't occur (such as
in a loop to free the memory).
2024-04-16 15:39:20 +06:30
Aryadev Chavali
05136fdd25 Fixed examples for changes in lexer
Name assigned to %CONST is the next symbol in stream, not the symbol
attached to it.
2024-04-16 15:38:24 +06:30
Aryadev Chavali
1e7f1bdee9 Changed %const format in preprocesser now
Instead of %const(<name>) ... %end it will now be %const <name>
... %end i.e. the first symbol after %const will be considered the
name of the constant similar to %use.
2024-04-15 18:39:37 +06:30
Aryadev Chavali
ba3525d533 preprocesser publicly exposes only preprocesser function
The preprocess_* functions are now privately contained within the
implementation file to help the preprocesser outer function.

Furthermore I've simplified the API of the preprocess_* functions by
making them only return pp_err_t and store their results in a vector
parameter taken by reference.
2024-04-15 18:37:45 +06:30
Aryadev Chavali
d594c0c531 Annotate some completed todos in todo.org 2024-04-15 16:35:44 +06:30
Aryadev Chavali
e960af2904 Added some VERBOSE checked messages into asm/main 2024-04-15 16:33:22 +06:30
Aryadev Chavali
4a7341e26c Propagate changes to lerr_t into preprocesser 2024-04-15 16:33:02 +06:30
Aryadev Chavali
b83bdd0d45 preprocesser function now only returns a pp_err_t
We leave the parameter tokens alone, considering it constant, while
the parameter vec_out is used to hold the new stream of tokens.  This
allows the caller to have a before and after view on the token stream
and reduces the worry of double frees.
2024-04-15 16:31:45 +06:30
Aryadev Chavali
c748ed8386 Changed output schema for printing tokens 2024-04-15 16:31:07 +06:30
Aryadev Chavali
83faf86312 Fix error where lexer would loop infinitely if unknown character found 2024-04-15 16:30:44 +06:30
Aryadev Chavali
0f430e399c Changed hex format from x<digits> -> 0x<digits> 2024-04-15 16:30:30 +06:30
Aryadev Chavali
175138f570 Lexical error on char literal being too big or too small
This is actually an improvement on the older lexer.
2024-04-15 16:29:42 +06:30
Aryadev Chavali
a2d98142d5 lerr_t is now a struct with constructors
Similar principle to pp_err_t in that a structure provides the
opportunity for more information about the error such as location.
2024-04-15 16:29:37 +06:30
Aryadev Chavali
ae3794c33e constexpr -> const in lexer.cpp
Not much of an actual performance change, more semantic meaning for me.
2024-04-15 16:27:05 +06:30
Aryadev Chavali
a70dcf2e3f ~extern "C"~ when including lib/inst.h 2024-04-15 16:26:44 +06:30
Aryadev Chavali
5319fd815d Dependencies stricter and prerequisites for directories in Makefile
Don't need to make a directory every time I compile some code.
2024-04-15 16:25:35 +06:30
Aryadev Chavali
58069e083d Auto filled copyrightn otice in asm/base 2024-04-15 16:25:14 +06:30
Aryadev Chavali
958868714e main.cpp now preprocesses tokens and prints the output 2024-04-15 05:56:01 +06:30
Aryadev Chavali
e22ed450ac Fix some off by one errors 2024-04-15 05:34:52 +06:30
Aryadev Chavali
940dd2021e Fix issue with use_blocks not being preprocessed 2024-04-15 05:34:41 +06:30
Aryadev Chavali
7a6275c0a1 fix memory leak through vec.clear
vec.clear() doesn't delete pointers (unless they're smart) so I need
to do it myself.
2024-04-15 05:34:02 +06:30
Aryadev Chavali
8d3951a871 Implemented preprocesser function. 2024-04-15 05:08:55 +06:30
Aryadev Chavali
1e1a13e741 Default constructor for pp_err_t 2024-04-15 05:08:40 +06:30
Aryadev Chavali
0e5c934072 preprocess_* now uses const references to tokens
They copy and construct new token vectors and just read the token
inputs.
2024-04-15 05:08:07 +06:30
Aryadev Chavali
9ca93786af Updated main.cpp for changes to lexer 2024-04-15 05:07:16 +06:30
Aryadev Chavali
ec87245724 Implemented preprocess_const_blocks
Once again quite similar to preprocess_macro_blocks but shorter,
easier to use and easier to read. (76 vs 109)
2024-04-15 04:55:51 +06:30
Aryadev Chavali
81efc9006d Implement printing of pp_err_t
Another great thing for C++: the ability to tell it how to print
structures the way I want.  In C it's either:
1) Write a function to print the structure out (preferably to a file
pointer)
2) Write a function to return a string (allocated on the heap) which
represents it

Both are not fun to write, whereas it's much easier to write this.
2024-04-15 04:55:51 +06:30
Aryadev Chavali
929e5a3d0d Implement constructors for pp_err_t 2024-04-15 04:55:51 +06:30
Aryadev Chavali
0a93ad5a8a Implement preprocess_use_blocks
While being very similar in style to the C version, it takes 27 lines
of code less to implement it due to the niceties of C++ (41 lines vs
68).
2024-04-15 04:55:51 +06:30
Aryadev Chavali
f661438c93 Moved read_file to a general base library 2024-04-15 04:55:51 +06:30
Aryadev Chavali
0385d4bb8d Fix some off by one errors in lexer 2024-04-15 04:43:58 +06:30
Aryadev Chavali
f01d64b5f4 lexer now produces a vector of heap allocated tokens
This removes the problem of possibly expensive copies occurring due to
working with tokens produced from the lexer (that C++ just... does):
now we hold pointers where the copy operator is a lot easier to use.

I want expensive stuff to be done by me and for a reason: I want to
be holding the shotgun.
2024-04-15 04:42:24 +06:30
Aryadev Chavali
062ed12278 Rewrote preprocesser API
This C++ rewrite allows me to rewrite the actual API of the system.
In particular, I'm no longer restricting myself to just using enums
then figuring out a way to get proper error logging later down the
line (through tracking tokens in the buffer internally, for example).

Instead I can now design error structures which hold references to the
token they occurred on as well as possible lexical errors (if they're
a FILE_LEXICAL_ERROR which occurs due to the ~%USE~ macro).  This
means it's a lot easier to write error logging now at the top level.
2024-04-15 04:37:43 +06:30
Aryadev Chavali
72ef40e671 parser -> preprocesser + parser
I've decided to split the module parsing into two modules, one for the
preprocessing stage which only deals with tokens and the parsing stage
which generates bytecode.
2024-04-14 17:25:28 +06:30
Aryadev Chavali
86e9d51ab0 enum -> enum class in lexer
This makes enum elements scoped which is actually quite useful as I
prefer the namespacing that enum's give in C++.
2024-04-14 17:17:51 +06:30
Aryadev Chavali
86aca9a596 Added static assert to lexer in case of opcode changes 2024-04-14 17:12:24 +06:30
Aryadev Chavali
e5ef0292e7 asm/main now tokenises and prints the tokens of a given file
With error checking!
2024-04-14 17:11:48 +06:30
Aryadev Chavali
d368a49f56 Implemented a function to read a file in full
Uses std::optional in case file doesn't exist.
2024-04-14 17:10:23 +06:30
Aryadev Chavali
98d4f73134 asm/main now prints usage 2024-04-14 17:09:49 +06:30
Aryadev Chavali
44305138b0 Implemented cstr functions. 2024-04-14 17:05:56 +06:30
Aryadev Chavali
e55871195a Implemented overload for ostream and token as well as constructors for token 2024-04-14 17:05:52 +06:30
Aryadev Chavali
a8f605c89b Implemented tokenise_buffer
Note that this is basically the same as the previous version,
excluding the fact that it uses C++ idioms more and does a bit better
in error checking.
2024-04-14 17:04:15 +06:30
Aryadev Chavali
7a9e646d39 Implemented tokenise_literal_string
One thing I've realised is that even methods such as this require
error tracking.  I won't implement it in the tokenise method as it's
not related to consuming the string per se but instead in the main method.
2024-04-14 17:02:45 +06:30