Commit Graph

385 Commits

Author SHA1 Message Date
Aryadev Chavali
6ae0bbedc5 Plug preprocesser into main 2024-07-07 19:06:56 +01:00
Aryadev Chavali
a422c7d1dc A reworked preprocesser with focus on stopping recursive errors
Preprocesser requires one function to use: preprocess.  Takes Tokens
and gives back Units.

A unit is a tree of tokens, where each unit is a node in that tree.  A
unit has a "root" token (value of node) and an "expansion" (children
of node) where the root is some preprocesser token (such as a
reference or USE call) and the expansion is the tokens it yields.  In
the case of a USE call this is the tokens of the file it includes, in
the case of a reference it's the tokens of the constant it refers to.
This means that the leaves of the tree of units are the completely
preprocessed/expanded form of the source code.

The function has many working components, which may need to be
extracted.  In particular, the function ensures we don't include a
source twice through a hash map and that constants are not redefined
in inner include scopes if they're already defined in outer
scopes (i.e. if compiling a.asm which defines constant N, then include
b.asm which defines constant N, then N uses the definition of a.asm
rather than b.asm).

I need to make a spec for this.
2024-07-06 17:38:02 +01:00
Aryadev Chavali
1145b97c4c Token to_string now include source name and is printed error style
So instead of the previous weird format, we have the format
<source_name>:<line>:<column>: <TYPE> which also allows me to quickly
go to that token via Emacs' (compile).
2024-07-06 17:36:58 +01:00
Aryadev Chavali
f9acb23671 Lexer errors contain the source name and tokenise_symbol refactor 2024-07-05 22:47:25 +01:00
Aryadev Chavali
036ac03176 Lexer tokens now include source name as part of the token 2024-07-05 18:18:11 +01:00
Aryadev Chavali
65ce50f620 Fix copyright notices and includes 2024-07-03 16:55:19 +01:00
Aryadev Chavali
15d39dcfe7 Reworked lexer to deal with invalid type suffixes
Now ~push.magic~ will result in an error about it being an invalid
type suffix.
2024-07-03 16:55:19 +01:00
Aryadev Chavali
42dbf515f0 Deleted preprocesser
Will be reworking it later
2024-07-03 16:55:19 +01:00
Aryadev Chavali
683636c66d Rewriting lexer TODO 2024-06-01 14:37:06 +01:00
Aryadev Chavali
76bb5ec7d9 (Lexer)+to_string functions for Err, Err::Type 2024-06-01 13:53:54 +01:00
Aryadev Chavali
4625b3b7a5 (Lexer)+to_string functions for Token, Token::Type 2024-06-01 13:51:10 +01:00
Aryadev Chavali
a4689f9dd0 Lexer call pattern is now Err return with reference to token 2024-06-01 13:40:17 +01:00
Aryadev Chavali
7e9af309e3 lerr_t and lerr_type_t -> Lexer::Err and Lexer::Err::Type 2024-06-01 13:40:17 +01:00
Aryadev Chavali
4b85f90a52 Namespace the lexer module
Future proofing any name collisions.
2024-06-01 01:52:17 +01:00
Aryadev Chavali
83ad8b832b token_type_t -> Token::Type
Implicit namespacing using the struct
2024-06-01 01:49:24 +01:00
Aryadev Chavali
f5d8777b7a token_t -> Token
Use C++'s implicit typedef
2024-06-01 01:48:11 +01:00
Aryadev Chavali
f3f7578811 Update lexer trivially
HALT is now an opcode, which we deal with already.
2024-06-01 01:47:16 +01:00
Aryadev Chavali
bbb8ed1337 Merge remote-tracking branch 'github/master' 2024-06-01 01:20:16 +01:00
Aryadev Chavali
2d5d8c7904 Update AVM and clean dir-locals 2024-06-01 01:18:56 +01:00
Aryadev Chavali
1ce5bf556e Update AVM 2024-04-16 20:52:32 +06:30
Aryadev Chavali
52843e2e14 Removed workflow as it doesn't work with submodules 2024-04-16 20:43:56 +06:30
Aryadev Chavali
f060a856d3 Fixed Makefile so it tracks dependencies better
It now tracks main.cpp's dependencies and rebuilds them as needed.
2024-04-16 20:42:51 +06:30
Aryadev Chavali
190bb766cb Added a TODO to write a specification for the assembly language 2024-04-16 19:18:35 +06:30
Aryadev Chavali
3b9e573c4a Made AVM a git submodule, updated the Makefile to build assembler
Also updated dir-locals to make include path resolution accurate.
2024-04-16 19:18:32 +06:30
Aryadev Chavali
9d72c9177d Clean up work tree for making assembler 2024-04-16 19:14:24 +06:30
Aryadev Chavali
2a1d006a88 Updated README about change to project 2024-04-16 15:49:26 +06:30
Aryadev Chavali
8f75241bcb Halting work on preprocesser units and rewrite as a whole
I've decided to split the project into 2 repositories: the assembler
and the runtime.  The runtime will contain both the executable and
lib/ while the assembler will have the runtime as a git submodule and
use it to build.  I think this is a clean solution, a lot cleaner than
having them all in one project where the Makefile has to massively
expand.
2024-04-16 15:42:59 +06:30
Aryadev Chavali
d5c43b1c3f Wrote up some notes on how preprocesser language may work
Bit formal and really excessively written but I needed my thoughts
down.
2024-04-16 15:42:34 +06:30
Aryadev Chavali
715facf015 Updated README lines of code 2024-04-16 15:42:22 +06:30
Aryadev Chavali
4ecd184759 lerr_type_t::UNKNOWN_CHAR -> UNKNOWN_LEXEME 2024-04-16 15:41:01 +06:30
Aryadev Chavali
27d6a47320 Clean up error message from preprocesser 2024-04-16 15:40:49 +06:30
Aryadev Chavali
3fc1f08134 Fix bug where CONST table didn't actually store symbol names
Pretty simple fix, stupid error in hindsight.
2024-04-16 15:40:00 +06:30
Aryadev Chavali
4b3e9b3567 Clear vector after deleting all tokens
Ensures that iteration over vec_out by caller doesn't occur (such as
in a loop to free the memory).
2024-04-16 15:39:20 +06:30
Aryadev Chavali
05136fdd25 Fixed examples for changes in lexer
Name assigned to %CONST is the next symbol in stream, not the symbol
attached to it.
2024-04-16 15:38:24 +06:30
Aryadev Chavali
1e7f1bdee9 Changed %const format in preprocesser now
Instead of %const(<name>) ... %end it will now be %const <name>
... %end i.e. the first symbol after %const will be considered the
name of the constant similar to %use.
2024-04-15 18:39:37 +06:30
Aryadev Chavali
ba3525d533 preprocesser publicly exposes only preprocesser function
The preprocess_* functions are now privately contained within the
implementation file to help the preprocesser outer function.

Furthermore I've simplified the API of the preprocess_* functions by
making them only return pp_err_t and store their results in a vector
parameter taken by reference.
2024-04-15 18:37:45 +06:30
Aryadev Chavali
d594c0c531 Annotate some completed todos in todo.org 2024-04-15 16:35:44 +06:30
Aryadev Chavali
e960af2904 Added some VERBOSE checked messages into asm/main 2024-04-15 16:33:22 +06:30
Aryadev Chavali
4a7341e26c Propagate changes to lerr_t into preprocesser 2024-04-15 16:33:02 +06:30
Aryadev Chavali
b83bdd0d45 preprocesser function now only returns a pp_err_t
We leave the parameter tokens alone, considering it constant, while
the parameter vec_out is used to hold the new stream of tokens.  This
allows the caller to have a before and after view on the token stream
and reduces the worry of double frees.
2024-04-15 16:31:45 +06:30
Aryadev Chavali
c748ed8386 Changed output schema for printing tokens 2024-04-15 16:31:07 +06:30
Aryadev Chavali
83faf86312 Fix error where lexer would loop infinitely if unknown character found 2024-04-15 16:30:44 +06:30
Aryadev Chavali
0f430e399c Changed hex format from x<digits> -> 0x<digits> 2024-04-15 16:30:30 +06:30
Aryadev Chavali
175138f570 Lexical error on char literal being too big or too small
This is actually an improvement on the older lexer.
2024-04-15 16:29:42 +06:30
Aryadev Chavali
a2d98142d5 lerr_t is now a struct with constructors
Similar principle to pp_err_t in that a structure provides the
opportunity for more information about the error such as location.
2024-04-15 16:29:37 +06:30
Aryadev Chavali
ae3794c33e constexpr -> const in lexer.cpp
Not much of an actual performance change, more semantic meaning for me.
2024-04-15 16:27:05 +06:30
Aryadev Chavali
a70dcf2e3f ~extern "C"~ when including lib/inst.h 2024-04-15 16:26:44 +06:30
Aryadev Chavali
5319fd815d Dependencies stricter and prerequisites for directories in Makefile
Don't need to make a directory every time I compile some code.
2024-04-15 16:25:35 +06:30
Aryadev Chavali
58069e083d Auto filled copyrightn otice in asm/base 2024-04-15 16:25:14 +06:30
Aryadev Chavali
958868714e main.cpp now preprocesses tokens and prints the output 2024-04-15 05:56:01 +06:30