oreodave/ovm - ovm - Gitea: Git with a cup of tea

Archived

Author	SHA1	Message	Date
Aryadev Chavali	0ebbf3ca75	Start writing assembler in C++ Best language to use as it's already compatible with the headers I'm using and can pretty neatly enter the build system while also using the functions I've built for converting to and from bytecode!	2024-04-14 02:45:48 +06:30
Aryadev Chavali	4e9eb0a42e	fix! loops in preprocess_use_blocks iterate to the wrong bound A token_stream being constructed on the spot has different used/available properties to a fully constructed one: a fully constructed token stream uses available to hold the total number of tokens and used as an internal iterator, while one that is still being constructed uses the semantics of a standard darr. Furthermore, some loops didn't divide by ~sizeof(token_t)~ which lead to iteration over bound errors.	2024-04-14 02:00:17 +06:30
Aryadev Chavali	60588129b4	Cleaned up logs in assembler/parser	2023-11-29 23:09:51 +00:00
Aryadev Chavali	6a34fd2d2e	Fixed incorrect free of tokens in error for preprocess_use_blocks Also error now points to the correct place in the file.	2023-11-29 16:58:26 +00:00
Aryadev Chavali	fd1e6d96f6	Report some stats of the actual program when working	2023-11-29 15:46:44 +00:00
Aryadev Chavali	16dcc88a53	Refactored preprocessor to preprocess_(use\|macro)_blocks and process_presults We have distinct functions for the use blocks and the macro blocks, which each generate wholesale new token streams via `token_copy` so we don't run into weird errors around ownership of the internal strings of each token. Furthermore, process_presults now uses the stream index in each presult to report errors when stuff goes wrong.	2023-11-29 15:43:53 +00:00
Aryadev Chavali	48d304056a	Refactored presult_t to include a stream pointer So when a presult_t is constructed it holds an index to where it was constructed in terms of the token stream. This will be useful when implementing an error checker in the preprocessing or result parsing stages.	2023-11-29 15:43:41 +00:00
Aryadev Chavali	4cee61fc9e	Added parse errors for %USE calls So %USE <STRING> is the expected call pattern, so there's an error if there isn't a string after %USE. The other two errors are file I/O errors i.e. nonexistent files or errors in parsing the other file. We don't report specifics about the other file, that should be up to the user to check themselves.	2023-11-29 15:40:14 +00:00
Aryadev Chavali	9b8936ea02	Fixed tokenise_string_literal Forgot to increment buffer->used and memcpy call was just incorrect.	2023-11-29 15:39:37 +00:00
Aryadev Chavali	ac70d4031c	Added function to copy tokens This essentially just copies the internal string of the token into a new buffer.	2023-11-29 15:38:57 +00:00
Aryadev Chavali	1cba5ccd8d	Added TOKEN_PP_USE to lexer with implementation	2023-11-29 15:38:41 +00:00
Aryadev Chavali	c9f684cc7d	Added string literals in tokeniser Doesn't do much, invalid for most operations.	2023-11-11 10:16:37 +00:00
Aryadev Chavali	cb2416554b	Added a preprocessing routine in assembler Preprocessor handles macros and macro blocks by working at the token level, not doing any high level parsing or instruction making. Essentially every macro is recorded in a registry, recording the name and the tokens assigned to it. Then for every caller it just inserts the tokens inline, creating a new stream and freeing the old one. It leaves actual high level parsing to `parse_next` and `process_presults`.	2023-11-08 18:15:26 +00:00
Aryadev Chavali	253bebb467	Added log in assembler for reading a certain number of bytes	2023-11-08 18:14:59 +00:00
Aryadev Chavali	642a8ae944	Lexer symbols now recognise macro constants and references	2023-11-08 18:14:41 +00:00
Aryadev Chavali	6e524569c3	Current work on preprocessor	2023-11-06 08:16:15 +00:00
Aryadev Chavali	4ae6c05276	Current work on preprocessor implementation Lots to refactor and test	2023-11-05 16:21:09 +00:00
Aryadev Chavali	e9eead1177	Symbols may now include digits in lexer This is mostly so labels get to have digits. This won't affect number tokens as that happens before symbols.	2023-11-03 21:50:55 +00:00
Aryadev Chavali	e6f580ba56	Removed tabs from VERBOSE logs in asm/main.c	2023-11-03 21:50:44 +00:00
Aryadev Chavali	3fde04e1d2	Fixed bug where labels were off by one Was used in a previous fix but not necessary anymore	2023-11-03 21:22:02 +00:00
Aryadev Chavali	b8f6232bb2	Refactor assembler to use prog_t structure Set the program structure correctly with a header using the parsed global instruction.	2023-11-03 21:15:30 +00:00
Aryadev Chavali	b5a1582976	Added a start address (equivalent to `main`) to assembler Creates a jump address to the label delegated by "global" so program starts at that point.	2023-11-03 19:01:31 +00:00
Aryadev Chavali	6dfc4ceaeb	Better logs for assembler	2023-11-02 23:29:43 +00:00
Aryadev Chavali	6c4469958e	Implemented CALL(_STACK) and RET on the assembler	2023-11-02 23:29:23 +00:00
Aryadev Chavali	bd39c2b283	Made lexer more error prone so parser is less Lexer now will straight away attempt to eat up any type or later portions of an opcode rather than leaving everything but the root. This means checking for type in the parser is a direct check against the name rather than prefixed with a dot. Checks are a bit more strong to cause more tokens to go straight to symbol rather than getting checked after one routine in at on the parser side.	2023-11-02 23:29:07 +00:00
Aryadev Chavali	9afeed6d61	Made separate tokens for JUMP_ABS and JUMP_STACK Makes more sense, don't need to fiddle around with strings as much in the parser due to this!	2023-11-02 20:54:26 +00:00
Aryadev Chavali	114fb82990	Removed instruction OP_JUMP_REGISTER Not necessary when you can just push the relevant word onto the stack then just do OP_JUMP_STACK.	2023-11-02 20:41:36 +00:00
Aryadev Chavali	4990d93a1c	Created a preprocessing unit presult_t and a function to process them Essentially a presult_t contains one of these: 1) A label construction, which stores the label symbol into `label` (PRES_LABEL) 2) An instruction that calls upon a label, storing the instruction in `instruction` and the label name in `label` (PRES_LABEL_ADDRESS) 3) An instruction that uses a relative address offset, storing the instruction in `instruction` and the offset wanted into `relative_address` (PRES_RELATIVE_ADDRESS) 4) An instruction that requires no further processing, storing the instruction into `instruction` (PRES_COMPLETE_INSTRUCTION) In the processing stage, we resolve all calls by iterating one by one and maintaining an absolute instruction address. Pretty nice, lots more machinery involved in parsing now.	2023-11-02 20:31:55 +00:00
Aryadev Chavali	d5e311c9d4	Started work on preprocessing jump addresses	2023-11-02 20:31:22 +00:00
Aryadev Chavali	740627b12d	Implemented MALLOC_STACK and SUB in the assembler	2023-11-01 22:56:40 +00:00
Aryadev Chavali	90e04542a2	Implemented stack versions of MGET and MSET in assembler	2023-11-01 22:09:39 +00:00
Aryadev Chavali	44125d7ad9	Implemented OP_MSIZE into lexer/parser of ASM	2023-11-01 21:47:19 +00:00
Aryadev Chavali	7564938113	Implemented lexer and parser for new memory management instructions	2023-11-01 21:40:25 +00:00
Aryadev Chavali	83678ad29a	Add MULT to lexer and parser for assembler	2023-11-01 18:09:00 +00:00
Aryadev Chavali	57e6923279	Fixed bug where comparators wouldn't be parsed correctly This is because comparators may apply to signed types, so I need to use the right parsing function.	2023-11-01 17:55:54 +00:00
Aryadev Chavali	6d35283ef0	Clearer VERBOSE messages	2023-11-01 15:22:47 +00:00
Aryadev Chavali	6a270eda1e	Parser now uses updated lexer Much simpler, uses a switch case which is a much faster method of doing the parsing. Though roughly equivalent in terms of LOC, I feel that this is more extensible	2023-11-01 15:09:56 +00:00
Aryadev Chavali	93d234cd48	Lexer now returns more descriptive tokens More useful tokens, in particular for each opcode possible. This makes parsing a simpler task to reason as now we're just checking against an enum rather than doing a string check in linear time. It makes more sense to do this at the tokeniser as the local data from the buffer will be in the cache most likely as the buffer is contiguously allocated. While it will always be slow to do linear time checks on strings, when doing it at the parser we're having to check strings that may be allocated in a variety of different places. This means caching becomes a harder task, but with this approach we're less likely to have cache misses as long as the buffer stays there.	2023-11-01 15:09:47 +00:00
Aryadev Chavali	0f0a1c7699	Allow hex literals for numbers As strto(ul\|ll) allow the parsing of hex literals of the form `0x`, we allow lexing of hex literals which start with `x`. They're lexed into C hex literals which work for strtol.	2023-10-31 22:27:53 +00:00
Aryadev Chavali	7817b5acc9	Use standardised signed version of word type from base.h	2023-10-31 21:24:50 +00:00
Aryadev Chavali	5d800d4366	Moved inst module to lib As it has no dependencies on vm specifically, and it's more necessary for any vendors who wish to target the virtual machine, it makes more sense for inst to be a lib module rather than a vm module.	2023-10-31 21:14:14 +00:00
Aryadev Chavali	7ca8f2c644	asm/main logs are now indented and look prettier	2023-10-31 20:39:49 +00:00
Aryadev Chavali	75dc36cd19	Lexer now returns errors on failure Currently only for invalid character literals, but still a possible problem.	2023-10-31 20:39:26 +00:00
Aryadev Chavali	fa640f13e8	parse_word deals with characters now Just takes the character literally as a number.	2023-10-31 20:38:03 +00:00
Aryadev Chavali	228f548bd9	Changed asm/parser instruction push-reg->push.reg	2023-10-31 20:37:11 +00:00
Aryadev Chavali	157c79d53c	Added a "usage" message and colours for assembler Prints useful and pretty messages when verbose being at least 1.	2023-10-29 16:59:31 +00:00
Aryadev Chavali	1c0bd20cba	Introduce error reporting in asm/main Pretty simple implementation, I've stopped printing the tokens cos I think the lexer is done.	2023-10-28 18:22:18 +01:00
Aryadev Chavali	eac8cbf1da	asm/parser supports all opcodes, introduced parse errors Introduced some functions to parse differing types of opcodes. Use the same style of a.b.c... for namespacing or type specification for certain opcodes. Bit hacky and not tested, but does work. Parse errors can be reported with an exact location using the token column, line.	2023-10-28 18:21:09 +01:00
Aryadev Chavali	191fe5c6b8	Ignore comments (using semicolons) in lexer Easier to do it here than at the parser.	2023-10-28 18:19:33 +01:00
Aryadev Chavali	d2429aa549	Introduced a column and line for each token Accurate error reporting can be introduced using this.	2023-10-28 18:19:30 +01:00

1 2 3

111 Commits