ovm - ARCHIVED - A stack based virtual machine to act as a target for other programming languages

Age	Commit message (Collapse)	Author
2024-04-16	Clear vector after deleting all tokens	Aryadev Chavali
	Ensures that iteration over vec_out by caller doesn't occur (such as in a loop to free the memory).
2024-04-16	Fixed examples for changes in lexer	Aryadev Chavali
	Name assigned to %CONST is the next symbol in stream, not the symbol attached to it.
2024-04-15	Changed %const format in preprocesser now	Aryadev Chavali
	Instead of %const(<name>) ... %end it will now be %const <name> ... %end i.e. the first symbol after %const will be considered the name of the constant similar to %use.
2024-04-15	preprocesser publicly exposes only preprocesser function	Aryadev Chavali
	The preprocess_* functions are now privately contained within the implementation file to help the preprocesser outer function. Furthermore I've simplified the API of the preprocess_* functions by making them only return pp_err_t and store their results in a vector parameter taken by reference.
2024-04-15	Annotate some completed todos in todo.org	Aryadev Chavali

2024-04-15	Added some VERBOSE checked messages into asm/main	Aryadev Chavali

2024-04-15	Propagate changes to lerr_t into preprocesser	Aryadev Chavali

2024-04-15	preprocesser function now only returns a pp_err_t	Aryadev Chavali
	We leave the parameter tokens alone, considering it constant, while the parameter vec_out is used to hold the new stream of tokens. This allows the caller to have a before and after view on the token stream and reduces the worry of double frees.
2024-04-15	Changed output schema for printing tokens	Aryadev Chavali

2024-04-15	Fix error where lexer would loop infinitely if unknown character found	Aryadev Chavali

2024-04-15	Changed hex format from x<digits> -> 0x<digits>	Aryadev Chavali

2024-04-15	Lexical error on char literal being too big or too small	Aryadev Chavali
	This is actually an improvement on the older lexer.
2024-04-15	lerr_t is now a struct with constructors	Aryadev Chavali
	Similar principle to pp_err_t in that a structure provides the opportunity for more information about the error such as location.
2024-04-15	constexpr -> const in lexer.cpp	Aryadev Chavali
	Not much of an actual performance change, more semantic meaning for me.
2024-04-15	~extern "C"~ when including lib/inst.h	Aryadev Chavali

2024-04-15	Dependencies stricter and prerequisites for directories in Makefile	Aryadev Chavali
	Don't need to make a directory every time I compile some code.
2024-04-15	Auto filled copyrightn otice in asm/base	Aryadev Chavali

2024-04-15	main.cpp now preprocesses tokens and prints the output	Aryadev Chavali

2024-04-15	Fix some off by one errors	Aryadev Chavali

2024-04-15	Fix issue with use_blocks not being preprocessed	Aryadev Chavali

2024-04-15	fix memory leak through vec.clear	Aryadev Chavali
	vec.clear() doesn't delete pointers (unless they're smart) so I need to do it myself.
2024-04-15	Implemented preprocesser function.	Aryadev Chavali

2024-04-15	Default constructor for pp_err_t	Aryadev Chavali

2024-04-15	preprocess_* now uses const references to tokens	Aryadev Chavali
	They copy and construct new token vectors and just read the token inputs.
2024-04-15	Updated main.cpp for changes to lexer	Aryadev Chavali

2024-04-15	Implemented preprocess_const_blocks	Aryadev Chavali
	Once again quite similar to preprocess_macro_blocks but shorter, easier to use and easier to read. (76 vs 109)
2024-04-15	Implement printing of pp_err_t	Aryadev Chavali
	Another great thing for C++: the ability to tell it how to print structures the way I want. In C it's either: 1) Write a function to print the structure out (preferably to a file pointer) 2) Write a function to return a string (allocated on the heap) which represents it Both are not fun to write, whereas it's much easier to write this.
2024-04-15	Implement constructors for pp_err_t	Aryadev Chavali

2024-04-15	Implement preprocess_use_blocks	Aryadev Chavali
	While being very similar in style to the C version, it takes 27 lines of code less to implement it due to the niceties of C++ (41 lines vs 68).
2024-04-15	Moved read_file to a general base library	Aryadev Chavali

2024-04-15	Fix some off by one errors in lexer	Aryadev Chavali

2024-04-15	lexer now produces a vector of heap allocated tokens	Aryadev Chavali
	This removes the problem of possibly expensive copies occurring due to working with tokens produced from the lexer (that C++ just... does): now we hold pointers where the copy operator is a lot easier to use. I want expensive stuff to be done by me and for a reason: I want to be holding the shotgun.
2024-04-15	Rewrote preprocesser API	Aryadev Chavali
	This C++ rewrite allows me to rewrite the actual API of the system. In particular, I'm no longer restricting myself to just using enums then figuring out a way to get proper error logging later down the line (through tracking tokens in the buffer internally, for example). Instead I can now design error structures which hold references to the token they occurred on as well as possible lexical errors (if they're a FILE_LEXICAL_ERROR which occurs due to the ~%USE~ macro). This means it's a lot easier to write error logging now at the top level.
2024-04-14	parser -> preprocesser + parser	Aryadev Chavali
	I've decided to split the module parsing into two modules, one for the preprocessing stage which only deals with tokens and the parsing stage which generates bytecode.
2024-04-14	enum -> enum class in lexer	Aryadev Chavali
	This makes enum elements scoped which is actually quite useful as I prefer the namespacing that enum's give in C++.
2024-04-14	Added static assert to lexer in case of opcode changes	Aryadev Chavali

2024-04-14	asm/main now tokenises and prints the tokens of a given file	Aryadev Chavali
	With error checking!
2024-04-14	Implemented a function to read a file in full	Aryadev Chavali
	Uses std::optional in case file doesn't exist.
2024-04-14	asm/main now prints usage	Aryadev Chavali

2024-04-14	Implemented cstr functions.	Aryadev Chavali

2024-04-14	Implemented overload for ostream and token as well as constructors for token	Aryadev Chavali

2024-04-14	Implemented tokenise_buffer	Aryadev Chavali
	Note that this is basically the same as the previous version, excluding the fact that it uses C++ idioms more and does a bit better in error checking.
2024-04-14	Implemented tokenise_literal_string	Aryadev Chavali
	One thing I've realised is that even methods such as this require error tracking. I won't implement it in the tokenise method as it's not related to consuming the string per se but instead in the main method.
2024-04-14	Implemented tokenise_literal_char (tokenise_char_literal)	Aryadev Chavali
	I made the escape sequence parsing occur here instead of leaving it to the main tokenise_buffer function as I think it's better suited here.
2024-04-14	Implemented tokenise_literal_hex	Aryadev Chavali
	Note the overall size of this function in comparison to the C version, as well as its clarity. Of course, it is doing allocations in the background through std::string which requires more profiling if I want to make this super efficient™ but honestly the assembler just needs to work, whereas the runtime needs to be fast.
2024-04-14	Implemented tokenise_literal_number (tokenise_number)	Aryadev Chavali

2024-04-14	Started implementing lexer in lexer.cpp	Aryadev Chavali
	The implementation for tokenise_symbol is already a lot nicer to look at and add to due to string/string_view operator overloading of ==. Furthermore, error handling through pair<> instead of making some custom structure which essentially does the same thing is already making me happy for this rewrite.
2024-04-14	Wrote a new lexer API in C++	Aryadev Chavali
	Essentially a refactor of the C formed lexer into C++ style. I can already see some benefits from doing this, in particular speed of prototyping.
2024-04-14	Added C++ dir locals	Aryadev Chavali

2024-04-14	Created custom functions to convert (h)words to and from bytecode format	Aryadev Chavali
	Instead of using endian.h that is not portable AND doesn't work with C++, I'll just write my own using a forced union based type punning trick. I've decided to use little endian for the format as well: it seems to be used by most desktop computers so it should make these functions faster to run for most CPUs.