A reworked preprocesser with focus on stopping recursive errors

Preprocesser requires one function to use: preprocess.  Takes Tokens
and gives back Units.

A unit is a tree of tokens, where each unit is a node in that tree.  A
unit has a "root" token (value of node) and an "expansion" (children
of node) where the root is some preprocesser token (such as a
reference or USE call) and the expansion is the tokens it yields.  In
the case of a USE call this is the tokens of the file it includes, in
the case of a reference it's the tokens of the constant it refers to.
This means that the leaves of the tree of units are the completely
preprocessed/expanded form of the source code.

The function has many working components, which may need to be
extracted.  In particular, the function ensures we don't include a
source twice through a hash map and that constants are not redefined
in inner include scopes if they're already defined in outer
scopes (i.e. if compiling a.asm which defines constant N, then include
b.asm which defines constant N, then N uses the definition of a.asm
rather than b.asm).

I need to make a spec for this.
This commit is contained in:
2024-07-06 17:38:02 +01:00
parent 1145b97c4c
commit a422c7d1dc
3 changed files with 354 additions and 1 deletions

80
src/preprocesser.hpp Normal file
View File

@@ -0,0 +1,80 @@
/* Copyright (C) 2024 Aryadev Chavali
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
* FOR A PARTICULAR PURPOSE. See the GNU General Public License Version 2 for
* details.
* You may distribute and modify this code under the terms of the GNU General
* Public License Version 2, which you should have received a copy of along with
* this program. If not, please go to <https://www.gnu.org/licenses/>.
* Created: 2024-07-03
* Author: Aryadev Chavali
* Description:
*/
#ifndef PREPROCESSER_HPP
#define PREPROCESSER_HPP
#include <ostream>
#include <unordered_map>
#include <src/lexer.hpp>
namespace Preprocesser
{
#define PREPROCESSER_MAX_DEPTH 16
struct Block
{
Lexer::Token *root;
std::vector<Lexer::Token *> body;
};
typedef std::unordered_map<std::string, Block> Map;
struct Unit
{
Lexer::Token *const root;
std::vector<Unit> expansion;
};
struct Err
{
Lexer::Token *token;
Err *child_error;
Lexer::Err lexer_error;
enum class Type
{
EXPECTED_END,
NO_CONST_AROUND,
EMPTY_CONST,
EXPECTED_SYMBOL_FOR_NAME,
DIRECTIVES_IN_CONST_BODY,
UNKNOWN_NAME_IN_REFERENCE,
EXPECTED_FILE_NAME_AS_STRING,
FILE_NON_EXISTENT,
IN_FILE_LEXING,
SELF_RECURSIVE_USE_CALL,
IN_ERROR,
EXCEEDED_PREPROCESSER_DEPTH,
} type;
Err();
Err(Err::Type, Lexer::Token *, Err *child = nullptr, Lexer::Err err = {});
~Err(void);
};
std::string to_string(const Unit &, int depth = 0);
std::string to_string(const Err::Type &);
std::string to_string(const Err &);
std::ostream &operator<<(std::ostream &, const Unit &);
std::ostream &operator<<(std::ostream &, const Err &);
Err *preprocess(std::vector<Lexer::Token *> tokens, std::vector<Unit> &units,
std::vector<Lexer::Token *> &new_token_bag, Map &const_map,
Map &file_map, int depth = 0);
}; // namespace Preprocesser
#endif