1. Execution Flow
After launching the interpreter a LiteScript file, the interpreter follows a pipeline to transform the raw source code into a structure that the interpreter understand and later runs.
1.1 Lexing
For each LiteScript file, this stage is in charge of reading all the relevant contents so that the next stage, the parsing stage, is able to cleanly parse everything.
Unlike most programming languages which, at the lexing stage, transform all the lines into tokens and build the Abstract Syntax Tree (AST) for prior execution, the LiteScript interpreter takes a different approach, by adopting a fixed-form strategy of defining statements, which simplifies the lexing and parsing processes. Because of this, all the lexer does is bundle together all the lines, exclude: any lines that don't have any text (only whitespace and/or tabs) and comments.
1.1.1 Comments
LiteScript only has support for single line comments, all starting with a dollar character '$'. Every character starting at the comment definition until the end of the line is considered a comment.
$ This is a comment.
$ This is also a comment. $ This part too.
Any comment can have tabs or whitespaces before it, but comments inlined with statements are not supported.
So these comments are valid:
$ This comment has a whitespace before it.
$ This comment has a tab before it.
$ This comment has a mix of both before it.
But this, for example, is not:
x = 1 $ Setting x to 1.
1.2 Parsing
Having processed the lexing stage, the parsing stage begins, it is here where the code is transformed into a singular object that the runtime can understand and execute.
It is here where any modules and other external LiteScript files referenced are bundled together. In the case of external source code files, they are all copied and pasted into the same object, so all the code that needs to be run is available at runtime.
For each dependency mentioned that includes modules and references other dependencies, they are also included and integrated. All transitive and circular dependencies are taken into account and do not result in redundant and cyclic inclusions.
1.2.1 Statement Parsing
It is important to mention briefly how statements are parsed by the interpreter. Like mentioned before, we adopt a fixed-form philosophy for statements, which we inherit from assembly. As it is commonly known, each assembly instruction has a fixed-form with no or many operands.
OPERATION [OPERAND1] [OPERAND2] ...
So, for statements where we assign some variable's value, in assembly we might do:
OPERATION SRC_OPERAND DEST_OPERAND
Where, DEST_OPERAND can be a register, an immediate or a value stored in memory. Taking this into account, in LiteScript, we inherit this behavior, providing the abstraction of variables and mathematical expressions; therefore transforming an assembly instruction into this for assignment operations:
VAR OPERATION EXPRESSION
For other operations that don't assign any variable any value, then it is just simply:
OPERATION [EXPRESSION]
What this implies is that expressions are essentially squashed during parsing, meaning that, only for expressions, you can write them however you like, either squashed or with spaces; but not with assigners. So writing this is valid:
x = 1+2*(3/4)
x = 1 + 2 * (3 / 4)
But this is not:
x =1+2*(3/4)
x =1 + 2 * (3 / 4)
x=1+2*(3/4)
x=1 + 2 * (3 / 4)
By restricting statements to abide by a fixed-form philosophy, this simplifies the parsing process a lot, as all it does is just simply split the lines provided by the lexer into the necessary tokens for the fixed-form statements, which is what gives LiteScript its unique characteristic of being a mix of assembly and scripting.
1.3 Runtime
With the object provided by the parser that essentially contains lines of tokens that the interpreter can understand, the runtime begins, where, just like any other interpreted language, the runtime executes statements line by line until the program finishes, with an successful exit code.
It is important to mention that the previous preprocessing stages don't handle syntax errors, which implies that any error in the syntax is detected at runtime, with an error of "Unrecognized instruction".
1.3.1 Runtime errors
During runtime, several errors can occur, in which case the interpreter simply prints an error message via stderr and exits with an exit code of 1. The interpreter is implemented in C++, so any exception message included might provide errors from "std::".
-
"Unknown symbol, use of undefined variable or function.": This error mainly occurs when we reference either a variable that hasn't been declared yet in an expression or a function that doesn't exist (or is not defined correctly).
-
"Type mismatch, wrong type provided.": This error only occurs in function calls and expressions that expect an operand to be of a certain specific type, but another incompatible type is provided.
-
"Unknown/unexpected interpreter exception, ...": This error can occur when an exception is being thrown during runtime that the interpreter doesn't initially/usually expect, normally by one of imported modules. In any case they include a special exception message that provides more information.
-
Module errors: For the different function calls from imported modules, an error is shown with the name of the module that is indicating the error along with a message indicating its cause.
-
Miscellaneous errors: In specific cases, a unique error is printed that doesn't follow any of the other previously mentioned errors, usually in alternate scenarios of specific operations.
-
"Fatal interpreter error.": Final fail-safe, indicates that something completely unexpected has happened. This error message should never be shown and likely indicates an error in the implementation.