July | 2012 | The Developer's Blog

2012-0728

Branching rules

Comments Off[Author: Jukka Lavonen Category: Game Creation Tools ]

Last time I talked about synthesis and IL generation. Currently, half of the statements already write valid byte code. I have now completed the branching infrastructure that I mentioned in my previous blog post, and most conditional structures and loops are now prepared – including the If, While, and Until structures. The For-Next loop and the Select-Case structure are next. But before going there I decided to wrap up a quick summary about branching rules and what they mean in terms of code compilation.

First of all, there’s a mechanic in place that eliminates unreachable code. For example, if you had an If condition that has a constant expression that always evaluates to false, the compiler knows this during compile time and will not write the byte code for that block. Similarly, if you had a While loop that is known to be always true the compiler doesn’t write the byte code for the expression at all. Code elimination also applies to nested code blocks, meaning that everything inside an unreachable code block is ignored when the byte code output is written. Next, let’s look at the specs:

The If structure

The If and ElseIf statements must know the next ElseIf/Else statement in order to branch when the expression is false
The If, ElseIf and Else statements must know the EndIf statement that is associated with the condition chain; The If statement needs it in order to branch when the expression is false, and the ElseIf and Else statements need it in order to issue an unconditional jump to indicate the end of the previous block
The ElseIf and Else statements must add an unconditional jump to the corresponding EndIf statement before any other byte code unless the statement in question is and ElseIf with a constant value expression of false
If the If/ElseIf expression is a constant value, don’t write expression byte code
If any of the If or ElseIf blocks evaluated to a constant expression of true, don’t write byte code for any of the subsequent blocks in the same chain
If an If block or an ElseIf block has a constant expression of false, don’t write its byte code. This includes nested code blocks

In addition to the rules listed above there are some minor tweaks that accomplish cleaner byte code output; I’ve eliminated some branching instructions that are not needed, for example.

The Repeat-Until structure

The Until and Forever statements must know the corresponding Repeat statement in order to branch to the correct location
If the Until statement’s expression is a constant value of true, do nothing
If the Until statement’s expression is a constant value of false, add an unconditional jump
The Forever statement just adds an unconditional jump

The While-EndWhile structure

The While statement needs to know the corresponding EndWhile statement in order to branch when the expression is false
The EndWhile statement needs to know the corresponding While statement in order to issue an unconditional jump
If the While statement’s expression is a constant value of false, don’t write byte code for the expression or code block
If the While statement’s expression is a constant value of true, don’t write the expression’s byte code
The EndWhile statement only adds an unconditional jump to the While statement

Controlling loops

The old “Exit” statement is now renamed as “Break”. It exits the Repeat, While, For, and Foreach loops, and continues execution from after the loop.

Similarly, a new loop control statement has been added. The “Continue” statement continues the encapsulating loop from its next iteration pass, but doesn’t exit the loop unless the associated condition dictates so.

The implementation of these is quite straight-forward; The Break statement needs to know the loop’s ending statement, and the Continue statement needs to know the loop’s starting statement. Both issue an unconditional jump.

I haven’t yet implemented these, and will retain doing so until I get the For-Next structure done. I might write a similar blog post about the For-Next structure and Select-Case structure next time…

Tags: Compiler

2012-0720

Moving from analysis to synthesis

Comments Off[Author: Jukka Lavonen Category: Game Creation Tools ]

Now that I’m on my summer vacation, I’ve had some time to return to the CoolBasic Classic compiler again (after a month or two of slacking, pardon me), and I’m happy to say that the analysis phase of the compilation process is now basically done. After code analysis the compiler has all the information it needs in order to produce the final byte code output. This phase is called “synthesis”, and it executes by recursively iterating through all statements based on their scope. This approach enables the compiler to omit byte code generation for unreachable code blocks such as If/ElseIf/While/Case blocks that have a constant expression that evaluates to false. Moreover, we could have those user-defined Functions and Subs who aren’t used anywhere in code not to be included to the output IL at all. However, it’s important to process all those code blocks fully even though they wouldn’t get written to IL because we want the compiler to validate the entire source code.

Some of the statements are already written into the output byte code stream such as the assignment statement and a Sub call. Similarly to function calls, the compiler inserts any omitted optional parameters and injects any type conversion operators that might be needed. There are still dozens of statement types to process, but all of them should be quite simple to implement. I decided to tackle the If/ElseIf/Else structure next because it requires some branching infrastructure I need to develop first. Once that’s sorted out it’s easy to implement other program flow control statements such as the While and Until loops.

Branching i.e. jumping is an interesting topic on its own because it’s a common key element and therefore needs to be very efficient. Labels are local to their enclosing Function or Sub symbol, and also the Root (the main program) has its own label “scope”. This means that the programmer can have identically named labels as long as they exist in a different “stack fragment”. The Root/Function/Sub symbols encompass a dictionary of labels, and they’re populated already in the parsing phase so that all labels are known during synthesis. The compiler can therefore match forward jumps without expensive scanning. On the other hand, this technique only applies to GoTo statements – many other statement types need to generate some jump targets.

Statements are processed in the natural program flow order during synthesis. For example, an If statement is processed before the corresponding EndIf statement, but the If statement in question still needs to know where the EndIf statement is in order to jump to the proper location if the expression is false. The compiler needs to link all cognate If/ElseIf/Else/EndIf statements together so that they can access the next member in the chain as well as the final EndIf. For branching to work, the compiler stores the byte code pointer of each statement (labels are considered statements as well even though they don’t generate any IL) when they get iterated by the synthesizer. Of course the If statement can’t know the final address of the EndIf statement upon processing because the EndIf doesn’t yet have the calculated offset, i.e. forward jumps cannot be determined fully at this point.

To solve this problem, all jumps (apart from GoTo statements) are cached in a special list that consist of tuples of an IL jump instruction and the target statement. Just before the IL is physically written to a file the compiler simply iterates this list and fills in the correct jump instruction addresses with the calculated code offsets found in the completely processed statements.

We’re quite close from having a working compiler prototype for internal testing 🙂

Tags: Compiler

Branching rules

The If structure

The Repeat-Until structure

The While-EndWhile structure

Controlling loops

Moving from analysis to synthesis

Recent Posts

Archives

Categories

Meta