New website scheduled for the Alpha release

Let’s face it, the current CoolBasic website is pretty horrible. In addition to the fact that it’s completely outdated, it’s also ugly!

The placeholder CoolBasic website of failure

To be honest, the (current) website was supposed to be a short-time placeholder. Well, CoolBasic was supposed to be released years ago, too. But not all things end up turning out as planned in life. For CoolBasic, there was a long lull with the project, and everything kind of grew out of trend in the meanwhile. Tech evolved. World changed. Game industry changed. Game tooling changed. And now we’re in a situation where once great products such as BlitzBasic and Dark Basic have been unable to change and improve with the world. New, more modern tools entered the market and took over (such as Unity that’s got a lot of traction recently), thanks to their more competitive, technological advantage. Ironically, this is also true for CoolBasic. So when we come back, we’ll have to offer modern solutions! The Story of BlitzBasic has taught me a lot.

Along with the entirety of CoolBasic, the website needs a complete overhaul and better design. However, as the development of the new CoolBasic is still on-going (and not quite ready for public alpha just yet,) I don’t really want to waste time on a temporary website facelift. So let’s launch it with the actual CoolBasic alpha and go all-out with it!

CoolBasic is an interesting software project because it involves so many different sub-projects: We have some tech-heavy stuff like the compiler and runtime virtual machine. Then we have some UI coding for the IDE and additional tools. Then there’s the game engine. And also the whole set of services on the web, like the website or the future coming online documentation system. If I should temporarily grow tired on one aspect, I can switch to something else. And recently, I’ve been active with the new website.

Although not planned for immediate launch, I thought that now would be a good time to make some preparations in terms of establishing the design and foundation for the future site. How should it look and feel? What kind of pages and structure it should have so that I can deliver the intended message as effectively as possible? What is its purpose? What information should it contain and what topics are better handled on the forums and just provide a link for? And then there’s the technical topics to consider: Should I embrace HTML5 and CSS3 exclusively or still retain compatibility with other browsers?

Developer as I am, I decided to go with full HTML5/CSS3/responsive design (mainly because it’s new and fun, but also because I believe that by the time we go full public the majority of our target audience do have these things covered.)

Responsive design makes the website’s layout adjust accordingly to the available viewport dimensions. For desktop use, this is the size of the browser window. For mobile devices, it’s the size of the screen. Responsive design has some branches of interest: The common design principle complies to all devices and treats them evenhanded; the content is being laid out in such a way that from the browser’s view it’s like: “When the viewport is shrinking, where should I arrange all this content to.”

Then there’s the mobile-first approach that assumes that the site is mainly viewed on a phone or tablet. Mobile-first content is designed to appear in specific order most of the time. From the browser’s view it’s like “When the viewport is growing, I should scale everything up and larger.” For CoolBasic website, I’m going with the generic model.

With these goals in mind, off I went and came up with the basic content structure and a strategy how I would split the information across the pages. The main purpose of the new website is to introduce the CoolBasic alpha release and make all necessary information available to those who want to test it. The site’s lifespan is designed to last through alpha, beta, and perhaps even carry on at the final release.

Having working with it for a few days now, it’s starting to shape up. I’m happy with how the responsive layout re-arranges content blocks as the viewport size changes. I’m happy with the color scheme too, as it now gives out a more “professional” impression, but still has the CoolBasic ish vibe to it. Coherent design: single theme color; accent color; clear typography; image arrangement; symbol icons; CSS3 transitions; proper HTML5 markup.

Wasn’t easy, though. Not having done much web design for a couple of years, I definitely felt a bit out of touch with how fast web technologies have gone forward. Especially all that responsiveness – not an easy task to implement for pages with multiple columns, images and more complex widgets. CSS media queries are rather simple as a concept, but in practise can prove to be difficult to pull off when you have, say, site logo and site menu occupying the same “row.” I’m using the Twitter bootstrap as the basis of the layout and Font Awesome for glyphs.

At the moment, I consider the layout if not then close to final. I have all headers in place and know what I’m going to be writing about in each planned paragraph within the website. Currently I’m using placeholder images and the paragraphs only contain lorem ipsum, though.

In short, yes a new website will be coming, but not until we’re publishing the alpha. I’m not going to reveal it just yet, but maybe a bit closer to the release 😉

Extending CoolBasic with your own DLLs

One of the shortcomings of the old CoolBasic is its limited extensibility. The original design didn’t really support utilizing external code. If you wanted to create a “library” it’d have to be written in CoolBasic and then you’d have to give away the source code. The end user would then include those .cb files in the beginning of their programs.

As CoolBasic is interpreted, the need to execute code with native speed soon become apparent. To relief this limitation, CallDll was introduced. You’d allocate memory for the in parameters and the return value. It’s not a very resourceful way to support or use external DLLs, and I would have liked to implement a proper syntax like Declare Function GetTickCount Lib "kernel32" Alias "GetTickCount" () As Integer. I never did, though.

Something like that may still come in for CoolBasic Classic. It’s on the TODO list, and both the compiler and runtime have a preliminary support for it already in place. But before diving deeper into that, there’s an alternative way to integrate external libraries so that you can use them directly in code. You’d call them just like any other user-defined function, and you don’t have to declare them before use.

Functions owned by the Host

The CoolBasic Classic language itself doesn’t include functions such as LoadImage or PlaySound. They’re provided by the engine that’s interpreting the compiled program. The design philosophy here is that CoolBasic Classic is just a language, and there can be any number of game engines that can run CoolBasic code. One engine may offer a completely different set of commands than the other (and this is also where project types come into play, more about that later.)

The engine basically gives a list of commands available in it. This list of commands is then fed to both the code editor (so it can syntax highlight,) and the CoolBasic Classic compiler (so it recognizes them.) Within the executing engine, CoolBasic functions are called in a certain way, and functions provided by the hosting engine are called differently (since they’re native code.)

A host function defines an ID that is presented to the CBC compiler. Let’s say a function “PlaySound” has an ID of 15. When the compiler emits the CallHost instruction it also attaches the number 15 on it. At runtime, the virtual machine will call a delegate who had defined an ID of 15.

Nice, so that’s how we call native code from a CoolBasic program. But there’s a little design flaw here… What if we wanted to import functions from another DLL and they had ambiguous IDs.

Introducing function signatures

A better way to refer to a function is by signature. The signature comprises of the function’s name, and the number and types of its parameters (but not its return type.) Based on the supplied arguments, the compiler determines exactly which function to call (and ambiguous function candidates would generate a compile time error.) Therefore, it’s sufficient to just issue “Call a method with this signature” at runtime because the compiler has already enforced that there will be no ambiguity.

So I spent some time refactoring the CoolBasic Classic compiler and the Cool VES virtual machine to now operate with signatures instead of arbitrary function IDs. When the engine is loading, it inspects the available host functions and gathers their signatures. The signatures along with the execute delegates are stored in a dictionary. If there are multiple functions with the same signature, it throws an error. This ensures that the call will be explicitly directed to the same function that the compiler determined. When the interpreter is calling a host function, we’re going to peek into the signature dictionary and execute the appropriate delegate.

This approach makes it possible to have multiple sources of host functions (as long as the provided functions’ signatures don’t collide.) The primary source is the game engine itself. But we can also dynamically inspect the other DLLs, within the same directory, for any valid host functions and import those! Think of it like this: the linking process is done at runtime (rather than after compilation) when the CoolBasic program is loaded into the engine’s memory.

All you have to do is drop those DLLs in the same directory as the executing engine. They’ll run at native speed when executed by the CoolBasic program.

Implementing host functions

It’s reasonably easy to implement a host function that is callable from a CoolBasic program. All you need to do is create a managed DLL in any CLR compliant language, such as C# or VB.NET, that contains methods that satisfy this delegate:

[csharp]delegate void CommandAction(StackEntry[] stack, ref int stackIndex);[/csharp]

Let’s create a DLL that introduces a Sleep command that you can then use in your CoolBasic programs. It takes one integer as parameter, the number of milliseconds that the program will wait until continuing execution.

I’ll create a new Class Library project in Visual Studio, name it “MyCoolExtension” and then create a single class “MyCommands” in it. I then add a reference to CoolVES.dll and write this code:

[csharp]namespace MyCoolExtension
{
using System.Threading;

using CoolVES;
using CoolVES.Integration;

public class MyCommands
{
/// <summary>
/// Halts the program until the specified amount of milliseconds have elapsed.
/// </summary>
/// <param name="stack">The stack.</param>
/// <param name="stackIndex">Index of the stack.</param>
/// <remarks>Sub Sleep(time As Integer)</remarks>
[Command(Id = 1, Name = "Sleep", ReturnType = DataType.Void)]
[CommandParameter(Index = 0, Name = "time", DataType = DataType.Int32)]
public static void Sleep(StackEntry[] stack, ref int stackIndex)
{
var time = stack[stackIndex–].AsInteger;

Thread.Sleep(time);
}
}
}[/csharp]

Build this and drop it to the same directory as the final game.

Note that you’ll have to handle stack manipulation manually. This is a speed optimization. It’s nothing too complicated though; you’ll simply access the stack array and decrease the pointer for as many times as you have parameters in your function. If you specify something else than Void as the return value type, you’ll also have to assign a value onto the stack at the end, too.

There’s one additional step that needs to be done in order to have the compiler support this new command. There will be a tool with which you can generate a “definition file” out of this DLL. The file contains metadata that describes the functions and constants contained within your library. You only have to do this once. This file is provided to the CoolBasic Classic compiler in the command line.

Definition file handling will be done automatically behind the scenes by the code editor. Also note that you only need the file in development time; metadata doesn’t (and shouldn’t) be present in the final game output directory.

As a library developer, when you want to distribute your work, simply provide the compiled DLL along with the generated metadata file. To use the library, the end user would make a reference to the metadata file in their code editor. This would make all new functions and constants available for syntax highlighting and compilation.

If one can programmatically generate the metadata out of a compiled DLL, why would we need a separate metadata file? Can’t we just inspect the referenced additional DLLs for this information on-the-fly when compiling? Well, no… that would create a strong dependency between the CoolBasic Classic compiler and CoolVES virtual machine. The idea is to decouple CoolBasic Classic as a language from any game engines (or execution engines, rather.) The metadata format has been designed to be quite flexible and can describe much more complicated type information than what CoolVES provides currently. In other words, the metadata can span across current and future coming technologies.

All in all, I think this design addresses the requirements in a neat way:

  1. You can achieve machine code speed
  2. It’s easy to create these DLLs
  3. You can harness the full power of the .NET and Mono frameworks
  4. It’s easy to consume these DLLs. They integrate to editor, compiler and runtime automatically. No need to declare the functions in code before use

It doesn’t allow you to call unmanaged DLLs without creating a managed wrapper, though. Perhaps that’s better handled with a Declare Function statement…

VERY TEMPORARY

Today I’m going to talk about some old stuff. Don’t worry though, most of you haven’t seen it yet. A year ago, at our traditional summer meeting, I demoed some very early and experimental CoolBasic builds. The reason I want to show code and screens this old, is so that it’s easier to explain about how things have since changed in the future coming blog posts.

Back in 2013 I actually had a working environment that consisted of a code editor, CoolBasic compiler, and a debugging runtime. You could write CoolBasic code, pass it to the compiler and finally execute it. It couldn’t render game graphics or play sounds, but the very basic text input and output was in place. At that time I focused on code execution rather than game libraries. Control structures, strings, arrays, types, operators, all that kind of stuff. As a result, the demo was probably very boring to watch, but hey, at least it was executing something!

The Compiler

The compiler was naturally the number one priority to get done. In refreshment, it’s written in C#, running on .NET CLR and Mono, and is a standalone console application so it can be called from anywhere. It wasn’t feature complete back then (for example, Select…Case was missing,) but it could handle most control structures and generate the final bytecode.

Compilers aren’t very exciting, though. All you need to know is it’s now faster, more feature-rich, and hasn’t got a function limit.

A VERY TEMPORARY Editor

The VERY TEMPORARY editor

Before you go to the forums and start complaining about how awful it looks, mind the window title. Guess why it’s called “VERY TEMPORARY EDITOR”. The caps are intended. This will *not* be the editor that will ship – don’t worry. I promise.

In short, I just wanted to test a) how easy it is to integrate the compiler into an IDE, and b) could I perhaps use AvalonEdit as the editing control as opposed to the commercial Actipro SyntaxEditor. I’m already quite familiar with the Actipro component (had a chance to use it in a work project) and I know it is the state-of-the-art option, but perhaps that would be a little bit of an overkill for my purposes.

As it turns out, AvalonEdit is just perfect, at least for the starters; I can always upgrade to Actipro later. AvalonEdit offers configurable syntax highlighting out of the box, supports code completion popups, and is generally fairly extendable. Syntax highlighting definition is loaded from an external XML file, and the list of commands provided by the game engine is imported from a special “framework definition file” that I can generate automatically off a compiled executable or DLL (via reflection.)

It was pretty easy to invoke the compiler, have its output written in a textbox, and parse off any errors it would report. All in all, a successful little test editor.

A VERY TEMPORARY Runtime

The VERY TEMPORARY runtime

That’s not a real game engine. It’s actually just a normal WPF application powered by the new Cool VES virtual machine. In fact, the only available commands are Print, Input, and Timer (for benchmarking.) The intention was to establish a simple “console” which would provide basic input and output so that I could test that the virtual machine doesn’t corrupt the virtual stack or leak memory at any point.

For this reason there are some debugging features available. At any time, I can click this cute pause button:

The pause button

This will halt the virtual machine that’s executing the code. While paused, this UI becomes available:

Metadata: symbols

Debug info: metadata

This view lists all functions and variables and their types. Symbol information is needed for a number of reasons. Firstly, the debugger can emit more meaningful call stacks when the functions’ names are known. Secondly, the runtime can perform proper clean-up when returning from a function as it knows which resources are stored in heap.

Bytecode

Debug info: disassembled program

This listing represents the current program in its “disassembled” form. Here I can see that the program was decoded properly and matches with what the compiler spat out.

Managed resources

Debug info: managed resources

Remember how LoadImage would return a handle that you’d then store into a variable for later use? These handles are called “Resources” internally in Cool VES. For more efficient memory management Cool VES keeps a list on what has been loaded. It’s not a real (unmanaged) memory pointer, but a reference to an internal object that also contains metadata of that object.

Interestingly, also strings are managed resources and they, too have handles that are manipulated every time a string is stored in and consumed off the stack.

Call stack and locals

Debug info: virtual stack

If I want to see the low-level state about the executing program, this is the view I’m interested in. I can inspect the values of each variable, for each function within the call stack. This information, of course, would be presented in a more intuitive way in a real debugger.

So that’s how things were a year ago. Nowadays the compiler is pretty much feature complete, and the real code editor is in the works. I also did some engine experiments based on the DirectX10 interface (initially on DX11, but for whatever reason DirectWrite that I use for text rendering isn’t easily usable in it.) More on these topics in future coming posts.

Branching rules

Last time I talked about synthesis and IL generation. Currently, half of the statements already write valid byte code. I have now completed the branching infrastructure that I mentioned in my previous blog post, and most conditional structures and loops are now prepared – including the If, While, and Until structures. The For-Next loop and the Select-Case structure are next. But before going there I decided to wrap up a quick summary about branching rules and what they mean in terms of code compilation.

First of all, there’s a mechanic in place that eliminates unreachable code. For example, if you had an If condition that has a constant expression that always evaluates to false, the compiler knows this during compile time and will not write the byte code for that block. Similarly, if you had a While loop that is known to be always true the compiler doesn’t write the byte code for the expression at all. Code elimination also applies to nested code blocks, meaning that everything inside an unreachable code block is ignored when the byte code output is written. Next, let’s look at the specs:

The If structure

  1. The If and ElseIf statements must know the next ElseIf/Else statement in order to branch when the expression is false
  2. The If, ElseIf and Else statements must know the EndIf statement that is associated with the condition chain; The If statement needs it in order to branch when the expression is false, and the ElseIf and Else statements need it in order to issue an unconditional jump to indicate the end of the previous block
  3. The ElseIf and Else statements must add an unconditional jump to the corresponding EndIf statement before any other byte code unless the statement in question is and ElseIf with a constant value expression of false
  4. If the If/ElseIf expression is a constant value, don’t write expression byte code
  5. If any of the If or ElseIf blocks evaluated to a constant expression of true, don’t write byte code for any of the subsequent blocks in the same chain
  6. If an If block or an ElseIf block has a constant expression of false, don’t write its byte code. This includes nested code blocks

In addition to the rules listed above there are some minor tweaks that accomplish cleaner byte code output; I’ve eliminated some branching instructions that are not needed, for example.

The Repeat-Until structure

  1. The Until and Forever statements must know the corresponding Repeat statement in order to branch to the correct location
  2. If the Until statement’s expression is a constant value of true, do nothing
  3. If the Until statement’s expression is a constant value of false, add an unconditional jump
  4. The Forever statement just adds an unconditional jump

The While-EndWhile structure

  1. The While statement needs to know the corresponding EndWhile statement in order to branch when the expression is false
  2. The EndWhile statement needs to know the corresponding While statement in order to issue an unconditional jump
  3. If the While statement’s expression is a constant value of false, don’t write byte code for the expression or code block
  4. If the While statement’s expression is a constant value of true, don’t write the expression’s byte code
  5. The EndWhile statement only adds an unconditional jump to the While statement

Controlling loops

The old “Exit” statement is now renamed as “Break”. It exits the Repeat, While, For, and Foreach loops, and continues execution from after the loop.

Similarly, a new loop control statement has been added. The “Continue” statement continues the encapsulating loop from its next iteration pass, but doesn’t exit the loop unless the associated condition dictates so.

The implementation of these is quite straight-forward; The Break statement needs to know the loop’s ending statement, and the Continue statement needs to know the loop’s starting statement. Both issue an unconditional jump.

I haven’t yet implemented these, and will retain doing so until I get the For-Next structure done. I might write a similar blog post about the For-Next structure and Select-Case structure next time…

Moving from analysis to synthesis

Now that I’m on my summer vacation, I’ve had some time to return to the CoolBasic Classic compiler again (after a month or two of slacking, pardon me), and I’m happy to say that the analysis phase of the compilation process is now basically done. After code analysis the compiler has all the information it needs in order to produce the final byte code output. This phase is called “synthesis”, and it executes by recursively iterating through all statements based on their scope. This approach enables the compiler to omit byte code generation for unreachable code blocks such as If/ElseIf/While/Case blocks that have a constant expression that evaluates to false. Moreover, we could have those user-defined Functions and Subs who aren’t used anywhere in code not to be included to the output IL at all. However, it’s important to process all those code blocks fully even though they wouldn’t get written to IL because we want the compiler to validate the entire source code.

Some of the statements are already written into the output byte code stream such as the assignment statement and a Sub call. Similarly to function calls, the compiler inserts any omitted optional parameters and injects any type conversion operators that might be needed. There are still dozens of statement types to process, but all of them should be quite simple to implement. I decided to tackle the If/ElseIf/Else structure next because it requires some branching infrastructure I need to develop first. Once that’s sorted out it’s easy to implement other program flow control statements such as the While and Until loops.

Branching i.e. jumping is an interesting topic on its own because it’s a common key element and therefore needs to be very efficient. Labels are local to their enclosing Function or Sub symbol, and also the Root (the main program) has its own label “scope”. This means that the programmer can have identically named labels as long as they exist in a different “stack fragment”. The Root/Function/Sub symbols encompass a dictionary of labels, and they’re populated already in the parsing phase so that all labels are known during synthesis. The compiler can therefore match forward jumps without expensive scanning. On the other hand, this technique only applies to GoTo statements – many other statement types need to generate some jump targets.

Statements are processed in the natural program flow order during synthesis. For example, an If statement is processed before the corresponding EndIf statement, but the If statement in question still needs to know where the EndIf statement is in order to jump to the proper location if the expression is false. The compiler needs to link all cognate If/ElseIf/Else/EndIf statements together so that they can access the next member in the chain as well as the final EndIf. For branching to work, the compiler stores the byte code pointer of each statement (labels are considered statements as well even though they don’t generate any IL) when they get iterated by the synthesizer. Of course the If statement can’t know the final address of the EndIf statement upon processing because the EndIf doesn’t yet have the calculated offset, i.e. forward jumps cannot be determined fully at this point.

To solve this problem, all jumps (apart from GoTo statements) are cached in a special list that consist of tuples of an IL jump instruction and the target statement. Just before the IL is physically written to a file the compiler simply iterates this list and fills in the correct jump instruction addresses with the calculated code offsets found in the completely processed statements.

We’re quite close from having a working compiler prototype for internal testing 🙂

New web server

Just a quick update. As you can see the blog now has new looks. This is part of a larger configuration regarding our web services. There’s now a new domain, coolbasic.fi that currently redirects to coolbasic.com. We now have a virtual web server instead of a standard web hotel, meaning that we can execute whatever we want on it. We recently migrated the forums to this new server, and everyone’s DNS should be up to date by now. There’s still some setting up to do such as a custom user account database and a complete CMS solution for the future website; the plan is that everything under coolbasic.fi will feature Finnish content only, and coolbasic.com will contain the international site. There will also be different discussion forums based on localization.

As a bonus note, we were 11 people from the Finnish CoolBasic community who participated in a summer cabin holiday just now. Back there, they managed to forge some motivation in me: An interesting community-driven project exists that basically implements a custom runtime & game engine for the old CoolBasic. It’s called “CB enchanted“, a very cool name I might add, and is a reverse-engineered CoolBasic byte code interpreter packed together with a fully hardware accelerated graphics system. Quite impressive! The real WTF is that these people managed to reverse-engineer the entire byte code structure, inject the engine as part of standard CoolBasic compilation process, and deliver overall better performance compared to the good old CB – and all this before CoolBasic Classic 😀

Yeah, I opened the compiler solution a few times and thought: “It’s time to finish this.”

We’ve got some real talent.

Compiler news

1,900 word wall-of-text incoming…

I returned to CoolBasic project after three months of learning XNA 3D game programming, and have been continuing to develop the CoolBasic Classic compiler for the last two weeks. Although the main focus has been with the compiler I also designed a new look & feel for the web portal, and it’s being reviewed by the rest of the Team. This entry, however, will be all about the compiler and what’s been done recently.

Can it produce byte code yet? Yes, to some extent. The underlying mechanics are in place, but nothing is written to any file yet. Basically, it can process expressions. This includes (but is not limited to) calculating the result value of a constant expression, resolving names (i.e. mapping the identifier names to their declared symbols), function overload resolution, automatic type conversions, and filling in the missing arguments for optional function parameters. A lot of thought has been put to optimize performance here. In fact, all those things mentioned above are done in a single iteration over the postfix presentation of the expression. I had to inspect how the .NET framework internally works with arrays, stacks, lists, and type conversion to make sure my algorithms were efficient enough. I actually ended up writing a few own implementations for those highly specialized cases I needed (where the framework equivalent would allocate memory in an inefficient way or do more work than needed, for example).

I’m about to go technical…

Expression pre-evaluation
Where I left off before my XNA experimentations, was the constant expression pre-evaluation. This is when the actual expression processor needs to be implemented. You first convert the expression into postfix notation and then evaluate it. Naturally constant expressions can contain constant symbols as well, so a circular reference safety check exists. Identifier names are resolved during evaluation because we need to know the data type of each value in order to determine the final data type and the ultimate value in case of a constant expression.

The expression processor is a single method any statement or symbol processor can call. Therefore we don’t know whether the expression is a constant expression or not (i.e. whether its value can be fully pre-calculated). When a symbol is encountered the first time, it will be “processed” (as in a constant symbol’s value needs to be known before it can be used in other expressions). Since the expression processor can be called by a function symbol’s own processor, in order to pre-evaluate optional parameters’ values, a circular reference safety check must exist for functions as well.

The constant symbol and parameter symbol processors simply check whether the result value was a constant and give an error if it isn’t. In addition, before the pre-evaluated value is assigned to a constant symbol or parameter symbol, it’s converted to the destination data type. I had to write my own optimized routines for this because the .NET framework is clearly a bit too slow with anything<-->string. And I learned how much pain in the ass it can be to convert a double value to string (just have a look at dtoa.c to get the idea) – I ended up implementing a much simpler algorithm for that conversion.

Even though constant value expressions are fully pre-calculated, I plan to add an expression simplifier for non-constant expressions as well in the future. It would basically turn expressions like a+b+2*20-c into a+b-c+40. However, this is a very difficult topic that will need much research in terms of grouping and ordering analysis, and I simply don’t want to be held back because of it (even though I have a partial implementation of it in the V3 compiler already). I don’t see it that important at the moment (in order to get this thing out someday I’ll leave it for now).

Name resolution
The expression is processed once token by token. Each encountered identifier is resolved by its name. This process is called name resolution. I have three special resolvers: Type resolver, Symbol resolver, and Overload resolver. The type resolver is called by a symbol processor (each symbol is validated by its own processor). A symbol processor exists for all symbol types. Most symbol processors resolve the associated data type, but some do additional work such as constant symbol processor who calls the expression processor in order to cache the pre-calculated value. The type resolver is the simplest one: since all type symbols must be declared in the root scope we can direct the search to that to begin with, and also only look for symbols classified as Type.

The identifier resolver is a tad more complex. It is told the context symbol and/or scope and whether the search should be locked to that context only or should it also be extended to upper levels if the name is not found immediately (one example of a locked context would be the dot notation path “a.b.c”). Unlocked search is recursive. In CoolBasic Classic, the main program exists in root scope, but it is not the root symbol; the Global symbol is the ultimate root node of the symbol tree, and all imported framework/engine functions such as LoadImage or MoveObject exist there. This means that you can override them by defining your own functions with the same name. The reason we tell the resolver both the context symbol and scope is that, for example, functions’ parameters don’t have scope during optional expression evaluation. In order to access constant symbols defined in the main program’s context, the search needs to get one symbol level up instead of scope level. Furthermore, functions’ local scopes are isolated from the main program (for obvious reasons), and local identifier, a variable or constant, name resolution cannot therefore access the main program’s base scope’s local variables and constants.

The overload resolution is an interesting one. First of all, name resolution has already succeeded on a function group symbol. Yes, we now have an extra container for all functions of the same name. It stores all overloads of that function. All we need to do is to pick the right overload and map the identifier to its symbol. So we pass a snapshot of the calculation stack to the overload resolver, and based on the data types of those values, the most appropriate overload is chosen. This means that you could have two functions of the same name, say Abs (like ‘absolute value’) that takes and returns an integer, and one that takes and returns a float. The compiler would then pick the right one based on the context in the expression, avoiding unnecessary casting and loss of precision due to choosing the wrong one. Should there be more than one equally qualified overload candidates, an error is generated. Should there be no qualified overloads at all, an error is also generated.

Completing expressions
Now what’s really new here is the injection system. It allows the addition of tokens in the middle of the token list without causing “memory move” operations like List.Insert() does (and yet we’re not operating with a linked list here – it’s not efficient enough in .NET, and I want to avoid the memory overhead generated by it). Omitted arguments for those function parameters classified as “optional” are a good example of injected tokens. But the feature also has one other very important use, for we’re also injecting type conversion tokens. This works with intrinsic data types (integer, float and string). Whenever a value of the wrong type is provided, CoolBasic Classic will try to convert the value to the correct destination type. For example, if you provide array indices as floats, they’ll be automatically converted to integers. Similarly, the function overload resolver tells back which values need to be converted and to which intrinsic data type.

We’ll probably be adding explicit type conversion operators to the language at some point as well; we’re just not sure about their naming yet. Just don’t be disappointed if you don’t see them in the first alpha or beta.

Another cool feature I’ve mentioned before is the presence of short-circuit And and Or operators. They are now fully implemented. They’re different from other operators in that while pre-calculation occurs in the same way as processing any binary operator in a postfix calculation stack, they actually produce byte code with conditional jumps instead of a single operator instruction. The hardest part was to infer the offset of the jump because type conversion instructions as well as loading omitted optional parameters to the instructions list occurs all the time (so that the offset cannot be determined during the conversion from infix to postfix). However, I came up with a clever mechanic that allows the reliable offset calculation without having to give up the idea of a single processing pass.

Byte code generation
I mentioned at the start of this blog post that I’m already able to produce byte code that would calculate the expression’s value in real Cool VES environment. A lot of different small parts had to be implemented before this goal was reached, but I think it’s now working quite well. The expression processor, in addition to calculating the result value, generates the full instruction set, including conversion instructions, short-circuit magic, and injected parameters.

One particularly tricky part was to implement dot notation path processing. While simple paths such as “a.b.c” are quite easy to pull off (just lock the name resolution context to the data type of the previous member’s value), it gets a little more complicated when assignment and arrays come in to play. I hate “exceptions to processing rules” so I had to come up with a unified model that just supports normal values, dot fields, dot array fields, and normal arrays. Array variables are always pointers to their actual buffers, but the value indicated by the index (or indices) on top of stack must be read by another instruction. And since context must be locked for the name resolution to succeed, unlike functions, the array field loader instruction cannot occur after argument values (and you cannot access elements not on top of stack which I could’ve done by creating my own custom stack structure, but it’d still be fundamentally wrong thing to do). So for an array access you need two instructions; one before and one after the arguments. Just like how C# and VB.NET compilers do it. It works now. Actually, byte code generated by CoolBasic Classic compiler is VERY similar to Microsoft CIL (Common Intermediate Language) generated by .NET compilers. For example, there’s no single instruction for operator “<=”, but it’s expressed with the combination of cgt, ldc.i4 0 and ceq instead – i.e. “not greater”.

Still few operators lack implementation of pre-evaluation support, but those should be painless to write in no time. One of these is the assignment operator (that again is a bit of a “different” case from the others), but perhaps I’ll get into that in the next blog post. Now this huge central piece is mainly done, the next big goal is to start iterating actual statements like If, While, For etc. Pretty much branching in general.

TL;DR
Expressions are now processed, and refined into real byte code that can execute on Cool VES platform. The next phase would be to process all statements, and with the help of the expression processor, create the final byte code output we can execute!

Compiler part 1/2 is done

One another big part of the CoolBasic Classic Compiler is now complete. The CB classic compiler is a standard 2-pass compiler, meaning that it consists of two major sweeps over the code in its transformation from textual representation into a byte code (the binary form is then consumed by the Cool VES runtime and game engine). There’s but one purpose for the first pass; It parses all code lines and creates lists of symbols, like functions and variables. It also picks all tokens that form expressions. Expressions are used in statements such as If, While, and assignment. We chose the 2-pass approach so that, for example, functions don’t need to be declared “above” any statements that use them. The first pass is thus essentially a gathering phase, and that is now complete. The current status is illustrated in the image below:

CoolBasic Classic Compiler - status 3/2011

Green: done
Yellow: in the progress

Personally speaking, writing parsers for 40 different kinds of statements was a little repetitive and mechanical work (yes, we use specialized parsers instead of a state table and stack), but the boring part is now done. I’d say we’re definitely getting somewhere here. There are already 20,431 lines of C# code in 270 files, so the compiler project alone is pretty huge.

CB Comic #4

CB Comic #4

Introducing KilledWhale (the smartass).

Experimental file diff

Before we begin, I’d like to say that the feature demonstrated herein will NOT be part of the first release of Cool Developer and CoolBasic Classic, but I’m merely playing with a cool idea instead.

I’ve been interested in versioning and Source Control technologies in general lately. One part of version controlling is the ability to handle conflicts when two or more developers check out a file for editing, do their changes and then check them back in. Two simultanous check-outs will often result in a conflict within the source file, and these conflicts have to be resolved through merging. Merging means that both developers’ modifications and fixes are applied to the source file, without the other’s changes getting lost in the process. Merging is one thing, but analyzing the differences is another. I decided to try out some difference tracking techniques.

First of all, there are number of algorithms available. Some are easier to understand than others, and some support more features (such as the ability to detect moved lines and code blocks). The most popular method I found was the Longest common subsequence problem, and it is, in fact, used in most Version Control software. It’s a good method of producing a script that can be used to “patch” the destination file with tiny changes so that it’ll transform to that of the source file. Basically you get a set of “add this to place X” and “remove this from place Y“. Unfortunately, while this technique indeed gets the job done nicely, visualizing these kinds of changes can be confusing. When a conflict occurs, for example, determining whether the merge would break something, can be tricky.

In addition, due to how LCS works, comparing two very big files (10,000 lines or more) quickly starts to eat giant chunks of memory because the algorithm operates with a matrix of n*m, where n is the number of code lines within the first file, and m is the number of code lines within the second one. There are a number of ways how to optimize memory consuption and matrix size, but that is out of scope of this blog bost.

Ideally, when a developer encounters a conflict (and a merge needs to be done), he or she will be provided with two source files side-by-side; one from the server and the other is the local file. This view should visually present those parts that differ between the two versions. If the developer spots a problem that would break the code upon auto-merge, there should be a way to edit the file before committing. LCS cannot by itself provide good enough visual presentation because it can only tell insertions and deletions, but not modifications. For example, editing a single line would produce both “delete” and “addition” changesets.

Then I came across with this Patience Diff algorithm. All in all, it would seem to fit to the purpose perfectly – it offers a very nice and clear presentation of changesets between two files and doesn’t take an awfully lot of memory either. I spent hours and hours trying to find a .NET implementation of it, but as of yet there apparently just aren’t any (the number of “good” C# implementations of the other algorithms is substantially small aswell). So I started working with a proof-of-concept. I spent a little less than 2 days on this challenge, and finally came up with a working solution. When I’m writing this, my work is the only C# implementation of the Patience Diff algorithm that I know of 🙂 . As a side note, there’s seemingly only one Version Control product that uses Patience Diff as its main tool, Bazaar, but apparently one can optionally enable it for Git aswell.

Here’s a sample of two slightly different files: the left side represents the original document, whereas the edited version is on the right:

Different files

The following difference graph can be generated from them (green = insert, red = delete, yellow = modify):

Difference analysis

So what does this mean for CoolBasic? I don’t really know yet, but I have some ideas 🙂

Copyright © All Rights Reserved · Green Hope Theme by Sivan & schiy · Proudly powered by WordPress