CoolBasic Program Skeleton

The leap from a procedural language to an object oriented language is huge for both the designer (me) and the user. I have established the guidelines and practises how CoolBasic programs need to be built, and I’m already seeing light at the end of the tunnel (which doesn’t, however, mean that I’m nearing the end). From the user’s point of view the change is so big that CoolBasic V3 can be considered completely different a language from what it used to be. And it is. You are going to be writing slightly more code to achieve same results compared to the current CoolBasic (that never got past its Beta stage haha).

In OOP languages, all executing code must be written inside a procedure. I say procedure, because it’s a uniting term for Functions, Properties and Operators. This means that the main loop of a game must also be inside a Function. Since functions can only appear within the module context, you must define the containing module or class. You can’t call the starting code from outside, so the concept of Main Function needed to be implemented. What this means, is that you must always write Function Main() somewhere in your root level Modules. It operates as an entry point to your application. Only one Main Function can be defined. The concept is similar to .NET languages and C/C++.

It may seem like much, but in the end it isn’t. Sure you have to build all programs in a certain way based on a skeleton, but after that things get simplier. The following code demonstrates a simple HelloWorld program. It will require 11 lines of code whereas the old CoolBasic only required four lines. But you really can’t simplify things any more than that in the OOP world. This can be confusing to newcomer programmers, so perhaps when you create a new project in the CoolBasic code editor, this skeleton could be automatically typed in – for reference and easy start.

Simple HelloWorld Example

Simple HelloWorld Example

Please note that the example program above is subject to change, and is most likely not final. Most of the example’s code lines belong to the frame you need to define. The main loop is still quite simple. First we open a windowed game screen of 800×600 dimensions. Then we loop two lines of code (text drawing and screen updating) until the Screen is closed. The biggest challenge is to memorize all new classes and their methods. A brand new, comprehensive and user-friendly manual will help with this, and perhaps I’ll implement some kind of an intellisense-prompt to the editor at some point.

A short update on CoolBasic V3 compiler progress: The above example fully passes the lexer, and soon the parser (Pass#1) aswell. All in all, I need to implement the parsers in the same order as one would write code to a program; containers first and then the statements. It’s slow in the beginning due to the complex structure of, say, Function statements and such, but the rate should accelerate when I progress through the simplier statements in the end. I recently finished the Function statement parser which is one of the largest and most complicated parsers, and I feel good about it; solid ground is the key to success. I expect to implement the Class and Interface parsers in the near future.

Parsing in Segments

Pass#1 is responsible for so many things it needs to be designed very carefully. I have come up with a plan about the remaining compiling phases: There will be 3 or 4 passes in total of which Pass#1 is the parser. Its main job is to compile the symbol table for the source code, and generate variances of the generic classes and functions. In short, it will prepare the source code for Pass#2 where the actual compilation of each procedure takes place. Of course, this involves making the code as clean as possible by removing modifiers and certain other structural “particles”.

Parsing in general can be quite difficult a task, but this time I’ve taken entirely different approach. As we begin Pass#1 (for each line separately) some basic syntax checks have already been done. Therefore we can safely assume that the line always begins with:

  • Pointer literal
  • Identifier
  • Label
  • Full-line special data segment
  • Keyword

Each of which launches a mini-parser of their own. These include ParseKeywords(), ParseStatementValue(), ParseAssignment(), ParseParamList(), ParseDeclaration() and ParseModifiers(). They scan the terminals of the current line by segments, i.e group terminals based on which class they belong to. All this is done recursively so in case of a failure, the entire call stack rolls back, ultimately terminating the entire compiling process. The pros of the recursive parsing is that it allows a flexible identification of expressions which belong to a multi-part statement such as the For-statement.

In addition, all in-built statements have their own parsers that make sure the keyword has been used within a proper declaration context, and that those statements that define a code block, follow a valid hierarchical structure. As many syntax checks will be made as possible, but the final compilation in Pass#2 will detect the rest of them.

CoolBasic V3 follows very sophisticated and strict syntactical rules similar to VB.NET. This means that in order to parse a simple Break-statement I must first establish the base for parsing modifiers, Modules, Functions, and Repeat…Until. In another words, I must first set up alot of functions just to get things running by putting it all together for the minimum. Lots of code is required before I can even run the basic tests (for example scope list and the symbol table). Pass#1 is coming along nicely, though.

Compilation: Pass#1

Since I decided to add cross-reference support for CoolBasic (see blog entry “Highly Modular”), some major changes to the entire process were needed. Therefore, I’ve rewritten the lexer in preparation for Pass#1. Due to the nature of these changes, the way inner lists work also needed a review. I’m quite happy with the results. Lexing algorithms improved, for example the assignment operator is now better distinguished from the comparison operator. The assignment is now automatically cast for statements like Optional in function declaration, variable declaration, or the For…Next assignment. This will ease further parsing in Pass#1 and Pass#2.

Pass#1 compiles the symbol table i.e list of all identifiers. Basically, the compiler will scan for declaration statements and then create a collection of their names. Some secondary information such as path, relationship, block entry/end and signature are also formulated. Pass#1 can also process the hierarchy integrity as well as perform some initial checks to the syntax. We can’t create exact pointers to parent entities yet because program structures can be declared in random order within the source. Other tasks include generation of enums and trimming the code from entity modifiers. In general, Pass#1 is just a fast sweep and it doesn’t analyse the syntax in great depth. Deep scan belongs to Pass#2 and perhaps Pass#3.

Constants are interesting because they can be used to define each others, for example Const a = 1 : Const b = a + 3. It’s, however, possible to create a circular reference which renders the evaluation of the constant’s value, impossible. You can’t make exceptions to the general reference resolver (requiring Constants to be always declared before using) either. Being unable to specify constants as part of another constant expression, is stupid, too. This means I have to implement a recursive search at later passes when constants’ true values are pre-calulated. Pass#1 doesn’t yet do anything about the constants’ values anyway. In VB.NET constants aren’t really replacing values within the source; they’re constant variables i.e Static ReadOnly. They are, however, pre-evaluated at compiling time to determine constant expressions and thus discarding false condition code blocks. This is where CoolBasic will differ – constants will be transformed into literal values which means speed increase at runtime.

More about Pass#2 when I’m done with this…

Highly Modular

I spent most of yesterday sprawling on the couch thinking about the fundamentals of programming languages and how, for example, VB.NET has most of its modern features implemented. I came into conclusion that it’s absolutely necessary to discard the idea of one-pass top-down parsing when the symbol table is created. Most of the old-fashion procedural languages (which includes the majority of BASICs), scan the source in two or three passes, but always so that when a new identifier is encountered it must have already been defined; otherwise the compiler would report an “unknown identifier”, and the process terminates.

Since all code in OOP-languages must be written within a container (Module, Class, or an Interface), there’s a problem when you must declare identifiers with each other’s type. There are no “prototypes” to introduce identifiers before actual declaration. The following code illustrates the problem:

Class Class1
Public c2 As Class2

Class Class2
Public c1 As Class1

The old-fashioned compiler would stop at the 2nd line of Class1 because the abstract data type Class2 is not yet declared anywhere within the preceding code lines. Therefore we need two passes at the minimum when we compile the the symbolic table of the entire source code. Practically speaking, CoolBasic will still compile the symbolic table during the first pass, but some information of each identifier such as its data type doesn’t yet point to the structure of an identified data type. Instead, there’s merely a string-based reference, which will be taken care of in later compile parses.

Structure of a CoolBasic program
The ROOT level i.e source file context can only contain Modules, Classes, Interfaces and the Imports statement. The Module declaration context, can contain Classes, Structures, Procedures, Enums, and variables. Classes can contain Classes, Structures, Procedures, Properties, Operators, Enums, and variables. Structures can contain Structures, Procedures, Properties, Operators, Enums, and variables. Procedures and Blocks can only contain executable code and variable definitions. Those variables are considered Public, but local. A block can’t define a variable that shadows parent level local identifier. The “variable” here also refers to any Constant. Identifier match resolver operates in standard dot notation rules, but each Imports declares a special namespace. If an identifier reference couldn’t be resolved, the compiler will attempt to match it with these namespaces as a prefix of the identifier.

Minimum language feature requirements
Object Oriented Programming unloads some very powerful tools of which I can take advantage of in creating more sophisticated structures. As I was lying on the couch this huge idea came to my mind… could this be possible?! What if you actually wrote some of the very basic features of any programming language by hand… in CoolBasic itself. Things such as strings, arrays or linked lists. So I spent some time mapping what are the bare minimum requirements the language absolutely needs to support, in order to provide everything essential so one could write all other features that are considered “standard” in today’s programming languages. These are my findings:

The language must support at least:

Basic program flow, Modules, Classes (with full Inheritance support), Procedures, Declares, Structures, Variables, Constants, Properties, Operators, Generics, certain in-built Operators, The ROOT data type

Can be implemented later:

Imports, Interfaces, ForEach

Need to be implemented before starting to write the essential classes in CoolBasic itself:

Memory, Console

Can be written in CoolBasic itself:

Integers, Floats, Strings, Arrays, System services

Sophisticated features written in CoolBasic itself:

List, LinkedList, Stack, Queue

Game related:

Graphics System, Object System, Input System, Sound System, etc.

“Imported Packages”
I have this idea of “Open Source” where all these modules written in CoolBasic itself, come with the package. You can see how they’re done code-wise, and you can even modify or optimize them. These modules would be located in an “Imports”-folder or something like that. The compiler would automatically import the standard modules, but nothing will stop you from adding in your own modules and by using the Imports statement, include them into the compilation. There’s one down-side to that, though: It means that *everything* will be compiled *every time*, slowing the process a bit. One way of solving this problem, would be a support for pre-compiled libraries (CoolBasic-made of course), but I need to do some more research about that.


As a new feature, CoolBasic V3 will support class properties. You may have already seen a hint for property support when I discussed about the String implementation earlier in this blog. Properties work just like any other Public variable, i.e you can access then by using the dot notation, but with a added twist. Normally, when you either assign or read a value from a Public variable nothing else will happen. But with properties you are able to capture this event, and execute your custom code to get the job done. Basically, we’re talking about old-fashioned way of creating getter and setter -functions. Without properties you’d have to write these two functions in order to access, say Private variables:


With properties you could do the same in more decent way:

MyClass.Name = "hello"

You still define the getting and setting code within the property structure, but done so they’re unaccessible from outside. So you have nice little black box of which workings not even the class members know about. There’s a difference in using the property name than assigning the value to the private variable in question by hand – even when all the setter does is the assign. There’s always code involved when accessing property values.

In addition you can define the property as ReadOnly or WriteOnly. By doing this, you protect some data that must not be modifier from outside. One example of a ReadOnly property would be the length of a string. The property returns the value of a Private variable, however this information is related to the actual data within the String class, and should only be updated when this inner data changes. It’s the only way of keeping such information in sync in a secure way. Now if you come to think about it, constants are ReadOnly, too (except that CoolBasic compiler doesn’t work like .NET compilers in this regard).

Since properties are really a set of methods, they can be inherited. This also means that things like overriding, are possible. However, at this point in time, I’m still a bit unsure about one thing called Default property. A property can be defined as Default when you access the object without providing further dot notation. Sound like normal object reference? Well, there’s a catch. In VB.NET Default properties always need atleast one parameter which means you must include a set of brackets to it. A typical example would be the myVar.Item(index) property you could refer to just myVar(index). I don’t like this because it’s easy to mix this to a method call.

On the other hand, I would still have to allow parameters for properties because it’s the only way to implement indexes to them. Like above. If I ever wanted to implement LinkedLists or Lists or any variant, I, in fact, must allow this. There are still lots of questions floating I need to chew on.

Scope rules in CoolBasic V3

Warning, wall of text incoming!

“In computer programming, scope is an enclosing context where values and expressions are associated. The type of scope determines what kind of entities it can contain and how it affects them. Typically, scope is used to define the visibility and reach of information hiding.”

The scope of an entity’s name is the set of all declaration spaces within which it’s possible to refer to that name without qualification. In general, the scope of an entity’s name is its entire declaration context; however, an entity’s declaration may contain nested declarations of entities with the same name. In that case, the nested entity shadows, or hides, the outer entity, and access to the shadowed entity is only possible through qualification.

So much for theory (the quoted text was from Wikipedia). Now, what does that mean for CoolBasic…

In VB.NET and C# there are mainly two kinds of scopes: We have a root/namespace/class/structure context, and then we have a method/block context. The main difference between the two is how data can be referenced in Object Oriented Programming. Class instance variables (objects) store their values in heap because they are reference types and their life span is not determined beforehand by the compiler. Procedure variables, however, store their values in stack. It’s a static space in memory where procedure variables temportarily reside, but from which the space is automatically freed when the life span of all those variables ends, i.e the procedure exits. Because of the stack, recursive calls are possible.

It’s important to realize that the program structure is highly hierarchical in Object Oriented Programming. It’s all about references to other objects and their parent-child relationship. Gone are the days of procedural programming and direct function calls where the only way to share data between two functions, was to introduce a global variable to hold that data. Now, global variables are quite bad a concept and frowned upon by most programmers today. That’s because in general, it’s often difficult to keep track of the value of a global variable plus all entities that can modify it. In the end, there’s more cons than pros when we’re dealing with large and complex projects.

I will not talk about global variables (or the lack thereof) just yet. Nor will I define concepts of Inheritance, Shadowing or Overriding because in all probability I’m going to do so at some point in future anyway. For quick reference, “inheritance” means reference-based access to a parent class from a derived class. “Shadowing” means identifier matching to the nearest level upwards within the program tree. “Overriding” means priority of methods, operators and properties through inheritance for identifiers with matching signature. “Implementing” means copying functinalities from an interface-template.

In VB.NET and C# there are some restrictions what you can and can’t declare within each type of entity. The basic concept is that all code is written inside a class. The only exception would be the root context which itself behaves like a class. However, root level should have the main object created as soon as possible so the program can be redirected to true class-based execution. Classes can have methods (functions), operators (kind-of functions, but executed as operator instead of value), properties (a public attribute with functional purpose) and even inner classes. In addition, there can be structures (similar to classes), enums, constants, and attributes (variables). I’ll describe some restrictions that were mentioned in the MSDN documentation, but were never explained or justified why you can’t do these things. I had to figure these out myself, and there might still be some caps behind my logic. However, trying to understand the problem by oneself tends to be quite an enriching and valuable experience.

A function cannot be declared within the Block or Method context
A Block context does not have an identifying name that you could use to access it by using the dot notation. This itself is logically wrong. In addition, the member access operator (dot) only works with references, and neither block nor function is a reference. This means that functions don’t get unique memory addresses that are visible to the programmer. Even if you had a pointer to the function, it’d be still pointing to the code entry point and not to the actual member variables. Function members can never be Public. They can’t be Protected either, because functions cannot inherit from other functions. So function members are Private which means you couldn’t access them from the outside anyways. Function variables and their memory addresses will be determined during runtime because they reside in Stack. This offset can’t be calculated at compiling time. References, in the other hand, point directly to member variable addresses so the reference is always direct and valid.

This, however, doesn’t explain why the compiler couldn’t just enforce Private functions within the Block/Method context (all identifiers are local in these context levels). Firstly, there’s the logical issue: Why on earth would anyone want to defined private functions inside other functions. It’s like restricting yourself from data you’re eligible to use anyways – for no gain. Secondly, there’s the technical issue. While I *could* allow this kind of declaration in CoolBasic (even though it’s forbidden in .NET-languages), I think that it would be better for the compiler’s sake to prevent from mixing two separate scope definitors. Private/Public/Protected are intended to define reference context accessibility, the inheritance thing. Whereas keyword Dim is meant to define identifiers within nested local scopes. There’s no equivalece to Public in Method/Block context (if there was, you’d be able to call conditional code from outside, which could potentially break the program flow integrity), as explained above.

So let’s just assume that we have a function inside a local scope (another function or a block) with no separate access modifier in the declaration. The declaration is now legal as long as there are no inheritance-related modifiers or flags. We call this inner function only from inside of the hosting scope. What’s wrong with this? Well… nothing really, besides the odd visibility and reach. Can we implement this? Probably. Should we allow this? I don’t know. It just happens that when you enter a method scope you can, in fact, declare an identically named inner function. In this case, which function does a recursive call direct to? In normal rules of shadowing, it would be the same scope of the calling code unless qualified with Me or MyBase. The problem is, these two variables always refer to a class so you can’t point to the inner function. Either you use Me.myFunc() for the outer function or just myFunc() for the inner function. Also, it matters whether you make the recursive call from inside the inner function or from outside of it. Add different function signatures and you have a nice soup of failure.

As what comes to declaring classes within functions, the same rules kind of apply. You can’t use function name as parth of the dot notation, to point to a class or nested function. Even though the issue would not be reference-based at runtime (procedures get their fixed location within ASM at compiling time). It’s about the syntax and whether or not it’s allowed to do inconsistent/logically wrong things. Function references within the source code should always contain a set of brackets – which itself satisfies the member access operator (dot) because such a token will be identified as value evaluation.

A structure or enum cannot be declared within the Block or Method context
First of all, being unable to declare structures within procedures, but still being able to do so outside (which the function could use), made no sence to be. There shouldn’t be any restrictions becasue you can declare variables normally. Only so that the new structure/enum would be visible locally within the method scope. Some rules that were explained above still apply, but I think I have come up with a better explanation. Picture this: You have a structure called “myStruct” within a class context. Then you have a method that takes a typed variable (myStruct) in as a parameter. Because the scope has now changed, you can declare another structure that has the same name within the method. When the source code gets parsed for the first time, all identifier definitions are scanned. While it wouldn’t normally matter where you have the structures defined and still being able to use them from anywhere within its scope, it just happens that variables are identifiers, too. So they get scanned in the same way. You *could* do multiple passes for the source code to scan in layers, but in OOP this can be a bit problematic and slow. So in this scenario, you can define a variable as “myStruct” before and after the inner Structure is declared. Which means we have two different types for those variables even though they were typed using the same name. Confusing, eh?

When you think about it, the answer to all these problems could be the simplest one: They don’t want you to write structural definitions (such as structures, enums, functions or classes) to a code block which has executable code part (and thus no no declaration part that is separate). The basic priciple is that everything belong to their little boxes, and only access modifiers, inheritance and shadowing rule out information hiding and reach because they work in both ways in the program hierarchy. I think there’s enough reason already to conclude this think-tank into following the rules the wise guys at Microsoft have agreed upon, and not to allow nested functions in CoolBasic V3.

Implementing Strings

Every time I have started refining CoolBasic there’s a number of issues that keep haunting me about how to implement them in the “proper” way. One of these topics is handling strings. Strings are not values because they have dynamic lenght and memory consumption and thus can’t be stored in the stack where all integers and floats reside. Strings work as references instead, meaning that the string variable is really an integer pointing to the actual position in memory which holds the actual string data. Every time a string is modified in some way (when its length changes), a new memory block will be allocated for new string data, and the old string will be freed. Basically, the string pointer changes in order to keep the reference up to date.

Because of this behaviour it’s problematic to free the string with rest of the values when, say, the containing procedure, block or class instance ceases to exist. You’d just lose the pointer, but the string itself remains in memory with no way to access it again! You could require the programmer to manually destroy all reference objects when needed, but it’d be kind of stupid to make strings part of that since they are intrinsic data types (i.e those literal values that come in-built with CoolBasic). That’s why you don’t want to force the use of the “New” keyword when declaring string values and variables. We’re after VB.NETish syntax, after all.

I was first thinking of implementing strings as a true CoolBasic-written class. However, it would mean that the full class source was included at the beginning of every CoolBasic program, increasing line count and thus processing time. It would also require the standard New-assignment for each string. But that was not the main problem since it could easily be overridden by pre-compiler that transforms the syntax during the compilation. In the other hand, had CoolBasic an automatic Garbage Collection system the strings would get deleted automatically with the rest of unreferenced objects. However, CoolBasic does not have such mechanics (the programmer needs to take care of freeing objects by calling their destructors).

The following code will illustrate the loss-of-pointer problem. Consider:

Dim a As String = "A"
Dim b As String = "B"
a = b.Trim()

First, pointer of string “B” will be evaluated and pushed into the stack, followed by function call Trim(). New string with leading and trailing white-spaces removed will be created and pushed into the stack, replacing the original pointer. Now, variable “a” contains pointer to existing string “A” in memory. But the pointer will be replaced in the assignment, and since strings are always unique i.e two variables cannot have reference to the same string in memory, the pointer to string “A” will be lost for good! It will remain in memory with no way to access nor free it in the program. This is called memory leak, and is considered bad programming. It can lead to memory starvation.

Luckily, there are only three things that need to be taken into account to prevent this. Firstly, every time a string literal occurrs, it will be copied and then its (new) pointer gets pushed into the stack. The template never changes, and we now have two identical strings at different memory locations. Secondly, every time there is an assignment to a string type variable, a .Finalize -property will be added by the compiler. And finally, we will take advantage of one of the new features of CoolBasic V3, class properties! Every time a scope (be it a procedure, class destructor or just a code block) ends, there will be automatically added .Finalize=Nothing -calls for all string variables, including arrays. Finalize is a WriteOnly property of the intrinsic String class which acts as a delegate function. This means that we can inject some program code before the actual value assignment happens. Basically, we’ll just deallocate any existing string at the current pointer and then assign a new pointer.

The compiler will transform string assignments like this:

a.Finalize = b.Trim()

Of course this behaviour differs from normal instance variable handling since there can be several references to same object. So far, this is by far the most sophisticated and proper way of handling strings any CoolBasic generation has ever implemented, and I’m very happy about it right now. For the moment, strings are not yet implemented, but at least the concept is now thought through.

Happy New Year 2009!

Oh snap, how long has it been? Like 6 or so years since CoolBasic was born! Let’s take a few moments to look back. We had a slow compiler stuggling with tons of syntactical exceptions that lead to all kinds of work-arounds and tanglented compiling process. What the compiler ended up producing with, was an unencrypted intermediate code (aka pseudo-code). There was even an option in the editor, to not include the program code into the executable, but to produce a small separate file that could then be “loaded” to the virtual machine. The VM tried to match files in the same folder with similar name to the VM executable itself. Now I laugh at this concept… it was sooo insecure it’s not even funny. Both the compiler and the editor was written in Visual Basic 6.0 and the Runtime was a BlitzBasic interpreter.

Back then, the interpreter was just a renamed executable file which during the compiling process got copied into the new location with its file extension changed into “.exe”, and having the pseudo-code included at the end of it. The interpreter automatically tried to open itself and to read the intermediate code based on a recorded offset where it could calculate its exact position within the executable. Of course, little things such as, say, modifying the icon resource of the executable was not supported because it was possible that the intermediate code offset within the file might not remain in sync wih the recoded offset. In short, the first CoolBasic was quite a mess.

This is only known by those who have been with us since the beginning, but CoolBasic was accused of copying BlitzBasic’s functionality – effectively creating a “wrapper”. This eventually lead to CoolBasic redesign where the Object System was born! I also wanted to get rid of VB6 (it’s simply crap), so I wrote a new editor. Had I been smart I would have written the compiler from the scratch at this point, too. But I didn’t. It’s still a VB6-piece-of-sh–.

The basic way of creating games with CoolBasic was now centered around the Object System, but of course some command sets were still similar to Blitz’s like the sound engine. Unfortunately, getting rid of those it would require a complete rewrite of the Runtime and having CoolBasic not created in BlitzBasic because of its limitations. Blitz was becoming a large problem for me, and I really wanted to dump it and have independant CoolBasic framework!

That CoolBasic is the one you’re using at the moment. Yeah, I know how limited and bad it is. That’s why I started writing it all over about 2 years ago. There was supposed to be a new compiler, new editor, new tools, new runtime. No VB6, no BlitzBasic. Independent platform, and limitless opportunities to plug-in new things. Needless to say, I was a student at that time so the project kind of froze until since I started to write this blog. I reopened the CoolBasic project, but this time I really wanted to get things done right straight from the start (so I don’t need to rewrite CB again next year). The result is CoolBasic V3 being under development. It’s progressing slowly, but as I said, I want to get things done properly. Careful design takes time.

I want CoolBasic to become more like VisualBasic.NET (what comes to the syntax, atleast). It will be fully Object Oriented, and the ways of Procedural Programming will be gone. I can’t emphasize enough how big a step this will be. If I wish to implement new programming trends and get CoolBasic fully competitive with other languages, OOP is the way. If CoolBasic V3 was just a rewrite of the original CB i.e procedural language then it would probably be already done. However, getting the system set up in a way that will support things like classes, polymorphism, inheritance, overloading, properties, unlimited nesting, protected code or even class templates will take very careful planning, and lots of iteration.

So I’ll be around for year 2009, developing CoolBasic V3! I hope it gets published at some time this year 😉

Copyright © All Rights Reserved · Green Hope Theme by Sivan & schiy · Proudly powered by WordPress