Sneak Peek: CoolBasic V3 manual

In parallel to compiler development, I have been designing the future-coming CoolBasic user manual for the past few weeks. I still use nearly 100% of developing time on the compiler, but the technical solution for the user manual is something I should keep an eye on as early as possible. Nothing too fancy, but I think it’s decided now how I’ll implement the manual (I have a working prototype already). Just like in the current version, the new manual will also be based on webpages. The exact same manual will be accessible both online and offline. You will be able to choose which version is the default when you access the manual from the editor. This way the end user can see always up-to-date online version if he/she chooses so, although the offline content will be updated regularly as well. It’s still uncertain whether or not those users who have a forum account, could write comments on the online manual pages.

Since the exact same manual is also available online, I must ensure that it’ll render the same way on all of the popular web browsers. This has been given me some headache. The offline manual relies on Internet Explorer WebBrowser control anyways, but as it’s well-known fact, IE doesn’t really follow the standards very well – especially IE6! A fresh statistics on StatCounter.com indicate that 1/4 of all Internet users worldwide still use IE6 as their primary web browser. Basically this means that the manual must support atleast IE6. This sets some limits and workarounds I have to consider when I design the CSS and the dynamic functionality for the manual. I test the manual with IE6, IE7, IE8, Opera 9.6, Opera 10, Firefox 3.0 and Safari 4. For some reason, I couldn’t be arsed to install Google Chrome yet. Also, BrowserShots.org comes to good use.

There’s always atleast one browser that fails to function like the rest. IE defines its own standards, naming attributes differently in javascript. Some browsers just function differently in some javascript cases, for example the “name” or “id” attributes combined with “getElementByName”. Frames render differently no matter how you define the borders. And then there’s Firefox that refuses to disable escaping in XSLT despite all the whine that has been roaming for several years now. Due to all this, I had to write some “hacks”. But atleast it works now. Why can’t all browsers just follow the darn standards! Why must they define their own rules that only apply to them and not to any of the rest!

Enough whining, let’s talk about something else… like the looks of V3 manual! There will be a treeview. The manual will be hierarchical. As opposed to the current manual, you can now see the entire path to the currently open page, as well as easily change to other articles that belong to the same context. The treeview operates in an intelligent manner, making sure the expanded listing will not grow too big; only the relevant contents will be shown, the rest of the pages always remain collapsed. Some users, however, like to browse the treeview with several separate parent nodes open at the same time. This isn’t possible because of the intelligent collapsing (and it’s intended), so to help browsing back, there is a “breadcrumb”-like collection of recent pages at the top side of the manual. This collection updates dynamically, reorganizing itself so that duplicate elements don’t appear.

The color scheme will probably be the same as the current coolbasic.com placeholder website (it looks so cool) with black background. I’ve made some minor improvements to the style sheet so that the layout fits better for manual pages. Captions will be aligned to the left, and their colour is now Orange. Links will remain blue. Text is still light silver. Emphasized text is now bright white and bold as opposed to current orange colour. In other words, every colour has it’s own, unique meaning. The test pages I have look very good at the moment. I also tried pastel colours with white background, but somehow it doesn’t look as good. The code editor will have white background, but this doesn’t fit to the dark theme of the manual. Therefore I’ll have to reconsider the source code colours for manual pages. More about that later.

Although the prototype is there, I won’t provide a screenshot just yet. There’s still tweaking to be done. I’m probably going back to 100% spent developing time on the compiler.

Compile time error messages

It’s been two and a half weeks since the last entry. It may seem a long time without any updates, but I’ve been working very hard on Pass#2 during this whole time. I finally managed to establish the architecture for data type resolving and name resolution. I just didn’t want to express any thoughts on this highly experimental phase until the solution that actually works on all test cases, has been found. That being said, I’ve spent the last few days on small improvements and bug fixing. Also, this is a good spot to clean up the code and fine-tune the work done so far until I get started with the next big thing (which I won’t elaborate at this point). One thing that needs improving and what has been haunting me for some time now, is the way the compiler expresses compile time errors to the user. The coding has been so hectic from the start that even though the compiler is able to express all the needed errors, some rephrasing is surely needed. Especially for Pass#1 which features a lot of syntax checks, and thus a lot of similarly formatted error messages.

Current V3 compiler error messages
For reference, compilation stops to the first error occurred – to prevent further (possibly misleading) errors from a chain-reaction the first error might have caused. This is identical to how current CoolBasic and many other Basic language compilers report errors. If you have programmed in C and the related languages, you might be familiar with the error flood you get when you try to build a project that had maybe just few small mistakes in the source code. In most cases, the tail of that list consisted of completely mysterious errors, and only because of the preceding errors caused flawed compiling process to proceed until the compiler got too confused to be able to continue. Luckily, compilers have become more intelligent regarding this matter, and for example, Visual Studio is able to construct a list with only the essential errors in it. CoolBasic V3, however, will continue to terminate the entire compilation on first error because of safety and stability. The sooner you get back to fixing it, the better.

Back to the current V3 implementation. To save time, errors are returned as strings when they occurr. I didn’t want to gather the absolute list of Pass#1 errors beforehand because it would have changed anyways. This was fully intended at that point in development. However, now that Pass#1 is essentially finished, I know all the errors that can occurr within it. And there’s now the need of rephrasing which means I would have to Find & Replace them all. Instead, it’s better to standardize the messages by binding them to constants; rephrasing them would require the change to be made to one place only. This need was also foreseen, but now it’s time to implement it. I would have changed the error message generation anyway, but this is also a great opportunity to think over better ways to express errors in terms of how they are phrased; Currently there exists a few vague messages that don’t provide enough information. For example, what does “Statement out of context” tell to the programmer?

The ideal error message
A good error message is uniformal, informative and precise. It should also provide a hint on how to correct the error. That way the programmer doesn’t have to visit the manual or spend time to understand what the error could mean in that particular case – which improves productivity and overall relish. However, being able to tell the error line, code module, the cause and the possible error code for further reference, is not enough. What if you had separarate statements written on the same line (separated with : of course), and you got an error saying “Error at line bla bla in code module bla bla: This statement cannot appear here.” More unnecessary time trying to figure out which statement caused the error. From my point of view it wouldn’t be too much of a work to bother adding this information to the error message, but omitting it would surely add up when hundreds of end users write programs in CoolBasic every day. Picture this: “Declaration context of ‘Const’ is invalid.” Sounds better? Sure, but something is still missing.

Normally I would add a hint on how to correct this error by specifying where constants are allowed or where they cannot appear. In this particular case it’s not that simple, because even though constants can appear within procedures and blocks, they can’t appear within Property or Select statement bodies. Rephrasing this error to cover exact definition where a constant can be declared would result in too complicated error message. Uniformality excludes me from listing the legal declaration contexts as well because their exact contents can vary depending on the statement. Compromises do happen.

Precise error messages also tell meta information about the error. For example, a vague error message like “Identifier is already defined.” doesn’t really tell which identifier is causing the conflict. A better version could be “‘myVariable’ is already declared in its context”, but it can be improved even further, providing more narrowing information: “Variable ‘myVariable’ is already declared in this Class.” – or even by providing the full declaration information, including the data type definition.

Whereas “precise” refers to “contextual detail”, informative errors specify moreover what’s syntactically or semantically wrong. For example, instead of telling “Syntax error. Invalid statement.” you could specify “End of statement expected.”

Detailed error messages tend to increase their amount. That generates variance which easily gets out of hand. Thus, even though you have practically different error messages for telling how different kinds of identifiers conflict in context, or what token statements expect next, they should follow the same basic formatting. This will not only result in impression of quality to the end user, but also makes the developer’s life easier in the future. Uniformality also emerges within the error codes. Even though variables and constants generate slightly different error messages when, say, their data type could not be resolved, they still share the same error code because the error occurred from the same cause. In other words: Error codes are not tied to the actual (unique) description. Something I need to keep in mind…

Showing the error to the user
The current CoolBasic version shows errors inside of a modal window with the code and description in it. Because of its modal nature, you must close the window in order to edit your code. And not only that, you can’t do anything while the windows is there so you *need* to close it. Does anyone else sometimes forget what the actual error message was, while you’re thinking and fixing? Well, I do, and it’s irritating when it happens because I will then probably compile it again to see it. Also, I don’t want to hear the Windows’ exclamation sound every time the compilation fails. In addition, there’s a known issue of Scintilla sometimes invalidating its rendering area when a modal window pops up basically making the source code invisible for quick inspection while the error window is present. So I’m experimenting with an idea to leave the latest error message visible to the bottom section of the editor, in a similar way than Visual Studio does. Also, when you click on that section, you could open the manual page for more details. You could even filter that section to show full error history or the latest error only. The error message should also be easy-to-read meaning that the error module and line number should appear in, say, inside their own respective columns.

The principle is that errors should be silent, but noticeable. And they should not create any unnecessary stir or require actions from the user.

Localize the compiler too?
I have mentioned several times that CoolBasic V3 will be fully localized into both english and finnish. This includes manual, website, forums, editor, tools, examples, tutorials and all of the content in general. There has been one hesitation to the rule, however. Normally compilers don’t get localized. It’s just the way it is, and nobody really questions it. Due to absence of real life examples and “significant” products pioneering this concept (To be honest, I don’t know about Visual Studio. I’m too lazy to check it out, but I haven’t heard about such thing), and combined with the amount of the required maintenance or the infrastructural problems within the compiler’s architecture, the idea hasn’t really catched on. Parts of the error messages would be in english anyway because common base class libraries are mostly coded in english, thus their identifier names mix to the localized language, making it look funny.

With all this rephrasing going on, it’s possible for me to build support for both english and finnish error messages. And I have decided to give it a try. You can always disable it, right? However, I will probably use english as the default compiler language even for localized CoolBasic unless separately changed from the editor’s settings.

The changes brought up in this blog entry will keep me busy for a few days…

Name Resolution

So I thought that I’d get an easy start to Pass#2 by first developing the constant value pre-calculator. It seemed like a logical thing to do for starters because those pre-evaluated values will be needed in the actual compilation of executing code. So I started writing a procedure that iterates all constants from the symbol table, and then calculates their values. I had all these circular reference algorithms planned out and so on. But of course surprises do happen (actually I should have seen this coming from a mile away), and I figured that before I can implement this fancy stuff, there are a few other things to take care of.

Wait, isn’t that the same precise thing I experienced in the beginning of Pass#1… oh, umm.. Yes it is! Back in Pass#1, I was hoping to get an easy start with simple statements such as Repeat…Until. But there’s no such thing as free lunch; I had to write the containing elements first (d’oh, of course!). Those being classes, functions and so on. On top of that, those statements just happened to be the most complicated ones, so plenty of general-purpose parsers needed to be written, too. So Pass#1 was quite frontloaded instead of backloaded a burden to chew on. As I’ve written in previous blog entries, the ending of Pass#1 went relatively quickly, and was just a copy-paste fest for the most part.

So what’s the frontload here in Pass#2? Well, even though you now got your pretty symbol table and the source code on a silver platter, you will still need to solve the identifier name references correctly because constant expressions can have constant identifiers in them. These references couldn’t be made in Pass#1 because identifiers can be declared in random order. Sooo… now you find yourself together with Mr. Name Resolution. Oh, and also, please meet his brother Mr. Overload Resolver. Oh, and don’t forget their sister, Ms. Template Expander. Whoah, so I suddenly have these 3, mean looking procedures on my face – right from the start. I’m not kidding, these are probably THE 3 most difficult entities for Pass#2. And again, they must be almost completely implemented before proceeding onto further tasks. Okey, the Template Expander can wait a little bit. I can probably get this pre-calculator to work correctly on *simple code* with just the Name Resolution algorithm implemented, but the 2 remaining villains need to be taken care of soon after.

Name Resolution is essentially the process of matching source code identifier names to their actual declaration. An identifier can be located in some of the upper levels or in a completely different branch within the program hierarchy. When the identifier reference is being resolved, also the access modifiers are taken into account. All this follows a specific rule-set of the OOP member accessing. Basically, an entity always has access to upper levels, but requires Public access for branches. Inherited class-based reference to base classes can also have Protected access. Combine these to overloading, shadowing (hiding by name) and procedure overriding (hiding by name and signature), and you got yourself a nice little soup. But these are my problems anyways 🙂

I have formulated algorithms for both constant expression calculation and name resolution, and I expect to carry them out during the up-coming weeks. The pre-calculator will be a big chunk of code. At this point in development, things get really interesting (and challenging).

Pass#2 is looming just behind the corner

All those 64 parsers of Pass#1 have now been written and tested. And that’s a lot of code! Virtually every statement has now had its parser implemented, even though some of those statements will probably be disabled for the first few alpha releases. I’m kind of relieved as I’ve now reached a “check point” in the development process. Yet there’s a lot more to come. All in all, I think it’s safe to say that I’m closing in the halfway of the entire compiler now that Pass#1 is ready. The compiler made it through the baptism of fire, by properly parsing a few-hundred lines long complete test class source code.

This means that given the source code of a random test program, the compiler has now performed syntax checks for all elements of it, and that there’s now a complete symbol table for Pass#2 to play with. All metadata has now been gathered… Pass#2 has all the info it needs in order to complete the transformation into the Intermediate Language. If Pass#1 was essentially the Parser, then you can consider Pass#2 the actual Compiler; Its main job is to process the executable code within procedures – and to order it in a meaningful way. Pass#2 is a very, VERY complex process, and it has a lot more tasks than Pass#1 had. I have already assembled a list of those tasks, but I’m not going to share it just yet. I’d rather divide them into smaller topics and discuss them separately in future blog entries.

There are few preceding steps before Pass#2 can start crunching the procedure code. One of them is Interface merging and the other is the pre-evaluation of constant value expressions. The latter is probably more challenging with all this circular reference thing going on. Other “difficult” parts of Pass#2 include Name Resolution, Overload Resolver and the actual Expression Transformer. More about those later. All I wanted to say, was that there’s much work to be done, and it’s getting more difficult now. Bleh, I’m probably going to be spending even more time just thinking things through before implementation.

ValueExpressionParser()

Pass#1 consists of dozens of parsers. All the “container” statements are already implemented and those missing are the “regular” statements i.e statements that can only be written inside a procedure which carry the actual program flow. I had no need to parse value expressions until now as I got started with statements such as If, Until, While, etc. Most of these standard statements include parts where a value is expected. And value is evaluated from an expression. Expressions can contain operators and operands and together they form a mathematical formula. A result of an expression can be a single value of any data type defined within a CoolBasic program.

The constant value expression parser was made long ago when I was writing the Function statement parser. It converts the expression to postfix notation and then attempts to evaluate it. It was perfectly sufficient for the job, so I decided to try out the same technique for the value expression parser aswell. Value expressions allow practically all expression elements whereas the constant parser implementation was merely partial due to the amount of “illegal” elements for a constant expression. The value expression parsing algorithm grew soon in size and wasn’t exactly the most enjoyable to debug with. Also, the more complicated expression elements such as jagged arrays caused all kinds of nasty problems that would have required ugly exceptions to be made for the algorihm. In order to debug the algorihm perfectly I needed long and complex test expressions. And since you’d need full debug info of what’s currently in the output queue, operator stack, and what’s the current state of the expression, it quickly lead to nice little flood in my Compiler debug window.

So I discarded the entire parser code, and started to think of new ways to check validity of a value expression. After sprawling on the couch (once again) for a couple of hours, I came up with a solution. I’d continue doing things just like I had been doing the entire time for Pass#1. Even expressions can be broken into smaller peaces, and each piece can be validated separately. One error, and the entire expression fails. Sounds like yet another great use of recursion, uh?

If you read compiler literature you’d notice that there are mainly two kinds of commonly used algorihms (not including their sub versions): The LL and LR parsers. They both scan expression terminals from left to right, and try to satisfy the requirements of an expression. The LL-parsers do this by forming the entire expression as a product of terminals and operators it comes across with, whereas the LR-parsers try to simplify the expression down to a single product. These algorithms are also known as “Top-down” and “Bottom-up” parsers. Since basically recursion is like a stack, the easiest way for me, was to use the Bottom-up parsing technique. It’s not by the book, but the basics are the same: Every time a value is expected, new recursive call is initiated. I’m not going to delve into details, but it appears to be working like a charm. Later on I also replaced the constant value expression parser.

So what this means, is that unless I hit unexpected problems, Pass#1 is finally coming toward its end. I still have lots of parsers to implement, but most of them are very simple. I should now have almost all bits of a statement, implemented as a parser. It’s just a puzzle left I need to put together with ready-to-use pieces.

Assigning stuff

C-languages (and the like) make difference between the “assign” operator and the “comparison: equal” operator, as opposed to BASIC-languages where both share the same symbol. The other assign operators such as “+=“, “-=“, “*=” and “/=“, in comparison, operate the same way in both languages – almost. In C-languages you can use assign operators pretty much anywhere within expressions because unlike in BASIC-languages, they return the value that was just assigned. That’s why you can use “a=b=c=1” in C, and have all the three variables be assigned a value of 1. In BASIC-languages, where symbol “=” defines the comparison operator, the same statement would equal C-expression of “a=(b==c==1)” which would result in a boolean value of True or False to be assigned to variable a, depending on the values of b and c.

The problem in BASIC-langages is that the compiler can’t tell whether the user wanted to express an assignment or a comparison with symbol “=“, except for the first one in line; the rest are considered “comparison: equal”. BASIC-language parsers typically distinguish these two operators (that share the same symbol) by context: Are we expecting a value here or not. Therefore, by design, BASIC-language assign operators don’t return a value. Unfortunately, nor do any of the other assign operators: “+=“, “-=“, etc. because of consistency. Thus you can’t do loop auto increments like:

While ((myVar += 1) < 10)

In addition, since no value is evaluated by the assignment you can’t have the operation applied before/after evaluation, like the “++” and “--” operators in C do (they can appear before or after a variable). In addition, the “--” as prefix has also a double meaning: it can either be decrement before evaluation, or a double unary minus. Too much noice here. And that’s why those operators are missing from most BASIC-languages including VB.NET – and they won’t make it to CoolBasic either. You will be able to use “i += 1“, but not within a value expression.

In summary, these are the assign operators that will be implemented to CoolBasic V3:
"=", "+=", "-=", "*=", "/=", "<<=" and ">>=".

Edit 15.5.2009: Fixed typos.

Annoying syntactical exceptions

So I have been writing these Pass#1 statement parsers for a while now (a bit more than 20 more to go). Recently, I came across with the Declare Function statement. It’s responsible for declaring a function from an outside resource – for example, a DLL-file. The general guideline is that all statements would look the same as the corresponding statement in VB.NET. So I began the normal procedure I do with every new statement parser, i.e type in a sample code in a source file which will then be passed to the compiler. Now, the first thing to do is to enable the parser function in question and then create the parser skeleton for quick OK check. I was surprised to notice that the compilation failed due to syntax error before the code line got even passed to the main parser function. The VB.NET syntax is as follows:

Declare Function MyFunc Lib "System.DLL" Alias "_RealFunctionName" ( [params] ) As [type]

Now the general syntax rules define that a literal cannot appear just before an opening parenthesis. And well well, isn’t that a string literal and a parameter list just after. I was first like, WTF how can I “fix” this in a non-ugly way without breaking the general syntax rule set. I had three choises, none of which seemed good way to go:

  1. Allow a string literal to precede an opening parenthesis
  2. Allow a string literal to precede an opening parenthesis only in a Declare statement
  3. Alter the syntax of the Declare statement

Definately not Option 1. Option 2 seemed to be the best choise because I still did’t want to alter the syntax of the statement. But that would have been an exception to the syntax rules. I don’t like exceptions in any program structure, because in the end, it will make code maintenance messed up. I thought about pros and cons of options 2 & 3 for a long time. I could have added a small keyword between the string literal and parameter list. The resulting statement syntax could look something like this:

Declare Function MyFunc Lib "System.DLL" Alias "_RealFunctionName" Ansi ( [params] ) As [type]

…where “Ansi” could be, for example, a charset modifier (Ansi/Unicode/Auto). On the other hand, charset modifier is already an optional modifier in the VB.NET syntax; its place is right after the Declare keyword. CoolBasic won’t provide support for charset modifiers just yet (since it’s optional I can add it later on without breaking all existing CoolBasic source codes which would make users angry).

Another solution for Option 3 would be to define Alias as identifier instead of a string. The syntax would look something like this:

Declare Function MyFunc Lib "System.DLL" Alias _RealFunctionName ( [params] ) As [type]

It seemed like a good idea at first, but come to think about it the solution would be logically wrong. By definition, all identifiers should be usable by the programmer. In this case, the Alias identifier would not – or it would be unintended identifier reserving a name without real operational use. At this point I ran out of options. This isn’t by the best practise, but I decided to go for Option 2. Luckily I’ve designed the lexer to be quite flexible so I only needed a small change in one place, basically overriding a syntax error based on a simple condition of few preceding line tokens.

As a result, the syntax will be exactly like in VB.NET, as intended. Internally, the fix didn’t end up being as ugly as I had feared. Overall, I’m content with the outcome. Pass#1 continues…

CoolBasic Program Skeleton

The leap from a procedural language to an object oriented language is huge for both the designer (me) and the user. I have established the guidelines and practises how CoolBasic programs need to be built, and I’m already seeing light at the end of the tunnel (which doesn’t, however, mean that I’m nearing the end). From the user’s point of view the change is so big that CoolBasic V3 can be considered completely different a language from what it used to be. And it is. You are going to be writing slightly more code to achieve same results compared to the current CoolBasic (that never got past its Beta stage haha).

In OOP languages, all executing code must be written inside a procedure. I say procedure, because it’s a uniting term for Functions, Properties and Operators. This means that the main loop of a game must also be inside a Function. Since functions can only appear within the module context, you must define the containing module or class. You can’t call the starting code from outside, so the concept of Main Function needed to be implemented. What this means, is that you must always write Function Main() somewhere in your root level Modules. It operates as an entry point to your application. Only one Main Function can be defined. The concept is similar to .NET languages and C/C++.

It may seem like much, but in the end it isn’t. Sure you have to build all programs in a certain way based on a skeleton, but after that things get simplier. The following code demonstrates a simple HelloWorld program. It will require 11 lines of code whereas the old CoolBasic only required four lines. But you really can’t simplify things any more than that in the OOP world. This can be confusing to newcomer programmers, so perhaps when you create a new project in the CoolBasic code editor, this skeleton could be automatically typed in – for reference and easy start.

Simple HelloWorld Example

Simple HelloWorld Example

Please note that the example program above is subject to change, and is most likely not final. Most of the example’s code lines belong to the frame you need to define. The main loop is still quite simple. First we open a windowed game screen of 800×600 dimensions. Then we loop two lines of code (text drawing and screen updating) until the Screen is closed. The biggest challenge is to memorize all new classes and their methods. A brand new, comprehensive and user-friendly manual will help with this, and perhaps I’ll implement some kind of an intellisense-prompt to the editor at some point.

A short update on CoolBasic V3 compiler progress: The above example fully passes the lexer, and soon the parser (Pass#1) aswell. All in all, I need to implement the parsers in the same order as one would write code to a program; containers first and then the statements. It’s slow in the beginning due to the complex structure of, say, Function statements and such, but the rate should accelerate when I progress through the simplier statements in the end. I recently finished the Function statement parser which is one of the largest and most complicated parsers, and I feel good about it; solid ground is the key to success. I expect to implement the Class and Interface parsers in the near future.

Parsing in Segments

Pass#1 is responsible for so many things it needs to be designed very carefully. I have come up with a plan about the remaining compiling phases: There will be 3 or 4 passes in total of which Pass#1 is the parser. Its main job is to compile the symbol table for the source code, and generate variances of the generic classes and functions. In short, it will prepare the source code for Pass#2 where the actual compilation of each procedure takes place. Of course, this involves making the code as clean as possible by removing modifiers and certain other structural “particles”.

Parsing in general can be quite difficult a task, but this time I’ve taken entirely different approach. As we begin Pass#1 (for each line separately) some basic syntax checks have already been done. Therefore we can safely assume that the line always begins with:

  • Pointer literal
  • Identifier
  • Label
  • Full-line special data segment
  • Keyword

Each of which launches a mini-parser of their own. These include ParseKeywords(), ParseStatementValue(), ParseAssignment(), ParseParamList(), ParseDeclaration() and ParseModifiers(). They scan the terminals of the current line by segments, i.e group terminals based on which class they belong to. All this is done recursively so in case of a failure, the entire call stack rolls back, ultimately terminating the entire compiling process. The pros of the recursive parsing is that it allows a flexible identification of expressions which belong to a multi-part statement such as the For-statement.

In addition, all in-built statements have their own parsers that make sure the keyword has been used within a proper declaration context, and that those statements that define a code block, follow a valid hierarchical structure. As many syntax checks will be made as possible, but the final compilation in Pass#2 will detect the rest of them.

CoolBasic V3 follows very sophisticated and strict syntactical rules similar to VB.NET. This means that in order to parse a simple Break-statement I must first establish the base for parsing modifiers, Modules, Functions, and Repeat…Until. In another words, I must first set up alot of functions just to get things running by putting it all together for the minimum. Lots of code is required before I can even run the basic tests (for example scope list and the symbol table). Pass#1 is coming along nicely, though.

Compilation: Pass#1

Since I decided to add cross-reference support for CoolBasic (see blog entry “Highly Modular”), some major changes to the entire process were needed. Therefore, I’ve rewritten the lexer in preparation for Pass#1. Due to the nature of these changes, the way inner lists work also needed a review. I’m quite happy with the results. Lexing algorithms improved, for example the assignment operator is now better distinguished from the comparison operator. The assignment is now automatically cast for statements like Optional in function declaration, variable declaration, or the For…Next assignment. This will ease further parsing in Pass#1 and Pass#2.

Pass#1 compiles the symbol table i.e list of all identifiers. Basically, the compiler will scan for declaration statements and then create a collection of their names. Some secondary information such as path, relationship, block entry/end and signature are also formulated. Pass#1 can also process the hierarchy integrity as well as perform some initial checks to the syntax. We can’t create exact pointers to parent entities yet because program structures can be declared in random order within the source. Other tasks include generation of enums and trimming the code from entity modifiers. In general, Pass#1 is just a fast sweep and it doesn’t analyse the syntax in great depth. Deep scan belongs to Pass#2 and perhaps Pass#3.

Constants are interesting because they can be used to define each others, for example Const a = 1 : Const b = a + 3. It’s, however, possible to create a circular reference which renders the evaluation of the constant’s value, impossible. You can’t make exceptions to the general reference resolver (requiring Constants to be always declared before using) either. Being unable to specify constants as part of another constant expression, is stupid, too. This means I have to implement a recursive search at later passes when constants’ true values are pre-calulated. Pass#1 doesn’t yet do anything about the constants’ values anyway. In VB.NET constants aren’t really replacing values within the source; they’re constant variables i.e Static ReadOnly. They are, however, pre-evaluated at compiling time to determine constant expressions and thus discarding false condition code blocks. This is where CoolBasic will differ – constants will be transformed into literal values which means speed increase at runtime.

More about Pass#2 when I’m done with this…

Copyright © All Rights Reserved · Green Hope Theme by Sivan & schiy · Proudly powered by WordPress