Bismuth reference manual

Version 0.1 draft
by Benedikt Bitterli

Introduction

Bismuth has been designed with the purpose of being an easy to learn, yet powerful programming language. Taking inspiration from existing languages in the BASIC family, the core principle of Bismuth is to relieve programmers of needless and unintuitive constraints while still allowing for as much control as possible.

This manual should serve as an introduction and a hopefully complete language reference for Bismuth.
Like most reference manuals, this document may be dry in some places and outdated in others. As the language is still in development, certain parts of this document may even undergo heavy changes in the future. Still, we try to keep this manual reliable and up-to-date as best as we can.

Note that this manual is not a programming tutorial. While the use of BNF and similar grammar description languages has been avoided, this document is still adressed at programmers experienced with other languages.

As of now, Bismuth does not directly compile to native code, but instead outputs C++ code as an intermediate step. This allows us to take advantage of the capabilities of today's C++ compilers for producing fast and reliable binaries as well as allowing the user to easily interface with existing C/++ code and libraries.

The Bismuth compiler is distributed as free software and may be used, modified and redistributed without restriction. Specific license terms are included with each Bismuth distribution.

Basic concepts

Identifiers

Identifiers are used to name variables, functions and classes inside a Bismuth program. Identifiers are chosen by the programmer and serve as a handle by which language constructs can be referred to.

Identifiers in Bismuth can be an arbitrarily long sequence of letters, digits, and underscores not beginning with a digit.

The following examples are valid identifiers:

i_am_an_identifier
FooBar
_superDerby1998

These are not:

99problems
I am a potato

The following keywords are used by the language itself and may not be used as identifiers:

and abstract bool byte case const continue class default delete do double eachin
else elseif end endextern endfunction endif endmethod endselect endclass exit
extends extern false field final float for forever function global if import int
local long method mod new next not null or object private protected ptr public
repeat return select self shl short shr step string super then to true try ubyte
uint ulong until ushort varptr void wend while

Note that Bismuth is case insensitive. The identifier foobar is equivalent to FooBar, FoObAr and so forth.

Datatypes

When defining variables or functions, Bismuth requires you to specify a datatype for the construct in question. Datatypes denote the type of values which are contained in a variable or returned by a function.

There are ten primitive datatypes in Bismuth. For integers:

Byte:   8 bit signed integer
Short: 16 bit signed integer
Int:   32 bit signed integer
Long:  64 bit signed integer

UByte:   8 bit unsigned integer
UShort: 16 bit unsigned integer
UInt:   32 bit unsigned integer
ULong:  64 bit unsigned integer

For decimals:

Float:  32 bit floating point
Double: 64 bit floating point

Datatypes may also consist of an identifier instead of one of the above keywords to refer to classes. Strings, which are used to store text, also fall into this category, as it is a special class provided by Bismuth and not a primitive datatype as such.

Additionally, each of these types may be augmented by a sequence of [] brackets and Ptr keywords, denoting the datatype as an array of or a pointer to a value, respectively.

Datatype declarations are read from right to left. Consider the following snippets:

Float Ptr         Pointer to a float
Float Ptr Ptr     Pointer to a pointer to a float
String[]          Array of strings
Float[][]         Array of arrays of floats
Long Ptr[]        Array of pointers to longs
Long[] Ptr        Pointer to an array of longs
Int Ptr[][,,]     Three dimensional array containing arrays of pointers to ints
Byte[] Ptr[]      An array of pointers to an array of bytes

Don't worry if you don't fully understand these right now - complex datatypes such as the last two rarely occur in the wild.

There is a last primitive datatype in Bismuth - Void. Void can only be used either as a function return type, in which case it denotes that the function does not return a value, or in the form of Void Ptr, denoting a pointer to data of any kind.

Variables

Variables are a program's way to store data during runtime. There are two basic kinds of variables in Bismuth: variables of local scope and variables of global scope.

Local and global variables are defined using the Local and Global keyword respectively:

Local Identifier1:Datatype
/* or */
Global Identifier2:Datatype

Where Identifier is the name of the variable and Datatype the corresponding datatype of the variable as described in the last section.

Variables may be assigned directly upon definition with

Local Identifier:Datatype = Expression
/* or */
Global Identifier:Datatype = Expression

Multiple variables can be defined in one statement by separating them with commas:

Local Identifier1:Datatype1 = Expression1, Identifier2:Datatype2 = Expression2, Identifier3:Datatype3 = ....

The assignment is, of course, optional. The syntax for the corresponding Global declaration is exactly the same - only replace the keyword.

The following are all valid variable definitions:

Local Foo:Byte
Local Bar:Int = 42
Local X:Int = 5, Y:Int = 24

Global Raspberry:Int, Pi:Float = 3.14159265

It should be noted that variables which aren't assigned upon definition will instead be initialized to a sensible default value. This is 0 for numerical types, "" for strings, the null reference for classes and the null pointer for pointers.
To be more specific, not specifying an initial value is equivalent to assigning Null to it; see the type conversion chapter for more information about the different meanings of Null.

Local variables may be defined anywhere in the program and only exist inside the block they've been defined. Local variables may only be used after they have been declared.

Global variables on the other hand may only be defined outside of functions. Unlike local variables however, global variables may be used anywhere in the program - even in code preceding definition!

Let's see some examples:

Local Foo:Int = 5
Print(Foo) //Prints 5
Print(Bar) //Error! Bar is undefined at this point
Local Bar:Float = 3.0

Print(GlBar) //Prints 0
Global GlBar:Int

Careful though! Even though globals may be used in code preceding their definition, they will only be assigned once execution reaches the definition. Consider this snippet:

Print(GlBar) //Valid code - but prints 0!
Global GlBar:Int = 42
Print(GlBar) //Prints 42

There is a construct that looks very similar to a variable declaration, namely the constant. As the name suggests, a constant stores a value that cannot be changed at runtime. Constants can (and have to) be assigned once and only once, namely in their definition. Similar to global variables, they can be used in the entire program, even in code preceding their definition. Unlike global variables however, constants may be defined anywhere in the program.
The only restriction is that the value assigned to a constant has to be a constant expression. Examples:

Const Pi:Float = 3.14159
Print(Pi) //Prints 3.14159

Print(TheAnswer) //Perfectly valid code! Prints 42
Const TheAnswer:Int = 42

TheAnswer = 43 // Error: Assigning to a constant is not allowed.

Note that the strange behaviour of constants and globals (being available before their definition) is for legacy reasons and is likely to change in future versions of Bismuth.

Functions

A function represents a block of code that can be called from anywhere in your program.

Functions are defined as follows:

Function Identifier:ReturnType(Parameters)
    /* Function statements go here */
End Function

ReturnType can be omitted, in which case it will default to Void, i.e. the function does not return a value.

Parameters has to be a possibly empty list of comma separated variable declaration in the form of

Identifier1:Datatype, Identifier2:Datatype, ....

Calling and returning values from functions will be covered in later chapters.

Expressions

Expressions in Bismuth consist of literals and/or variables combined with unary or binary operators. Expressions always return a value and cannot stand on their own - they always have to be used within statements or declarations.

Arithmetic operators

As the name suggests, these operators combine their operands using standard arithmetic rules. These operators may only be used with numeric types, with the exception of +, which may be used with strings to perform concatenation.

Unary:

-  Unary minus

Binary:

^    Power-of operator
*    Multiplication
/    Division
+    Addition
-    Subtraction
Mod  Modulo (remainder of division)

Logical operators

Logical operators evaluate their operands to a boolean using standard conversion rules and then compare them according to specific rules.

Unary:

Not

Negates the operand, i.e. returns true if the operand is false and vice-versa.

Binary:

And

Logical and. Returns true only if both operands are true; returns false otherwise.

Or

Logical or. Returns true if at least one of the operands is true; returns false otherwise.

And and Or in Bismuth are short-circuited. This means that they don't evaluate their second operand if the result is already determined after evaluating the first. I.e. And will skip evaluation of the second operand if the first operand is false; Or works similarily by ignoring the second operand if the first one results in true.

This is useful if the second operand has side-effects. Consider this example:

A <> Null And A.Foo = 5

If And always evaluated both operands, the program would crash if A was Null. Since this is not the case, And will not evaluate A.Foo if A <> Null is false, thus preventing a crash if A is indeed Null.

Relational / equality operators

These operators compare their operands and always return a boolean (True or False).

<   Less than
>   Greater than
<=  Less than or equal to
>=  Greater than or equal to
=   Equal to
<>  Not equal to

<> and = will work on all types, whereas the other operators require their operands to be of numeric type.

Bitwise operators

Bitwise operators access the bit patterns of their operands. They may only be used on integers.

Unary:

~  Negation; flips all bits of the target expression.

Binary:

&    Bitwise and
|    Bitwise or
~    Bitwise xor
Shl  Shift left
Shr  Shift right (logical for unsigned types, arithmetic for signed types)

Function calls

Function calls are of the form Expression(Parameters), where Expression evaluates to a function datatype and Parameters is a (possibly empty), comma-separated list of expressions to pass to the function.

If the function return datatype is Void, then it may not be called inside an expression, since expressions have to return values.

In case the function defines optional parameters, these may be omitted in the function call.

Assuming the following declarations:

Function A:Int(Foo:Float)
End Function

Function B:Float(Foo:Double, Bar:Int = 5)
End Function

these are correct function calls:

A(4.5)
B(1.5, 6)
B(2.5) //Bar defaults to 5

Index operator

The index operator [] allows access to values inside an array or values referenced by a pointer. Index expressions are of the form Expression[ExpressionList], where Expression evaluates to a pointer or an array and ExpressionList is a non-empty, comma-separated list of integer expressions.

For arrays, the number of elements of ExpressionList must match the dimension of the Expression operand. For pointers, ExpressionList must contain exactly one expression.

Assuming the following declarations:

Local A:Int[], B:Float[,,], C:Double Ptr

these are valid index expressions:

A[5]
B[2, 3, 4]
C[67]

VarPtr operator

VarPtr retrieves the address in memory of a particular expression. It may only be used with operands that are actually stored in memory, i.e. variables or array locations.

Assuming the following declarations:

Local A:Int, B:Float[,,], C:Foo
Class Foo
    Field Bar:Int
End Class

these are valid applications of VarPtr:

VarPtr A       //Returns an Int Ptr
VarPtr B[2, 5] //Returns a Float Ptr
VarPtr B       //Returns a Float[,,] Ptr
VarPtr C.Var   //Returns an Int Ptr

New operator

The New operator will instantiate a class or an array - that is, allocate memory for the object to inhabit, potentially run initialization code and return the newly created object.

For classes, the syntax of New is as follows:

New Identifier
/* or */
New Identifier(ParameterList)

Where Identifier is the identifier of a class and ParameterList is a (potentially empty), comma-separated list of expressions. New Identifier is shorthand syntax for New Identifier().

New will call the constructor of the class (if it has one) and pass ParameterList as arguments to it. As such, the number and type of the arguments must match the ones defined by the class constructor.
If the class does not define a constructor, New treats it as a class with a constructor with no arguments (that is, only New Identifier or New Identifier() are valid).

For arrays, the syntax of New is as follows:

New ArrayType[SizeList]

Where ArrayType is the datatype of the array and SizeList is a non-empty, comma-separated list of expressions of integer type.

Examples:

New Float[5]       //Float array with 5 entries
New Int[20, 10]    //2D Int array with 20x10 entries
New String[][][42] //An array of String[][] with 42 entries

Type cast operator

The type cast operator is of the form

Datatype(Expression)

where Datatype is a valid datatype as described in an earlier chapter and Expression has to be explicitly castable to Datatype.

The conversion follows standard type conversion rules. If is a valid conversion, it will return a value of type Datatype.

Statements

Statements are a program's only way to actually execute code. A statement is the smallest standalone element of any programming language. A program consists of a sequence of one or more statements.

A statement is one of the following:

Assignment
Function call
Control statement
Execution redirection

A statement list in Bismuth consists of a sequence of statements separated by a line break. Multiple statements may be written on one line by separating them with a semicolon (;). One statement may be spread onto multiple lines by ending each line with two periods (..)

The following statement lists are equivalent:

Print("Hello ")
Print("World!")

Print("Hello "); Print("World!")

Print( ..
    "Hello " ..
)
Print( ..
    "World!" ..
)

Writing multiple statements in one line may be necessary in some cases - for example, it's the only way to execute multiple statement in the ForStatements of a For-While-Do loop or inside a one-line-If - but usually, this kind of code style should be avoided in favour of readability.

Assignments

Assignments in Bismuth are of the form

Expression1 = Expression2

They are used to assign values to a writable location.

Expressions on the left of the assignment operator (=) have addressable and writable.
For example, assigning to 5 + 3 is not allowed - it may be an expression, but the result is not addressable (i.e. it does not resolve to some location in memory). Similarily, assigning to a constant outside its declaration won't work either, since constants are not writable.

Function calls

Function call statements follow the exact same syntax and behaviour as function call expressions.

The only difference is that any value returned by the function is discarded. As such, functions of type Void can be called as well, in contrast to function call expressions, which do not allow this.

Control statements

Control statements alter the order in which statements are executed. They are extremely important tools of a programming language, and it is important to know all of them in order to write good code.

Control statements in Bismuth each create their own scope; that is, a new space for variables to live in. Variables defined inside a scope "vanish" once the corresponding code block is left; also, variables from an outer scope may be overwritten by an inner scope ("shadowing").

See an example:

Local A:Int = 6
If A <> 0 Then
    Local A:Int = 3 //Perfectly legal - "overwrites" the old A inside this scope
    Local B:Int = 7

    Print(A) //Prints 3
    A = 23
    Print(A) //Prints 23
EndIf
Print(A) //Prints 6
Print(B) //Error! B does not exist anymore

If statement

The If statement is used to conditionally execute blocks of code based on an expression. An If statement looks as follows:

If Expression1 Then
    StatementList
ElseIf Expression2 Then
    StatementList
ElseIf Expression3 Then
    StatementList
.....
Else Expression4 Then
    StatementList
EndIf

The expressions will be evaluated in order until one is found that evaluates to True, in which case the matching list of statements will be executed. If none evaluate to true, the statements inside the Else block will be executed (if there is one).

The list of ElseIf blocks may be arbitrarily long and even be omitted completely. The Else block is optional.

To support readability and compact code in case of short expressions and statement lists, Bismuth supports a one line equivalent of a full If statement:

If Expression1 Then StatementList ElseIf Expression2 Then StatementList .... Else StatementList

The same rules as for the multiline version apply.

Select statement

The Select statement is used to select a statement list to execute from a labeled set. The syntax is as follows:

Select Expression
    Case Expression1
        StatementList
    Case Expression2
        StatementList
    ....
    Default
        StatementList
End Select

This is equivalent to the following If statement:

If Expression = Expression1 Then
    StatementList
ElseIf Expression = Expression2 Then
    StatementList
....
Else
    StatementList
EndIf

with the only difference that Expression is only evaluated once.

The Default case is optional.

For loop

The For loop is used to execute a code block a specific number of times. The syntax is as follows:

For Variable = FirstValue To LastValue Step StepValue
    StatementList
Next

The For loop will first set Variable to FirstValue. Then it performs a check to see whether Variable is smaller or equal to LastValue. If it is, it will execute the statement list, increment Variable by StepValue and jump back to the beginning of the loop (performing the check again). If the check fails however, the statement list will be skipped and execution resumes with the next statement following Next.

Variable has to be an existing variable and be of numeric type.
StepValue has to be constant. It may be omitted, in which case it defaults to 1.

Alternative forms of the For loop are available:

Variable may be replaced by Local Variable:Datatype, in which case a new local variable will be introduced instead of having to use an existing one. To may be replaced by Until, in which case the last iteration of the loop will be skipped.

Consider the following sample codes:

//Prints all number from 0 to 10 inclusive
For Local I:Int = 0 To 10
    Print I
Next

//Prints all number from 0 to 10 exclusive
For Local I:Int = 0 Until 10
    Print I
Next

//Prints the numbers 0, 3, 6, 9
For Local I:Int = 0 To 10 Step 3
    Print I
Next

//Prints all numbers from 10 to 0
For Local I:Int = 10 To 0 Step -1
    Print I
Next

While loop

The While loop will execute a code block while a certain condition holds true. It is of the form:

While Expression
    StatementList
Wend

Expression will be evaluated before executing the statement list. If it evaluates to True, the statement list will be executed; otherwise, it will be skipped and execution resumes with the next statement following Wend.

Repeat loop

The Repeat loop will execute a code block until a certain condition is true. It is of the form:

Repeat
    StatementList
Until Expression

Expression will be evaluated after executing the statement list, meaning that the statement list will be executed at least once. If it evaluates to True, execution leaves the loop and continues with the next statement following Until. Otherwise, the statement list will be executed again.

An additional version of Repeat is available:

Repeat
    StatementList
Forever

This will continue executing the statement list until the loop is broken manually with Exit or Return.

For-While-Do loop

The For-While-Do loop is an enhanced version of the For loop resembling for loops in other languages (such as C or Java). It is of the form

For ForStatements While WhileExpression Do DoStatement
    StatementList
Next

This is equivalent to the following construct:

ForStatements
While WhileExpression
    DoStatement
    StatementList
Wend

The DoStatement is optional and may be omitted completely (the Do however is mandatory).

Let's see an example:

//Computes and prints all Fibonacci numbers smaller than 1000
For Local A:Int, B:Int = 1 While B < 1000 Do Local C:Int = A + B
    A = B
    B = C
    Print(A)
Next

Currently the DoStatement is not that useful. Future versions of Bismuth may change the behaviour of the For-While-Do loop to act more like a For loop, executing the DoStatement after the statement list and adding special cases for Exit and Continue.

Control redirection

Return

Return may be used inside a function to leave it prematurely. Additionally, it is also used to return values from a function, in which case it used as Return Expression.

If the function is of type Void, specifying a return value is not allowed. Similarily, if the function is non-void, a return value has to be specified when Return is used.

Note that Return ignores all enclosing control statements and always leaves the function directly.

Exit/Continue

Exit and Continue may be used inside loops to change the loop behaviour.

Exit will leave the loop immediately and redirect execution to the next statement following the loop block.
Continue, on the other hand, will skip the rest of the statement list and jump directly to the end of the loop block, execute all potential loop checks and start a new iteration if the checks still hold.

Note that Exit and Continue act on the innermost enclosing loop block. Other enclosing control statements, such as If or Select, will be ignored.

Program structure

Bismuth programs are stored as plain text in *.bi files and consist of a list of statements and declarations.

Unlike languages like C, Bismuth allows statements to stand on their own, outside any function or class declaration. These are the statements that will be executed when the program is compiled and run, i.e. a program consisting of only declarations won't actually do anything.

Typically a program starts out with a list of imports followed by the program statements and, after that, all function and class declarations. This ordering is not mandatory, but recommended to support readability.

Note that the program statements form their own scope, so local variables declared on the program level are not accessible inside functions or classes:

Local Message:String = "Hello, world!"

Print(Message) // Prints "Hello, world!"
PrintMessage()

Function PrintMessage()
    Print(Message) // Error! Message not accessible from here
End Function

Type conversion

Every expression and variable in Bismuth has an associated datatype, which has been discussed in earlier chapters. As such, it is not unusual for operands, assignments or parameters to be of different datatypes. For these reasons, Bismuth provides a set of rules for implicit and explicit type conversion.

Implicit conversion rules

If the source expression of an assignment or a function parameter is of differing type than the target datatype, Bismuth will first try to do an implicit conversion of the source datatype. This means that in most cases, the source expression will silently be converted into the target datatype. If the implicit conversion succeeds, it behaves the same as performing an explicit conversion using the type cast operator.

Implicit conversion works if:

The source type is numeric and the target type is numeric (truncation may occur)
The source type is numeric and the target type is a string
The source type is a class and the target type is a parent of the class
The source type is a pointer and the target type is a void pointer
The target type is a boolean. The result of the conversion is equal to performing SourceExpression <> Null.
The source expression is Null. If the target type is
numeric, the result is 0
a pointer, the result is the null pointer
a class, the result is the null reference
a string, the result is ""
an array, the result is []
The source type is a string and the target type is a Byte Ptr or a UByte Ptr, in which case the pointer to the first character of the string will be passed instead. This is out of convenience when calling C/++ functions.

In any other case, the implicit conversion will fail and the compiler will issue an error.

Explicit conversion rules

Explicit conversion can be used to override the type system used by the compiler. Explicit conversion is done using the type cast expression introduced earlier.

An explicit conversion will succeed if either the corresponding implicit conversion would succeed, or if:

The source type is a pointer or an integer and the target type is a pointer
The source type is a pointer and the target type is an integer
The source type is a class and the target type is a class. If the class instance in the source expression has a matching polymorphic attachment, it will be returned; otherwise, the result of the operation is Null.
The source type is a string and the target type is numeric.

In any other case, an error will be issued by the compiler.

Type coercion rules

When using operands of different types inside an expression, Bismuth employs a set of rules to determine the datatype of the result. In such a case, the following list of rules will be checked from top to bottom. The first matching rule determines the datatype of the result:

If either operand is a String, the result type will be a String
If either operand is a Double, the result type will be a Double
If either operand is a Float, the result type will be a Float
If either operand is a Long, the result type will be a Long
The result type will be an Int

The only exception to this rule is the power-of operator (^), the result of which will always be a double.

In any case, the result of the operation will be the same as if both operands had been explicitly converted to the result datatype before the operation.

Classes

Classes represent a collection of data and functions to operate on said data.

Classes are defined with the Class keyword. This will define only a class, which is a blueprint for creating instances of the class. Only instances will occupy memory and can actually be referenced.

Variables of an instance holding data are called fields, whereas functions of an instance are called methods. Both fields and methods of an instance are referred to as members.

The most basic class definition is of the form:

Class Identifier
    Field Identifier1:Datatype1
    Field Identifier2:Datatype2
    ...
End Class

This class can now be instantiated using the New operator and its fields accessed with the . operator. Take a look at these examples:

Class Foo
    Field A:Int
    Field Message:String
End Class

Local Instance1:Foo = New Foo //Instanciates Foo and assigns a reference to Instance

Instance1.A = 25
Instance1.Message = "Hello World!"

Print(Instance1.Message) //Prints "Hello World!"

Foo.A = 34 //Error! Foo is a class and not an instance

As you've seen in the example above, the variable Instance1 has the datatype Foo. As explained in an earlier chapter, class names can be used as a datatype, denoting that the variable can contain a reference to an instanbce of the class.

Note that specifying a class type does not mean that the variable actually contains an instance. Bismuth passes instances around by reference. This means that multiple variables can reference the same instance, or that one variable may reference no instance at all. Variables of class type are initialized to the null reference, meaning that no instance is referenced by the variable. The reference can be changed through assignment.

Let's see a few examples:

Local Instance2:Foo

Print(Instance2.Message) //Runtime error! Instance2 does not reference any object - accessing Message will crash

Local Instance3:Foo = Instance1 //Instance3 and Instance1 now reference the same class instance
Print(Instance3.A) //Prints 25
Instance3.A = 10
Print(Instance1.A) //Prints 10

Indicated by the last line, references and instances are two different things. When the instance is changed through one reference, the change will be seen through all other references.

Methods

Next to defining fields, it is also possible to define methods in a class definition. A method definition looks the same as a function definition - just replace the Function keyword by Method and place the definition inside a Class-EndClass block.

Methods can be accesses using the . operator and called with the standard function call syntax. Consider the following example:

Class Bar
    Field Value:Float

    Method ChangeAndPrint(NewValue:Float)
        Value = NewValue
        Print("The value is: " + Value)
    End Method
End Class

Local B:Bar = New Bar
B.ChangeAndPrint(1.37)

The useful thing about Methods is that they have full access to all fields as if they were local variables. Inside a method, you will also have access to an implicitly defined Self, which always holds a reference to the current instance the method acts upon.

Constructors

There is a special kind of method called the constructor. It follows the same syntax as other methods, only that it has the same name as the class. Constructors are provided by the programmer and serve to initialize a class with sensible data. They also provide a bit of extra security: A class can only be instantiated by calling the constructor, therefore ensuring classes are always initialized when instantiated.

Constructors couple tightly with the New expression - let's see some examples:

Class ChocolateBar
    Field Price:Float

    Method ChocolateBar(Price:Float)
        //Another useful purpose of Self: access fields shadowed by parameters
        Self.Price = Price
    End Method
End Class

Class VendingMachine
    Field Contents:TList

    Method VendingMachine()
        //Ensure that Contents references a valid instance
        Contents = New TList
    End Method
End Class

Local Machine:VendingMachine = New VendingMachine()
//Machine.Contents now referents a valid instance!

Local Bar:ChocolateBar = New ChocolateBar(3.50)

Local Foo:ChocolateBar = New ChocolateBar //Error! Must provide constructor with parameters

The exact syntax and semantics of the constructor are explained in more detail in paragraph about the New statement.

Static members

So far we only looked at instance members, which belong to the instances of a class. Each instance has its own copy of members, which are independent of other instances - changing a field in one instance will leave the other instances untouched.

However, classes in Bismuth can also provide static members, which belong to the class rather than an instance of it. Regardless of how many instances of a class exist, there always exists only one set of static members.

Static fields are introduced using the Global keyword, static methods with Function. The syntax is exactly as you'd expect from the previous use of Global and Function - just the meaning is different.

What's useful about static members as opposed to just using Global and Function outside the class definition is that they are accessible (and modifiable) from all instances just like normal fields. Additionally, they allow for encapsulating all data and functionality required by a class right inside the class definition, instead of scattering required globals and functions inside the file.

Static members can be accessed from outside the class using the . operator just like you would with an instance member - only using the class name instead of the instance reference on the left hand side.

Let's see an example:

Class Apple
    Global AppleCount:Int

    Method Apple() //Apple constructor
        AppleCount = AppleCount + 1
    End Method

    Function PrintApples()
        Print("There are " + AppleCount + " apples!")
    End Function
End Class

Local A:Apple = New Apple
Local B:Apple = New Apple
Local C:Apple = New Apple

Apple.PrintApples()

If Apple.AppleCount > 2 Then Print("That's a lot of apples!")

Inheritance

Inheritance allows classes to extend the functionality of an existing class. It allows for code reuse and so-called subtype polymorphism, which will be explained later.

Classes can define a parent class (or "superclass") by using the Extends keyword in the class definition:

Class Polygon
End Class

Class Triangle Extends Polygon
End Class

In this case, Polygon would be the superclass and Triangle the subclass. The superclass may itself be a subclass of another class and so forth. Classes which don't define a superclass implicitly extend the Object class - thus, every class in Bismuth is either directly or indirectly a subclass of Object.

Subclasses inherit all members (that is, fields and methods) from their parent class, meaning subclasses can use members of the parent class as is they were defined in the subclass.

This is extremely useful if you have a handful of classes that share common attribute and code - just define a base class with all the common members and extend it! Continuing the example with polygons from before:

Class Polygon
    Field Kind:String
    Field CenterX:Float, CenterY:Float

    Method PrintCenter()
        Print("This polygon is a " + Kind + " and sits at " + CenterX + ", " + CenterY)
    End Method

    Method Polygon(X:Float, Y:float)
        CenterX = X
        CenterY = Y
    End Method
End Class

Class Triangle Extends Polygon
    Method Triangle(X:Float, Y:Float)
        Super(X, Y)
        Kind = "Triangle"
    End Method
End Class

Class Square Extends Polygon
    Method Square(X:Float, Y:Float)
        Super(X, Y)
        Kind = "Square"
    End Method
End Class

Local T:Triangle = New Triangle(5, 3.5)
Local S:Square = New Square(1.3, 2)

T.PrintCenter() //Prints "This polygon is a Triangle and sits at 5, 3.5
S.PrintCenter() //Prints "This polygon is a Square and sits at 1.3, 2

One could continue this with pentagons and hexagons, but you get the idea.

One interesting thing to note is that constructors are not inherited: If Triangle for example didn't define a constructor, the expression New Triangle(5, 3.5) would be invalid, even though Polygon defines a constructor with two parameters.
You can use Super to explicitly call the constructor of the superclass from inside a constructor of the subclass to ensure the inherited members are properly initialized.

Subtype polymorphism

Don't be scared by the buzzwords in the title - subtype polymorphism is actually really easy! It is one of the most useful properties of class inheritance, so it's important to understand the concept.

Polymorphism in this context means that an instance of a subclass may be given everywhere an instance of the superclass is expected. Example (continued from above):

Local Poly1:Polygon = T //Assigning a Triangle to a Polygon - no problem!
Local Poly2:Polygon = S //Same!

Poly1.PrintCenter()
Poly2.PrintCenter()

Print AddPolyCenterX(T, S) //Parameters expect a Polygon, but we give Triangles/Squares instead!

Function AddPolyCenterX:Float(A:Polygon, B:Polygon)
    Return A.CenterX + B.CenterX
End Function

This makes inheritance even more useful! In addition to being able to reuse code from the superclass in all subclasses, we can now also reuse all code dealing with the superclass.

Note that casting to a superclass is not a one-way street. With an explicit typecast, we can get our original subclass back:

Local Tri:Triangle = Triangle(Poly1) //Works, because Poly1 was originally a Triangle
Local Sq:Square = Square(Poly1) //Doesn't work - Poly1 is not a Square. Typecast returns Null

Local Tri2:Triangle = Poly1 //Error! Need explicit typecast

A reference can only be casted to a class type if it has the corresponding "polymorphic attachment". The polymorphic attachments of a reference are basically all sub- and superclasses of the original instance. In the case of Poly1, those would be Triangle, Polygon and Object.

Method overriding

Overriding methods is another feature to make class inheritance more powerful. It allows subclasses to override methods of the superclass by redefining them. Combined with subtype polymorphism, this allows for really neat code.

Extending the examples from above:

Class Polygon
    Method Area:Float()
        Return 0.0
    End Method
End Class

Class Square Extends Polygon
    Field SideLength:Float

    Method Area:Float()
        Return SideLength^2
    End Method

    Method Square(SideLength:Float)
        Self.SideLength = SideLength
    End Method
End Class

Class Triangle Extends Polygon
    /* ... etc .... */
End Class

Local P1:Polygon = New Square(5.0) //Works because of polymorphism
Local P2:Polygon = New Polygon

Print "Area of polygon 1: " + P1.Area() //Prints 25.0!
Print "Area of polygon 2: " + P2.Area() //Prints 0.0!

As you can see, a method call on a Square instance will always call the overriding method - even if the reference seemingly points to a Polygon instance!

Sometimes, you need to call the original implementation of a method from a subclass. Just calling the method by its name will only ever call the overriden method - what to do? Luckily, the Super keyword allows us to access members of the superclass directly, disregarding the overridden versions. Extending the above example:

Class Square Extends Polygon
    ..... same as above .....

    Method PrintArea()
        Print(Area()) //Prints 25.0
        Print(Super.Area()) //Prints 0.0!
    End Method
End Class

Overriding methods allows us to define behaviour for all subclasses in the superclass by specifying the methods we want and a generic implementation that works for most subclasses, and then overriding that implementation in the subclass where appropriate.

When a generic implementation doesn't make much sense, but you still want to provide methods for subclasses to override, you can use the Abstract keyword:

Class Polygon
    Method Area:Float() Abstract
End Class

This is similar to making an empty method (i.e. with no statements inside), but with the difference being that subclasses have to override the method with an implementation of their own.
That is, instantiating a class or any of its subclasses will fail if there is at least one abstract method which has not been overwritten by an actual implementation (thus, New Polygon would not work anymore). This ensures that methods provided by a superclass actually have a meaningful implementation.

If you want to disallow instantiation of a class, you can also use Abstract on the class itself:

Class Polygon Abstract
End Class

Now, only subclasses of Polygon may be instantiated, but not Polygon itself.

There is also the Final keyword, which may be used on methods and classes. When it is used on a method, it will forbid any subclass from overriding it. When used on a class, it disallows other classes to extend it.

Using Abstract and Final together on a class makes sense for utility classes which only contain static members:

Class Math Abstract Final
    Function Sin(Angle:Float)
        ....
    End Function

    ....
End Class

Instantiating such classes does not make sense, so the Abstract keyword is used. To prevent evasion of this restriction by subclassing, Final is used as well.

Visibility

As a last feature, classes can control the visibility of their members. So far, members of a class were accessible from all parts of the program - inside the class itself, in subclasses and in code outside of the class. Sometimes, this is useful - most of the time, it isn't.

For these reasons, Private, Protected and Public can be used to control who has access to which members of a class.

They are used like this:

Class Foo
    ....
Private
    .....
Protected
    .....
Public
    ....
End Class

The visibility mode in classes starts out as public and can then be changed with the corresponding keywords. Member declarations receive the current visibility mode. Visibility can be changed as many times as you want and in any order.

Members with Public visibility are accessible everywhere and act the way described previously in this chapter - which makes sense, since visibility of members is public unless you explicitly change it!
Members with Protected visibility are only accessible within the class and subclasses of it.
Members with Private visibility are only accessible within the class that defined them - subclasses won't be able to see them! This also affects inheritance - private methods cannot be overridden.

Examples:

Class Polygon
    Method PrintInfo()
        //Accessing CenterX/Y is fine here - they are defined in this class
        Print("This is a " + Name + " at " + CenterX + ", " + CenterY + " and is " + Area() + "big!")
    End Method
Private
    Field CenterX:Float, CenterY:Float
Protected
    Field Name:String

    Method Area:Float() Abstract
Public //A second public section - perfectly fine!
    Method Polygon(X:Float, Y:Float)
        CenterX = X
        CenterY = Y
    End Method
End Class

Class Square Extends Polygon
    Method Square(X:Float, Y:Float, Size:Float)
        Super(X, Y) //Have to use constructor! Can't access CenterX/Y from here
        Self.Size = Size
        Name = "square"
    End Method
Protected
    Method Area:Float()
        Return Size*Size
    End Method
Private
    Field Size:Float
End Class

Local S:Square = New Square(10.0, 10.0, 3.0)
S.PrintInfo()

Print(S.Area()) //Error! Can't access Area from here - it's Protected
Print(S.CenterX) //Error! CenterX is Private!

If member visibility is new to you, you might be wondering why this is useful. The main reason visibility was introduced is to protect class integrity and therefore prevent bugs. Even if you program on your own and only use classes you yourself designed, it's important nonetheless to sit down and think about which functionality should be exposed to other code and which shouldn't.

When to use which visibility? There are a handful of rules for each:

Use Private for members that are irrelevant for users of the class (e.g. helper methods, internal variables etc.).
Use Private for fields that either need validation or some extra processing. Provide public Set/Get methods to allow read/write access through a safe environment
Use Protected when Private would be appropriate, but the member is also relevant for subclasses. For example, shared functionality in the base class that is irrelevant for a user of the class, but important for implementations of subclasses (e.g. Area in the previous example)
Use Public only for members if they are useful to a user of the class, and only for fields that don't need validation. Setting everything as public is convenient, but errorprone - if a member is public, it will be used sooner or later, and usually in a way that endangers class integrity (by avoiding validation checks and accidentally setting a reference to Null, for example).

Garbage collection

As seen in previous chapters, Bismuth programs can instantiate new instances using New. These instances occupy memory, naturally, so over time, a program will use more and more memory. But how to free memory that isn't needed anymore?

Bismuth has two ways to deal with this problem: Automatic garbage collection and manual memory management.

As you should know by now, a program can hold references to instances it created. Access to an instance is done using a reference to it. If there is no reference left to an instance, this means that it is unreachable for the program - without a reference, there's no way the program will ever be able to access it again!
For these reasons, the Bismuth compiler includes a garbage collector as part of your program. It will lay dormant most of the time, but wake up periodically and scan the program for instances that don't have any references left pointing to them, removing them from memory automatically.

As a programmer, you usually won't have to worry about the garbage collector - most of the time you won't even notice it's there! However, it should be noted that unreachable instances are not removed right away - the garbage collector will only kick in every so often, and even then it may not remove all unreachable instances right away. Sometimes, that is not what you want - say, if you allocated a large instance, but only need it temporarily and want to get rid of it right after you don't need it anymore to save memory, you may choose to manually remove that instance from memory using the Delete statement:

Function HeavyLifting()
    Local Temp1:ReallyLargeClass = New ReallyLargeClass
    Local Temp2:Float[,,] = New Float[1024, 1024, 1024]

    /*  do something with it */

    /* And now get rid of it immediately */
    Delete Temp1
    Delete Temp2
End Function

A fair word of warning: Using Delete is dangerous if you don't know what you're doing. It should really only be used when it is absolutely necessary.
The reason is that Delete does not remove references to the instance being deleted - in our previous example, Temp1 and Temp2 would not equal to Null, but point to a now unused chunk of memory. There is no way of telling whether the instance a reference points to has been removed with Delete, so you could end up accidentally modifying a deleted instance, leading to memory corruption or random crashes.
Therefore: Only use Delete if you're absolutely sure that no other code could end up accidentally trying to use the instance you're about to delete.

Import

Once codes get longer and longer, it's quite useful to split it up into multiple files to avoid losing the overview and to make compilation faster.

The Import statement allows you to import other code files into your code. It is used as

Import "Path/To/File.bi"

Semantically, there is no difference between importing a code file and copy-pasting it at the end of the file that imports it. However, compilation times will be vastly different. If a code file didn't change since the last compilation, the compiler will use a compiled version cached on disk, ensuring short compilation even in large projects - useful!
Note that imported code files may themselves import other files, leading to arbitrarily nested import chains. The compiler is robust enough to deal with cyclic imports; any file will be imported at most once.

Import does not only work with *.bi files - most files somehow useful will be recognized by the compiler. It will check the file extension, match it against a rule in the list below and take appropriate action. If the code being compiled is called Foo.bi, then, depending on the file extension of the import:

.c, .cpp, *.cxx: The file will be compiled separately and linked statically with Foo.o
.h, .hpp, *.hxx: The file will be included in Foo.h
*.o: The file will be linked statically with Foo.o
.a, .so: The file will be linked with the final executable
No extension: The file will be treated as a library import (e.g. Import "opengl32" would link the final executable with the opengl library).

Import can also be used to import modules, which will be explained in the next chapter.

Modules

Bismuth is a modular language, allowing you to encapsulate often-used functionality in a unit called module. A module is just a code file - possibly importing other modules or code files - which, upon import, will make its globals, functions, classes and constants available to the code that uses it.

Modules live inside the /mod/ folder inside the root directory of the compiler. The Bismuth compiler will scan this folder and its subfolders at startup. If a folder contains a *.bi file with the same name, it will be recognized as a module and loaded into the module tree.
Modules may contain modules themselves (so-called submodules), nesting arbitrarily deep. This allows grouping modules with similar functionality together.

Modules can be imported into a bismuth code using the Import statement. For example, Import MyModule will look for a folder called MyModule inside the main module folder and, if it exists, look for a file "MyModule.bi" inside it and load it.
You can import submodules by separating module names with . - e.g. Import MyModule.MySubmodule. Additionally, using the wildcard * will import all submodules of a module (Import MyModule.*, for example).

Importing a module is very similar to importing a code file. One of the advantages is that modules live in a directory the compiler knows about - there's no need to specify a full path to a code file or even copy it close to the code that uses it just to get shorter imports. Modules also encourage code reuse and make it easy to share code with others.
One big difference is that modules are precompiled. If the compiler stumbles upon an outdated build of a module (or an uncompiled module), it will warn you about it - but not compile it! This has to be done by the user beforehand.

All of the important language features - Strings, Arrays, Object, the garbage collector etc. - live in a module called Bismuth. Out of convenience, this module and its submodules are automatically imported in every Bismuth file (this is equivalent to adding Import Bismuth.* at the beginning of each code). This module should not be changed or extended by new submodules - if you have suggestions or think you found a bug, notify the developers instead so everyone can profit.

Interfacing with C/++: Extern

Although Bismuth can do a lot of things C/++ can, you may find yourself in a situation where you'd rather use an existing library written in C/++ rather than write your own implementation. For these reasons, the Extern statement has been introduced to make the compiler recognize functions, classes and variables declared in other languages.

The syntax of Extern is simple - just write normal Function/Class/Global declarations inside an Extern-End Extern block as you normally would, except omitting any implementation.

This example:

Extern
    Function Foo:Float()
    Function Bar:Int(A:Int)

    Global Scootaloo:Foobar

    Class Bloomberg
        Method Apple()
        Method Bloom:Int()
    End Class
End Extern

Would describe the following .hpp header:

float Foo();
int Bar(int A);

class Bloomberg {
public:
    void Apple();
    int Bloom();
};

extern Foobar* Scootaloo;

Note that declarations inside an Extern block don't have to be complete - for example, specifying only the members of a class you need and not all members there are is absolutely fine. Also, defining C/++ macros as a Function in Bismuth is no problem.
Just remember to Import the header that contains all the constructs you declared inside the Extern block and the *.c/*.cpp/*.cxx file that actually implements them!

It is important to know that Bismuth doesn't actually have a way of determining whether the Extern block is correct, that is, whether the declarations match the ones found in the corresponding C/++ source. This means that if they are incorrect, you will have to decipher the compiler error given by the C/++ compiler, which may involve having to take a look at code generated by the Bismuth compiler.
This is a bit of a hassle, but there is no way around it without having to implement a full C/++ compiler inside the Bismuth compiler, which is not an easy task given the syntax of the C/++ language.