Version 0.1 draft
by Benedikt Bitterli
Bismuth has been designed with the purpose of being an easy to learn, yet powerful programming language. Taking inspiration from existing languages in the BASIC family, the core principle of Bismuth is to relieve programmers of needless and unintuitive constraints while still allowing for as much control as possible.
This manual should serve as an introduction and a hopefully complete language reference for Bismuth.
Like most reference manuals, this document may be dry in some places and outdated in others. As the language is still in development, certain parts of this document may even undergo heavy changes in the future. Still, we try to keep this manual reliable and up-to-date as best as we can.
Note that this manual is not a programming tutorial. While the use of BNF and similar grammar description languages has been avoided, this document is still adressed at programmers experienced with other languages.
As of now, Bismuth does not directly compile to native code, but instead outputs C++ code as an intermediate step. This allows us to take advantage of the capabilities of today's C++ compilers for producing fast and reliable binaries as well as allowing the user to easily interface with existing C/++ code and libraries.
The Bismuth compiler is distributed as free software and may be used, modified and redistributed without restriction. Specific license terms are included with each Bismuth distribution.
Identifiers are used to name variables, functions and classes inside a Bismuth program. Identifiers are chosen by the programmer and serve as a handle by which language constructs can be referred to.
Identifiers in Bismuth can be an arbitrarily long sequence of letters, digits, and underscores not beginning with a digit.
The following examples are valid identifiers:
i_am_an_identifier
FooBar
_superDerby1998
These are not:
99problems
I am a potato
The following keywords are used by the language itself and may not be used as identifiers:
and abstract bool byte case const continue class default delete do double eachin
else elseif end endextern endfunction endif endmethod endselect endclass exit
extends extern false field final float for forever function global if import int
local long method mod new next not null or object private protected ptr public
repeat return select self shl short shr step string super then to true try ubyte
uint ulong until ushort varptr void wend while
Note that Bismuth is case insensitive. The identifier foobar
is equivalent to FooBar
, FoObAr
and so forth.
When defining variables or functions, Bismuth requires you to specify a datatype for the construct in question. Datatypes denote the type of values which are contained in a variable or returned by a function.
There are ten primitive datatypes in Bismuth. For integers:
Byte: 8 bit signed integer
Short: 16 bit signed integer
Int: 32 bit signed integer
Long: 64 bit signed integer
UByte: 8 bit unsigned integer
UShort: 16 bit unsigned integer
UInt: 32 bit unsigned integer
ULong: 64 bit unsigned integer
For decimals:
Float: 32 bit floating point
Double: 64 bit floating point
Datatypes may also consist of an identifier instead of one of the above keywords to refer to classes. Strings, which are used to store text, also fall into this category, as it is a special class provided by Bismuth and not a primitive datatype as such.
Additionally, each of these types may be augmented by a sequence of []
brackets and Ptr
keywords, denoting the datatype as an array of or a pointer to a value, respectively.
Datatype declarations are read from right to left. Consider the following snippets:
Float Ptr Pointer to a float
Float Ptr Ptr Pointer to a pointer to a float
String[] Array of strings
Float[][] Array of arrays of floats
Long Ptr[] Array of pointers to longs
Long[] Ptr Pointer to an array of longs
Int Ptr[][,,] Three dimensional array containing arrays of pointers to ints
Byte[] Ptr[] An array of pointers to an array of bytes
Don't worry if you don't fully understand these right now - complex datatypes such as the last two rarely occur in the wild.
There is a last primitive datatype in Bismuth - Void
. Void
can only be used either as a function return type, in which case it denotes that the function does not return a value, or in the form of Void Ptr
, denoting a pointer to data of any kind.
Variables are a program's way to store data during runtime. There are two basic kinds of variables in Bismuth: variables of local scope and variables of global scope.
Local and global variables are defined using the Local
and Global
keyword respectively:
Local Identifier1:Datatype
/* or */
Global Identifier2:Datatype
Where Identifier
is the name of the variable and Datatype
the corresponding datatype of the variable as described in the last section.
Variables may be assigned directly upon definition with
Local Identifier:Datatype = Expression
/* or */
Global Identifier:Datatype = Expression
Multiple variables can be defined in one statement by separating them with commas:
Local Identifier1:Datatype1 = Expression1, Identifier2:Datatype2 = Expression2, Identifier3:Datatype3 = ....
The assignment is, of course, optional. The syntax for the corresponding Global
declaration is exactly the same - only replace the keyword.
The following are all valid variable definitions:
Local Foo:Byte
Local Bar:Int = 42
Local X:Int = 5, Y:Int = 24
Global Raspberry:Int, Pi:Float = 3.14159265
It should be noted that variables which aren't assigned upon definition will instead be initialized to a sensible default value. This is 0
for numerical types, ""
for strings, the null reference for classes and the null pointer for pointers.
To be more specific, not specifying an initial value is equivalent to assigning Null
to it; see the type conversion chapter for more information about the different meanings of Null
.
Local variables may be defined anywhere in the program and only exist inside the block they've been defined. Local variables may only be used after they have been declared.
Global variables on the other hand may only be defined outside of functions. Unlike local variables however, global variables may be used anywhere in the program - even in code preceding definition!
Let's see some examples:
Local Foo:Int = 5
Print(Foo) //Prints 5
Print(Bar) //Error! Bar is undefined at this point
Local Bar:Float = 3.0
Print(GlBar) //Prints 0
Global GlBar:Int
Careful though! Even though globals may be used in code preceding their definition, they will only be assigned once execution reaches the definition. Consider this snippet:
Print(GlBar) //Valid code - but prints 0!
Global GlBar:Int = 42
Print(GlBar) //Prints 42
There is a construct that looks very similar to a variable declaration, namely the constant. As the name suggests, a constant stores a value that cannot be changed at runtime. Constants can (and have to) be assigned once and only once, namely in their definition. Similar to global variables, they can be used in the entire program, even in code preceding their definition. Unlike global variables however, constants may be defined anywhere in the program.
The only restriction is that the value assigned to a constant has to be a constant expression. Examples:
Const Pi:Float = 3.14159
Print(Pi) //Prints 3.14159
Print(TheAnswer) //Perfectly valid code! Prints 42
Const TheAnswer:Int = 42
TheAnswer = 43 // Error: Assigning to a constant is not allowed.
Note that the strange behaviour of constants and globals (being available before their definition) is for legacy reasons and is likely to change in future versions of Bismuth.
A function represents a block of code that can be called from anywhere in your program.
Functions are defined as follows:
Function Identifier:ReturnType(Parameters)
/* Function statements go here */
End Function
ReturnType
can be omitted, in which case it will default to Void
, i.e. the function does not return a value.
Parameters
has to be a possibly empty list of comma separated variable declaration in the form of
Identifier1:Datatype, Identifier2:Datatype, ....
Calling and returning values from functions will be covered in later chapters.
Expressions in Bismuth consist of literals and/or variables combined with unary or binary operators. Expressions always return a value and cannot stand on their own - they always have to be used within statements or declarations.
As the name suggests, these operators combine their operands using standard arithmetic rules. These operators may only be used with numeric types, with the exception of +, which may be used with strings to perform concatenation.
Unary:
- Unary minus
Binary:
^ Power-of operator
* Multiplication
/ Division
+ Addition
- Subtraction
Mod Modulo (remainder of division)
Logical operators evaluate their operands to a boolean using standard conversion rules and then compare them according to specific rules.
Unary:
Not
Negates the operand, i.e. returns true if the operand is false and vice-versa.
Binary:
And
Logical and. Returns true only if both operands are true; returns false otherwise.
Or
Logical or. Returns true if at least one of the operands is true; returns false otherwise.
And
and Or
in Bismuth are short-circuited. This means that they don't evaluate their second operand if the result is already determined after evaluating the first. I.e. And
will skip evaluation of the second operand if the first operand is false; Or
works similarily by ignoring the second operand if the first one results in true.
This is useful if the second operand has side-effects. Consider this example:
A <> Null And A.Foo = 5
If And
always evaluated both operands, the program would crash if A
was Null. Since this is not the case, And
will not evaluate A.Foo
if A <> Null
is false, thus preventing a crash if A
is indeed Null
.
These operators compare their operands and always return a boolean (True or False).
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to
= Equal to
<> Not equal to
<>
and =
will work on all types, whereas the other operators require their operands to be of numeric type.
Bitwise operators access the bit patterns of their operands. They may only be used on integers.
Unary:
~ Negation; flips all bits of the target expression.
Binary:
& Bitwise and
| Bitwise or
~ Bitwise xor
Shl Shift left
Shr Shift right (logical for unsigned types, arithmetic for signed types)
Function calls are of the form Expression(Parameters)
, where Expression
evaluates to a function datatype and Parameters
is a (possibly empty), comma-separated list of expressions to pass to the function.
If the function return datatype is Void
, then it may not be called inside an expression, since expressions have to return values.
In case the function defines optional parameters, these may be omitted in the function call.
Assuming the following declarations:
Function A:Int(Foo:Float)
End Function
Function B:Float(Foo:Double, Bar:Int = 5)
End Function
these are correct function calls:
A(4.5)
B(1.5, 6)
B(2.5) //Bar defaults to 5
The index operator []
allows access to values inside an array or values referenced by a pointer. Index expressions are of the form Expression[ExpressionList]
, where Expression
evaluates to a pointer or an array and ExpressionList
is a non-empty, comma-separated list of integer expressions.
For arrays, the number of elements of ExpressionList
must match the dimension of the Expression
operand. For pointers, ExpressionList
must contain exactly one expression.
Assuming the following declarations:
Local A:Int[], B:Float[,,], C:Double Ptr
these are valid index expressions:
A[5]
B[2, 3, 4]
C[67]
VarPtr
retrieves the address in memory of a particular expression. It may only be used with operands that are actually stored in memory, i.e. variables or array locations.
Assuming the following declarations:
Local A:Int, B:Float[,,], C:Foo
Class Foo
Field Bar:Int
End Class
these are valid applications of VarPtr:
VarPtr A //Returns an Int Ptr
VarPtr B[2, 5] //Returns a Float Ptr
VarPtr B //Returns a Float[,,] Ptr
VarPtr C.Var //Returns an Int Ptr
The New
operator will instantiate a class or an array - that is, allocate memory for the object to inhabit, potentially run initialization code and return the newly created object.
For classes, the syntax of New
is as follows:
New Identifier
/* or */
New Identifier(ParameterList)
Where Identifier
is the identifier of a class and ParameterList
is a (potentially empty), comma-separated list of expressions. New Identifier
is shorthand syntax for New Identifier()
.
New
will call the constructor of the class (if it has one) and pass ParameterList
as arguments to it. As such, the number and type of the arguments must match the ones defined by the class constructor.
If the class does not define a constructor, New
treats it as a class with a constructor with no arguments (that is, only New Identifier
or New Identifier()
are valid).
For arrays, the syntax of New
is as follows:
New ArrayType[SizeList]
Where ArrayType
is the datatype of the array and SizeList
is a non-empty, comma-separated list of expressions of integer type.
Examples:
New Float[5] //Float array with 5 entries
New Int[20, 10] //2D Int array with 20x10 entries
New String[][][42] //An array of String[][] with 42 entries
The type cast operator is of the form
Datatype(Expression)
where Datatype
is a valid datatype as described in an earlier chapter and Expression
has to be explicitly castable to Datatype
.
The conversion follows standard type conversion rules. If is a valid conversion, it will return a value of type Datatype
.
Statements are a program's only way to actually execute code. A statement is the smallest standalone element of any programming language. A program consists of a sequence of one or more statements.
A statement is one of the following:
A statement list in Bismuth consists of a sequence of statements separated by a line break. Multiple statements may be written on one line by separating them with a semicolon (;
). One statement may be spread onto multiple lines by ending each line with two periods (..
)
The following statement lists are equivalent:
Print("Hello ")
Print("World!")
Print("Hello "); Print("World!")
Print( ..
"Hello " ..
)
Print( ..
"World!" ..
)
Writing multiple statements in one line may be necessary in some cases - for example, it's the only way to execute multiple statement in the ForStatements
of a For-While-Do
loop or inside a one-line-If
- but usually, this kind of code style should be avoided in favour of readability.
Assignments in Bismuth are of the form
Expression1 = Expression2
They are used to assign values to a writable location.
Expressions on the left of the assignment operator (=
) have addressable and writable.
For example, assigning to 5 + 3
is not allowed - it may be an expression, but the result is not addressable (i.e. it does not resolve to some location in memory). Similarily, assigning to a constant outside its declaration won't work either, since constants are not writable.
Function call statements follow the exact same syntax and behaviour as function call expressions.
The only difference is that any value returned by the function is discarded. As such, functions of type Void
can be called as well, in contrast to function call expressions, which do not allow this.
Control statements alter the order in which statements are executed. They are extremely important tools of a programming language, and it is important to know all of them in order to write good code.
Control statements in Bismuth each create their own scope; that is, a new space for variables to live in. Variables defined inside a scope "vanish" once the corresponding code block is left; also, variables from an outer scope may be overwritten by an inner scope ("shadowing").
See an example:
Local A:Int = 6
If A <> 0 Then
Local A:Int = 3 //Perfectly legal - "overwrites" the old A inside this scope
Local B:Int = 7
Print(A) //Prints 3
A = 23
Print(A) //Prints 23
EndIf
Print(A) //Prints 6
Print(B) //Error! B does not exist anymore
The If
statement is used to conditionally execute blocks of code based on an expression. An If
statement looks as follows:
If Expression1 Then
StatementList
ElseIf Expression2 Then
StatementList
ElseIf Expression3 Then
StatementList
.....
Else Expression4 Then
StatementList
EndIf
The expressions will be evaluated in order until one is found that evaluates to True
, in which case the matching list of statements will be executed. If none evaluate to true, the statements inside the Else
block will be executed (if there is one).
The list of ElseIf
blocks may be arbitrarily long and even be omitted completely. The Else
block is optional.
To support readability and compact code in case of short expressions and statement lists, Bismuth supports a one line equivalent of a full If
statement:
If Expression1 Then StatementList ElseIf Expression2 Then StatementList .... Else StatementList
The same rules as for the multiline version apply.
The Select
statement is used to select a statement list to execute from a labeled set. The syntax is as follows:
Select Expression
Case Expression1
StatementList
Case Expression2
StatementList
....
Default
StatementList
End Select
This is equivalent to the following If
statement:
If Expression = Expression1 Then
StatementList
ElseIf Expression = Expression2 Then
StatementList
....
Else
StatementList
EndIf
with the only difference that Expression
is only evaluated once.
The Default
case is optional.
The For
loop is used to execute a code block a specific number of times. The syntax is as follows:
For Variable = FirstValue To LastValue Step StepValue
StatementList
Next
The For
loop will first set Variable
to FirstValue
. Then it performs a check to see whether Variable
is smaller or equal to LastValue
. If it is, it will execute the statement list, increment Variable
by StepValue
and jump back to the beginning of the loop (performing the check again). If the check fails however, the statement list will be skipped and execution resumes with the next statement following Next
.
Variable
has to be an existing variable and be of numeric type.
StepValue
has to be constant. It may be omitted, in which case it defaults to 1.
Alternative forms of the For
loop are available:
Variable
may be replaced by Local Variable:Datatype
, in which case a new local variable will be introduced instead of having to use an existing one.
To
may be replaced by Until
, in which case the last iteration of the loop will be skipped.
Consider the following sample codes:
//Prints all number from 0 to 10 inclusive
For Local I:Int = 0 To 10
Print I
Next
//Prints all number from 0 to 10 exclusive
For Local I:Int = 0 Until 10
Print I
Next
//Prints the numbers 0, 3, 6, 9
For Local I:Int = 0 To 10 Step 3
Print I
Next
//Prints all numbers from 10 to 0
For Local I:Int = 10 To 0 Step -1
Print I
Next
The While
loop will execute a code block while a certain condition holds true. It is of the form:
While Expression
StatementList
Wend
Expression
will be evaluated before executing the statement list. If it evaluates to True
, the statement list will be executed; otherwise, it will be skipped and execution resumes with the next statement following Wend
.
The Repeat
loop will execute a code block until a certain condition is true. It is of the form:
Repeat
StatementList
Until Expression
Expression
will be evaluated after executing the statement list, meaning that the statement list will be executed at least once. If it evaluates to True
, execution leaves the loop and continues with the next statement following Until
. Otherwise, the statement list will be executed again.
An additional version of Repeat is available:
Repeat
StatementList
Forever
This will continue executing the statement list until the loop is broken manually with Exit
or Return
.
The For-While-Do
loop is an enhanced version of the For
loop resembling for
loops in other languages (such as C or Java). It is of the form
For ForStatements While WhileExpression Do DoStatement
StatementList
Next
This is equivalent to the following construct:
ForStatements
While WhileExpression
DoStatement
StatementList
Wend
The DoStatement is optional and may be omitted completely (the Do
however is mandatory).
Let's see an example:
//Computes and prints all Fibonacci numbers smaller than 1000
For Local A:Int, B:Int = 1 While B < 1000 Do Local C:Int = A + B
A = B
B = C
Print(A)
Next
Currently the DoStatement
is not that useful. Future versions of Bismuth may change the behaviour of the For-While-Do
loop to act more like a For
loop, executing the DoStatement
after the statement list and adding special cases for Exit
and Continue
.
Return
may be used inside a function to leave it prematurely. Additionally, it is also used to return values from a function, in which case it used as Return Expression
.
If the function is of type Void
, specifying a return value is not allowed. Similarily, if the function is non-void, a return value has to be specified when Return
is used.
Note that Return
ignores all enclosing control statements and always leaves the function directly.
Exit
and Continue
may be used inside loops to change the loop behaviour.
Exit
will leave the loop immediately and redirect execution to the next statement following the loop block.
Continue
, on the other hand, will skip the rest of the statement list and jump directly to the end of the loop block, execute all potential loop checks and start a new iteration if the checks still hold.
Note that Exit
and Continue
act on the innermost enclosing loop block. Other enclosing control statements, such as If
or Select
, will be ignored.
Bismuth programs are stored as plain text in *.bi files and consist of a list of statements and declarations.
Unlike languages like C, Bismuth allows statements to stand on their own, outside any function or class declaration. These are the statements that will be executed when the program is compiled and run, i.e. a program consisting of only declarations won't actually do anything.
Typically a program starts out with a list of imports followed by the program statements and, after that, all function and class declarations. This ordering is not mandatory, but recommended to support readability.
Note that the program statements form their own scope, so local variables declared on the program level are not accessible inside functions or classes:
Local Message:String = "Hello, world!"
Print(Message) // Prints "Hello, world!"
PrintMessage()
Function PrintMessage()
Print(Message) // Error! Message not accessible from here
End Function
Every expression and variable in Bismuth has an associated datatype, which has been discussed in earlier chapters. As such, it is not unusual for operands, assignments or parameters to be of different datatypes. For these reasons, Bismuth provides a set of rules for implicit and explicit type conversion.
If the source expression of an assignment or a function parameter is of differing type than the target datatype, Bismuth will first try to do an implicit conversion of the source datatype. This means that in most cases, the source expression will silently be converted into the target datatype. If the implicit conversion succeeds, it behaves the same as performing an explicit conversion using the type cast operator.
Implicit conversion works if:
SourceExpression <> Null
.Null
. If the target type is""
[]
Byte Ptr
or a UByte Ptr
, in which case the pointer to the first character of the string will be passed instead. This is out of convenience when calling C/++ functions.In any other case, the implicit conversion will fail and the compiler will issue an error.
Explicit conversion can be used to override the type system used by the compiler. Explicit conversion is done using the type cast expression introduced earlier.
An explicit conversion will succeed if either the corresponding implicit conversion would succeed, or if:
Null
.In any other case, an error will be issued by the compiler.
When using operands of different types inside an expression, Bismuth employs a set of rules to determine the datatype of the result. In such a case, the following list of rules will be checked from top to bottom. The first matching rule determines the datatype of the result:
String
, the result type will be a String
Double
, the result type will be a Double
Float
, the result type will be a Float
Long
, the result type will be a Long
Int
The only exception to this rule is the power-of operator (^), the result of which will always be a double.
In any case, the result of the operation will be the same as if both operands had been explicitly converted to the result datatype before the operation.
Classes represent a collection of data and functions to operate on said data.
Classes are defined with the Class
keyword. This will define only a class, which is a blueprint for creating instances of the class. Only instances will occupy memory and can actually be referenced.
Variables of an instance holding data are called fields, whereas functions of an instance are called methods. Both fields and methods of an instance are referred to as members.
The most basic class definition is of the form:
Class Identifier
Field Identifier1:Datatype1
Field Identifier2:Datatype2
...
End Class
This class can now be instantiated using the New
operator and its fields accessed with the .
operator. Take a look at these examples:
Class Foo
Field A:Int
Field Message:String
End Class
Local Instance1:Foo = New Foo //Instanciates Foo and assigns a reference to Instance
Instance1.A = 25
Instance1.Message = "Hello World!"
Print(Instance1.Message) //Prints "Hello World!"
Foo.A = 34 //Error! Foo is a class and not an instance
As you've seen in the example above, the variable Instance1
has the datatype Foo
. As explained in an earlier chapter, class names can be used as a datatype, denoting that the variable can contain a reference to an instanbce of the class.
Note that specifying a class type does not mean that the variable actually contains an instance. Bismuth passes instances around by reference. This means that multiple variables can reference the same instance, or that one variable may reference no instance at all. Variables of class type are initialized to the null reference, meaning that no instance is referenced by the variable. The reference can be changed through assignment.
Let's see a few examples:
Local Instance2:Foo
Print(Instance2.Message) //Runtime error! Instance2 does not reference any object - accessing Message will crash
Local Instance3:Foo = Instance1 //Instance3 and Instance1 now reference the same class instance
Print(Instance3.A) //Prints 25
Instance3.A = 10
Print(Instance1.A) //Prints 10
Indicated by the last line, references and instances are two different things. When the instance is changed through one reference, the change will be seen through all other references.
Next to defining fields, it is also possible to define methods in a class definition. A method definition looks the same as a function definition - just replace the Function
keyword by Method
and place the definition inside a Class
-EndClass
block.
Methods can be accesses using the .
operator and called with the standard function call syntax. Consider the following example:
Class Bar
Field Value:Float
Method ChangeAndPrint(NewValue:Float)
Value = NewValue
Print("The value is: " + Value)
End Method
End Class
Local B:Bar = New Bar
B.ChangeAndPrint(1.37)
The useful thing about Methods is that they have full access to all fields as if they were local variables. Inside a method, you will also have access to an implicitly defined Self
, which always holds a reference to the current instance the method acts upon.
There is a special kind of method called the constructor. It follows the same syntax as other methods, only that it has the same name as the class. Constructors are provided by the programmer and serve to initialize a class with sensible data. They also provide a bit of extra security: A class can only be instantiated by calling the constructor, therefore ensuring classes are always initialized when instantiated.
Constructors couple tightly with the New
expression - let's see some examples:
Class ChocolateBar
Field Price:Float
Method ChocolateBar(Price:Float)
//Another useful purpose of Self: access fields shadowed by parameters
Self.Price = Price
End Method
End Class
Class VendingMachine
Field Contents:TList
Method VendingMachine()
//Ensure that Contents references a valid instance
Contents = New TList
End Method
End Class
Local Machine:VendingMachine = New VendingMachine()
//Machine.Contents now referents a valid instance!
Local Bar:ChocolateBar = New ChocolateBar(3.50)
Local Foo:ChocolateBar = New ChocolateBar //Error! Must provide constructor with parameters
The exact syntax and semantics of the constructor are explained in more detail in paragraph about the New
statement.
So far we only looked at instance members, which belong to the instances of a class. Each instance has its own copy of members, which are independent of other instances - changing a field in one instance will leave the other instances untouched.
However, classes in Bismuth can also provide static members, which belong to the class rather than an instance of it. Regardless of how many instances of a class exist, there always exists only one set of static members.
Static fields are introduced using the Global
keyword, static methods with Function
. The syntax is exactly as you'd expect from the previous use of Global
and Function
- just the meaning is different.
What's useful about static members as opposed to just using Global
and Function
outside the class definition is that they are accessible (and modifiable) from all instances just like normal fields. Additionally, they allow for encapsulating all data and functionality required by a class right inside the class definition, instead of scattering required globals and functions inside the file.
Static members can be accessed from outside the class using the .
operator just like you would with an instance member - only using the class name instead of the instance reference on the left hand side.
Let's see an example:
Class Apple
Global AppleCount:Int
Method Apple() //Apple constructor
AppleCount = AppleCount + 1
End Method
Function PrintApples()
Print("There are " + AppleCount + " apples!")
End Function
End Class
Local A:Apple = New Apple
Local B:Apple = New Apple
Local C:Apple = New Apple
Apple.PrintApples()
If Apple.AppleCount > 2 Then Print("That's a lot of apples!")
Inheritance allows classes to extend the functionality of an existing class. It allows for code reuse and so-called subtype polymorphism, which will be explained later.
Classes can define a parent class (or "superclass") by using the Extends
keyword in the class definition:
Class Polygon
End Class
Class Triangle Extends Polygon
End Class
In this case, Polygon
would be the superclass and Triangle
the subclass. The superclass may itself be a subclass of another class and so forth. Classes which don't define a superclass implicitly extend the Object
class - thus, every class in Bismuth is either directly or indirectly a subclass of Object
.
Subclasses inherit all members (that is, fields and methods) from their parent class, meaning subclasses can use members of the parent class as is they were defined in the subclass.
This is extremely useful if you have a handful of classes that share common attribute and code - just define a base class with all the common members and extend it! Continuing the example with polygons from before:
Class Polygon
Field Kind:String
Field CenterX:Float, CenterY:Float
Method PrintCenter()
Print("This polygon is a " + Kind + " and sits at " + CenterX + ", " + CenterY)
End Method
Method Polygon(X:Float, Y:float)
CenterX = X
CenterY = Y
End Method
End Class
Class Triangle Extends Polygon
Method Triangle(X:Float, Y:Float)
Super(X, Y)
Kind = "Triangle"
End Method
End Class
Class Square Extends Polygon
Method Square(X:Float, Y:Float)
Super(X, Y)
Kind = "Square"
End Method
End Class
Local T:Triangle = New Triangle(5, 3.5)
Local S:Square = New Square(1.3, 2)
T.PrintCenter() //Prints "This polygon is a Triangle and sits at 5, 3.5
S.PrintCenter() //Prints "This polygon is a Square and sits at 1.3, 2
One could continue this with pentagons and hexagons, but you get the idea.
One interesting thing to note is that constructors are not inherited: If Triangle
for example didn't define a constructor, the expression New Triangle(5, 3.5)
would be invalid, even though Polygon
defines a constructor with two parameters.
You can use Super
to explicitly call the constructor of the superclass from inside a constructor of the subclass to ensure the inherited members are properly initialized.
Don't be scared by the buzzwords in the title - subtype polymorphism is actually really easy! It is one of the most useful properties of class inheritance, so it's important to understand the concept.
Polymorphism in this context means that an instance of a subclass may be given everywhere an instance of the superclass is expected. Example (continued from above):
Local Poly1:Polygon = T //Assigning a Triangle to a Polygon - no problem!
Local Poly2:Polygon = S //Same!
Poly1.PrintCenter()
Poly2.PrintCenter()
Print AddPolyCenterX(T, S) //Parameters expect a Polygon, but we give Triangles/Squares instead!
Function AddPolyCenterX:Float(A:Polygon, B:Polygon)
Return A.CenterX + B.CenterX
End Function
This makes inheritance even more useful! In addition to being able to reuse code from the superclass in all subclasses, we can now also reuse all code dealing with the superclass.
Note that casting to a superclass is not a one-way street. With an explicit typecast, we can get our original subclass back:
Local Tri:Triangle = Triangle(Poly1) //Works, because Poly1 was originally a Triangle
Local Sq:Square = Square(Poly1) //Doesn't work - Poly1 is not a Square. Typecast returns Null
Local Tri2:Triangle = Poly1 //Error! Need explicit typecast
A reference can only be casted to a class type if it has the corresponding "polymorphic attachment". The polymorphic attachments of a reference are basically all sub- and superclasses of the original instance. In the case of Poly1, those would be Triangle
, Polygon
and Object
.
Overriding methods is another feature to make class inheritance more powerful. It allows subclasses to override methods of the superclass by redefining them. Combined with subtype polymorphism, this allows for really neat code.
Extending the examples from above:
Class Polygon
Method Area:Float()
Return 0.0
End Method
End Class
Class Square Extends Polygon
Field SideLength:Float
Method Area:Float()
Return SideLength^2
End Method
Method Square(SideLength:Float)
Self.SideLength = SideLength
End Method
End Class
Class Triangle Extends Polygon
/* ... etc .... */
End Class
Local P1:Polygon = New Square(5.0) //Works because of polymorphism
Local P2:Polygon = New Polygon
Print "Area of polygon 1: " + P1.Area() //Prints 25.0!
Print "Area of polygon 2: " + P2.Area() //Prints 0.0!
As you can see, a method call on a Square
instance will always call the overriding method - even if the reference seemingly points to a Polygon
instance!
Sometimes, you need to call the original implementation of a method from a subclass. Just calling the method by its name will only ever call the overriden method - what to do? Luckily, the Super
keyword allows us to access members of the superclass directly, disregarding the overridden versions. Extending the above example:
Class Square Extends Polygon
..... same as above .....
Method PrintArea()
Print(Area()) //Prints 25.0
Print(Super.Area()) //Prints 0.0!
End Method
End Class
Overriding methods allows us to define behaviour for all subclasses in the superclass by specifying the methods we want and a generic implementation that works for most subclasses, and then overriding that implementation in the subclass where appropriate.
When a generic implementation doesn't make much sense, but you still want to provide methods for subclasses to override, you can use the Abstract
keyword:
Class Polygon
Method Area:Float() Abstract
End Class
This is similar to making an empty method (i.e. with no statements inside), but with the difference being that subclasses have to override the method with an implementation of their own.
That is, instantiating a class or any of its subclasses will fail if there is at least one abstract method which has not been overwritten by an actual implementation (thus, New Polygon
would not work anymore). This ensures that methods provided by a superclass actually have a meaningful implementation.
If you want to disallow instantiation of a class, you can also use Abstract
on the class itself:
Class Polygon Abstract
End Class
Now, only subclasses of Polygon
may be instantiated, but not Polygon
itself.
There is also the Final
keyword, which may be used on methods and classes. When it is used on a method, it will forbid any subclass from overriding it. When used on a class, it disallows other classes to extend it.
Using Abstract
and Final
together on a class makes sense for utility classes which only contain static members:
Class Math Abstract Final
Function Sin(Angle:Float)
....
End Function
....
End Class
Instantiating such classes does not make sense, so the Abstract
keyword is used. To prevent evasion of this restriction by subclassing, Final
is used as well.
As a last feature, classes can control the visibility of their members. So far, members of a class were accessible from all parts of the program - inside the class itself, in subclasses and in code outside of the class. Sometimes, this is useful - most of the time, it isn't.
For these reasons, Private
, Protected
and Public
can be used to control who has access to which members of a class.
They are used like this:
Class Foo
....
Private
.....
Protected
.....
Public
....
End Class
The visibility mode in classes starts out as public and can then be changed with the corresponding keywords. Member declarations receive the current visibility mode. Visibility can be changed as many times as you want and in any order.
Members with Public
visibility are accessible everywhere and act the way described previously in this chapter - which makes sense, since visibility of members is public unless you explicitly change it!
Members with Protected
visibility are only accessible within the class and subclasses of it.
Members with Private
visibility are only accessible within the class that defined them - subclasses won't be able to see them! This also affects inheritance - private methods cannot be overridden.
Examples:
Class Polygon
Method PrintInfo()
//Accessing CenterX/Y is fine here - they are defined in this class
Print("This is a " + Name + " at " + CenterX + ", " + CenterY + " and is " + Area() + "big!")
End Method
Private
Field CenterX:Float, CenterY:Float
Protected
Field Name:String
Method Area:Float() Abstract
Public //A second public section - perfectly fine!
Method Polygon(X:Float, Y:Float)
CenterX = X
CenterY = Y
End Method
End Class
Class Square Extends Polygon
Method Square(X:Float, Y:Float, Size:Float)
Super(X, Y) //Have to use constructor! Can't access CenterX/Y from here
Self.Size = Size
Name = "square"
End Method
Protected
Method Area:Float()
Return Size*Size
End Method
Private
Field Size:Float
End Class
Local S:Square = New Square(10.0, 10.0, 3.0)
S.PrintInfo()
Print(S.Area()) //Error! Can't access Area from here - it's Protected
Print(S.CenterX) //Error! CenterX is Private!
If member visibility is new to you, you might be wondering why this is useful. The main reason visibility was introduced is to protect class integrity and therefore prevent bugs. Even if you program on your own and only use classes you yourself designed, it's important nonetheless to sit down and think about which functionality should be exposed to other code and which shouldn't.
When to use which visibility? There are a handful of rules for each:
Private
for members that are irrelevant for users of the class (e.g. helper methods, internal variables etc.).Private
for fields that either need validation or some extra processing. Provide public Set/Get methods to allow read/write access through a safe environmentProtected
when Private
would be appropriate, but the member is also relevant for subclasses. For example, shared functionality in the base class that is irrelevant for a user of the class, but important for implementations of subclasses (e.g. Area
in the previous example)Public
only for members if they are useful to a user of the class, and only for fields that don't need validation. Setting everything as public is convenient, but errorprone - if a member is public, it will be used sooner or later, and usually in a way that endangers class integrity (by avoiding validation checks and accidentally setting a reference to Null
, for example).As seen in previous chapters, Bismuth programs can instantiate new instances using New
. These instances occupy memory, naturally, so over time, a program will use more and more memory. But how to free memory that isn't needed anymore?
Bismuth has two ways to deal with this problem: Automatic garbage collection and manual memory management.
As you should know by now, a program can hold references to instances it created. Access to an instance is done using a reference to it. If there is no reference left to an instance, this means that it is unreachable for the program - without a reference, there's no way the program will ever be able to access it again!
For these reasons, the Bismuth compiler includes a garbage collector as part of your program. It will lay dormant most of the time, but wake up periodically and scan the program for instances that don't have any references left pointing to them, removing them from memory automatically.
As a programmer, you usually won't have to worry about the garbage collector - most of the time you won't even notice it's there! However, it should be noted that unreachable instances are not removed right away - the garbage collector will only kick in every so often, and even then it may not remove all unreachable instances right away. Sometimes, that is not what you want - say, if you allocated a large instance, but only need it temporarily and want to get rid of it right after you don't need it anymore to save memory, you may choose to manually remove that instance from memory using the Delete
statement:
Function HeavyLifting()
Local Temp1:ReallyLargeClass = New ReallyLargeClass
Local Temp2:Float[,,] = New Float[1024, 1024, 1024]
/* do something with it */
/* And now get rid of it immediately */
Delete Temp1
Delete Temp2
End Function
A fair word of warning: Using Delete
is dangerous if you don't know what you're doing. It should really only be used when it is absolutely necessary.
The reason is that Delete
does not remove references to the instance being deleted - in our previous example, Temp1
and Temp2
would not equal to Null
, but point to a now unused chunk of memory. There is no way of telling whether the instance a reference points to has been removed with Delete
, so you could end up accidentally modifying a deleted instance, leading to memory corruption or random crashes.
Therefore: Only use Delete
if you're absolutely sure that no other code could end up accidentally trying to use the instance you're about to delete.
Once codes get longer and longer, it's quite useful to split it up into multiple files to avoid losing the overview and to make compilation faster.
The Import
statement allows you to import other code files into your code. It is used as
Import "Path/To/File.bi"
Semantically, there is no difference between importing a code file and copy-pasting it at the end of the file that imports it. However, compilation times will be vastly different. If a code file didn't change since the last compilation, the compiler will use a compiled version cached on disk, ensuring short compilation even in large projects - useful!
Note that imported code files may themselves import other files, leading to arbitrarily nested import chains. The compiler is robust enough to deal with cyclic imports; any file will be imported at most once.
Import does not only work with *.bi files - most files somehow useful will be recognized by the compiler. It will check the file extension, match it against a rule in the list below and take appropriate action. If the code being compiled is called Foo.bi, then, depending on the file extension of the import:
Import
can also be used to import modules, which will be explained in the next chapter.
Bismuth is a modular language, allowing you to encapsulate often-used functionality in a unit called module. A module is just a code file - possibly importing other modules or code files - which, upon import, will make its globals, functions, classes and constants available to the code that uses it.
Modules live inside the /mod/ folder inside the root directory of the compiler. The Bismuth compiler will scan this folder and its subfolders at startup. If a folder contains a *.bi file with the same name, it will be recognized as a module and loaded into the module tree.
Modules may contain modules themselves (so-called submodules), nesting arbitrarily deep. This allows grouping modules with similar functionality together.
Modules can be imported into a bismuth code using the Import
statement. For example, Import MyModule
will look for a folder called MyModule
inside the main module folder and, if it exists, look for a file "MyModule.bi" inside it and load it.
You can import submodules by separating module names with .
- e.g. Import MyModule.MySubmodule
. Additionally, using the wildcard *
will import all submodules of a module (Import MyModule.*
, for example).
Importing a module is very similar to importing a code file. One of the advantages is that modules live in a directory the compiler knows about - there's no need to specify a full path to a code file or even copy it close to the code that uses it just to get shorter imports. Modules also encourage code reuse and make it easy to share code with others.
One big difference is that modules are precompiled. If the compiler stumbles upon an outdated build of a module (or an uncompiled module), it will warn you about it - but not compile it! This has to be done by the user beforehand.
All of the important language features - Strings, Arrays, Object
, the garbage collector etc. - live in a module called Bismuth
. Out of convenience, this module and its submodules are automatically imported in every Bismuth file (this is equivalent to adding Import Bismuth.*
at the beginning of each code). This module should not be changed or extended by new submodules - if you have suggestions or think you found a bug, notify the developers instead so everyone can profit.
Although Bismuth can do a lot of things C/++ can, you may find yourself in a situation where you'd rather use an existing library written in C/++ rather than write your own implementation. For these reasons, the Extern
statement has been introduced to make the compiler recognize functions, classes and variables declared in other languages.
The syntax of Extern
is simple - just write normal Function/Class/Global declarations inside an Extern
-End Extern
block as you normally would, except omitting any implementation.
This example:
Extern
Function Foo:Float()
Function Bar:Int(A:Int)
Global Scootaloo:Foobar
Class Bloomberg
Method Apple()
Method Bloom:Int()
End Class
End Extern
Would describe the following .hpp header:
float Foo();
int Bar(int A);
class Bloomberg {
public:
void Apple();
int Bloom();
};
extern Foobar* Scootaloo;
Note that declarations inside an Extern
block don't have to be complete - for example, specifying only the members of a class you need and not all members there are is absolutely fine. Also, defining C/++ macros as a Function
in Bismuth is no problem.
Just remember to Import
the header that contains all the constructs you declared inside the Extern
block and the *.c/*.cpp/*.cxx file that actually implements them!
It is important to know that Bismuth doesn't actually have a way of determining whether the Extern
block is correct, that is, whether the declarations match the ones found in the corresponding C/++ source. This means that if they are incorrect, you will have to decipher the compiler error given by the C/++ compiler, which may involve having to take a look at code generated by the Bismuth compiler.
This is a bit of a hassle, but there is no way around it without having to implement a full C/++ compiler inside the Bismuth compiler, which is not an easy task given the syntax of the C/++ language.