High-level language design

Pascal is Pascal is Pascal is dog meat.

-- M. Devine and P. Larson, Computer Science 340

Description

Datatypes

The following are the datatypes fundamental to this language:

integer
A 32-bit integer value (range -2147483648 - 2147483647)
byte
An 8-bit unsigned integer (range 0-255). A synonym for this type is character
real
A floating point number, IEEE 754 double precision.
array
An array is a collection of a fixed number of variables of a certain types. Multidimensional arrays can be formed by creating arrays of arrays.
reference
A reference to a variable of some type. A reference can be typed, and refer to a specific type of variable, or untyped and refer to an object of any type - this is used to implement explicit polymorphism.
function
A function is associated with a piece of code.
object
This is an advanced type that allows the compounding of other objects of various types. It's full purpose is described later.

Syntax

A code file consists of a list of declarations. A declaration may be a variable declaration, a function declaration or an object declaration. Object and functions definitions have blocks containing further definitions associated with them, called declaration blocks and code blocks respectively. These are enclosed using square brackets [], and will be defined later. Whitespace is used to seperate names and is otherwise ignored.

A declaration block has the same syntax as the 'top level' of the code file - that is, it consists of declarations as defined in this section:

Declarations

A variable is defined using the following syntax:

[scope] <type> <name> [assignment];

scope is optional, and may be one of 'public', 'protected', 'private' or 'static'. A public variable may be accessed outside of its immediate context - in a top-level definition this creates a global variable, and in an object it creates a variable that can be accessed by any other object. A protected variable is equivalent to a private variable at the top level, but in an object it specifies a variable that may be accessed by this object or any object derived from it. A private variable is one that can only be accessed from within the current source file (if declared at top-level) or object (if declared in an object). A static object is a private variable that is only ever allocated once, and all references are to that one allocation. Objects are always referred to by reference, so if you have an object named 'foo', the declaration foo bar; is equivalent to reference{foo} bar;. Similarily for arrays: the declaration array of integer[100] foo; is equivalent to reference{array of integer[100]} foo;.

type can be one of the built-in fundamental types (integer, short real, long real or byte), an object name, an array declaration or reference. A reference is given a type by placing a type definition in curly brackets after the name - untyped references are dealt with later. A reference can also be a function reference, in which case it can be assigned a function and called as if it were a function itself. An array declaration has the form array[<expression>] of <type>, where <expression> is a constant integer expression evaluating to the number of elements in the array, and <type> is a type definition. The number may be omitted, in which case the array is resizable. <name> is a unique identifier for this variable in this context (no two global variables may have the same name, and neither may any two variables within an object). A name is a string consisting of the alphanumeric characters or '_', and not beginning with a number. If an assignment is given, whenever the variable is created the assignment is evaluated and that value assigned to the variable (an assignment is of the form '= expression). References can also refer to functions - in this case they have the format reference{function(<arguments>)}. The arguments have the same format as for a function declaration. When an object is declared, the special function 'create' is called for it and all the object types it inherited from. When it is destroyed (either by delete or by passing out of scope), the 'destroy' function is called in a similar manner. Neither of these functions take any arguments.

A function is declared using the following syntax:

[scope] function <type> <name>[type variables]([arguments]) <code>;

scope and name are as for variable declarations, above. type specifies the return type of the function. The arguments are a list of variable declarations, as above and local to the function (public, private and protected are all equivalent scopes). code is a code block, explained later. The type variables argument is a comma-seperated list of names enclosed in angular brackets '<>' that can be used as types by the reference type (in the arguments or the code itself), creating untyped references. Note that there is no function overloading - name must be unique. Only function arguments can be specified using type variables specified as part of the function name, and on calling the function, these are resolved using the types of the arguments passed to the function.

An object class definition has the following syntax:

object <name>[type variables] [inherits] <declarations>;

Once an object has been defined, it can be used as a type. inherits is an optional inheritance specifier of the form 'inherits from <name>'. Declarations is a declarations block, enclosed in square brackets, with the same format as the top-level declarations block. If new object types are defined in this block, they are considered to have a scope local to this object. The special variable 'this' is always implicitly declared in an object function definition, and points to the instance of the object that a function was called for (or nil if a static function was called with no object). All object functions are considered 'virtual' - that is, if an object reuses a function name used in one of the objects it inherits from, the new function replaces the old in all object instances created from that type.

A constant may be defined using the following syntax:

constant <name> = expression;

expression should contain only constants - no variables are permitted. A constant may not be assigned to, but is otherwise treated as a variable.

If many related constants need to be defined, the following syntax may be used:

enum <block>

block is a list of constant names enclosed in square brackets, each with the following syntax:

<name> [ = expression ];

If a = expression is present, name is assigned as a constant with the value of expression. If it is not present, it is assigned a value 1 greater than the previous constant in the enum block, or 0 if no such constant exists.

Finally, the 'import' directive may be used to specify that definitions from another source file should be used. It has the syntax import "filename";, where filename is a string of characters indicating which file should be imported (it is illegal to form a loop in this way, importing a file that itself imports the original file)

The compiler will make an initial pass through the source file to include any imports and to work out the complete list of object, variable and function names. Thus there is no such thing as a 'forward' declartion, as an object, variable or function can be referenced before it is defined.

Code

Code blocks are enclosed by square brackets []. They begin with a list of variable declarations (only 'static' has any effect on the variables scope, though). This is followed by a list of statements, each of which is terminated by a semicolon. A statement may have one of the following forms:

variable = expression;

variable is a variable name - the syntax of these is given later. The type of the expression must match the type of the variable.

loop(expression) [next codeblock1] codeblock2;

codeblock1 & codeblock2 must be valid code blocks. expression is evaluated, and if true, codeblock2 is executed, followed by codeblock1 before control passes back to the start of the loop. The 'next' statement is optional, and if omitted, codeblock2 is treated as empty.

if (expression) codeblock [else codeblock];

expression is evaluated, and codeblock executed if the result is non-zero. There may also be an optional 'else codeblock' part, which is executed in the case where expression is zero.

<type> <name> -> codeblock;

If an exception of type type is thrown anywhere in the enclosing code block, a variable name is created, assigned the value of the exception, and the codeblock is executed. The program will then exit the enclosing code block.

throw expression;

Throws an exception with the same type as the expression. The execution system will back up the stack tree until it finds a routine that handles an exception with the appropriate type.

delete <name>

Deletes the variable referred to by name. The variable must be a reference (remember that objects and arrays are implicitly declared as references). Actually, delete decrements the reference count of an object, and only actually removes it if the count is 0.

codeblock;

Code blocks can be nested, to allow temporary declaration of variables within a function, or to continue after an exception.

An expression may be a number (which gives it type integer), a decimal number (which gives it type real) a string enclosed in quotes (which gives it type array[] of byte), or one of the following:

sizeof(expression) expression must evaluate to an array type. This evaluates to an integer giving the number of elements in the array.
real(expression) expression must evaluate to an integer type. This evaluates to a real value with the same value as expression.
(expression) Evaluates to the value of expression
variable Evaluates to the value of variable. If variable is a reference used in a context where a reference is not appropriate, it is automatically dereferenced: references can only been compared for equality and assigned - all other usages will result in automatic dereferencing.
expression1 + expression2 If the expressions are both of type integer (or are both of type real), performs integer or real addition. If the expressions are arrays, this creates a new array with the first elements consiting of those from the first expression, and the remaining elements consisting of the elements from the second. If the first expression is an object, then this evaluates to expression1.Add(expression2). This is invalid if either expression is a reference.
expression1 - expression2 If the expressions are both of type integer (or are both of type real), performs integer or real subtraction. If the first expression is an object, then this evaluates to expression1.Sub(expression2). This is invalid if either expression is a reference or array
expression1 * expression2 If the expressions are both of type integer (or are both of type real), performs integer or real multiplication. If the first expression is an object, then this evaluates to expression1.Mul(expression2). This is invalid if either expression is a reference or array
expression1 / expression2 If the expressions are both of type integer (or are both of type real), performs integer or real division. If the first expression is an object, then this evaluates to expression1.Div(expression2). This is invalid if either expression is a reference or array
expression1 % expression2 If the expressions are both of type integer (or are both of type real), performs integer or real modulo. If the first expression is an object, then this evaluates to expression1.Mod(expression2). This is invalid if either expression is a reference or array
expression1 == expression2 Evaluates to 1 if the expressions are equal, or 0 if the expressions are not. For references this checks that the two expressions reference the same object. For objects, this evaluates to the result of (expression1.Compare(expression2)==0)
expression1 != expression2 Evaluates to 1 if the expressions are equal, or 0 if the expressions are not. For references this checks that the two expressions reference different objects. For objects, this evaluates to the result of (expression1.Compare(expression2)!=0)
expression1 > expression2 Evaluates to 1 if expression1 is greater than expression2 or 0 otherwise. For objects, this evaluates to the the result of (expression1.Compare(expression2)>0). This is invalid for references
expression1 >= expression2 Evaluates to 1 if expression1 is greater than or equal to expression2 or 0 otherwise. For objects, this evaluates to the result of (expression1.Compare(expression2)>=0)
expression1 < expression2 Evaluates to 1 if expression1 is less than expression2 or 0 otherwise. For objects, this evaluates to the result of (expression1.Compare(expression2)<0)
expression1 <= expression2 Evaluates to 1 if expression1 is less than or equal to expression2 or 0 otherwise. For objects, this evaluates to the result of (expression1.Compare(expression2)<=0)
function(arguments) Calls the given function with the supplied arguments. The arguments are a comma seperated list of expressions, whose type must match the types specified by the function definition. This evaluates to the return value of the function
nil This value is a reference to nothing - it represents a reference that hasn't been initialised with a value yet.
new(type) This allocates space for an object of type type and returns a reference to it.

The infix operators defined above have the following priority, from highest to lowest.

()
* / %
+ -
<= < > >= == !=

A variable can be a reference, a dereference, an object, an object's field or a global variable name. The syntax of a variable name is as follows:

&variable Derefence variable. If reading this has the value of the variable that is being referenced. If writing, the value of this variable is written rather than its reference.
variable1.variable2 This references a field in an object. variable1 specifies the object and variable2 specifies the field in that object. If variable1 is a reference, it is dereferenced (as all object variables are actually references, a dereference is therefore always performed).
variable[expression] variable must be an array or an array reference. If used as part of an expression, this evaluates to the value at the position specified by the (integer) expression. If used as part of an assignment, this stores the value at the position specified by the (integer) expression

These operators have the following priorities (highest first):

&
[]
.

In order to work out which variable a name refers to, the compiler should first check the variables declared in the current code block, then the containing code block and so on until it reaches the


Andrew Hunter
Last modified: Tue Jan 2 14:56:20 GMT 2001