 High-level language design
      High-level language design
    Pascal is Pascal is Pascal is dog meat.
-- M. Devine and P. Larson, Computer Science 340
The following are the datatypes fundamental to this language:
A code file consists of a list of declarations. A declaration may be a variable declaration, a function declaration or an object declaration. Object and functions definitions have blocks containing further definitions associated with them, called declaration blocks and code blocks respectively. These are enclosed using square brackets [], and will be defined later. Whitespace is used to seperate names and is otherwise ignored.
A declaration block has the same syntax as the 'top level' of the code file - that is, it consists of declarations as defined in this section:
A variable is defined using the following syntax:
[scope] <type> <name> [assignment];
      scope is optional, and may be one of 'public',
      'protected', 'private' or 'static'. A public variable may be
      accessed outside of its immediate context - in a top-level
      definition this creates a global variable, and in an object it
      creates a variable that can be accessed by any other object. A
      protected variable is equivalent to a private variable at the
      top level, but in an object it specifies a variable that may be
      accessed by this object or any object derived from it. A private
      variable is one that can only be accessed from within the
      current source file (if declared at top-level) or object (if
      declared in an object). A static object is a private variable
      that is only ever allocated once, and all references are to that 
      one allocation. Objects are always referred to by reference, so
      if you have an object named 'foo', the declaration foo
	bar; is equivalent to reference{foo}
	  bar;. Similarily for arrays: the declaration
      array of integer[100] foo; is equivalent to
      reference{array of integer[100]}
	  foo;.
    
      type can be one of the built-in fundamental types
      (integer, short real, long real or byte), an object name, an
      array declaration or reference. A reference is given a
      type by placing a type definition in curly brackets after the
      name - untyped references are dealt with later. A reference can
      also be a function reference, in which case it can be
      assigned a function and called as if it were a function
      itself. An array declaration 
      has the form array[<expression>] of
	  <type>, where <expression> is a
      constant integer expression evaluating to the number
      of elements in the array, and <type> is a type
      definition. The number may be omitted, in which case the array
      is resizable.
      <name> is a unique identifier for this variable in this
      context (no two global variables may have the same name, and
      neither may any two variables within an object). A name
      is a string consisting of the alphanumeric characters or '_',
      and not beginning with a number. If an assignment is
      given, whenever the variable is created the assignment is
      evaluated and that value assigned to the variable (an
      assignment is of the form '=
      expression). References can also refer to functions - in 
      this case they have the format
      reference{function(<arguments>)}. The
	  arguments have the same format as for a function
      declaration. When an object is declared, the special function
      'create' is called for it and all the object types it inherited
      from. When it is destroyed (either by delete
      or by passing out of scope), the 'destroy' function is called in 
      a similar manner. Neither of these functions take any arguments.
    
A function is declared using the following syntax:
[scope] function <type> <name>[type variables]([arguments]) <code>;
scope and name are as for variable declarations, above. type specifies the return type of the function. The arguments are a list of variable declarations, as above and local to the function (public, private and protected are all equivalent scopes). code is a code block, explained later. The type variables argument is a comma-seperated list of names enclosed in angular brackets '<>' that can be used as types by the reference type (in the arguments or the code itself), creating untyped references. Note that there is no function overloading - name must be unique. Only function arguments can be specified using type variables specified as part of the function name, and on calling the function, these are resolved using the types of the arguments passed to the function.
An object class definition has the following syntax:
object <name>[type variables] [inherits] <declarations>;
Once an object has been defined, it can be used as a type. inherits is an optional inheritance specifier of the form 'inherits from <name>'. Declarations is a declarations block, enclosed in square brackets, with the same format as the top-level declarations block. If new object types are defined in this block, they are considered to have a scope local to this object. The special variable 'this' is always implicitly declared in an object function definition, and points to the instance of the object that a function was called for (or nil if a static function was called with no object). All object functions are considered 'virtual' - that is, if an object reuses a function name used in one of the objects it inherits from, the new function replaces the old in all object instances created from that type.
A constant may be defined using the following syntax:
constant <name> = expression;
expression should contain only constants - no variables are permitted. A constant may not be assigned to, but is otherwise treated as a variable.
If many related constants need to be defined, the following syntax may be used:
enum <block>
block is a list of constant names enclosed in square brackets, each with the following syntax:
<name> [ = expression ];
If a = expression is present, name is assigned as a constant with the value of expression. If it is not present, it is assigned a value 1 greater than the previous constant in the enum block, or 0 if no such constant exists.
      Finally, the 'import' directive may be used to specify that
      definitions from another source file should be used. It has the
      syntax import "filename";, where
      filename is a string of characters indicating which file
      should be imported (it is illegal to form a loop in this way,
      importing a file that itself imports the original file)
    
The compiler will make an initial pass through the source file to include any imports and to work out the complete list of object, variable and function names. Thus there is no such thing as a 'forward' declartion, as an object, variable or function can be referenced before it is defined.
Code blocks are enclosed by square brackets []. They begin with a list of variable declarations (only 'static' has any effect on the variables scope, though). This is followed by a list of statements, each of which is terminated by a semicolon. A statement may have one of the following forms:
variable = expression;
variable is a variable name - the syntax of these is given later. The type of the expression must match the type of the variable.
loop(expression) [next codeblock1] codeblock2;
codeblock1 & codeblock2 must be valid code blocks. expression is evaluated, and if true, codeblock2 is executed, followed by codeblock1 before control passes back to the start of the loop. The 'next' statement is optional, and if omitted, codeblock2 is treated as empty.
if (expression) codeblock [else codeblock];
expression is evaluated, and codeblock executed if the result is non-zero. There may also be an optional 'else codeblock' part, which is executed in the case where expression is zero.
<type> <name> -> codeblock;
If an exception of type type is thrown anywhere in the enclosing code block, a variable name is created, assigned the value of the exception, and the codeblock is executed. The program will then exit the enclosing code block.
throw expression;
Throws an exception with the same type as the expression. The execution system will back up the stack tree until it finds a routine that handles an exception with the appropriate type.
delete <name>
Deletes the variable referred to by name. The variable must be a reference (remember that objects and arrays are implicitly declared as references). Actually, delete decrements the reference count of an object, and only actually removes it if the count is 0.
codeblock;
Code blocks can be nested, to allow temporary declaration of variables within a function, or to continue after an exception.
An expression may be a number (which gives it type integer), a decimal number (which gives it type real) a string enclosed in quotes (which gives it type array[] of byte), or one of the following:
| sizeof(expression) | expression must evaluate to an array type. This evaluates to an integer giving the number of elements in the array. | 
| real(expression) | expression must evaluate to an integer type. This evaluates to a real value with the same value as expression. | 
| (expression) | Evaluates to the value of expression | 
| variable | Evaluates to the value of variable. If variable is a reference used in a context where a reference is not appropriate, it is automatically dereferenced: references can only been compared for equality and assigned - all other usages will result in automatic dereferencing. | 
| expression1 + expression2 | If the expressions are both of type integer (or are both of type real), performs integer or real addition. If the expressions are arrays, this creates a new array with the first elements consiting of those from the first expression, and the remaining elements consisting of the elements from the second. If the first expression is an object, then this evaluates to expression1.Add(expression2). This is invalid if either expression is a reference. | 
| expression1 - expression2 | If the expressions are both of type integer (or are both of type real), performs integer or real subtraction. If the first expression is an object, then this evaluates to expression1.Sub(expression2). This is invalid if either expression is a reference or array | 
| expression1 * expression2 | If the expressions are both of type integer (or are both of type real), performs integer or real multiplication. If the first expression is an object, then this evaluates to expression1.Mul(expression2). This is invalid if either expression is a reference or array | 
| expression1 / expression2 | If the expressions are both of type integer (or are both of type real), performs integer or real division. If the first expression is an object, then this evaluates to expression1.Div(expression2). This is invalid if either expression is a reference or array | 
| expression1 % expression2 | If the expressions are both of type integer (or are both of type real), performs integer or real modulo. If the first expression is an object, then this evaluates to expression1.Mod(expression2). This is invalid if either expression is a reference or array | 
| expression1 == expression2 | Evaluates to 1 if the expressions are equal, or 0 if the expressions are not. For references this checks that the two expressions reference the same object. For objects, this evaluates to the result of (expression1.Compare(expression2)==0) | 
| expression1 != expression2 | Evaluates to 1 if the expressions are equal, or 0 if the expressions are not. For references this checks that the two expressions reference different objects. For objects, this evaluates to the result of (expression1.Compare(expression2)!=0) | 
| expression1 > expression2 | Evaluates to 1 if expression1 is greater than expression2 or 0 otherwise. For objects, this evaluates to the the result of (expression1.Compare(expression2)>0). This is invalid for references | 
| expression1 >= expression2 | Evaluates to 1 if expression1 is greater than or equal to expression2 or 0 otherwise. For objects, this evaluates to the result of (expression1.Compare(expression2)>=0) | 
| expression1 < expression2 | Evaluates to 1 if expression1 is less than expression2 or 0 otherwise. For objects, this evaluates to the result of (expression1.Compare(expression2)<0) | 
| expression1 <= expression2 | Evaluates to 1 if expression1 is less than or equal to expression2 or 0 otherwise. For objects, this evaluates to the result of (expression1.Compare(expression2)<=0) | 
| function(arguments) | Calls the given function with the supplied arguments. The arguments are a comma seperated list of expressions, whose type must match the types specified by the function definition. This evaluates to the return value of the function | 
| nil | This value is a reference to nothing - it represents a reference that hasn't been initialised with a value yet. | 
| new(type) | This allocates space for an object of type type and returns a reference to it. | 
The infix operators defined above have the following priority, from highest to lowest.
| () | 
| * / % | 
| + - | 
| <= < > >= == != | 
A variable can be a reference, a dereference, an object, an object's field or a global variable name. The syntax of a variable name is as follows:
| &variable | Derefence variable. If reading this has the value of the variable that is being referenced. If writing, the value of this variable is written rather than its reference. | 
| variable1.variable2 | This references a field in an object. variable1 specifies the object and variable2 specifies the field in that object. If variable1 is a reference, it is dereferenced (as all object variables are actually references, a dereference is therefore always performed). | 
| variable[expression] | variable must be an array or an array reference. If used as part of an expression, this evaluates to the value at the position specified by the (integer) expression. If used as part of an assignment, this stores the value at the position specified by the (integer) expression | 
These operators have the following priorities (highest first):
| & | 
| [] | 
| . | 
In order to work out which variable a name refers to, the compiler should first check the variables declared in the current code block, then the containing code block and so on until it reaches the