A compilation project for Thrid year students of IMT Atlantique (the former Telecom Bretagne)
The specification of the project can be found here (Authorization Required)
- How to clone the project
- The project structure
- How to build the compiler
- How to execute the compiler
- How to test the Compiler
- How to contribute to the Project
- To do list
- Problems
- Contributors
Using git clone
git clone https://redmine-df.telecom-bretagne.eu/git/f2b304_compiler_cnthen enter your Username and Password in the command prompt
The project is divided by 2 parts
- phase1: the first deliverable that contains the lexical and syntax analyzer for the language minijava
- phase2: the second deliverable that contains the type checking, compiling and executing for the language minijava based on the lexical and syntax analyzer that offered by professors
To build or execute the compiler, plase enter one of the two deliverables, either phase1 or phase2
Using cd ./phase1 or cd ./phase2
Then using the shell script build to build the compiler
./buildor using ocamlbuild to build the compiler
ocamlbuild Main.byteNotes for contributors
The main file is
Main/Main.ml, it should not be modified. It opens the given file,creates a lexing buffer, initializes the location and call the compile function of the moduleMain/compile.ml. It is this function that you should modify to call your parser.
Using the shell script minijavac to execute the compiler
./minijavac <filename>or using the following command to build and then execute the compiler on the given file named <filename>
ocamlbuild Main.byte -- <filename>By default, the program searches for file with the extension
.javaand append it to the given filename if it does not end with it.
Using the shell script test to test the Compiler
./testit will execute Main.byte on all files in the directory Evaluator
If you are a team member of the project, please review the Guidelines for Contributing to this repository in order to make appropriate contributions
Deadline 15/01/2018
- Line Terminators
- Input Elements and Tokens
- White Space
- Comments
- Identifiers
- Keywords
- for
- while
- else
- if
- Literals
- Int
- String
- Separators
- brace
- parenthese
- dot
- comma
- semicolon
- Operators
-
=Simple Assignment Operator -
+ - * / %Arithmetic Operators -
+ - ++ -- !Unary Operators -
== != > < <= >=Equality and Relational Operators -
&& ||Conditional Operators
-
- Keywords
- class
- static
- extends
- return
- new
- Classes
- Class Declaration
- simple class declaration
- simple class declaration with extends
- Field Declarations
- static Fields
- non-static Fields
- Method Declarations
- static Methods
- non-static Methods
- Class Declaration
Deadline 25/02/2018
note for the reviewer: [ ] denotes item that needs to do while [x] denotes item that has done
- The construction of the class definition environment. This environment contains the type of methods for each class. This phase ignores the attributes (which are not visible outside the class) and the method bodies.
- create a class definition environment type called
class_env, it contains 4 fields as follows- methods: a
Hashtblthat maps from method name to method return type and argument type - constructors: a
Hashtblthat maps from constructor name to class reference type and argument type - attributes: a
Hashtblthat maps from attribute name to attribute type (declared type) - parent: a class reference type that refers to its class
- methods: a
- create a
Hashtblthat maps from class
- create a class definition environment type called
- The second phase is concerned with verifying that the inside of classes is correct (mainly the body of methods). She will also make sure of the correction of the higher level expression.
- create 3 verification methods that verify the following aspects of the program
-
verify_methodsthat checks the type of methods- create a local definition environment type called
current_envthat contains 3 fields as follows- returntype: the declared return type of the method
- variables: a
Hashtblthat maps from local variable name to local variable declared type - this_class: the id of the class
- env_type: a string that identifies the type of the local definition environment, it could be
constructor,methodorattribute, in this case, theenv_typeismethod
- write a verification method (
verify_declared_args) that checks the declared type of variables in the method argument list- check if there exists Duplicate Local Variable
- write a verification method (
verify_statement) that checks the body of the method- check variable declaration statement
- check block of statement
- check expression
- check return statement when it's none, ex:
return; - check return statement when it's not none, ex:
return x; - check throw statement
- it does check if exception type or a supertype of that exception type is mentioned in a throws clause in the declaration of the method, it should be checked in compiling
- check while statement
- check if without else statement
- check if with else statement
- check for statement
- check try statement
- create a local definition environment type called
-
verify_constructorsthat checks the type of constructors- same as verify_methods, except for the following minor difference
returntypein the local definition environmentcurrent_envis a reference to the class it belongs toenv_typein the local definition environmentcurrent_envisconstructor- check return statement in
verify_statementis slightly different since constructors can havereuturn;but not something likereturn x;
-
verify_attributesthat checks the type of attributes- create a local definition environment type called
current_envit contains 3 fields as following- returntype: since attributes have no return value, so it sets to be
Type.Void - variables: a
Hashtblthat maps from local variable name to local variable declared type - this_class: the id of the current class
- env_type: which is
attributehere
- returntype: since attributes have no return value, so it sets to be
- write a verification expression (
verify_expression) that checks the declared type of an expression Inverify_expression:- check
Newexpression type which instantiates a class - check
NewArrayexpression type which declares an array like: new int[5] - check
Callexpression type which calls a method, here, we didn't checkthiskeyword when calling a method. For the moment, it only supports the case when the class name has already existes inclass_enhashtable. - check
Attrexpression type which calls an attribute - check
Ifexpression type - check
Valexpression type which is the primitive type like int, string... in an expression - check
Nameexpression type which represents a variable - check
ArrayInitexpression type which initializes an array like {1,2,3} - check
Arrayexpression type (TODO). This part has not been done for the moment - check
AssignExpexpression type which compares an assignment operation type - check
Postexpression type which is some post operations type, like: a++, b--... - check
Preexpression type which is some pre operations type, like: !a, ~b... - cehck
Opexpression type which is some operation optype, like: ||, &&, +, -... - check
CondOpexpression type which is conditional operation, like a ? b : c - check
Castexpression type - check
Typeexpression type - check
ClassOfexpression type - check
Instanceofexpression type - check
VoidClassexpression
- check
- write an verification method (
verify_assignop_type) that checks the declared type of an attribute match the type of the expression. It has three inputs:- t1: the type of an attribute
- t2: the type of the corresponding expression of an attribute
- op: the type of operation, here is
Type.Assign
- create a local definition environment type called
-
- add support to
thiskeyword within a class in order to do type checking likethis.a = 5; - add location in exception message in order to locate errors
- create 3 verification methods that verify the following aspects of the program
- add support to overload methods and constructors
- ArgumentAlreadyExists
- when found duplicated argument in constructor argument list -> ArgumentAlreadyExists("[pident of argument]")
- when found duplicated argument in method argument list -> ArgumentAlreadyExists("[pident of argument]")
- ArgumentTypeNotExiste
- when found the arguments in a called function don't existe a declared method
- ArgumentTypeNotMatch
- when found the arguments in a called function don't match a declared method -> ArgumentTypeNotMatch("Arguments' type in "^meth_name^" not match")
- AttributeAlreadyExists
- when found duplicated attribute in class definition environment -> AttributeAlreadyExists("[aname of attribute]")
- ClassAlreadyExists
- ConstructorAlreadyExists
- when found duplicated constructor in class definition environment -> ConstructorAlreadyExists("[cname of constructor]")
- DuplicateLocalVariable
- when found duplcated variable in variable declaration statement (VarDecl) -> DuplicateLocalVariable("[decalred type] [variable id]")
- when found duplcated variable in the init part of for loop statement (For(fil,eo,el,s)) -> DuplicateLocalVariable("[decalred type] [variable id]")
- also raise this exception when found duplcated variable in variable declaration statement in the body of block, if, if else, for, while statement -> DuplicateLocalVariable("[decalred type] [variable id]")
- IncompatibleTypes
- when constructor try to return a variable -> ("unexpected return value")
- when method return does not contain variable -> IncompatibleTypes("missing return value")
- when method return type does not corresponds with the declared one -> IncompatibleTypes("missing return value")
- when condition in if statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- when condition in if else statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- when loop condition in for statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- when loop condition in while statement is not boolean -> IncompatibleTypes("[actual type] cannot be converted to boolean")
- InvalidMethodDeclaration
- when method declaration does not have return type -> InvalidMethodDeclaration("return type required")
- MethodAlreadyExists
- when found duplicated method in class definition environment -> MethodAlreadyExists("[mname of method]")
- UnknownActualType
- when actual type of a variable cannot be determined in variable declaration statement (VarDecl) -> UnknownActualType("[edesc] don't have type information")
- when actual type of a variable cannot be determined in the init part of for loop statement (For(fil,eo,el,s)) -> UnknownActualType("[edesc] don't have type information")
- when actual type of variable cannot be determined in the condition part of while loop statement (While(e,s)) -> UnknownActualType("[edesc]: unknow type in while condition")
- when actual type of variable cannot be determined in the condition part of if statement (If(e,s,None)) -> UnknownActualType("[edesc]: unknow type in if condition")
- when actual type of variable cannot be determined in the condition part of if else statement (If(e,s,Some s2)) -> UnknownActualType("[edesc]: unknow type in if else condition")
- UnknownVariable
- when the variable does not existe in current environment or global environment -> UnknownVariable("[variable_name]")
- UnknownClass
- UnknownMethod
- WrongTypePrefixOperation
- when the prefix operation type is not match -> WrongTypePrefixOperation("[operation, expr]")
- WrongTypePostfixOperation
- when the postfix operation type is not match -> WrongTypePostfixOperation("[operation, expr]")
- WrongInvokedArgumentsLength
- when actual and formal argument lists differ in length -> WrongInvokedArgumentsLength()
- WrongTypesAssignOperation
- when an assignment operation type is not match -> WrongTypesAssignOperation("[expr1_type, op, expr2_type]")
- WrongTypesOperation
- when an operation type is not match -> WrongTypesAssignOperation("[expr1_type, op, expr2_type]")
- errors related to overloading
- errors related to generic types
- errors related to
thiskeyword
Evaluation and execute by certain means
Construction of class descriptors table and method table.
class descriptors table : name - classTable : (string, globalClassDescriptor) Hashtbl.t
method table : name - methodTable : (string, astmethod) Hashtbl.t
All the contents of functions of different classes are saved in the methodTable All the contents of classes are saved in the classTable except for the contents of functions Here we use the name of functions and type of params to in class descriptor to find the content of function in the methodTable
functions:
- class descriptor of a class : func_name_typeOfpara1,typeOfpara2 classname_func_name_typeOfpara1,typeOfpara2
- method table of : classname_func_name_typeOfpara1,typeOfpara2 astmethod In this way, functions that has the same name but different type of paramters are permitted in the compilation
constructors: name : typeOfpara1,typeOfpara2 content: astconst In this way, different types of constructors are permitted
Please take care that overriding are not supported in the typage. For testing the overriding, please delete the typage function first.
ParentClassNotDefined :raised when parent class is not defined in the file SameFunctionAlreadyDefined: raised when function of class have the same name and the same type of parameters SameFunctionConstructorsDefined : raised when constructors of class have the same name and the same type of parameters
- not support method overloading
- not support generic types
- not support typing related to
thiskeyword
- First part: Lexical and syntactic analyzers
- Expression: Shuwei ZHANG & Jinhai ZHOU
- Classes: Xiaofeng ZHOU & Keyu PU
- Second part: The Type-checking and the Execution
- Type-checking: Shuwei ZHANG & [Jinhai ZHOU]
- Execution: Xiaofeng ZHOU & Keyu PU
This work is licensed under a Creative Commons Attribution 4.0 International License.
