Generating Intermediate Code Using Syntax-Directed Translation in ANTLR












0















So, this question isn't necessarily a problem I have, but rather a lack of understanding.



I have this ANTLR code (which comprises of a parser and lexer):



grammar Compiler;

prog
: Class Program '{' field_decls method_decls '}'
;

field_decls returns [String s1]
: field_decls field_decl ';'
{
$s1 = $field_decl.s2;
}
| field_decls inited_field_decl ';'
|
;

field_decl returns [String s2]
: field_decl ',' Ident
| field_decl ',' Ident '[' num ']'
| Type Ident
{
System.out.println($Ident.text);
$s2 = $Ident.text;
}
| Type Ident '[' num ']'
{
System.out.println($Ident.text+"["+"]");
$s2 = $Ident.text;
}
;

inited_field_decl
: Type Ident '=' literal
;

method_decls
: method_decls method_decl
|
;

method_decl
: Void Ident '(' params ')' block
| Type Ident '(' params ')' block
;

params
: Type Ident nextParams
|
;

nextParams
: ',' Type Ident nextParams
|
;

block
: '{' var_decls statements '}'
;

var_decls
: var_decls var_decl
|
;

var_decl
: Type Ident ';'
;

statements
: statement statements
|
;

statement
: location eqOp expr ';'
| If '(' expr ')' block
| If '(' expr ')' block Else block
| While '(' expr ')' statement
| Switch expr '{' cases '}'
| Ret ';'
| Ret '(' expr ')' ';'
| Brk ';'
| Cnt ';'
| block
| methodCall ';'
;

cases
: Case literal ':' statements cases
| Case literal ':' statements
;

methodCall
: Ident '(' args ')'
| Callout '(' Str calloutArgs ')'
;

args
: someArgs
|
;

someArgs
: someArgs ',' expr
| expr
;

calloutArgs
: calloutArgs ',' expr
| calloutArgs ',' Str
|
;

expr
: literal
| location
| '(' expr ')'
| SubOp expr
| '!' expr
| expr AddOp expr
| expr MulDiv expr
| expr SubOp expr
| expr RelOp expr
| expr AndOp expr
| expr OrOp expr
| methodCall
;

location
:Ident
| Ident '[' expr ']'
;

num
: DecNum
| HexNum
;

literal
: num
| Char
| BoolLit
;

eqOp
: '='
| AssignOp
;

//-----------------------------------------------------------------------------------------------------------
fragment Delim
: ' '
| 't'
| 'n'
;

fragment Letter
: [a-zA-Z]
;

fragment Digit
: [0-9]
;

fragment HexDigit
: Digit
| [a-f]
| [A-F]
;

fragment Alpha
: Letter
| '_'
;

fragment AlphaNum
: Alpha
| Digit
;

WhiteSpace
: Delim+ -> skip
;

Char
: ''' ~('\') '''
| ''\' . '''
;

Str
:'"' ((~('\' | '"')) | ('\'.))* '"'
;

Class
: 'class'
;

Program
: 'Program'
;

Void
: 'void'
;

If
: 'if'
;

Else
: 'else'
;

While
: 'while'
;

Switch
: 'switch'
;

Case
: 'case'
;

Ret
: 'return'
;

Brk
: 'break'
;

Cnt
: 'continue'
;

Callout
: 'callout'
;

DecNum
: Digit+
;

HexNum
: '0x'HexDigit+
;

BoolLit
: 'true'
| 'false'
;

Type
: 'int'
| 'boolean'
;

Ident
: Alpha AlphaNum*
;

RelOp
: '<='
| '>='
| '<'
| '>'
| '=='
| '!='
;

AssignOp
: '+='
| '-='
;

MulDiv
: '*'
| '/'
| '%'
;

AddOp
: '+'
;

SubOp
: '-'
;

AndOp
: '&&'
;

OrOp
: '||'
;


And basically, we need to generate intermediate code using syntax directed translation. By my knowledge, this means that we must add semantic rules to the parser grammar. We need to take the output generated and encapsulate it into .csv files.



So, we have three files: symbols.csv, symtable.csv and instructions.csv



In symbols.csv, the format of each row is:



int id; //serial no. of symbol, unique
int tabid; //id no. of symbol table
string name; //symbol name
enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type
enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope
boolean isArray; //is it an array variable
int arrSize; //array size, if applicable
boolean isInited; //is initialized
union initVal {
int i;
boolean b;
} in; //initial value, if applicable


In symtable.csv, the format of each row is:



int id; //symbol table serial no., unique
int parent; //parent symbol table serial no.


In instructions.csv, the format of each row is:



int id; //serial no., unique
int res; //serial no. of result symbol
enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type
int op1; //serial no. of first operand symbol
int op2; //serial no. of second operand symbol


As an example, let's say we have this input:



class Program {
int x;
int y, z;
int w = 0;
void main (int n) {
int a;
a = 0;
while (a < n) {
int n;
n = a + 1;
a = n;
}
callout("printf", "n = %dn", n);
return n;
}
}


symbols.csv should look like this:



0, 0, x, INT, GLOBAL, false, 0, false, 0,
1, 0, y, INT, GLOBAL, false, 0, false, 0,
2, 0, z, INT, GLOBAL, false, 0, false, 0,
3, 0, 0, INT, CONST, false, 0, false, 0,
4, 0, w, INT, GLOBAL, false, 0, true, 0,
5, 0, main, LABEL, GLOBAL, false, 0, false, 0,
6, 1, n, INT, LOCAL, false, 0, false, 0,
7, 1, a, INT, LOCAL, false, 0, false, 0,
8, 1, 0, INT, CONST, false, 0, false, 0,
9, 2, n, INT, LOCAL, false, 0, false, 0,
10, 2, 1, INT, CONST, false, 0, false, 0,
11, 1, "printf", STR, CONST, false, 0, false, 0,
12, 1, "n = %dn", STR, CONST, false, 0, false, 0,
13, 1, 2, INT, CONST, false, 0, false, 0,


symtables.csv should look like this:



0, -1,
1, 0,
2, 1,


instructions.csv should look like this:



0, 4, ASSIGN, 3, -1, #w = 0
1, 5, LABEL, -1, -1, #main:
2, 7, ASSIGN, 8, -1, #a = 0
3, 5, LT, 7, 6, #if a<n goto 5
4, 8, GE, 7, 6, #iffalse a<n goto 8
5, 9, ADD, 7, 10, #n = a + 1
6, 7, ASSIGN, 9, -1, #a = n
7, 2, GOTO, -1, -1, #goto 3
8, -1, PARAM, 12, -1, #"n = %dn"
9, -1, PARAM, 6, -1, #n
10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);
11, -1, RET, 6, -1, # return n


Simply put, I am not sure exactly where to start. I understand that I must add semantic rules to my parser grammar so that I can have output such as the ones I have previously stated. Furthermore, I have done some research on my own and discovered that I must create classes in java for my symbols and symtable and symstack. I am very new to ANTLR and would appreciate it if someone experienced in ANTLR could point me in the right direction.



Thank you in advance for any help.



P.S My lexer and parser are based off a tiny C-like language that is posted below.



Tiny C-Like Language:



program
:'class Program {'field_decl* method_decl*'}'

field_decl
: type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'
| type id '=' literal ';'

method_decl
: (type | 'void') id'('( (type id) ( ','type id)*)? ')'block

block
: '{'var_decl* statement*'}'

var_decl
: type id(','id)* ';'

type
: 'int'
| 'boolean'

statement
: location assign_op expr';'
| method_call';'
| 'if ('expr')' block ('else' block )?
| 'switch' expr '{'('case' literal ':' statement*)+'}'
| 'while (' expr ')' statement
| 'return' ( expr )? ';'
| 'break ;'
| 'continue ;'
| block

assign_op
: '='
| '+='
| '-='

method_call
: method_name '(' (expr ( ',' expr )*)? ')'
| 'callout (' string_literal ( ',' callout_arg )* ')'

method_name
: id

location
: id
| id '[' expr ']'

expr
: location
| method_call
| literal
| expr bin_op expr
| '-' expr
| '!' expr
| '(' expr ')'

callout_arg
: expr
| string_literal

bin_op
: arith_op
| rel_op
| eq_op
| cond_op

arith_op
: '+'
| '-'
| '*'
| '/'
| '%'

rel_op
: '<'
| '>'
| '<='
| '>='

eq_op
: '=='
| '!='

cond_op
: '&&'
| '||'

literal
: int_literal
| char_literal
| bool_literal

id
: alpha alpha_num*

alpha
: ['a'-'z''A'-'Z''_']

alpha_num
: alpha
| digit

digit
: ['0'-'9']

hex_digit
: digit
| ['a'-'f''A'-'F']

int_literal
: decimal_literal
| hex_literal

decimal_literal
: digit+

hex_literal
: '0x' hex_digit+

bool_literal
: 'true'
| 'false'

char_literal
: '‘'char'’'

string_literal
: '“'char*'”'









share|improve this question



























    0















    So, this question isn't necessarily a problem I have, but rather a lack of understanding.



    I have this ANTLR code (which comprises of a parser and lexer):



    grammar Compiler;

    prog
    : Class Program '{' field_decls method_decls '}'
    ;

    field_decls returns [String s1]
    : field_decls field_decl ';'
    {
    $s1 = $field_decl.s2;
    }
    | field_decls inited_field_decl ';'
    |
    ;

    field_decl returns [String s2]
    : field_decl ',' Ident
    | field_decl ',' Ident '[' num ']'
    | Type Ident
    {
    System.out.println($Ident.text);
    $s2 = $Ident.text;
    }
    | Type Ident '[' num ']'
    {
    System.out.println($Ident.text+"["+"]");
    $s2 = $Ident.text;
    }
    ;

    inited_field_decl
    : Type Ident '=' literal
    ;

    method_decls
    : method_decls method_decl
    |
    ;

    method_decl
    : Void Ident '(' params ')' block
    | Type Ident '(' params ')' block
    ;

    params
    : Type Ident nextParams
    |
    ;

    nextParams
    : ',' Type Ident nextParams
    |
    ;

    block
    : '{' var_decls statements '}'
    ;

    var_decls
    : var_decls var_decl
    |
    ;

    var_decl
    : Type Ident ';'
    ;

    statements
    : statement statements
    |
    ;

    statement
    : location eqOp expr ';'
    | If '(' expr ')' block
    | If '(' expr ')' block Else block
    | While '(' expr ')' statement
    | Switch expr '{' cases '}'
    | Ret ';'
    | Ret '(' expr ')' ';'
    | Brk ';'
    | Cnt ';'
    | block
    | methodCall ';'
    ;

    cases
    : Case literal ':' statements cases
    | Case literal ':' statements
    ;

    methodCall
    : Ident '(' args ')'
    | Callout '(' Str calloutArgs ')'
    ;

    args
    : someArgs
    |
    ;

    someArgs
    : someArgs ',' expr
    | expr
    ;

    calloutArgs
    : calloutArgs ',' expr
    | calloutArgs ',' Str
    |
    ;

    expr
    : literal
    | location
    | '(' expr ')'
    | SubOp expr
    | '!' expr
    | expr AddOp expr
    | expr MulDiv expr
    | expr SubOp expr
    | expr RelOp expr
    | expr AndOp expr
    | expr OrOp expr
    | methodCall
    ;

    location
    :Ident
    | Ident '[' expr ']'
    ;

    num
    : DecNum
    | HexNum
    ;

    literal
    : num
    | Char
    | BoolLit
    ;

    eqOp
    : '='
    | AssignOp
    ;

    //-----------------------------------------------------------------------------------------------------------
    fragment Delim
    : ' '
    | 't'
    | 'n'
    ;

    fragment Letter
    : [a-zA-Z]
    ;

    fragment Digit
    : [0-9]
    ;

    fragment HexDigit
    : Digit
    | [a-f]
    | [A-F]
    ;

    fragment Alpha
    : Letter
    | '_'
    ;

    fragment AlphaNum
    : Alpha
    | Digit
    ;

    WhiteSpace
    : Delim+ -> skip
    ;

    Char
    : ''' ~('\') '''
    | ''\' . '''
    ;

    Str
    :'"' ((~('\' | '"')) | ('\'.))* '"'
    ;

    Class
    : 'class'
    ;

    Program
    : 'Program'
    ;

    Void
    : 'void'
    ;

    If
    : 'if'
    ;

    Else
    : 'else'
    ;

    While
    : 'while'
    ;

    Switch
    : 'switch'
    ;

    Case
    : 'case'
    ;

    Ret
    : 'return'
    ;

    Brk
    : 'break'
    ;

    Cnt
    : 'continue'
    ;

    Callout
    : 'callout'
    ;

    DecNum
    : Digit+
    ;

    HexNum
    : '0x'HexDigit+
    ;

    BoolLit
    : 'true'
    | 'false'
    ;

    Type
    : 'int'
    | 'boolean'
    ;

    Ident
    : Alpha AlphaNum*
    ;

    RelOp
    : '<='
    | '>='
    | '<'
    | '>'
    | '=='
    | '!='
    ;

    AssignOp
    : '+='
    | '-='
    ;

    MulDiv
    : '*'
    | '/'
    | '%'
    ;

    AddOp
    : '+'
    ;

    SubOp
    : '-'
    ;

    AndOp
    : '&&'
    ;

    OrOp
    : '||'
    ;


    And basically, we need to generate intermediate code using syntax directed translation. By my knowledge, this means that we must add semantic rules to the parser grammar. We need to take the output generated and encapsulate it into .csv files.



    So, we have three files: symbols.csv, symtable.csv and instructions.csv



    In symbols.csv, the format of each row is:



    int id; //serial no. of symbol, unique
    int tabid; //id no. of symbol table
    string name; //symbol name
    enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type
    enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope
    boolean isArray; //is it an array variable
    int arrSize; //array size, if applicable
    boolean isInited; //is initialized
    union initVal {
    int i;
    boolean b;
    } in; //initial value, if applicable


    In symtable.csv, the format of each row is:



    int id; //symbol table serial no., unique
    int parent; //parent symbol table serial no.


    In instructions.csv, the format of each row is:



    int id; //serial no., unique
    int res; //serial no. of result symbol
    enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type
    int op1; //serial no. of first operand symbol
    int op2; //serial no. of second operand symbol


    As an example, let's say we have this input:



    class Program {
    int x;
    int y, z;
    int w = 0;
    void main (int n) {
    int a;
    a = 0;
    while (a < n) {
    int n;
    n = a + 1;
    a = n;
    }
    callout("printf", "n = %dn", n);
    return n;
    }
    }


    symbols.csv should look like this:



    0, 0, x, INT, GLOBAL, false, 0, false, 0,
    1, 0, y, INT, GLOBAL, false, 0, false, 0,
    2, 0, z, INT, GLOBAL, false, 0, false, 0,
    3, 0, 0, INT, CONST, false, 0, false, 0,
    4, 0, w, INT, GLOBAL, false, 0, true, 0,
    5, 0, main, LABEL, GLOBAL, false, 0, false, 0,
    6, 1, n, INT, LOCAL, false, 0, false, 0,
    7, 1, a, INT, LOCAL, false, 0, false, 0,
    8, 1, 0, INT, CONST, false, 0, false, 0,
    9, 2, n, INT, LOCAL, false, 0, false, 0,
    10, 2, 1, INT, CONST, false, 0, false, 0,
    11, 1, "printf", STR, CONST, false, 0, false, 0,
    12, 1, "n = %dn", STR, CONST, false, 0, false, 0,
    13, 1, 2, INT, CONST, false, 0, false, 0,


    symtables.csv should look like this:



    0, -1,
    1, 0,
    2, 1,


    instructions.csv should look like this:



    0, 4, ASSIGN, 3, -1, #w = 0
    1, 5, LABEL, -1, -1, #main:
    2, 7, ASSIGN, 8, -1, #a = 0
    3, 5, LT, 7, 6, #if a<n goto 5
    4, 8, GE, 7, 6, #iffalse a<n goto 8
    5, 9, ADD, 7, 10, #n = a + 1
    6, 7, ASSIGN, 9, -1, #a = n
    7, 2, GOTO, -1, -1, #goto 3
    8, -1, PARAM, 12, -1, #"n = %dn"
    9, -1, PARAM, 6, -1, #n
    10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);
    11, -1, RET, 6, -1, # return n


    Simply put, I am not sure exactly where to start. I understand that I must add semantic rules to my parser grammar so that I can have output such as the ones I have previously stated. Furthermore, I have done some research on my own and discovered that I must create classes in java for my symbols and symtable and symstack. I am very new to ANTLR and would appreciate it if someone experienced in ANTLR could point me in the right direction.



    Thank you in advance for any help.



    P.S My lexer and parser are based off a tiny C-like language that is posted below.



    Tiny C-Like Language:



    program
    :'class Program {'field_decl* method_decl*'}'

    field_decl
    : type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'
    | type id '=' literal ';'

    method_decl
    : (type | 'void') id'('( (type id) ( ','type id)*)? ')'block

    block
    : '{'var_decl* statement*'}'

    var_decl
    : type id(','id)* ';'

    type
    : 'int'
    | 'boolean'

    statement
    : location assign_op expr';'
    | method_call';'
    | 'if ('expr')' block ('else' block )?
    | 'switch' expr '{'('case' literal ':' statement*)+'}'
    | 'while (' expr ')' statement
    | 'return' ( expr )? ';'
    | 'break ;'
    | 'continue ;'
    | block

    assign_op
    : '='
    | '+='
    | '-='

    method_call
    : method_name '(' (expr ( ',' expr )*)? ')'
    | 'callout (' string_literal ( ',' callout_arg )* ')'

    method_name
    : id

    location
    : id
    | id '[' expr ']'

    expr
    : location
    | method_call
    | literal
    | expr bin_op expr
    | '-' expr
    | '!' expr
    | '(' expr ')'

    callout_arg
    : expr
    | string_literal

    bin_op
    : arith_op
    | rel_op
    | eq_op
    | cond_op

    arith_op
    : '+'
    | '-'
    | '*'
    | '/'
    | '%'

    rel_op
    : '<'
    | '>'
    | '<='
    | '>='

    eq_op
    : '=='
    | '!='

    cond_op
    : '&&'
    | '||'

    literal
    : int_literal
    | char_literal
    | bool_literal

    id
    : alpha alpha_num*

    alpha
    : ['a'-'z''A'-'Z''_']

    alpha_num
    : alpha
    | digit

    digit
    : ['0'-'9']

    hex_digit
    : digit
    | ['a'-'f''A'-'F']

    int_literal
    : decimal_literal
    | hex_literal

    decimal_literal
    : digit+

    hex_literal
    : '0x' hex_digit+

    bool_literal
    : 'true'
    | 'false'

    char_literal
    : '‘'char'’'

    string_literal
    : '“'char*'”'









    share|improve this question

























      0












      0








      0


      0






      So, this question isn't necessarily a problem I have, but rather a lack of understanding.



      I have this ANTLR code (which comprises of a parser and lexer):



      grammar Compiler;

      prog
      : Class Program '{' field_decls method_decls '}'
      ;

      field_decls returns [String s1]
      : field_decls field_decl ';'
      {
      $s1 = $field_decl.s2;
      }
      | field_decls inited_field_decl ';'
      |
      ;

      field_decl returns [String s2]
      : field_decl ',' Ident
      | field_decl ',' Ident '[' num ']'
      | Type Ident
      {
      System.out.println($Ident.text);
      $s2 = $Ident.text;
      }
      | Type Ident '[' num ']'
      {
      System.out.println($Ident.text+"["+"]");
      $s2 = $Ident.text;
      }
      ;

      inited_field_decl
      : Type Ident '=' literal
      ;

      method_decls
      : method_decls method_decl
      |
      ;

      method_decl
      : Void Ident '(' params ')' block
      | Type Ident '(' params ')' block
      ;

      params
      : Type Ident nextParams
      |
      ;

      nextParams
      : ',' Type Ident nextParams
      |
      ;

      block
      : '{' var_decls statements '}'
      ;

      var_decls
      : var_decls var_decl
      |
      ;

      var_decl
      : Type Ident ';'
      ;

      statements
      : statement statements
      |
      ;

      statement
      : location eqOp expr ';'
      | If '(' expr ')' block
      | If '(' expr ')' block Else block
      | While '(' expr ')' statement
      | Switch expr '{' cases '}'
      | Ret ';'
      | Ret '(' expr ')' ';'
      | Brk ';'
      | Cnt ';'
      | block
      | methodCall ';'
      ;

      cases
      : Case literal ':' statements cases
      | Case literal ':' statements
      ;

      methodCall
      : Ident '(' args ')'
      | Callout '(' Str calloutArgs ')'
      ;

      args
      : someArgs
      |
      ;

      someArgs
      : someArgs ',' expr
      | expr
      ;

      calloutArgs
      : calloutArgs ',' expr
      | calloutArgs ',' Str
      |
      ;

      expr
      : literal
      | location
      | '(' expr ')'
      | SubOp expr
      | '!' expr
      | expr AddOp expr
      | expr MulDiv expr
      | expr SubOp expr
      | expr RelOp expr
      | expr AndOp expr
      | expr OrOp expr
      | methodCall
      ;

      location
      :Ident
      | Ident '[' expr ']'
      ;

      num
      : DecNum
      | HexNum
      ;

      literal
      : num
      | Char
      | BoolLit
      ;

      eqOp
      : '='
      | AssignOp
      ;

      //-----------------------------------------------------------------------------------------------------------
      fragment Delim
      : ' '
      | 't'
      | 'n'
      ;

      fragment Letter
      : [a-zA-Z]
      ;

      fragment Digit
      : [0-9]
      ;

      fragment HexDigit
      : Digit
      | [a-f]
      | [A-F]
      ;

      fragment Alpha
      : Letter
      | '_'
      ;

      fragment AlphaNum
      : Alpha
      | Digit
      ;

      WhiteSpace
      : Delim+ -> skip
      ;

      Char
      : ''' ~('\') '''
      | ''\' . '''
      ;

      Str
      :'"' ((~('\' | '"')) | ('\'.))* '"'
      ;

      Class
      : 'class'
      ;

      Program
      : 'Program'
      ;

      Void
      : 'void'
      ;

      If
      : 'if'
      ;

      Else
      : 'else'
      ;

      While
      : 'while'
      ;

      Switch
      : 'switch'
      ;

      Case
      : 'case'
      ;

      Ret
      : 'return'
      ;

      Brk
      : 'break'
      ;

      Cnt
      : 'continue'
      ;

      Callout
      : 'callout'
      ;

      DecNum
      : Digit+
      ;

      HexNum
      : '0x'HexDigit+
      ;

      BoolLit
      : 'true'
      | 'false'
      ;

      Type
      : 'int'
      | 'boolean'
      ;

      Ident
      : Alpha AlphaNum*
      ;

      RelOp
      : '<='
      | '>='
      | '<'
      | '>'
      | '=='
      | '!='
      ;

      AssignOp
      : '+='
      | '-='
      ;

      MulDiv
      : '*'
      | '/'
      | '%'
      ;

      AddOp
      : '+'
      ;

      SubOp
      : '-'
      ;

      AndOp
      : '&&'
      ;

      OrOp
      : '||'
      ;


      And basically, we need to generate intermediate code using syntax directed translation. By my knowledge, this means that we must add semantic rules to the parser grammar. We need to take the output generated and encapsulate it into .csv files.



      So, we have three files: symbols.csv, symtable.csv and instructions.csv



      In symbols.csv, the format of each row is:



      int id; //serial no. of symbol, unique
      int tabid; //id no. of symbol table
      string name; //symbol name
      enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type
      enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope
      boolean isArray; //is it an array variable
      int arrSize; //array size, if applicable
      boolean isInited; //is initialized
      union initVal {
      int i;
      boolean b;
      } in; //initial value, if applicable


      In symtable.csv, the format of each row is:



      int id; //symbol table serial no., unique
      int parent; //parent symbol table serial no.


      In instructions.csv, the format of each row is:



      int id; //serial no., unique
      int res; //serial no. of result symbol
      enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type
      int op1; //serial no. of first operand symbol
      int op2; //serial no. of second operand symbol


      As an example, let's say we have this input:



      class Program {
      int x;
      int y, z;
      int w = 0;
      void main (int n) {
      int a;
      a = 0;
      while (a < n) {
      int n;
      n = a + 1;
      a = n;
      }
      callout("printf", "n = %dn", n);
      return n;
      }
      }


      symbols.csv should look like this:



      0, 0, x, INT, GLOBAL, false, 0, false, 0,
      1, 0, y, INT, GLOBAL, false, 0, false, 0,
      2, 0, z, INT, GLOBAL, false, 0, false, 0,
      3, 0, 0, INT, CONST, false, 0, false, 0,
      4, 0, w, INT, GLOBAL, false, 0, true, 0,
      5, 0, main, LABEL, GLOBAL, false, 0, false, 0,
      6, 1, n, INT, LOCAL, false, 0, false, 0,
      7, 1, a, INT, LOCAL, false, 0, false, 0,
      8, 1, 0, INT, CONST, false, 0, false, 0,
      9, 2, n, INT, LOCAL, false, 0, false, 0,
      10, 2, 1, INT, CONST, false, 0, false, 0,
      11, 1, "printf", STR, CONST, false, 0, false, 0,
      12, 1, "n = %dn", STR, CONST, false, 0, false, 0,
      13, 1, 2, INT, CONST, false, 0, false, 0,


      symtables.csv should look like this:



      0, -1,
      1, 0,
      2, 1,


      instructions.csv should look like this:



      0, 4, ASSIGN, 3, -1, #w = 0
      1, 5, LABEL, -1, -1, #main:
      2, 7, ASSIGN, 8, -1, #a = 0
      3, 5, LT, 7, 6, #if a<n goto 5
      4, 8, GE, 7, 6, #iffalse a<n goto 8
      5, 9, ADD, 7, 10, #n = a + 1
      6, 7, ASSIGN, 9, -1, #a = n
      7, 2, GOTO, -1, -1, #goto 3
      8, -1, PARAM, 12, -1, #"n = %dn"
      9, -1, PARAM, 6, -1, #n
      10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);
      11, -1, RET, 6, -1, # return n


      Simply put, I am not sure exactly where to start. I understand that I must add semantic rules to my parser grammar so that I can have output such as the ones I have previously stated. Furthermore, I have done some research on my own and discovered that I must create classes in java for my symbols and symtable and symstack. I am very new to ANTLR and would appreciate it if someone experienced in ANTLR could point me in the right direction.



      Thank you in advance for any help.



      P.S My lexer and parser are based off a tiny C-like language that is posted below.



      Tiny C-Like Language:



      program
      :'class Program {'field_decl* method_decl*'}'

      field_decl
      : type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'
      | type id '=' literal ';'

      method_decl
      : (type | 'void') id'('( (type id) ( ','type id)*)? ')'block

      block
      : '{'var_decl* statement*'}'

      var_decl
      : type id(','id)* ';'

      type
      : 'int'
      | 'boolean'

      statement
      : location assign_op expr';'
      | method_call';'
      | 'if ('expr')' block ('else' block )?
      | 'switch' expr '{'('case' literal ':' statement*)+'}'
      | 'while (' expr ')' statement
      | 'return' ( expr )? ';'
      | 'break ;'
      | 'continue ;'
      | block

      assign_op
      : '='
      | '+='
      | '-='

      method_call
      : method_name '(' (expr ( ',' expr )*)? ')'
      | 'callout (' string_literal ( ',' callout_arg )* ')'

      method_name
      : id

      location
      : id
      | id '[' expr ']'

      expr
      : location
      | method_call
      | literal
      | expr bin_op expr
      | '-' expr
      | '!' expr
      | '(' expr ')'

      callout_arg
      : expr
      | string_literal

      bin_op
      : arith_op
      | rel_op
      | eq_op
      | cond_op

      arith_op
      : '+'
      | '-'
      | '*'
      | '/'
      | '%'

      rel_op
      : '<'
      | '>'
      | '<='
      | '>='

      eq_op
      : '=='
      | '!='

      cond_op
      : '&&'
      | '||'

      literal
      : int_literal
      | char_literal
      | bool_literal

      id
      : alpha alpha_num*

      alpha
      : ['a'-'z''A'-'Z''_']

      alpha_num
      : alpha
      | digit

      digit
      : ['0'-'9']

      hex_digit
      : digit
      | ['a'-'f''A'-'F']

      int_literal
      : decimal_literal
      | hex_literal

      decimal_literal
      : digit+

      hex_literal
      : '0x' hex_digit+

      bool_literal
      : 'true'
      | 'false'

      char_literal
      : '‘'char'’'

      string_literal
      : '“'char*'”'









      share|improve this question














      So, this question isn't necessarily a problem I have, but rather a lack of understanding.



      I have this ANTLR code (which comprises of a parser and lexer):



      grammar Compiler;

      prog
      : Class Program '{' field_decls method_decls '}'
      ;

      field_decls returns [String s1]
      : field_decls field_decl ';'
      {
      $s1 = $field_decl.s2;
      }
      | field_decls inited_field_decl ';'
      |
      ;

      field_decl returns [String s2]
      : field_decl ',' Ident
      | field_decl ',' Ident '[' num ']'
      | Type Ident
      {
      System.out.println($Ident.text);
      $s2 = $Ident.text;
      }
      | Type Ident '[' num ']'
      {
      System.out.println($Ident.text+"["+"]");
      $s2 = $Ident.text;
      }
      ;

      inited_field_decl
      : Type Ident '=' literal
      ;

      method_decls
      : method_decls method_decl
      |
      ;

      method_decl
      : Void Ident '(' params ')' block
      | Type Ident '(' params ')' block
      ;

      params
      : Type Ident nextParams
      |
      ;

      nextParams
      : ',' Type Ident nextParams
      |
      ;

      block
      : '{' var_decls statements '}'
      ;

      var_decls
      : var_decls var_decl
      |
      ;

      var_decl
      : Type Ident ';'
      ;

      statements
      : statement statements
      |
      ;

      statement
      : location eqOp expr ';'
      | If '(' expr ')' block
      | If '(' expr ')' block Else block
      | While '(' expr ')' statement
      | Switch expr '{' cases '}'
      | Ret ';'
      | Ret '(' expr ')' ';'
      | Brk ';'
      | Cnt ';'
      | block
      | methodCall ';'
      ;

      cases
      : Case literal ':' statements cases
      | Case literal ':' statements
      ;

      methodCall
      : Ident '(' args ')'
      | Callout '(' Str calloutArgs ')'
      ;

      args
      : someArgs
      |
      ;

      someArgs
      : someArgs ',' expr
      | expr
      ;

      calloutArgs
      : calloutArgs ',' expr
      | calloutArgs ',' Str
      |
      ;

      expr
      : literal
      | location
      | '(' expr ')'
      | SubOp expr
      | '!' expr
      | expr AddOp expr
      | expr MulDiv expr
      | expr SubOp expr
      | expr RelOp expr
      | expr AndOp expr
      | expr OrOp expr
      | methodCall
      ;

      location
      :Ident
      | Ident '[' expr ']'
      ;

      num
      : DecNum
      | HexNum
      ;

      literal
      : num
      | Char
      | BoolLit
      ;

      eqOp
      : '='
      | AssignOp
      ;

      //-----------------------------------------------------------------------------------------------------------
      fragment Delim
      : ' '
      | 't'
      | 'n'
      ;

      fragment Letter
      : [a-zA-Z]
      ;

      fragment Digit
      : [0-9]
      ;

      fragment HexDigit
      : Digit
      | [a-f]
      | [A-F]
      ;

      fragment Alpha
      : Letter
      | '_'
      ;

      fragment AlphaNum
      : Alpha
      | Digit
      ;

      WhiteSpace
      : Delim+ -> skip
      ;

      Char
      : ''' ~('\') '''
      | ''\' . '''
      ;

      Str
      :'"' ((~('\' | '"')) | ('\'.))* '"'
      ;

      Class
      : 'class'
      ;

      Program
      : 'Program'
      ;

      Void
      : 'void'
      ;

      If
      : 'if'
      ;

      Else
      : 'else'
      ;

      While
      : 'while'
      ;

      Switch
      : 'switch'
      ;

      Case
      : 'case'
      ;

      Ret
      : 'return'
      ;

      Brk
      : 'break'
      ;

      Cnt
      : 'continue'
      ;

      Callout
      : 'callout'
      ;

      DecNum
      : Digit+
      ;

      HexNum
      : '0x'HexDigit+
      ;

      BoolLit
      : 'true'
      | 'false'
      ;

      Type
      : 'int'
      | 'boolean'
      ;

      Ident
      : Alpha AlphaNum*
      ;

      RelOp
      : '<='
      | '>='
      | '<'
      | '>'
      | '=='
      | '!='
      ;

      AssignOp
      : '+='
      | '-='
      ;

      MulDiv
      : '*'
      | '/'
      | '%'
      ;

      AddOp
      : '+'
      ;

      SubOp
      : '-'
      ;

      AndOp
      : '&&'
      ;

      OrOp
      : '||'
      ;


      And basically, we need to generate intermediate code using syntax directed translation. By my knowledge, this means that we must add semantic rules to the parser grammar. We need to take the output generated and encapsulate it into .csv files.



      So, we have three files: symbols.csv, symtable.csv and instructions.csv



      In symbols.csv, the format of each row is:



      int id; //serial no. of symbol, unique
      int tabid; //id no. of symbol table
      string name; //symbol name
      enum types {INT, CHAR, BOOL, STR, VOID, LABEL, INVALID} ty; //symbol type
      enum scope {GLOBAL, LOCAL, CONST, INVALID} sc; //symbol scope
      boolean isArray; //is it an array variable
      int arrSize; //array size, if applicable
      boolean isInited; //is initialized
      union initVal {
      int i;
      boolean b;
      } in; //initial value, if applicable


      In symtable.csv, the format of each row is:



      int id; //symbol table serial no., unique
      int parent; //parent symbol table serial no.


      In instructions.csv, the format of each row is:



      int id; //serial no., unique
      int res; //serial no. of result symbol
      enum opcode {ADD, SUB, MUL, DIV, NEG, READ, WRITE, ASSIGN, GOTO, LT, GT, LE, GE, EQ, NE, PARAM, CALL, RET, LABEL} opc; //operation type
      int op1; //serial no. of first operand symbol
      int op2; //serial no. of second operand symbol


      As an example, let's say we have this input:



      class Program {
      int x;
      int y, z;
      int w = 0;
      void main (int n) {
      int a;
      a = 0;
      while (a < n) {
      int n;
      n = a + 1;
      a = n;
      }
      callout("printf", "n = %dn", n);
      return n;
      }
      }


      symbols.csv should look like this:



      0, 0, x, INT, GLOBAL, false, 0, false, 0,
      1, 0, y, INT, GLOBAL, false, 0, false, 0,
      2, 0, z, INT, GLOBAL, false, 0, false, 0,
      3, 0, 0, INT, CONST, false, 0, false, 0,
      4, 0, w, INT, GLOBAL, false, 0, true, 0,
      5, 0, main, LABEL, GLOBAL, false, 0, false, 0,
      6, 1, n, INT, LOCAL, false, 0, false, 0,
      7, 1, a, INT, LOCAL, false, 0, false, 0,
      8, 1, 0, INT, CONST, false, 0, false, 0,
      9, 2, n, INT, LOCAL, false, 0, false, 0,
      10, 2, 1, INT, CONST, false, 0, false, 0,
      11, 1, "printf", STR, CONST, false, 0, false, 0,
      12, 1, "n = %dn", STR, CONST, false, 0, false, 0,
      13, 1, 2, INT, CONST, false, 0, false, 0,


      symtables.csv should look like this:



      0, -1,
      1, 0,
      2, 1,


      instructions.csv should look like this:



      0, 4, ASSIGN, 3, -1, #w = 0
      1, 5, LABEL, -1, -1, #main:
      2, 7, ASSIGN, 8, -1, #a = 0
      3, 5, LT, 7, 6, #if a<n goto 5
      4, 8, GE, 7, 6, #iffalse a<n goto 8
      5, 9, ADD, 7, 10, #n = a + 1
      6, 7, ASSIGN, 9, -1, #a = n
      7, 2, GOTO, -1, -1, #goto 3
      8, -1, PARAM, 12, -1, #"n = %dn"
      9, -1, PARAM, 6, -1, #n
      10, -1, CALL, 11, 13, #callout("printf", "n = %dn", n);
      11, -1, RET, 6, -1, # return n


      Simply put, I am not sure exactly where to start. I understand that I must add semantic rules to my parser grammar so that I can have output such as the ones I have previously stated. Furthermore, I have done some research on my own and discovered that I must create classes in java for my symbols and symtable and symstack. I am very new to ANTLR and would appreciate it if someone experienced in ANTLR could point me in the right direction.



      Thank you in advance for any help.



      P.S My lexer and parser are based off a tiny C-like language that is posted below.



      Tiny C-Like Language:



      program
      :'class Program {'field_decl* method_decl*'}'

      field_decl
      : type (id | id'['int_literal']') ( ',' id | id'['int_literal']')*';'
      | type id '=' literal ';'

      method_decl
      : (type | 'void') id'('( (type id) ( ','type id)*)? ')'block

      block
      : '{'var_decl* statement*'}'

      var_decl
      : type id(','id)* ';'

      type
      : 'int'
      | 'boolean'

      statement
      : location assign_op expr';'
      | method_call';'
      | 'if ('expr')' block ('else' block )?
      | 'switch' expr '{'('case' literal ':' statement*)+'}'
      | 'while (' expr ')' statement
      | 'return' ( expr )? ';'
      | 'break ;'
      | 'continue ;'
      | block

      assign_op
      : '='
      | '+='
      | '-='

      method_call
      : method_name '(' (expr ( ',' expr )*)? ')'
      | 'callout (' string_literal ( ',' callout_arg )* ')'

      method_name
      : id

      location
      : id
      | id '[' expr ']'

      expr
      : location
      | method_call
      | literal
      | expr bin_op expr
      | '-' expr
      | '!' expr
      | '(' expr ')'

      callout_arg
      : expr
      | string_literal

      bin_op
      : arith_op
      | rel_op
      | eq_op
      | cond_op

      arith_op
      : '+'
      | '-'
      | '*'
      | '/'
      | '%'

      rel_op
      : '<'
      | '>'
      | '<='
      | '>='

      eq_op
      : '=='
      | '!='

      cond_op
      : '&&'
      | '||'

      literal
      : int_literal
      | char_literal
      | bool_literal

      id
      : alpha alpha_num*

      alpha
      : ['a'-'z''A'-'Z''_']

      alpha_num
      : alpha
      | digit

      digit
      : ['0'-'9']

      hex_digit
      : digit
      | ['a'-'f''A'-'F']

      int_literal
      : decimal_literal
      | hex_literal

      decimal_literal
      : digit+

      hex_literal
      : '0x' hex_digit+

      bool_literal
      : 'true'
      | 'false'

      char_literal
      : '‘'char'’'

      string_literal
      : '“'char*'”'






      csv compiler-construction antlr






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 24 '18 at 6:40









      J.KhellyJ.Khelly

      145




      145
























          1 Answer
          1






          active

          oldest

          votes


















          0














          This depends on what version of ANTLR you're using:




          • In ANTLR 3


            • The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.

            • A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.



          • In ANTLR 4, you use a Listener or a Visitor to process the parsed input.






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53455817%2fgenerating-intermediate-code-using-syntax-directed-translation-in-antlr%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            This depends on what version of ANTLR you're using:




            • In ANTLR 3


              • The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.

              • A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.



            • In ANTLR 4, you use a Listener or a Visitor to process the parsed input.






            share|improve this answer




























              0














              This depends on what version of ANTLR you're using:




              • In ANTLR 3


                • The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.

                • A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.



              • In ANTLR 4, you use a Listener or a Visitor to process the parsed input.






              share|improve this answer


























                0












                0








                0







                This depends on what version of ANTLR you're using:




                • In ANTLR 3


                  • The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.

                  • A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.



                • In ANTLR 4, you use a Listener or a Visitor to process the parsed input.






                share|improve this answer













                This depends on what version of ANTLR you're using:




                • In ANTLR 3


                  • The most common approach was to use Tree Construction instructions to create a (modified) parse tree / AST, then walk through that tree as needed.

                  • A less common approach in ANTLR 3 is to embed actions (in target language) directly into grammar rules to capture and interpret the parsed input.



                • In ANTLR 4, you use a Listener or a Visitor to process the parsed input.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 25 '18 at 21:09









                Jiri TousekJiri Tousek

                10.2k52138




                10.2k52138
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53455817%2fgenerating-intermediate-code-using-syntax-directed-translation-in-antlr%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Costa Masnaga

                    Fotorealismo

                    Sidney Franklin