I am doing my research by making a programming language using antlr4 and I am struggling for whole day to fix the problem with two words being one token after whitespace removal.
This is my grammar for antlr:
grammar Grammar;
start: (statement ';')*;
//needs expressions extension
statement
: variable
| //class
| if
| function
| loop
| functionCall
| show
;
variable
: TYPE ID ('=' VAR_TYPE)?
| ...
;
array
: TYPE ID '[]' ('=' '[' VAR_TYPE (',' VAR_TYPE)* ']')?
;
//needs expressions extension
function
: (ACCESS TYPE ID '(' ID* ')' '{'
(
variable
| if
| loop
| functionCall
) 'return' VAR_TYPE
'}')
| (ACCESS 'void' ID '(' ID* ')' '{'
(
variable
| if
| loop
| functionCall
)
'}')
;
//needs expressions extension
if: 'if' (ID | VAR_TYPE) COMPARISON (ID | VAR_TYPE) ':'
(
'\t' variable
| '\t' if
| '\t' loop
| '\t' functionCall
| '\t' show
)*
('else if' (ID | VAR_TYPE) COMPARISON (ID | VAR_TYPE) ':'
(
'\t' variable
| '\t' if
| '\t' loop
| '\t' functionCall
| '\t' show
)*
)*
('else' ':'
(
'\t' variable
| '\t' if
| '\t' loop
| '\t' functionCall
| '\t' show
)*
)?
;
loop: 'foreach' ID 'in' ID ':'
(
'\t' variable
| '\t' if
| '\t' loop
| '\t' functionCall
| '\t' show
)*
;
functionCall: (ID '.')? ID '()';
//needs expressions extension
show: 'show' '(' (ID | VAR_TYPE)? ('+' (ID | VAR_TYPE))* ')';
ACCESS: 'private' | 'public';
COMPARISON: '>' | '<' | '>=' | '<=' | '==';
TYPE: 'int' | 'float' | 'string';
VAR_TYPE: STRING | INT | BOOL | FLOAT | ID;
ID: [a-zA-Z_][a-zA-Z0-9_]* ;
STRING : '"' .*? '"' ;
INT : [0-9]+ ;
BOOL : 'true' | 'false' ;
FLOAT : [0-9]+ '.' [0-9]+ ;
WS : [ \t\r\n]+ -> skip;
This is what console gives after making a tree:
line 1:7 no viable alternative at input 'stringname'
line 2:4 no viable alternative at input 'intage'
And here is also input.txt file for grammar:
string name;
int age;
bool sex;
string children[];
public string returnPerson() {
return "Name " + name + "\nAge " + age + "\nSex " + sex + "\n";
}
public bool isMinor() {
if age > 17:
return false;
else:
return true;
}
public void showChildren() {
int i = 0;
foreach child in children:
show("Children №" + (i + 1) + ": " + child + "\n");
}
I basically just don’t know what to do with this, I have witespaces sorted out, but it still thinks it is one token. Also, by the output tree I see that it doesnt go further than two first lines of input.txt.
Help me to fix this problem please.
You need to sign in to view this answers