Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable declaration and scope #150

Closed
cardillan opened this issue Sep 24, 2024 · 4 comments
Closed

Variable declaration and scope #150

cardillan opened this issue Sep 24, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@cardillan
Copy link
Owner

cardillan commented Sep 24, 2024

See also #148 and #149.

New variable declaration syntax needs to support the following variable types and initialization:

  • Variable and constant visibility (global scope)
    • public is visible among all modules.
    • private is only visible inside the declaring module.
    • default (no public/private) is accessible from other modules, but only by using full module name or through import.
  • Internal variables (processor-based)
    • Local variables. Declaration is optional, but should be possible. Might be required in strict compilation setting.
      • Scope: enclosing code block. If there are two non-nested for loops with the same control variable declared in the loop statement, the variable will have to be declared twice. The names will be mangled to distinguish the two instances of the variable.
    • Global variables
      • Global variables need to be declared. If they were created just by assigning a value to them in the global scope (as in e.g. Python), we'd still need explicit declaration in case we want to have an uninitialized global variable.
      • Public global variable names are not mangled.
      • Default/private global variable names in the main module are not mangled if there's no collision with a public or imported default global variable from another module.
      • Mangling is done by prepending a prefix using module name or module identifier, e.g. __m01_X is a private global variable X in the first module.
    • Arrays
      • The same rules as global variables.
      • Needs to devise a way to name variables holding array elements and apply the same name mangling to these variables as for globals.
  • External variables (stored in memory cell or memory bank)
    • External variables or arrays
    • May be allocated on the heap (no additional information) or at a specific memory block/index
    • No direct representation as a processor variable --> no need for name mangling
  • Variable initialization
    • Is a must for internal arrays; when allowed for arrays, it makes sense to allow initialization for all internal variables.
    • Initialization of external variables is problematic and won't be initially supported:
      • The memory blocks might not be built when the initialization runs.
      • The variables pointing to memory blocks might not be set up.
  • Parameters
    • Only public. Private doesn't make sense, as parameters must be prominently visible in the compiled code.

Mindcode compiler will emit a mapping of variable names to mlog identifiers for reference.

We need a homogenous, intuitive syntax covering the requirements given above. A C-style syntax comes to mind.

  • Visibility: default (no keyword), public or private.
  • Modifiers volatile, aliased, restricted and linked may be used on internal variables. var is optional in that case.
    • (Maybe modifiers could be applied to variables separately from the declaration, using, say, declare?)
  • Keyword var will be required. If types are introduced one day, a type name would be used in place of var.
  • var not needed (or outright forbidden, probably forbidden) for
    • external variables, as they're always numeric and can't be anything else,
    • linked variables, as they're always blocks and can't be anything else.

Examples of declarations:

public var a;
public var b = 10;
private var c;
private var d = 10;
public var e[10];
public var f[] = (1, 2, 3);
private var g[10];
private var h[] = (1, 2, 3);
public external volatile i;                 // allocated on the heap
private external j[10];                     // private array with 10 elements allocated on the heap
private external cell1[1] k;                // private variable stored in cell1[1]  
external cell1[11 .. 20] l[10];             // range size must match array size
external cell1[21] m[20];                   // it is possible to specify only the starting index 
external cell1[41 .. 60] m[];               // array size derived from range size 

// Multiple declarations/initializations
var o, p = 10, q[] = (2, 4, 6), r[3];       // default visibility
external restricted s, t[5];                // all allocated on the heap
external cell1[5] u, v[3], w[6];            // all stored in cell1 in this order starting at slot 5
external aliased cell1[6 .. 12] x, y, z[5]; // range size must match total allocated size; all variables are aliased
external bank1 d1[256], d2[256];            // If no index given, starts at 0

external volatile bank3;                    // marks bank3 as volatile
external restricted cell5[0 .. 15];         // marks slots 0 .. 15 as restricted

// Maybe one day long, long time away: declare large array over several banks
// Loops over this array would be four loops over each of the banks
external (bank1, bank2, bank3, bank4) array[2048];

Generic syntax

Internal variables:

[public|private] var identifier[<array-decl>] [= init] (, identifier[<array-decl>] [= init])* ;

External variables:

[public|private] external [volatile|aliased|restricted] <ext_location> [var] identifier[<array-decl>] (, identifier[<array-decl>])* ;


[public|private] external [volatile|aliased|restricted] <ext_location>;

Linked variables:

[public] linked [var] identifier (, identifier)* ;

Linked variables are automatically public in global namespace. No initialization.

Not all of this will be supported from the get go, but it is important to get the syntax right.

@cardillan cardillan added the enhancement New feature or request label Sep 24, 2024
@cardillan
Copy link
Owner Author

Currently I'm leaning to limit the scope of local variables to the block they're declared in. I want this code to generate compile-time errors, for example:

for var i in 1 ... 10 do
    for var i in a, b, c do
        ...
   end;
end;

but this one to compile:

for var i in 1 ... 10 do
    ...
end;

for var i in a, b, c do
    ...
end;

That wouldn't be possible with making the variable scope equal to the whole function.

The two instances of the same variable should probably get compiled to different mlog variables, so that the second instance doesn't "inherit" the last value of the previous instance. I'm not entirely decided about this yet.

@cardillan
Copy link
Owner Author

Version 3.0.0 will probably support optional variable declaration as an experimental feature.

Three kinds of variables will be available:

  • Standard variables: var variable1, variable2 = 10;
    • Corresponding to Mindustry processor variables
    • Name must not map to a recognized linked block name
    • Can be initialized at declaration to any expression
    • Allowed in global, main or local scope
      • To create a main scope, use begin ... end;
      • Multiple main scopes can be created, variables declared in one main scope is accessible in all other main scopes
    • Valid in the scope in which they were declared
  • Linked variables: linked cell1, message1, switch2;
    • Corresponding to Mindustry linked blocks
      • A warning will be generated when the linked block name won't be recognized
    • Allowed only in the global context (are therefore necessarily global)
    • Cannot be initialized
  • External variables: external externalVariable = 5, $externalVariable;
    • Assigned to a free memory slot from the "heap"
    • The $ prefix for external variables is allowed, but is optional
    • Allowed only in the global context (are therefore necessarily global)
    • Can be initialized
    • Support for cached modifier (a keyword): external cached $A, $B = 10;
      • Cached external variables are only read once from the memory at the declaration (or, if initialized, the initial value is written).
      • The variable value is held in a dedicated mlog variable ("shadow variable", named identically to the variable name).
      • When reading the value from the variable, the value is obtained from the shadow variable; the external memory is not accessed.
      • When writing a value, the shadow variable is updated and the value is written to the external memory.

Using undeclared variables will be supported alongside optional declarations with normal rules:

  • linked block names map to linked variables,
  • upper case names map to global variables,
  • lower case names/mixed case names map to main/local variables,
  • $ prefix marks external variables allocated on the heap.

A declared main/local variable will hide a global variable, a parameter or a constant with the same name. For now - and maybe never - there won't be a mechanism to access a global variable hidden by a local one.

A compiler option will probably be added to generate warning or error when an undeclared variable is encountered. Declaring a variable which was already used without declaration, as well as declaring a variable twice, will result in an error.

For the future, the following changes are planned:

  • The scope of main and local variables will be limited to the code block they are declared in. Multiple independent declarations of a variable with the same name in the same scope may or may not map to the same mlog variable.
  • It will be possible to use a linked block name for a main or global variable name. Declared global/main variable names will be decorated in mlog.
  • Variable types will be introduced.
    • var keyword will become a type corresponding to an mlog variable (the widest type).
    • Other types will include: int, float, string, item, liquid, unit and so on.

@cardillan
Copy link
Owner Author

Implementing variable declarations, as described above, might lead to a global variable mapping to the same mlog name as a main variable. To this end, I'll change the way variable names are transformed to mlog:

  • linked block and parameters: no modification to the variable name
  • global variables: .variable (a . prefix)
  • main variables: :variable (a : prefix)
  • function variables: :fnXX:variable (a : prefix both to the function counter and a variable name)
  • temporary variables: *tmpXX (*tmp + counter)
  • function return address/variable: :fnXX*retaddr/:fnXX*retval
  • stack pointer: *sp

Once modules become available, a module prefix of the form .mXX will be prepended to the names generated as above.

This is what a nontrivial compiled code would look like:

set PCT_LOW 60
set PCT_HIGH 80
set :switch null
set :measure null
set :maximum null
set :sorter null
set :container null
set .MESSAGE null
set :repeat false
print "Configuring regulator...\n"
set :controlled 0
set :links @links
set :n 0
jump 63 greaterThanEq 0 :links
getlink :block :n
print "Found: "
print :block
print "\n"
sensor *tmp4 :block @type
jump 22 notEqual *tmp4 @message
set .MESSAGE :block
jump 61 always 0 0
jump 25 notEqual *tmp4 @switch
set :switch :block
jump 61 always 0 0
jump 28 equal *tmp4 @sorter
jump 28 equal *tmp4 @inverted-sorter
jump 30 notEqual *tmp4 @unloader
set :sorter :block
jump 61 always 0 0
jump 35 equal *tmp4 @vault
jump 35 equal *tmp4 @container
jump 35 equal *tmp4 @core-shard
jump 35 equal *tmp4 @core-foundation
jump 39 notEqual *tmp4 @core-nucleus
set :container :block
set :measure @totalItems
set :maximum @itemCapacity
jump 61 always 0 0
jump 42 equal *tmp4 @liquid-tank
jump 42 equal *tmp4 @liquid-container
jump 46 notEqual *tmp4 @liquid-router
set :container :block
set :measure @totalLiquids
set :maximum @liquidCapacity
jump 61 always 0 0
jump 48 equal *tmp4 @battery
jump 52 notEqual *tmp4 @battery-large
set :container :block
set :measure @totalPower
set :maximum @powerCapacity
jump 61 always 0 0
jump 55 equal *tmp4 @power-node
jump 55 equal *tmp4 @power-node-large
jump 59 notEqual *tmp4 @surge-tower
set :container :block
set :measure @powerNetStored
set :maximum @powerNetCapacity
jump 61 always 0 0
op shl *tmp6 1 :n
op or :controlled :controlled *tmp6
op add :n :n 1
jump 14 lessThan :n :links
print "Message: "
print .MESSAGE
print "\nSwitch: "
print :switch
print "\nSorter: "
print :sorter
print "\nContainer: "
print :container
print "\nControlled mask: "
print :controlled
print "\n"
jump 77 notEqual .MESSAGE null
print "No message.\n"
set :repeat true
jump 80 notEqual :switch null
print "No switch.\n"
set :repeat true
jump 83 notEqual :container null
print "No container.\n"
set :repeat true
printflush .MESSAGE
jump 8 notEqual :repeat false
printflush null
op greaterThanEq *tmp13 PCT_LOW 0
op lessThanEq *tmp14 PCT_LOW 100
op land *tmp15 *tmp13 *tmp14
op floor *tmp16 PCT_LOW 0
op equal *tmp17 PCT_LOW *tmp16
op land :fn0:condition *tmp15 *tmp17
jump 96 notEqual :fn0:condition false
print "PCT_LOW must be an integer between 0 to 100."
printflush .MESSAGE
stop
op greaterThanEq *tmp20 PCT_HIGH 0
op lessThanEq *tmp21 PCT_HIGH 100
op land *tmp22 *tmp20 *tmp21
op floor *tmp23 PCT_HIGH 0
op equal *tmp24 PCT_HIGH *tmp23
op land :fn0:condition *tmp22 *tmp24
jump 106 notEqual :fn0:condition false
print "PCT_HIGH must be an integer between 0 to 100."
printflush .MESSAGE
stop
op lessThan :fn0:condition PCT_LOW PCT_HIGH
jump 111 notEqual :fn0:condition false
print "PCT_LOW must be less than PCT_HIGH."
printflush .MESSAGE
stop
sensor :max :container :maximum
sensor *tmp30 :sorter @type
op strictEqual .INVERTED *tmp30 @inverted-sorter
set .STATE true
set .CYCLES 1
op xor .ON .INVERTED true
set .ACTIVE_TEXT "\nCurrently inactive:[salmon]"
jump 120 equal .ON false
set .ACTIVE_TEXT "\nCurrently active:[green]"
control enabled :switch 0 0 0 0
sensor *tmp39 :switch @enabled
jump 0 notEqual *tmp39 0
set :start @time
sensor *tmp42 :container @dead
jump 127 equal *tmp42 0
end
sensor *tmp45 .MESSAGE @dead
jump 130 equal *tmp45 0
end
sensor *tmp48 :switch @dead
jump 133 equal *tmp48 0
end
sensor *tmp51 :sorter @dead
jump 136 notEqual :sorter *tmp51
end
sensor :item :sorter @config
op equal *tmp55 :item null
op notEqual *tmp56 :measure @totalItems
op or *tmp58 *tmp55 *tmp56
jump 145 equal *tmp58 false
print "Measuring [gold]total[] in "
print :container
sensor :amount :container :measure
jump 150 always 0 0
print "Measuring [gold]"
print :item
print "[] in "
print :container
sensor :amount :container :item
op mul *tmp62 100 :amount
op idiv :pct *tmp62 :max
jump 161 greaterThan :pct PCT_LOW
jump 169 equal .STATE true
set .STATE true
op add .CYCLES .CYCLES 1
op xor .ON .INVERTED true
set .ACTIVE_TEXT "\nCurrently inactive:[salmon]"
jump 169 equal .ON false
set .ACTIVE_TEXT "\nCurrently active:[green]"
jump 169 always 0 0
jump 169 lessThan :pct PCT_HIGH
jump 169 equal .STATE false
set .STATE false
op add .CYCLES .CYCLES 1
op xor .ON .INVERTED false
set .ACTIVE_TEXT "\nCurrently inactive:[salmon]"
jump 169 equal .ON false
set .ACTIVE_TEXT "\nCurrently active:[green]"
print "\nLevel: [gold]"
print :pct
print "%[]"
jump 179 equal .INVERTED false
print "\nActivate above [green]"
print PCT_HIGH
print "%[]\nDeactivate below [salmon]"
print PCT_LOW
print "%[]"
jump 184 always 0 0
print "\nActivate below [green]"
print PCT_LOW
print "%[]\nDeactivate above [salmon]"
print PCT_HIGH
print "%[]"
print .ACTIVE_TEXT
jump 187 equal :links @links
end
set :n 0
jump 198 greaterThanEq 0 :links
getlink :block :n
op shl *tmp83 1 :n
op and *tmp84 :controlled *tmp83
jump 196 equal *tmp84 false
control enabled :block .ON 0 0 0
print "\n    "
print :block
op add :n :n 1
jump 189 lessThan :n :links
print "[]\n# of cycles: "
print .CYCLES
op sub *tmp88 @time :start
op floor *tmp89 *tmp88 0
print "\n[lightgray]Loop: "
print *tmp89
print " ms"
printflush .MESSAGE
sensor *tmp39 :switch @enabled
jump 123 equal *tmp39 0

@cardillan
Copy link
Owner Author

Implemented in 3.0.0. The visibility issues will be solved as part of module implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant