Skip to content

Annotated CAST Part 3: CAST and AnnCast nodes

titomeister edited this page May 18, 2022 · 11 revisions

This document summarizes the attributes of CAST and annotated CAST nodes. CAST node classes are automatically generated from the Swagger spec, and include many Swagger internal attributes/methods that are not used in the PA pipeline. This document only lists attributes which are used in the PA pipeline, and for some, we give hints on how they are used.

CAST Nodes

AstNode:
  - Base class
Assignment:
  - left: AstNode
  - right: AstNode
Attribute:
  - value: AstNode
  - attr: Name 
BinaryOp:
  - left: AstNode
  - right: AstNode
  - op: BinaryOperator
BinaryOperator: Enum of operators
Boolean:
  - boolean: bool
Call: 
  - func: Name
  - arguments: List[AstNode]  (often Var or literal, but could be expression for example)
ClassDef:
  - name: str
  - funcs: List[FunctionDef]
  - fields: List[Var]
  - bases: List[str]
Dict:
  - keys: List[AstNode]
  - values: List[AstNode]
Expr:
  - expr: AstNode
FunctionDef:
  - name: str (TODO: convert this to a Name node (currently in Py2CAST, also do for GCC2CAST))
  - func_args: List[Var]
  - body: List[AstNode]
List:
  - values: List[AstNode]
Loop:
  - expr: AstNode
  - body: List[AstNode]
ModelBreak: basically unused, and no fields
ModelContinue: basically unused, and no fields
ModelIf:
  - expr: AstNode
  - body: List[AstNode] (if body)
  - orelse: List[AstNode] (else body)
ModelReturn:
  - value: AstNode
Module:
  - name: str
  - body: List[AstNode]
Name:
  - name: str
  - id: integer
Number:
  - number: float (any number in python)
Set:
  - values: List[AstNode]
String:
  - string: str
SourceRef: (We do not have an AnnCastSourceRef, we just use SourceRef)
  - source_file_name: str
  - col_start: number
  - col_end: number
  - row_start: number
  - row_end: number
Subscript:
  - value: AstNode
  - slice: AstNode
Tuple:
  - values: List[AstNode]
UnaryOp:
  - value: AstNode
  - op: UnaryOperator 
UnaryOperator: Enum of operators
VarType: Enum, but its empty and unused
Var:
  - val: Name
  - type: VarType (unused)

All of the above nodes X are converted to Annotated Cast nodes with names AnnCastX except the following:

  • SourceRef: the Annotated Cast node copies the SourceRef from the associated CAST node
  • BinaryOperator, UnaryOperator, VarType: these are simply enums

Annotated CAST nodes

For every CAST node NodeName there is an associated annotated CAST node called AnnCastNodeName, with the exception of AnnCastNode. AnnCastNode is the annotated CAST analog of AstNode. Each annotated CAST node has the same attributes as its associated CAST node, excluding the Swagger generated internal attributes. Many annotated CAST nodes have additional attributes which are populated during the various annotated CAST passes. Below, we list the attributes we have added to annotated CAST nodes.

In addition to the extra attributes added for individual Annotated CAST (AnnCast) nodes, each AnnCast node has an equiv and a to_dict method. to_dict is used to convert any AnnCast node into a dictionary and is implemented differently depending on which AnnCast node it is. These dictionaries are then used as part of the equiv method. The equiv method checks whether two AnnCast nodes are equivalent by comparing their respective dictionaries. The equiv method is particularly important when it comes to testing as it's used to double-check generated AnnCast against an expected AnnCast set. An example implementation of both to_dict and equiv is shown after the list of attributes added below.

The Annotated CAST node classes are defined in automates/program_analysis/CAST2GrFN/visitors/annotated_cast.py

AnnCastAssignment:
  - grfn_assignment: GrfnAssignment

AnnCastAttribute:
  # No new fields added
  
AnnCastBinaryOp:
  # No new fields added     

AnnCastBoolean:
  # No new fields added 

AnnCastCall: 
**TODO:** update if any removed

  # index of this Call node over all invocations of this function
  - invocation_index: int

  # dicts mapping a Name id to its fullid
  - top_interface_in: Dict[var_id, fullid]
  - top_interface_out: Dict[var_id, fullid]
  - bot_interface_in: Dict[var_id, fullid]
  - bot_interface_out: Dict[var_id, fullid]

  # dicts mapping Name id to Name string
  - top_interface_vars: Dict[var_id, str]
  - bot_interface_vars: Dict[var_id, str]

  # GrFN lambda expressions
  - top_interface_lambda: str
  - bot_interface_lambda: str

  # maps Name id to fullid
  - globals_accessed_before_mod: Dict[var_id, fullid]
  - used_globals:                Dict[var_id, fullid]

  # maps Name id to fullid
  - in_ret_val:  Dict[var_id, fullid]
  - out_ret_val: Dict[var_id, fullid]

  # Determines if this is a GrFN 2.2 call
  # The associated FunctionDef is copied to make a GrFN 2.2 container
  # if this flag is True
  - is_grfn_2_2: bool

  # copied function def used for GrFN 2.2
  - func_def_copy: typing.Optional[AnnCastFunctionDef]  

  # Maps argument index to created argument fullid
  - arg_index_to_fullid:   Dict[int, fullid]
  - param_index_to_fullid: Dict[int, fullid]

  # Maps positional index to GrfnAssignment
  # GrfnAssignment stores ASSIGN/LITERAL node, its inputs, and its outputs
  - arg_assignments:       Dict[int, GrfnAssignment]

  # Metadata attributes
  - grfn_con_src_ref: GrfnContainerSrcRef

AnnCastDict:
  # No new fields added

AnnCastExpr:
  # No new fields added

AnnCastFunctionDef: 
  # for bot_interface_in

  - con_scope: List
  - has_ret_val: bool

  - in_ret_val: Dict[var_id, fullid]
  - out_ret_val: Dict[var_id, fullid]

  - modified_vars: Dict[var_id, var_name]
  - vars_accessed_before_mod: Dict[var_id, var_name]
  - used_vars: Dict[var_id, var_id]

  - top_interface_vars: Dict[int, str] 
  - bot_interface_vars: Dict[int, str]  

  - globals_accessed_before_mod: Dict
  - used_globals:                Dict

  - modified_globals: Dict

  - arg_index_to_fullid: Dict[int, fullid]
  - param_index_to_fullid: Dict[int, fullid]

  # dicts mapping a Name id to its fullid
  - top_interface_in: Dict[var_id,fullid]
  - top_interface_out: Dict[var_id,fullid]
  - bot_interface_in: Dict[var_id,fullid]
  - bot_interface_out: Dict[var_id,fullid]

  - top_interface_lambda: str
  - bot_interface_lambda: str

  # dict mapping Name id to highest version at end of "block"     
  - body_highest_var_vers: Dict[,]
 
  - grfn_con_src_ref: GrfnContainerSrcRef

  - dummy_grfn_assignments: List    
  
  - body_highest_var_vers = Dict[var_id, version]

AnnCastLoop:
  # Loop container scope
  - con_scope: List

  # Function scopestring 
  - base_func_scopestr: str

  # dicts mapping a Name id to its string name
  # used for container interfaces
  - modified_vars: Dict[var_id, var_name]       
  - vars_accessed_before_mod: Dict[var_id, var_name]
  - used_vars: Dict[var_id, str]
  - top_interface_vars: Dict[var_id, str]
  - top_interface_updated_vars: Dict[var_id, str]
  - bot_interface_vars: Dict[var_id, str]      

  # dicts mapping Name id to highest version at end of "block"
  - expr_highest_var_vers: Dict[var_id, version]
  - body_highest_var_vers: Dict[var_id, version]

  # dicts mapping a Name id to variable string name
  # for variables used in the Loop expr
  - expr_vars_accessed_before_mod: Dict[var_id, str]
  - expr_modified_vars: Dict[var_id, str]
  - expr_used_vars: Dict[var_id, str]

  # dicts mapping a Name id to its fullid
  # initial versions for the top interface come from enclosing scope
  # updated versions for the top interface are versions 
  # at the bottom of the loop after one or more executions of the loop
  - top_interface_initial: Dict[var_id, fullid]
  - top_interface_updated: Dict[var_id, fullid]
  - top_interface_out: Dict[var_id, fullid]
  - bot_interface_in: Dict[var_id, fullid]
  - bot_interface_out: Dict[var_id, fullid]
  - condition_in: Dict[var_id, fullid]
  - condition_out: Dict[var_id, fullid]

  # GrFN VariableNode for the condition node
  - condition_var = None

  # GrFN lambda expressions
  - top_interface_lambda: str
  - bot_interface_lambda: str
  - condition_lambda: str
  # metadata attributes
  - grfn_con_src_ref: GrfnContainerSrcRef

AnnCastModelBreak:
  # No new fields added

AnnCastModelContinue:  
  # No new fields added
  
AnnCastModelIf:
  # Container scope
  - con_scope: List
  # Function scope string this ModelIf node is "living" in
  - base_func_scopestr: str

  # dicts mapping a Name id to its string name
  - modified_vars: Dict[var_id, var_name]
  - vars_accessed_before_mod: Dict[var_id, var_name]
  - used_vars: Dict[var_id, var_name]
  - top_interface_vars: Dict[var_id, var_name]
  - bot_interface_vars: Dict[var_id, var_name]
  
  # dicts mapping a Name id to variable string name
  # for variables used in the if expr
  - expr_vars_accessed_before_mod: Dict[var_id, var_name]
  - expr_modified_vars: Dict[var_id, var_name]
  - expr_used_vars: Dict[var_id, var_name]

  # dicts mapping Name id to highest version at end of "block"
  - expr_highest_var_vers: Dict[var_id, version]
  - ifbody_highest_var_vers: Dict[var_id, version]
  - elsebody_highest_var_vers: Dict[var_id, version]

  # dicts mapping a Name id to its fullid
  - top_interface_in: Dict[var_id, fullid]
  - top_interface_out: Dict[var_id, fullid]
  - bot_interface_in: Dict[var_id, fullid]
  - bot_interface_out: Dict[var_id, fullid]
  - condition_in: Dict[var_id, fullid]
  - condition_out: Dict[var_id, fullid]
  - decision_in: Dict[var_id, fullid]
  - decision_out: Dict[var_id, fullid]

  # GrFN VariableNode for the condition node
  - condition_var: None 

  # GrFN lambda expressions
  - top_interface_lambda: str
  - bot_interface_lambda: str
  - condition_lambda: str
  - decision_lambda: str

  # metadata attributes
  - grfn_con_src_ref: GrfnContainerSrcRef
 
AnnCastModelReturn:
  # Cache the FunctionDef node this return statement lies in
  - owning_func_def: Optional[AnnCastFunctionDef] 
  # Store GrfnAssignment for use in GrFN generation
  - grfn_assignment: Optional[GrfnAssignment]

AnnCastModule: 
  - modified_vars: Dict[int, str]
  - vars_accessed_before_mod: Dict[int,str]
  - used_vars: Dict[int,str]
  - con_scope: List

  - grfn_con_src_ref: GrfnContainerSrcRef

AnnCastName:
  # container_scope is used to aid GrFN generation
  - self.con_scope: List[str]

  # Function scopestr this Name node is "living" in
  - base_func_scopestr: str

  # versions are bound to the scope of the variable
  - self.version: int
  - self.grfn_id: grfn_var_id

AnnCastNumber:
  # No new fields added
   
AnnCastSet:
  # No new fields added

AnnCastString:
  # No new fields added

AnnCastSubscript:
  # No new fields added

AnnCastTuple: 
  # No new fields added

AnnCastUnaryOp:
  # No new fields added

AnnCastVar: 
  # No new fields added

What follows next is an implementation of the equiv and to_dict methods on an Annotated CAST node. Here is what they look like for the AnnCastModelIf class.

def to_dict(self):
    result = super().to_dict()
    result["expr"] = self.expr.to_dict()
    result["body"] = [node.to_dict() for node in self.body]
    result["orelse"] = [node.to_dict() for node in self.orelse]
    result["con_scope"] = self.con_scope
    result["base_func_scopestr"] = self.base_func_scopestr
    # FUTURE: add attributes to enhance test coverage
    return result

def equiv(self, other):
    if not isinstance(other, AnnCastModelIf):
        return False
    return self.to_dict() == other.to_dict()