Skip to content

Tutorial on semantic actions

adunofaiur edited this page May 31, 2014 · 2 revisions

This tutorial will show you how to create useful semantic actions for the meta_metadata object urban_spoon_restaurant.

All semantic actions must be placed within the meta_metadata tag of the object they are intended for.

In general there are three types of semantic actions: variable declarations, control flow statements, and bridge function statements.

Variable Declarations:

  • get_field - Puts a meta_metadata_field in context, needed before that field can be passed as an argument.
  • check - Decide the value of a flag.
Control Flow Statements:
  • for_each - Iterates through a collection meta_metadata_field
  • if - Checks the conditions of a flag_check.
  • flag_check - Used in conjunction with an if statement.
Bridge Function Statements:
  • create_and_visualize_img_surrogate - Used to signal to visualization application to display an image, the image's URL is included as an argument.
  • create_and_visualize_text_surrogate - Same as the previous function except the argument passed is text.
  • parse_document - Signals the information crawler to download and parse a resource, either immediately or later, taking a URL as an argument.
The actions we want to take for an Urban Spoon restaurant information page might include:
  • Displaying a picture of the restaurant
  • If that picture is not available display a picture of the restaurant on the map
  • Queue for download the pages containing lists of similar restaurants
The semantic_actions for displaying the image of the restaurant (as stored in the meta_metadata_field pic) are:
<semantic_actions>

<get_field name="pic"/> 
 <create_and_visualize_img_surrogate> 

<arg value="pic" name="image_purl"/> 

</create_and_visualize_img_surrogate> 

 </semantic_actions> 

Not all restaurant pages have a picture. In the case that it doesn't have a picture this action will do nothing. We can add a check to see if pic is null and if so then we can display the image map.

 <semantic_actions>


<get_field name="pic"> 


<check condition="NOT_NULL" name="picFound"/>


</get_field> 

 <if> 

 <flag_check value="picFound"/> 

 <create_and_visualize_img_surrogate> 


<arg value="pic" name="image_purl"/> 


</create_and_visualize_img_surrogate> 

 </if> 

 <get_field name="map"/> 

 <create_and_visualize_img_surrogate> 


<arg value="pic" name="image_purl"/> 


</create_and_visualize_img_surrogate> 

 </semantic_actions> 

Now we will add the semantic actions for adding each food genre page to the download queue. To do this we will have to use a for_each statement as well as parse_document.

This is the corresponding action set that will be added to the end of our existing semantic actions:

 <get_field name="genres"/> 

 <for_each collection="genres" as="genre"/> 


<get_field name="link" object="genre"/> 

 <get_field name="heading" object="genre"/> 

 <parse_document> 


<arg value="link" name="container_link"/> 

 <arg value="heading" name="anchor_text"/> 


</parse_document> 

 </for_each> 

Tip: The default behavior of parse_document is to queue that URL for downloading and parsing in a waiting list. However, it has an attribute now that you can set to true if you want meta-metadata to immediately download and parse that document.

The next part of this tutorial talks about some advanced topics about the meta-language.