-
Notifications
You must be signed in to change notification settings - Fork 9
Home
The web is a heterogeneous and interconnected space of hypermedia. User tasks often require multiple metadata types and information sources. BigSemantics addresses this heterogeneity by providing a repository of meta-metadata wrappers supporting a wide range of types and sources, and a polymorphic type system that maximizes reuse.
The repository includes wrappers ranging from everyday services such as Amazon Product, Trip Advisor, and Google Books, to professional digital libraries such as the ACM Digital Library and IEEE Xplore, as well as movies, games, blog posts, and so on. In the repository, wrappers written in the meta-metadata language are organized by inheritance, in a hierarchical type system, making it easy to reuse and extend.
The repository is distributed with BigSemantics and is also hosted on GitHub.
It is possible that you want to work with semantics from a web site but BigSemantics has not yet provided support for that site. Don't worry, one of the advantages of using BigSemantics is that you can author your own wrapper and then be able to work with that site!
This section will show you how to author your own wrapper for an unsupported web site. First, you need to set up a development environment. Then, you will learn about the in-and-outs of wrappers: defining data structures, extracting semantics, and attaching presentation semantics and semantic actions. The Wrapper Dev Assist tool can help you handle some details.
- Check out code and set up a development environment
- Compile wrappers and run the BigSemantics service, as explained in the workflow
- Author Wrappers with Meta-Metadata
- Check out extraction results in browser as explained in the workflow, and iterate on wrappers. Note that after you edit wrappers, you'll need to re-compile them and restart the BigSemantics service.
- When necessary, apply advanced wrapper features
For a complete specification of the meta-metadata language, see here.
These tools are not necessary but we have found them helpful in authoring meta-metadata. They help when locating information in a page's HTML and forming XPath expressions:
- If you use Google Chrome, we recommend using its built-in Developer Tools (accessible from menu).
- For XPath authoring, we recommend using
$x("XPATH-EXPRESSION-HERE")
in its JavaScript console, or the XPath Helper extension.
- For XPath authoring, we recommend using
- If you use Firefox, we recommend the Firebug add-on to help identify the parts of the HTML code which contain the information you need.
- For XPath authoring, use XPather or Firefinder for Firebug.