Skip to content
Andy Jackson edited this page Nov 14, 2013 · 22 revisions

Adding An Entry

This is the a crucial user workflow - adding an entry to W3ACT. In the prototype, this is a single page form, but in this version, we need to break the workflow down.

  1. Look up a URI.
  2. If appropriate, add a new entry.

Stage 1 - Look up a URI

The user starts by looking up the URL of interest. This gives a status check, and allows the user to decide whether they need to create an entry for this item.

Here is an example from the current prototype:

http://www.webarchive.org.uk/act/websites/url-search?url=http%3A//www.bbc.co.uk/news/

This page allows the user to look up a URL in a way that is compatible with a bookmarklet or other RESTful services.

From here, they can find out:

  • If the URI, or a closely related URI (e.g. same domain), already has an entry in ACT.
  • If the URI has been crawled (hook into Monitrix - see Data sources).
  • If the URI is available from any of the Wayback instances (i.e. hooks to Wayback API - see Data sources).

What this page should also do, but does not do yet:

Stage 2 - Adding an entry

This uses the same form as the entry editor (below). However, the URL is passed in and should appear in the, and a number of other parameters should also be set.

Adding/Editing A Target

The same form is used to add or edit entries (aiming to avoid user confusion), but some fields may not be edited after they have been added. (TO BE SPECIFIED)

In the prototype, this was one long form. In the new version, this should be simplified as much as possible. The editor should present a horizontal array of tabs, one of each section of the editor:

  • Basic information
    • Title
    • URL(s)
    • Live Site status
    • Overall QA status
    • Key Site status
    • WCT/SPT IDs (to be shown only but not edited)
  • Metadata
    • Description
    • Subject
    • Collections
    • Nominating Organisation
  • Crawl permission
  • Crawl policy
    • Crawl frequency
    • Crawl scope
    • Crawl depth (or rather, crawl cap)
    • Whether to ignore Robots.txt
    • FUTURE whitelist and/or backlist URLs/Regexes.

TODO Add details.

The highest priority is to establish the permission to crawl based on the crawl scope policy.

TBA

Viewing A Target

This should use the same layout as the editor, but with static labels instead of form elements.

Beneath the tabbed pane, the current set of known Instances of this Target should be shown.

Annotating An Instance

Individual Instances of a Target are annotated in order to QA them and in order to add them to Collections.

TBA