OPML Alerts is an RSS/Atom-based change detection and notification service, a simplified analogue of its namesake Google Alerts.
The application takes an OPML file containing a list of web feeds, annotated with patterns in form of regular expressions. The feeds get polled in regular user-configurable intervals for new entries, which are subsequently filtered and reported in accordance with the criteria imposed by the patterns.
OPML Alerts is written in Scala and its primary purpose is to provide a demo of Akka Typed, which is an actively developed branch intended to overcome the design flaws of the currently standard, untyped Akka. Since the new API has not yet been officially released, this project is meant more as a playground for the new technology rather than as a stable product.
Parsing of both the OPML files and the RSS/Atom feeds is delegated to the ROME framework.
<opml version="1.0">
<body>
<outline type="link" text="GitHub Timeline" url="https://github.com/timeline"
interval="20" pattern="(for|while|if)\s*\(" />
<outline url="https://twitrss.me/twitter_search_to_rss/?term=scala" pattern="(?i)haskell|java" />
<outline url="https://twitrss.me/twitter_search_to_rss/?term=haskell" pattern="(?i)monad|functor" />
</body>
</opml>
OPML Alerts is guided by the following special attributes of the <outline />
tag:
pattern
: A mandatory attribute for the feed to be considered by OPML Alerts. Defines the regular expression to be matched against the text content of feed entries. See the documentation forjava.util.regex.Pattern
for details about the regular expression syntax.interval
: An optional attribute. Determines the length of interval in seconds between two consecutive pollings of the feed. Defaults to 60 seconds.
Regarding the common attributes of <outline />
, only a url
/xmlUrl
/htmlUrl
is strictly necessary. Nonetheless, presence of other attributes (such as title
or text
) may lead to more informative result reports.
- When started, OPML Alerts spawns a
Manager
actor that oversees all the stages and serves as a common communication point for all the other actors. - The ROME-based
Parser
class is called to parse the input OPML file. Manager
dedicates a separateFeedHandler
actor to each feed.- A pool of
EntryHandler
s is also spawned; these are fungible and not tied to particular feeds. - A collection of internal timers is set up to trigger feed re-fetching whereupon the associated
FeedHandler
GETs the feed URL and sends the yet-unseen entries back to theManager
. - The entries are distributed among the
EntryHandler
s by a round-robin scheduler. - An
EntryHandler
scrapes the web page pointed to by the given feed entry and matches it against the pattern. - At last, a summary of the match results is displayed by all available
Printer
instances.
The following concepts and features of Akka Typed can be seen in action with OPML Alerts:
- Extensive employment of (memoised) adapter actors together with private message classes to guarantee protocol safety, viz. to avoid identifying information as components of response messages.
Actor.withTimers
factory to configure a periodic timer for each feed interval.- Actor discovery. Instances of
Printer
are discovered and registered automatically through registration & subscription with theReceptionist
actor. - Coexistence of typed and untyped actors within a single
ActorSystem
. OPML Alerts uses the schwatcher file-watching library to detect and react to any chadges made to the OPML file. However, the library is written using untyped actors; the root behavior therefore guards anActorSystem
which, even though typed in itself, accomodates untyped elements through adapters. - Synchronous testing of behaviors as such, instead of as instantiated within actors.