-
Notifications
You must be signed in to change notification settings - Fork 129
Micro Libraries wtf_fetch, ....
If we regard wtf
as a "Wiki Transformation Framework
", in which all submodules/mirco libraries have the module name wtf_my_module
, then a wtf
node application or WebApp is chaining mirco libraries in a specific order to perform a job.
-
wtf_fetch
was already extracted as one main job performed before within `wtf_wikipedia´ (i.e. in v7.3.0). -
wtf_tokenizer
is able to tokenize mathematical expressions and citation and stores the parsed content in a JSON similar to `wtf_wikipedia´. By options you may decide if you want to tokenize citations and/or mathematical expressions. The tokenizer is required for Wiki2Reveal. -
wtf_url
(not implemented yet) transforms e.g. relative links to absolute links or replaces urls for images and other media to absolute links to WikiCommons.
Those processing chain are connected with Promises in the root wtf which is now wtf_wikipedia.
If wtf_url
was already implemented for transforming relative links and urls for image, videos and audio to absolute links, then one example processing chain would be:
wtf_fetch > wtf_url > wtf_wikipedia
wtf_fetch
downloads the wiki markdown and wtf_url
converts relative to absolute links and at the very end of the pipe the converted result wiki
is processed with wtf(wiki)
with the current npm module wtf_wikipedia
. If a developer wants to parse the wiki markdown Parsoid, so the processing chain would look like this:
wtf_fetch > wtf_url > parsoid
If a special wrapper for Wiki Transformation Framework
(WTF) would be implemented, that the module would be named as wtf_parsoid
and the processing chain for wtf_wiki2html
with absolute link transformation would look like this:
wtf_wiki2html := wtf_fetch > wtf_url > wtf_parsoid
The processing of the library wtf_wikipedia
can be split into the following 3 jobs:
-
wtf_fetch
, that fetches the wiki source from Wikipedia, Wikiversity, .... (MediaWiki domain) with the parameters language (e.g.en
,de
,.. ) and domain (e.g.wikipedia
,wikiversity
,wikivoyage
, ...) -
wtf_parse
, that parses wiki source into aDocument
object (Abstract Syntax Tree) -
wtf_output
, that generates/renders the output for a specific format from a givenDocument
object. The output modes are attached to the tree nodes in the Abstract Syntax Tree (AST). Current tree nodes are defined in thewtf_wikipedia
directorysrc
. The order in which they are parsed are indicated in prefix number of the folder name. The tree node have output rendering functions for each available format (plaintext, latex, markdown, html).
The micro library wtf_fetch
was extracted from wtf_wikipedia
for performing just the cross-fetch download of the wiki source into the browser or NodeJS environment for further processing.
- GitHub repository for:
wtf_fetch
https://www.github.com/niebert/wtf_fetch -
Demo Web-App showing the
wtf_fetch
with form for populating the parameters ofwtf_fetch
The following mirco libraries may be implemented. If you contribute to the Wiki Transformation Framework
(WTF) and implement one those methods please replace add a link to the repository to this Wiki document.
-
wtf_fetch
, that fetches the wiki source from Wikipedia, Wikiversity, .... (MediaWiki domain) with the parameters language (e.g.en
,de
,.. ) and domain (e.g.wikipedia
,wikiversity
,wikivoyage
, ...) -
wtf_wiki2odt
converts a wiki markdown source into LibreOffice document. The templateODT
file can be loaded withLoadFile4DOM
. AnODT
file is just a ZIP file with a specific internal folder and file structure. TheODT
file can be handle withJSZip
even in a browser and thecontent.xml
in theODT
file can be replaced bywtf_wiki2odt
. This would allow the client side generation of an LibreOffice document directly from the Wiki source. -
wtf_book_creator
is a client side book creator, that extract links/references in previously downloaded articles withwtf_wikipedia
. Categories and links can be used in a Wiki Book Creator to suggest new articles in Wikipedia or Wikiversity to be appended to the client side generated wiki book. Together withwtf_wiki2odt
the output of the book could be and LibreOffice file for further editing. -
wtf_url
link, url transformations mirco library for theWiki Transformation Framework
(WTF) -
wtf_wiki2reveal
could be an additional export module for theWiki Transformation Framework
(WTF) (see Wiki2Reveal in Wikiversity for the current rapid prototype as proof of concept) -
wtf_tokenizer
some content elements need special transformations. A tokenizer replaces content elements beforewtf_wikipedia
parses the wiki markdown. After processing the wiki markdown withwtf_wikipedia
then the tokens are replaced by a specifc syntax for the export format (e.g. citations or mathematical expression) in the output format.
- Parsing Concepts are based on Parsoid - https://www.mediawiki.org/wiki/Parsoid
- Output: Based on concepts of the swiss-army knife of
document conversion
developed by John MacFarlane PanDoc - https://www.pandoc.org