Parses an XML string into Venice data structures.
Returns a tree of XML element maps with the keys :tag (XML element name), :attrs (XML element attributes), and :content (XML element content).
(do
;; load the Venice XML extension module
(load-module :xml)
(str (xml/parse-str "<a><b>B</b></a>"))
; -> {:tag "a" :content [{:tag "b" :content ["B"]}]}
(str (xml/parse-str
"""
<?xml version="1.0" encoding="UTF-8"?>
<a a1="100">
<b>B1</b>
<b>B2</b>
</a>
""")))
; -> {:tag "a"
; :attrs {:a1 "100"}
; :content [{:tag "b" :content ["B1"]}
; {:tag "b" :content ["B2"]}]}
Alternatively Venice can parse XML data from various sources:
String
(xml/parse-str "<a><b>B</b></a>")
SAX Parser InputSource
(xml/parse (->> (. :java.io.StringReader :new "<a><b>B</b></a>")
(. :org.xml.sax.InputSource :new)))
InputStream
(try-with [is (. :java.io.FileInputStream :new (io/file "books.xml"))]
(xml/parse is))
File
(xml/parse (io/file "books.xml"))
URI
(xml/parse "https://www.w3schools.com/xml/books.xml")
The following examples will outline an XPath like navigation through parsed XML documents.
The XML books.xml:
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web" cover="paperback">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
Parse the XML
(do
(load-module :xml)
(def nodes (xml/parse "https://www.w3schools.com/xml/books.xml")))
xml/parse
parses the XML into a tree structure like this
{:tag "bookstore" :content [{:tag "book"} ...]}
Descends into the node's child elements
(xml/children nodes)
which results in
({:tag "bookstore"}
{:tag "book"
:attrs {:category "cooking"}
:content [...]}
{...}
{:tag "book"
:attrs {:category "children"}
:content [...]})
(let [path [(xml/tag= "book")
(xml/tag= "title")
xml/text]]
(xml/path-> path nodes))
result:
'("Everyday Italian" "Harry Potter" "XQuery Kick Start" "Learning XML")
(let [path [(xml/tag= "book")
(xml/attr= :category "web")
(xml/tag= "title")
xml/text
second]]
(xml/path-> path nodes))
result:
"Learning XML"
Alternatively the query can be written as:
(->> [nodes]
((xml/tag= "book"))
((xml/attr= :category "web"))
((xml/tag= "title"))
xml/text
second)
(let [path [(xml/tag= "book")
(xml/attr= :category "web")
(xml/tag= "price")
xml/text]]
(reduce + (map decimal (xml/path-> path nodes))))
result:
89.94M
(xml/tag= "book")
is equivalent to
(xml/tagp #(== % "book"))
(xml/tagp (partial == "book"))
(xml/attr= :category "web")
is equivalent to
(xml/attrp :category #(== % "web"))
(xml/attrp :category (partial == "web"))
(let [path [(xml/tagp #(match? % "book.*"))
(xml/attrp :category #(match? % "web.*"))
(xml/tag= "title")
xml/text
second]]
(xml/path-> path nodes))
result:
"Learning XML"
(let [path [(xml/tag= "book")
(xml/attrp :cover some?)
(xml/tag= "title")
xml/text]]
(xml/path-> path nodes))
result:
"Learning XML"