Xmls is a small, simple, non-validating xml parser for Common Lisp. It's designed to be a self-contained, easily embedded parser that recognizes a useful subset of the XML spec. It provides a simple mapping from xml to lisp structures or s-expressions and back.
Since XMLS was first released it has gained some additional complications/features. In particular:
xmls/octets
that will open streams for the XMLS parser,
processing any content-type declarations in the process.Parsed xml is represented as a nested lisp structure, unlike in the original version, where it was a lisp list. The s-expression representation is still maintained, and there are functions to translate to and from this notation.
In the structure representation, a node, corresponding to an XML element, is defined as follows:
(defstruct (node (:constructor %make-node)) name ns attrs children)
Xmls also includes a helper function, make-node for creating xml nodes of this form:
(make-node &key name ns attrs children)
Xmls provides the corresponding accessor functions node-name, node-ns node-attrs, and node-children.
In the s-expression representation, a node is represented as follows:
(name (attributes) children*)
A name is either a simple string, if the element does not belong to a namespace, or a list of (name namespace-url) if the element does belong to a namespace.
Attributes are stored as (name value) lists.
Children are stored as a list of either element nodes or text nodes.
For example, the following xml document:
<?xml version="1.0"?> <!-- test document --> <book title='The Cyberiad'> <!-- comment in here --> <author xmlns='http://authors'>Stanislaw Lem</author> <info:subject xmlns:info='http://bookinfo' rank='1'>"Cybernetic Fables"</info:subject> </book>Would parse as:
("book" (("title" "The Cyberiad")) (("author" . "http://authors") NIL "Stanislaw Lem") (("subject" . "http://bookinfo") (("rank" "1")) "\"Cybernetic Fables\""))
To detect whether in this version of XMLS the return value of PARSE
will be a
list or a structure, check for the feature :XMLS-NODES-ARE-STRUCTS
.
For old code that wants XML parsed into lists, instead of
structures, you may replace calls to (parse str)
with
(node->nodelist (parse str))
.
For greater convenience, we offer PARSE-TO-LIST
, which
performs the same function.
The interface is straightforward. The two main functions are
PARSE
and TOXML
.
(parse source &key (compress-whitespace t) (quash-errors t)
Parse accepts either a string or an input stream and attempts to parse the xml document contained therein. It will return the s-expr parse tree if it's successful or nil if parsing fails.
If COMPRESS-WHITESPACE
is non-NIL
, content nodes will be trimmed of whitespace and
empty whitespace strings between nodes will be discarded.
(parse-to-list source (&rest args))
Functions as PARSE
, but returns a list representation
of the XML document, instead of a structure.
(write-prologue xml-decl doctype stream)
write-prologue writes the leading
<?xml ... ?>
and <!DOCTYPE ... >
elements to stream
.
xml-decl
is an alist of attribute name value pairs.
Valid xml-decl attributes per the xml spec are "version", "encoding",
and "standalone", though write-prologue does not verify this.
doctype
is a string containing the document type definition.
(write-prolog xml-decl doctype stream)
U.S. spelling alternative to write-prologue
.
(write-xml xml stream &key (indent nil))
write-xml accepts a lisp list in the format described above and writes the equivalent xml string to stream. Currently, if nodes use namespaces xmls will not assign namespaces prefixes but will explicitly assign the namespace to each node. This will be changed in a later release. Xmls will indent the generated xml output if indent is non-nil.
(toxml node &key (indent nil))
TOXML
is a convenience wrapper around write-xml that returns the in a newly
allocated string.
XMLS provides two exported functions to translate between the CL structure representation of the XML tree and the s-expression representation:
node->nodelist
nodelist->nodes
These are intended to allow programmers to avoid direct manipulation of the XMLS element representation. If you use these, your code should be easier to read and you will avoid problems if there is a change in internal representation (such changes would be hard to even find, much less correct, if using the lists directly).
make-xmlrep (tag &key attribs children)
xmlrep-add-child! (xmlrep child)
xmlrep-tag (xmlrep)
xmlrep-tagmatch (tag treenode)
xmlrep-attribs (xmlrep)
xmlrep-children (xmlrep)
xmlrep-find-child-tags (tag treenode)
xmlrep-tagmatch
.
xmlrep-find-child-tag (tag treenode &optional (if-unfound :error))
xmlrep-string-child (treenode &optional (if-unfound :error))
xmlrep-integer-child (treenode)
xmlrep-attrib-value (attrib treenode &optional (if-undefined :error))
xmlrep-boolean-attrib-value (attrib treenode &optional (if-undefined :error))
XMLS itself simply processes strings or streams. This means that it
does not provide native support for handling character encodings, as
declared in the XML headers. The system xmls/octets
,
which depends on xmls
provides that support with the
exported function make-xml-stream
, which takes an
octet-stream as argument, processes its header, choosing the
appropriate character encoding, and then returns a stream suitable for
passing to xmls:parse
.
Probably make-xml-stream
should be made generic, and support
arguments of other types (e.g., strings interpreted as filenames,
pathnames, etc.).
xmls can be installed as an asdf system. An asdf system definition is provided with the distribution.
Previous versions of XMLS were single files, and could be installed simply by loading the file xmls.lisp. This option is no longer supported.
Please contact Robert Goldman, rpgoldman AT sift.net with any questions or bug reports.