Table of Contents
fleXiParse is a framework for writing XML parsers. It is based
on Java's DOM (Document Object Model) and XPath implementation.
It uses a visitor pattern, where the user can define
NodeHandler
s whose
handleNode
is called for each DOM node
matching one of the XPathExpression
s
provided by the NodeHandler
's
configuration.
Handlers can store objects in a tree managed by the parser framework. The framework creates a node in the object tree for each element in the source document. Different handlers can communicate with each other by storing object in the tree and retrieving them from it. The final state of the object tree represents the result of the parsing process and is therefore returned to the code calling the parser.
Table of Contents
All parsers implement the com.marsching.flexiparse.parser.Parser
interface. This interface contains the methods needed to parse a
document and to add NodeHandler
s to
the parser. SimpleParser
is a simple
implementation that is used as a base for more complex
implementations.
XMLConfiguredParser
extends the
SimpleParser
class with methods for adding
NodeHandler
s that are configured
in an XML file.
ClasspathConfiguredParser
reads these
XML files from a specified location within in the class path and is
therefore well suited for building an extensible parser. New handlers
can be simply added by placing a JAR in the classpath that contains the
handlers and the XML configuration file at the right location.
In the following example we will show how to instantiate and use a
ClasspathConfiguredParser
. The process is
basically the same for other parser implementations. However, handlers
have to be added explicitly when using another implementation.
Parser parser = new ClasspathConfiguredParser("com/example/myhandlers.xml"); ObjectTreeElement result = parser.parse(new File("test.xml"));
This code first creates an instance of ClasspathConfiguredParser
using the configuration path com/example/myhandlers.xml
.
This means that the handler configuration is expected in a file called
myhandlers.xml
that is in the
com.example package of the class path. If there is
more than one file (e.g. same file name in same package in different
JARs) all of the files found will be used. Thus
the set of handlers can be extended by modules just by placing the
configuration file in the right package.
Then the parsers's parse
method is called to
parse a file called test.xml
and the root node of
the resulting object tree is assigned to the result
variable which can be used to get objects from the tree that have been
attached by the handlers during the parsing process.
Table of Contents
The XML file containing the handler configuration has a very simple format:
<configuration xmlns="http://www.marsching.com/2008/flexiparse/configurationNS" xmlns:x="http://www.example.com/exampleNS" xmlns:xsi="http://http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.marsching.com/2008/flexiparse/configurationNS http://www.marsching.com/2008/flexiparse/flexiparse-configuration.xsd " > <handler class="com.example.MyHandler"> <match>/x:addressbook/x:person/x:address</match> </handler> </configuration>
This configuration defines a handler using the class
com.example.MyHandler
that will be invoked
for each element matched by the XPath expression. As you can see, the
the XPath expression uses namespace prefixes that are defined in
the context of the match
tag.
When the parser walks through the DOM tree, there are two different
points for each node where the corresponding handlers can be called:
Either before the child nodes of the node are processed or after the
child nodes have been processed. This might be relevant if there are
handlers attached to one of the child nodes as this handler might either
expect some data that has been attached the object tree by the handler
for the parent node or might itself attach some data to the object tree
that is then used by one of the parent node's handlers. Therefore you
can define a run level for each handler. Valid run levels are
start
, end
and
both
(start
is the default).
Handlers with run level start
will be called
before the child nodes are processed. Handlers with
run level end
are called after
the child nodes have been processed. Handlers with run level
both
are called before as well
as after the child nodes have been processed.
<configuration xmlns="http://www.marsching.com/2008/flexiparse/configurationNS" xmlns:x="http://www.example.com/exampleNS" xmlns:xsi="http://http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.marsching.com/2008/flexiparse/configurationNS http://www.marsching.com/2008/flexiparse/flexiparse-configuration.xsd " > <handler class="com.example.PersonHandler" run-level="end"> <match>/x:addressbook/x:person</match> </handler> <handler class="com.example.AddressHandler"> <match>/x:addressbook/x:person/x:address</match> </handler> </configuration>
In this example the handler
com.example.PersonHandler
will be called after
the child nodes have been processed and might therefore collect objects
from the object tree that have been created by the
com.example.AddressHandler
.
Each handler may specify an id using the id
attribute. If no explicit id is specified, the handle's class name is
implicitly used as the id. The id has to be unique, that is there must
not be more than one handler using the same id.
Other handlers may specify run order dependencies related to other
handlers using the preceding-handler
and
following-handler
tags. These dependencies are only
considered if both handlers are acting on the same node. Otherwise a
dependency will be silently ignored.
<configuration xmlns="http://www.marsching.com/2008/flexiparse/configurationNS" xmlns:x="http://www.example.com/exampleNS" xmlns:xsi="http://http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.marsching.com/2008/flexiparse/configurationNS http://www.marsching.com/2008/flexiparse/flexiparse-configuration.xsd " > <handler class="com.example.PersonHandler" run-level="end"> <match>/x:addressbook/x:person</match> </handler> <handler class="com.example.AddressHandler" id="com.example.SomeOtherId"> <match>/x:addressbook/x:person/x:address</match> </handler> <handler class="com.example.VerificationHandler"> <preceding-handler>com.example.PersonHandler</preceding-handler> <preceding-handler>com.example.SomeOtherId</preceding-handler> <match>/x:addressbook/x:person/x:address</match> <match>/x:addressbook/x:person</match> </handler> </configuration>
In this example the VerificationHandler
is
called after the
PersonHandler
or
AddressHandler
(depending on the node being
processed) have been called. Thus the verification handler may use
objects from the object tree that have been placed there by the other
handler for the same node. Placing
<following-handler>com.example.Verificationhandler</following-handler>
in the configurations of the PersonHandler
and
AddressHandler
instead of the
preceding-handler
declarations in the
VerificationHandler
configuration would have the
same effect.
In fact even both could be present at the same time, because these
constraints do not conflict. If there are conflicts (circular
dependency graph), an exception is thrown by the parser.
Table of Contents
The object tree is used to store the result of the parsing process and to share data between different handlers (or several invocations of the same handler). The object tree has one root node for the document being parsed and one child node for each XML element in the document (including the root element). The tree structure of these nodes reflects the tree structure of the parsed document.
Each time a handler is invoked, a reference to an object tree element is passed. This object tree element corresponds to the XML element (or the parent XML element for non-element nodes) being processed at this time.
The handler can use this object tree element to attach or retrieve Java objects created in the context of the current node or to navigate through the object tree and retrieve Java attached to other object tree elements.
A handler can attach an arbitrary Java object to an object tree element
by invoking the addObject(Object object)
method. By convention, a handler should usually only attach objects to
the object tree element corresponding to the current context XML
element, although there is no technical restriction requiring or
enforcing this behavior.
The method getObjects()
returns all objects
attached to the object tree element regardless of their type. The
objects are returned in the order they have been attached to the object
tree element.
The method getObjectsOfType(Class type)
returns
all objects attached to the object tree element that are sub-types of
the type specified by the parameter type
. The
objects are returned in the order they have been attached to the object
tree element.
The method
getObjectsOfTypeFromSubTree(Class type)
returns
objects of the given type (or a sub-type) that are attached to the
object tree element or one its descendant elements. The order is parent
elements before child elements, child elements on the same level in the
order of the corresponding XML elements in the source document and
objects attached to the same element in the order they have been added
to the element.
The method
getObjectsOfTypeFromTopTree(Class type)
returns
objects of the given type (or a sub-type) that are attached to the
object tree element or one its ancestor elements. The order is child
element before parent element and objects attached to the same element
in the order they have been added to the element.
When invoking the parse
method of the
Parser
interface, arbitrary objects can
be passed as parameters. These objects are attached to the root element
of the object tree before the parsing process begins. In this way
parameters can be passed from the code invoking the parser to the
parsing handlers.
If an instance of
com.marsching.flexiparse.objecttree.DisableParsingFlag
is attached to an XML element's object tree element before the child
nodes of the XML element have been processed, these child nodes are
excluded from the parsing process. This mechanism can be used in order
to exclude certain parts of the XML tree from the parsing process based
on runtime parameters.
Table of Contents
While fleXiParse's handler concept provides a maximum of flexibility
and extensibility, writing a hander for every tag can be a
time-consuming task. Therefore fleXiParse provides a facility for the
automatic mapping of XML data to objects. This facility can be used
by using configuration tags from the
http://www.marsching.com/2008/flexiparse/xml2objectNS
namespace.
<configuration xmlns="http://www.marsching.com/2008/flexiparse/configurationNS" xmlns:t="http://www.example.com/exampleNS" xmlns:xo="http://www.marsching.com/2008/flexiparse/xml2objectNS" xmlns:xsi="http://http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.marsching.com/2008/flexiparse/configurationNS http://www.marsching.com/2008/flexiparse/flexiparse-configuration.xsd http://www.marsching.com/2008/flexiparse/xml2objectNS http://www.marsching.com/2008/flexiparse/flexiparse-xml2object.xsd " > <xo:element name="t:test" target-type="com.example.ExampleObjectA"> <xo:attribute name="a" target-attribute="a"/> <xo:attribute name="b" target-attribute="b" target-type="com.example.ExampleObjectB" occurrence="1" /> </xo:element> </configuration>
XML to object mapping configurations use the same configuration files but a different namespace than handlers. There are three kinds of mappings: Mappings for elements, attributes, or text nodes.
The mapping for an XML element is defined using the element
tag. The name
attribute takes a local
or qualified name. If a local name is given, the mapping matches
elements with this name using no namespace. If the qualified form is
used, the mapping matches elements with this name and the namespace
bound to the prefix in the context of the configuration element.
The target-type
attribute specifies the
name of the Java type the XML element is mapped to. This has to be
either a Java primitive (e.g. int
,
boolean
) or fully qualified type name (e.g.
java.lang.String
- the default). The type has to
have a default constructor.
The target-attribute
attribute specifies
the name of the attribute in the parent Java object the object created
for this mapping should be saved in. This attribute is only used if the
corresponding tag is encountered within another tag handled by this XML
to Object facility. There has to be a setter method (adhering the Java
Beans convention) for the specified attribute in the parent object. If
the special name !mapentry
is used, the parent object
has to implement java.util.Map
and the
object has to implement
java.util.Map.Entry
. In this case the
object will be added to the parent map. If the special name
!collectionentry
is used, the parent object has to
implement java.util.Collection
and the
object will be added to the parent collection. If the special name
!parent
is used, the parent object will be replaced
by this object.
If the deep-search
attribute is set to
true
, the XML to Object mapper will not
restrict the search to direct child elements, but will use all
descendant elements when looking for elements matching the child
mapping configurations.
Root element mappings have no parent mappings.This kind of mapping is
used for every element which matches the specified namespace and
local name. When processing the child nodes of a mapped element, the
mappings nested inside the parent definition will be used first.
However, if no matching mapping is found, the root mappings with the
target-attribute
set will be used, too.
The occurrence
attribute can be set to
either 0..1
, allowing a maximum of
one instances per context, or 0..n
allowing an unlimited number of instances per context.
Nested element mappings are children of other nested or root element
mappings. The target-attribute
attribute
is mandatory for this kind of mappings.
The occurrence
attribute may be set to
either 0..1
(default),
0..n
, 1
or
1..n
.
Attribute mappings basically support the settings described in
Section 5.3, “Element Mappings”. However, the
target Java type has to have a constructor taking a single
parameter of type java.lang.String
instead of
a default constructor. The deep-search
attribute does not exist for attribute mappings.
The occurrence
attribute can be set to
either 0..1
(default) or
1
.
Text mappings have the same options as
attribute mappings,
however there is no name
attribute and
the occurrence
attribute may be
set to either 0..1
(default),
0..n
, 1
or
1..n
.
If the append
attribute is set to
true
, all text nodes in a given context
are concatenated creating one single string.
If the ignore-white-space
attribute is
set to true
, text nodes that contain
white space only are ignored.
Objects can be added to collections using the special target attribute
!collectionentry
.
<xo:element name="addressbook" target-type="java.util.HashSet"> <xo:element name="person" target-attribute="!collectionentry" target-type="com.example.Person"> ... </xo:element> </xo:element>
Map entries can be added to a map using the special target attribute
!mapentry
. The entries have to implement
java.util.Map.Entry
. If this interface
is specified as the target type of a mapping, fleXiParse will use
an internal implementation which has a key
and
value
attribute.
<xo:element name="parameters" target-type="java.util.HashMap"> <xo:element name="parameter" target-attribute="!mapentry" target-type="java.util.Map.Entry"> <xo:attribute name="name" target-attribute="key"/> <xo:text target-attribute="value"/> </xo:element> </xo:element>