I made a change in the blogger configuration to ease the later work when blogging. It is possible that older entries are not correctly formatted.

Showing posts with label XML. Show all posts
Showing posts with label XML. Show all posts

Friday, 12 July 2013

Axiom - XML Processing for Axis and others

I have been taking a look at the Apache Axis2 Web Service framework. Reading the documentation, I was reminded of the Axiom framework used for the processing of XML. Therefore, I will gather here a few informations which I find useful. These info come partially from quickstart-samples. The example are changed a little to be adapted to my way of thinking. This is also a documentation for me.


In addition to be used to process XML efficiently, Axiom can be used to process MTOM messages which are used in SOAP to transport binary data.
Again on the quickstart-samples page, there are some examples. But I will use a separate POST for this.


How to retrieve child elements from root:

public void processXmlFile(File file, String namespace, String elementName) throws IOException, OMException {
// initialize the file input stream to be closed in this method
InputStream in = new FileInputStream(file);
// get the root element
OMElement root = OMXMLBuilderFactory.createOMBuilder(in).getDocumentElement();

// Process the content of the file
OMElement element = root.getFirstChildWithName(
new QName("-http://maven.apache.org/POM/4.0.0", "url"));
if (element == null) {
System.out.println("No <"+elementName+"> element found");
} else {
System.out.println(elementName" = " + element.getText());
}

// Because Axiom uses deferred parsing, the stream must be closed AFTER
// processing the document (unless OMElement#build() is called)
in.close();
}

Schema validation

How to perform partial schema validation (from quickstart-samples).

public void validate(InputStream in, URL schemaUrl) throws Exception {
  SOAPModelBuilder builder = OMXMLBuilderFactory.createSOAPModelBuilder(in, "UTF-8");
  SOAPEnvelope envelope = builder.getSOAPEnvelope();
  OMElement bodyContent = envelope.getBody().getFirstElement();
  SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
  Schema schema = schemaFactory.newSchema(schemaUrl);
  Validator validator = schema.newValidator();
  validator.validate(bodyContent.getSAXSource(true));
}

quickstart-samples presents also a way to do it with a DOMSource instead of the SAXSource.


Perform on chunks of an XML Document


Often it is useful to perform just on a particular chunk of the XML document.

public interface FragmentProcessor {

    public QName getTagName();

    public void processFragment(OEMElement);

}

  public void processFragments(InputStream in, FragmentProcessor fragmentProcessor) throws XMLStreamException {
    // Create an XMLStreamReader without building the object model
    XMLStreamReader reader =
       OMXMLBuilderFactory.createOMBuilder(in).getDocument().getXMLStreamReader(false);
    QName tagName = fragmentProcessor.getTagName();
    while (reader.hasNext()) {
       if (reader.getEventType() == XMLStreamReader.START_ELEMENT &&
          reader.getName().equals(tagName)) {
             OMElement element =
                OMXMLBuilderFactory.createStAXOMBuilder(reader).getDocumentElement();
             // Make sure that all events belonging to the element are consumed so that
             // that the XMLStreamReader points to a well defined location (namely the
             // event immediately following the END_ELEMENT event).
             element.build();
             // Now process the element.
             fragmentProcessor.processFragment(element);
          } else {
             reader.next();
          }
       }
   }

Sunday, 24 June 2012

xmllint tutorial

http://linux.byexamples.com/archives/565/your-xml-friend-xpath-command-line-xmllint/ is a very small tutorial on how to use xmllint. I think I should try it a little bit further. Currently it did not do what I wanted it to do.


while trying xmllint on a settings.xml file from maven, I was having trouble getting the content out. After looking for the way to do it in internet, I found out that you need to set the namespace you need to look for even if it is the default namespace. I am not completely sure about the reason why. Anyway, that is how I did it:

> setns r=http://maven.apache.org/xsd/settings-1.0.0.xsd
/ > cat //r:settings
/ > cat //r:id
/ > setns r=http://maven.apache.org/SETTINGS/1.0.0
/ > cat //r:id
-------
<id >jsf-app-profile </id >
-------
<id >snapshots.jboss.org </id >
-------
<id >repository.jboss.org </id >
-------
<id >sonar</id >

Thursday, 6 May 2010

AXIOM - an Apache Stax Parser

I will have to take a look at Axiom which provides a Stax implementation to access XML info sets. It was developed for Axis 2. But it can be used independantly.

Monday, 7 January 2008

StAX- Streaming API for XML in Java

A number a way exist to access XML Data from Java. The DOM and SAX API are the two most traditional ones. A new alternatives under the form of the Streaming API for XML. This explanation uses the informations found in this tutorial.

The principle behing this API it to give the programmer the control over the next element of the file to be treated. This simplifies things greatly when there are many elements.

There are mainly two sets of APIs with the StaX API: cursor and iterator APIs.

Cursor API The cursor walks the documents one infoset element at a time and always forward. For this there are two interfaces: XMLStreamReader and XMLStreamWriter. These interfaces are very similar to the SAX handlers.

Iterator API

The Iterator API works on the idea, that the XML documents can be seen as a sequence of Events. Each of these Events are of a given type: StartDocument, StartElement, Attribute, ....

Differences. First of all, there are some things that are possible with the Iterator API which are not possible with the Cursor API. However, with the cursor API the code tends to be smaller and more efficient.

For processing events pipelines or modifiying the event stream use the iterator API.

In general, it is preferable to use the Iterator API, since this API is more easily extensible, thus it probably requires less refactoring for future versions of the applications.

Use of StaX APIs

The use of the StaX APIs is performed using the XMLInputFactory, XMLOutputFactory and XMLEventFactory classes. These factory classes are configured using the setProperty() method.

Sunday, 6 January 2008

JAXB

Another interesting Java technology pushed by Sun is JAXB, it stands for "Java Architecture XML Binding". Most of the information found here comes from Sun's JAXB tutorial. It is a means of coding Java Object into XML using the descriptive power of the XML Schemas. It is possible to generate Java Object from XML descriptions using a Schema. To simplify the process, it is also possible to use JAXB annotations of Java classe in order to generate the schemas needed for such schemas. So the important steps to consider when using this is:
  1. write the classes of the objects which should be stored as XML files, or DOM nodes, etc...
  2. annotate them so that the corresonding schema can be generated
  3. generate content according to schema
  4. unmarshal the content (i.e load into content tree in Java)
  5. validate if required
  6. marshall, i.e export again as XML
This page gives a list of (XML)Schema-Java correspondance tags. As specified, the binding can be generated either using annotations inside the Java files, or it is also possible to use external configuration files outside these files.