Val's Blog
Lots of stuff for Web 2.0 freaks and Java addicts
Feeds RSS | Atom | RDF
 
 
Jane Cleland-Huang: "Software development should integrate and consider project metrics that assess its financial impact."
[ Login ]

June 2005
SunMonTueWedThuFriSat
    1  2  3  4 
 5  6  7  8  9  10  11 
 12  13  14  15  16  17  18 
 19  20  21  22  23  24  25 
 26  27  28  29  30   
May  |  Today  |  Jul
XML Feeds   Subscribe with Bloglines

Javaranch Sheriff   My LinkedIn Profile
Drop me a line or two   Bloglines Blogroll
JavaRSS   Referers
How cool are you?   My Reviews

Next trips...
JavaOne 2008 (May 6-9, 08)
SpringOne 2008 (Jun 11-12, 08)
Ajax Exp. 2008 (Sep 29-Oct 1, 08)
Top 10 entries (#hits)
(As of Nov 30, 2007)


Top 10 entries (#hits/day)
Come Back (5.032)
(As of Nov 30, 2007)
Recent Blog Entries
Recent Blog Comments
Re: Review of "Marketing Management 12th"
i know marketing management by kotler is good book but the problem is that the management part of this book is totally missing as fare as i know managemet is complete different subject and it should not be mixed i am student of MBA i was looking at ass...

Re: Review of "Pro Spring"
Using simple POJOs + factories without Spring for "echo" and "counter" would be a lot more easier. No need to write those XML files... So, in this case using Spring makes me write a lot more code... (OK, you can generate everything with the help of And...

pls urgent
Hi I am trying to generate the word doc but i m not understanding wats happening any one pls figure it out /* * WordAPI.java * * Created on May 30, 2006, 10:50 AM * * To change this template, choose Tools | Template Manager * and open the te...
Archives (# entries)
Links
Other Blogs
Other Blogs

Reviewing
Reading
Locations of visitors to this page
What they once said...
 

In one of my recent comments to Steve Loughran, I mentioned that Java 7 (aka Dolphin) native XML support could provide an elegant solution for bridging the gap between the Java programming language and XML. I have already argued that I don't see XML imposing itself as a WATER-style programming language. I think anyone can understand what I mean when looking at the WATER and Java comparison page provided by the WATER folks. The following "XML programming" code is worth a million pictures.

 thing.<set the_cache=<thing/>/>
 <defmethod cache_it expr=required="ek_string">
      <if> the_cache.<has_key expr/> 
           the_cache.<get expr/>
 	 else
 	   <set val=expr.<execute/>>
                 the_cache.<set_value expr val/>
                 val
             </set>
      </if>
 </defmethod>
 <cache_it 2.<plus 3/> /> 

Steve, please tell me that Alpine is NOT headed into that direction.

Assuming that Alpine's goals are legitimate and considering Steve's assumption that there is no good solution for efficiently working with XML today, let's delve a little bit into the current state of Java and XML programming in order to discover what the ideal solution could be and how I would like to see native XML support implemented in the Java programming language.

One of the first attempts at simplifying DOM programming in Java was tackled by the JDOM guys, whose main goal was to provide a much simpler programming interface for Java programmers to manipulate XML files. Admittedly, they have been very successful. But, even with JDOM, we are still programming at the metalevel since we have to use methods such as getAttribute() or getChildren() in order to manipulate the content of an XML document. In order to illustrate the different ways of handling XML data using DOM, JDOM or Dolphin's XML native support, let's take the infamous purchase order XML document (po.xml) as an example (repeated below for convenience) and show how to:

  1. retrieve all partNum attribute values of the item elements; and
  2. compute the grand total for the whole purchase order using the text value of the USPrice element.
[Note that Dolphin's work on native XML support is still under research. So the code I provide is just a tentative on how I think such a native support would look like.]
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
   <shipTo country="US">
      <name>Alice Smith</name>
      <street>123 Maple Street</street>
      <city>Mill Valley</city>
      <state>CA</state>
      <zip>90952</zip>
   </shipTo>
   <billTo country="US">
      <name>Robert Smith</name>
      <street>8 Oak Avenue</street>
      <city>Old Town</city>
      <state>PA</state>
      <zip>95819</zip>
   </billTo>
   <comment>Hurry, my lawn is going wild!lt;/comment>
   <items>
      <item partNum="872-AA">
         <productName>Lawnmower</productName>
         <quantity>1</quantity>
         <USPrice>148.95</USPrice>
         <comment>Confirm this is electric</comment>
      </item>
      <item partNum="926-AA">
         <productName>Baby Monitor</productName>
         <quantity>1</quantity>
         <USPrice>39.98</USPrice>
         <shipDate>1999-05-21</shipDate>
      </item>
   </items>
</purchaseOrder>

Using DOM, the Java code would look like the following:

import org.w3c.dom.Document;
import org.w3c.dom.Element;
...
//Retrieve the Document object
DocumentBuilder fact = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document po = fact.parse(new File("po.xml"));
Element root = po.getDocumentElement();

//Retrieve all partNums and compute the grand total for the purchase order
double total = 0;
NodeList children = root.getChildNodes();
for (int i = 0; i < children.getLength(); i++) {
   Node node = children.item(i);
   
   //Find the items child element
   if ("items".equals(node.getLocalName())) {
      NodeList itemList = node.getChildNodes();
      for (int j = 0; j < itemList.getLength(); j++) {
         Node item = itemList.item(j);

         //Get the partNum attribute value
         NamedNodeMap attrs = item.getAttributes();
         System.out.println("partNum:" + attrs.getNamedItem("partNum"));

         //Find the USPrice child element
         NodeList itemChildren = item.getChildNodes();
         for (int k = 0; k < itemChildren.getLength(); k++) {
            Node child = itemChildren.item(k);
            if ("USPrice".equals(child.getLocalName()) {
               total += Double.valueOf(child.getNodeValue()).doubleValue();
            }
         }
      }
   }
}
System.out.println("Grand total = " + total);

I'll spare you the comments as this DOM code makes an excellent job at discriminating itself by screaming out loud "Please don't use me" to whoever watches it!

With JDOM, things are several orders of magnitude easier as the provided methods for browsing through the hierarchy and for gaining access to the raw meat of the XML document are less convoluted. Plus, using JDOM, you get neat hooks that allow you to take org.w3c.dom.Document objects and use them to build the corresponding JDOM hierarchy.

import org.jdom.Document;
import org.jdom.Element;
...

//Retrieve the Document object
Document po = new SAXBuilder().build(new File("po.xml"));
Element root = po.getRootElement();

//Retrieve all partNums and compute the grand total for the purchase order
double total = 0;
Element items = root.getChild("items");
Iterator itemIt = items.getChildren().iterator();
while (itemIt.hasNext()) {
   Element item = (Element) itemIt.next();
   System.out.println("partNum:" + item.getAttributeValue("partNum"));
   
   total += Double.valueOf(item.getChildText("USPrice")).doubleValue();
}
System.out.println("Grand total = " + total);

As the above code shows, JDOM definitely fulfills its goal of providing a simpler API for DOM programming in Java. However, JDOM still requires developers to program at the metalevel and this is a big problem in my opinion as we cannot completely concentrate on solving the problem at hand. More on this later...

Finally, as I see it, there are two ways of having native XML support in Java: explicit support similar to Ecma's E4X (ECMAScript for XML) specification and implicit support where the programming language shields you from seeing all those angle brackets clutter. The latter is my preferred one. Thus, the way I would like to see native XML support to be implemented would be some sort of mix between XPath and native data binding built into the Java programming language. Stefan Tilkov should interpret this as my way of seeing "native" O/X mapping built into a programming language such as Java. This native support should allow one to manipulate the real business entities directly instead of their metalevel counterparts. I would really like to not having to deal with the document-children-node-element terminology any more. I would like to solve business problem, and thus, it would be far more productive to have a language that allows me to develop solutions using business terms and not terms made up by some bunch of geeks hiding deep in a development cave. Looking at the code below, this is what I call ease of development! I'm quite sure this can be taken a little further but in comparison to the two previous solutions, I see this as a kind of developer's Graal.

//Retrieve the PurchaseOrder object
Document doc = new Document("po.xml");
PurchaseOrder po = doc.getPurchaseOrder();

//Retrieve all partNums and compute the grand total for the purchase order
double total = 0;
for (Item item : po.getItems()) {
   System.out.println("partNum:" + item.getPartNum());
   total += item.getUSPrice().asDouble();
}
System.out.println("Grand total = " + total);

The solution could even be better in terms of performance if we let aside the object-oriented tight encapsulation dogma for a second and use attribute access using the traditional dot notation instead of having the JVM create a costly activation frame upon each method invocation. Things would look like this.

//Retrieve the PurchaseOrder object
Document doc = new Document("po.xml");
PurchaseOrder po = doc.getPurchaseOrder();

//Retrieve all partNums and compute the grand total for the purchase order
double total = 0;
for (Item item : po.items) {
   System.out.println("partNum:" + item.partNum);
   total += item.usPrice.asDouble();
}
System.out.println("Grand total = " + total);

Agreed, six activation frames less won't change much in this small example, but I'm quite sure we could gain a substantial amount of time when processing heavy XML documents. Imagine a way "à la XPath" to access a value three levels deep in just one line of code. For instance, po.items[1].productName would return "Baby Monitor". Great, isn't it?

And your next question might be how we come up with the PurchaseOrder and Item class definitions? In my opinion, using on-the-fly byte code generators can do the job pretty well. For instance, when the compiler encounters the Document object creation, it shall dynamically create a new class definition for the root element (purchaseOrder) and populate it with all the necessary business methods for enabling the retrieval of its content (getOrderDate(), getItems(), etc.). In addition, the compiler should also dynamically modify the Document class definition to add a method (getPurchaseOrder()) for retrieving the root element as an instance of the newly created class. This should remove the need for having to explicit cast the root element that would be returned as an Object. Then, the compiler may freely proceed with the dereferencing of the newly created PurchaseOrder instance as it now has everything it needs to go on. I'm quite sure there are other solutions as well, though, but I find this one quite developer-friendly.

Now, which solution do you guys prefer? Don't tell me the first or the second because I won't believe you. Imagine the potential of the third solution for programming SOAP (if it still exists in 2007 of course) or for programming against any other XML-intensive framework for that matter. Personally, I see XML native support in Java as a way to finally let the IT division focus on its primary goal, which is to support the business of the company and let developers and business guys talk with a compatible vocabulary. Today, a non-negligible number of IT guys are acting as if the company's primary business was IT. They come to meetings with their convoluted geekie dictionary and take a great pleasure at mystifying information system development and making it seem as if it was like climbing Mount Everest right after coming back from K2. What those guys don't understand is that their primary role is to develop information systems for supporting the business of their company. In order to achieve this, they sure need frameworks and APIs that are way simpler to use as they are today. To me, ameliorations, such as the one proposed by Dolphin, definitely go into that direction and sort of makes Java become a true 4G programming language by getting closer to the human natural language and getting away from that silly metalevel. This would clearly benefit everyone. From the developer who would certainly understand much more quickly what the code does (or should do) and which business concerns are implemented in the code, to the business guys who would finally share a common vocabulary with the development teams. Let's stop (or at least diminish) programming at the metalevel as 95% of the time this practice is not productive at all and let's finally share a common vocabulary with business folks. I'm sure that would definitely contribute to repolish IT's image and recover from the bad press that our sector has been receiving over the last couple of years.

Related entries:
Native XML Support in Java (Round 2)
HP's "Rethinking the Java SOAP stack" should be rethought
"Alps" Defender Steve Loughran strikes back


I totally agree that native XML support in the programming language is the best possible solution to the O/X mapping problem, because no mapping would be needed.
Your ideas are spot on, but an XML doc processed this way is going to have to be validated against a schema first, or else error handling (and code generation too) will become a nightmare.
That's correct, Fred! The schema file name could be provided as another argument to the Document constructor for instance, along with some kind of error handler or something like that to take care of possible error conditions. If the document is not valid, an exception should be thrown to indicate that the document is not valid, period. My goal was not to provide a solution for the problem but just to show that we now have all the technologies we need to get this thing done. I don't think that validation and error handling are showstoppers, all we have to do is to find the right way of handling them in a graceful manner.
Love the approach suggested but just curious whether it will work within an IDE, meaning: will an IDE be able to parse the PO file and be able to provide code completion help? I'd hate to have to hold within my brain all the available methods.
Why not use JXPath. Then you can just access your xml/tree structures using XPath expressions. What else would you want ?
However, JDOM still requires developers to program at the metalevel and this is a big problem in my opinion
It is a problem, but the solution is not to work closer to XML, but further from it. XML is a nice format for storing structured data, but it isn't, no matter how it's wrapped up, a good format for working with the data. XML documents are inherently tree-shaped. Not all data fits that. That's why we have ID and IDREF attributes. If the data isn't tree-shaped, then XML is the wrong format to work with it in. Parse the data from the XML into a real object model, do your work there, and save it out to XML again. Don't work with XML, not even on a metamodel level. Apart from that, if we start embedding other languages in Java, I want general embedded language support, not just XML. /L
Take a look at Scala.
Is it only me or is this an extremely limited and completely useless code sample?
Let me expand a little. The first two lines are:
Document doc = new Document("po.xml");
PurchaseOrder po = doc.getPurchaseOrder();
Where is PurchaseOrder defined? How can the compiler know about purchase order? Should the compiler read po.xml??? How often do you need locally stored static XML files that doesn't change? The only solution is to provide a DTD (oh god, the horror!) or a schema to the compiler. Thus we have suddenly exploded the size and the complexity of the Java compiler. And what if we don't have a schema ready at hand -- do we have to make one (the horrror) or autocreate one (i'm repeating myselves here, fyi)? Or are we then reduced to DOM calls? Your code sample looks really nice. But I don't think it belongs in the language at all. What you really want, if you want to process XML _with fixed structure_, is code generation. I'm not all in favour for that, but it works quite well in many cases. You provide no samples of how to process XML with unknown structure, which actually is a quite common problem. Or at least, the structure is poorly defined and changes all the time.
Been there learnt XML and XSD used Castor to generate validating XML beans from XSD, so no data typing/conversion issues, easy navigation, successfully used in high volume commercial products, enough said. If you need to only read XML, either use code generation, XPath, Sasax, or simple/recursive string extraction, anything else is a waste of valuable time IMHO, however XML writing is best done with XML bean tools like Castor or the JRE's XML serialisation mechanism. Sasax is at http://sasax.sourceforge.net/index.en.html it allows you to build a typed schema/parser in Java objects and is a real product, unlike your naive ideas.
Take a look at XMLBeans. It compiles your schema into java classes and gives you the strongly typed getters (eg. getPurchaseOrder().getItems()) that you're looking for. Plus you can use xpath and xquery.


Add a comment

Title
Body
HTML : b, i, blockquote, br, p, pre, a href="", ul, ol, li
Math Quiz 4 + 1 = (Helps stop blog spam)
Name
E-mail address
Website
Remember me Yes  No 

E-mail addresses are not publicly displayed, so please only leave your e-mail address if you would like to be notified when new comments are added to this blog entry (you can opt-out later).

TrackBack to http://radio.javaranch.com/val/addTrackBack.action?entry=1118933294448

 
About this Blog