13 March 2014

PHP has several built-in strategies for handling XML, but each comes with its own limitations (e.g., DOM is verbose, SimpleXML has limited creation capabilities). The weakness of built-in solutions has led to a proliferation of XML libraries. Stack Overflow has a post that identifies 13+ substantial projects. The syntax of many of the projects has been influenced by jQuery, and several of the projects (e.g., phpQuery and QueryPath) explicitly attempt to replicate jQuery functionality. As with any crowded open source space, some of the projects have been forked or abandoned as other projects gain more traction (e.g., phpQuery appears abandoned but QueryPath lives on).

From my perspective, the most prominent solutions are QueryPath, Symfony's DomCrawler and FluentDOM. Of course, that perception is largely biased by my technology stack since that largely guides my reading.

However, I am not a fan of the significant amount of code each of those projects requires to even load a small XML document. Additionally, the internal state strategy that complicates storing references to multiple positions in the document highlights the fact that the complexity comes with some performance hits.

When I started the research, I was really just looking for a project that plugged a few holes in the SimpleXML functionality. SimpleXML is fast since it is compiled into PHP, and it is very simple to use. In the end, I opted to create a lightweight wrapper that adds a few small missing features to SimpleXML without creating the extreme bloat of the libraries above. In the end, it took fewer than 500 lines (with comments) to accomplish what I wanted.

Using QuipXML, XPath queries are chainable, the DOM can be manipulated easily, and there are a handful of trivial traversal functions.


blog comments powered by Disqus