Compare two XML files
-
So, you mean to do something like this?
@
<someXml>
<A>
<A2/>
<A1/>
</A>
<B>
<B1/>
<B2/>
</B>
</someXml>
@
compared with@
<someXml>
<B>
<B2/>
<B1/>
</B>
<A>
<A1/>
<A2/>
</A>
</someXml>
@And say that the files are the same?
How big are the documents we're talking about? And do you only need "same"/"different", or do you also need to identify these differences?
-
For more explanation I need create tool something like this:
!http://s17.a-img.com/images/shots/DiffDog2010r3_XML_comparison.gif(diff)!
but more simple, of course. But with highlighting difference nodes. -
Basically, I'd say you'd need to build a tree representation of each of your XML files. Then, you iterate over tree A recursively, trying to find a match for each node in tree B. The tree would contain everything, down to the attributes.
You can remove every end-node (node without children) from both trees if a match if found. After iterating over the tree, you're left with two trees that only contain those nodes that are not in the other tree: the difference. Note that for this to work, you will have to iterate as deep as you can get before you start deleting. If you can't get to the same depth on both, you have a difference and there is nothing to delete.
To search efficiently, I think QDomDocument may not be the ideal in-memory representation, because it is hard to search for nodes. Instead, I would considder a custom data structure, or alternatively build an index of the QDomDocument first so you can quickly retreive nodes from it based on a path.
Note that a Google search on "compare two trees algorithm" returns quite a number of useful-looking results.