Issues with QDomDocument parsing in Qt6.7.0
-
I'm in the process of porting my application from Qt 5.3 to Qt 6.7.0. The code below is used to parse a 2.7 megabyte document. In PyQt6 it does not find the node with the name attribute of "project_locations". However, in PyQt5 it does. With such a large file, I'm not sure where to begin troubleshooting. Any ideas on how to start troubleshooting the issue? Is there already a known issue with version 6.7.0?
class ProjectNotesCommon: def find_node(self, node, type, attribute, name): if node is None: return(None) children = node.firstChild() while not children.isNull(): if children.nodeName() == "table": print("comparing: ", children.nodeName() , children.toElement().attribute(attribute)) if ( children.nodeName() == type and children.toElement().attribute(attribute) == name ): return(children) subsearch = self.find_node(children, type, attribute, name) if subsearch is not None: return(subsearch) children = children.nextSibling() return(None) app = QApplication(sys.argv) xmldoc = QDomDocument("TestDocument") f = QFile("project.xml") if f.open(QIODevice.OpenModeFlag.ReadOnly): print("example project opened") xmldoc.setContent(f) pnc = ProjectNotesCommon() xmlroot = xmldoc.elementsByTagName("projectnotes").at(0) node = pnc.find_node(xmlroot, "table", "name", "project_locations") print(node) f.close()
-
I did a compare of the file I saved with Notepad to the original. Notepad filtered out the non-breaking space characters from the XML. I found online that QDomDocument doesn't handle those and a few other characters. I added some code to filter those characters out and the resulting file works now! Thanks!
-
Any ideas on how to start troubleshooting the issue?
Start by adding a
print(children.nodeName())
for every node visited (not just yours inside anif
). Pipe the output throughgrep
. Do you ever encounter the node you are looking for? And extend to printing out the attributes as necessary. Find out whether you ever visit the desired node or not, and if you do whether it fails on the attributes. With this I really think you will be close to figuring where the problem is.Also try finding a different node/attribute, does it ever work? On a different document? Although I don't believe it is a problem, try an attribute name without an underscore in it, just in case....
-
Any ideas on how to start troubleshooting the issue?
Start by adding a
print(children.nodeName())
for every node visited (not just yours inside anif
). Pipe the output throughgrep
. Do you ever encounter the node you are looking for? And extend to printing out the attributes as necessary. Find out whether you ever visit the desired node or not, and if you do whether it fails on the attributes. With this I really think you will be close to figuring where the problem is.Also try finding a different node/attribute, does it ever work? On a different document? Although I don't believe it is a problem, try an attribute name without an underscore in it, just in case....
-
@JonB Thanks for this Idea. I did do that, and it never finds the node. I think QDomDocument isn't parsing the XML document correctly.
@kesterpm said in Issues with QDomDocument parsing in Qt6.7.0:
I think QDomDocument isn't parsing the XML document correctly.
This would be rather surprising... you are doing something wrong.
-
@kesterpm said in Issues with QDomDocument parsing in Qt6.7.0:
I think QDomDocument isn't parsing the XML document correctly.
This would be rather surprising... you are doing something wrong.
@Christian-Ehrlicher It's certainly possible I'm doing something wrong :). My application uses the QDomDocument to construct the XML file. I send it to a Python script to reload it. For the smaller exports, the Qt6 version works just fine. I've exported the XML using the Qt5 version of my app and the Qt6 Python still misses that node. The Qt5 Python finds it just fine. The Qt5 app has been in active use for about 2 years now, without any of XML related issues. The code changes are so minimal between the two, its making it hard to diagnose.
-
@Christian-Ehrlicher It's certainly possible I'm doing something wrong :). My application uses the QDomDocument to construct the XML file. I send it to a Python script to reload it. For the smaller exports, the Qt6 version works just fine. I've exported the XML using the Qt5 version of my app and the Qt6 Python still misses that node. The Qt5 Python finds it just fine. The Qt5 app has been in active use for about 2 years now, without any of XML related issues. The code changes are so minimal between the two, its making it hard to diagnose.
@kesterpm
I don't see why you cannot find out more about what it is/is not doing with a little diagnostic output. For example, make the code just print out the node names/attributes it encounters, one per line. Run it under the Qt5 you say works and the Qt6 you say does not. Run adiff
or similar on the output. Find where you say the parsing differs. That sort of thing. It's what I would do.(BTW: While I think of it. If you do this it's possible they could output attributes in a different order, messing up
diff
. If so sort them alphabetically before outputting..) -
@kesterpm
I don't see why you cannot find out more about what it is/is not doing with a little diagnostic output. For example, make the code just print out the node names/attributes it encounters, one per line. Run it under the Qt5 you say works and the Qt6 you say does not. Run adiff
or similar on the output. Find where you say the parsing differs. That sort of thing. It's what I would do.(BTW: While I think of it. If you do this it's possible they could output attributes in a different order, messing up
diff
. If so sort them alphabetically before outputting..)@JonB Ok, I think I'm not setting up the XML encoding correctly. If I open the file up in Windows Notepad and add a space and save it, it works. I'll have to do some investigation. Maybe it has something to do with the QFile.open flags. The Qt5 version may have been more forgiving.
-
I did a compare of the file I saved with Notepad to the original. Notepad filtered out the non-breaking space characters from the XML. I found online that QDomDocument doesn't handle those and a few other characters. I added some code to filter those characters out and the resulting file works now! Thanks!
-
K kesterpm has marked this topic as solved on