Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Strange behavior of QDomDocument::toString()
QtWS25 Last Chance

Strange behavior of QDomDocument::toString()

Scheduled Pinned Locked Moved General and Desktop
9 Posts 4 Posters 5.6k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    A Offline
    a.v.mich
    wrote on last edited by
    #1

    Hello. I have 4.7.2 version and this code:
    @#include <QtCore/QCoreApplication>
    #include <QtXml/QDomDocument>
    #include <QtCore/QDebug>

    int main(int argc, char *argv[])
    {
    QCoreApplication a(argc, argv);

    QString simpleXml = "<node1>&amp;lt;node2/&ampgt;</node1>";
    QDomDocument doc("simple");
    doc.setContent(simpleXml);
    
    QString simpleXml2 = doc.toString();
    qDebug() << simpleXml2;
    
    return a.exec&#40;&#41;;
    

    }
    @

    the result is "<node1>&lt;node2/></node1>";

    instead of &gt; appears greater than sign. How?

    1 Reply Last reply
    0
    • D Offline
      D Offline
      dangelog
      wrote on last edited by
      #2

      Why not? I guess it's perfectly valid XML to have a literal '>' if it can be parsed in the right way. Did you use xmllint on that?

      Software Engineer
      KDAB (UK) Ltd., a KDAB Group company

      1 Reply Last reply
      0
      • A Offline
        A Offline
        a.v.mich
        wrote on last edited by
        #3

        I mean that ideally, I should get the exact same line, which gave to QDomDocument::setContent(). Why QDomDocument makes unescaping of DomText instead of me.

        1 Reply Last reply
        0
        • A Offline
          A Offline
          andre
          wrote on last edited by
          #4

          Sorry, but no. I think that QDomDocument constructs the text when you call toString(). It does not keep track if the document was modified or not and if it could perhaps just return whatever was set on it.

          1 Reply Last reply
          0
          • A Offline
            A Offline
            a.v.mich
            wrote on last edited by
            #5

            agree, but I was referring to the above described example. Where could happen modifying of document and what justified such behavior of setContent or toString?

            1 Reply Last reply
            0
            • A Offline
              A Offline
              andre
              wrote on last edited by
              #6

              Your reply shows that you did not understand my reply. Let me try to rephrase. My idea is that QDomDocument does not keep a string representation of the document. It parses the document into its internal, node-based data structure when you set it using setContent(), and then discards the string. Now, when you request a string representation of the document, such a string is constructed from the internal representation.*

              The document is not modified at all. You are just getting a different representation than you expected of the same document. It is, as peppe pointed out, perfectly valid and represents the same document as the one you first put in. Similary, the order of attributes in an XML document representation is undefined. That means that
              @
              <doc>
              <node arg1="foo" arg2="bar"/>
              </doc>
              @

              represents exactly the same document as
              @
              <doc>
              <node arg2="bar" arg1="foo"/>
              </doc>
              @

              even if the textual representations of the document differ. That's XML for you. Deal with it. Relying on these kind of things to be stable or in a specific form is a bug, IMO.

              *) Note, this is my idea of how QDomDocument works, I did not verify this against the docs or the source code. You probably should do that yourself to be sure.

              1 Reply Last reply
              0
              • A Offline
                A Offline
                a.v.mich
                wrote on last edited by
                #7

                No, no, I did not mean the string representation of the document, I tried to understand why the contents of QDomText changes after parsing and reverse string builder. I wanted to know why, during one of this action only one character is converted from html code to character representation - closing bracket. Why not sign ampersand, not the quotes and not opening bracket - the sign >. It seemed strange to me, and there were thoughts that something is not working as it should. I picked up this discussion to find out - can I expect that the data that I put between the tags and which do not violate validity of XML document will not change after reverse string construction.

                1 Reply Last reply
                0
                • A Offline
                  A Offline
                  andre
                  wrote on last edited by
                  #8

                  Final attempt:
                  I tried to explain above that the different representations that you get represent the very same document. That is: the contents of your document was not changed. As for why encode one angeled brace, and not the other: as peppe said, why not? It is valid XML, why make the representation longer than needed by writing > instead of > ? Again: I doubt QDomDocument (or any of the nodes, for that matter) keep track of how these were originally encoded.

                  1 Reply Last reply
                  0
                  • G Offline
                    G Offline
                    goetz
                    wrote on last edited by
                    #9

                    Both string representations are semantically identical, though not literal identical. This is all that matters, everything else is subject to the inner workings of the libs and classes used. As long as you get a semantically identical XML out of what you put in, everything is ok.

                    http://www.catb.org/~esr/faqs/smart-questions.html

                    1 Reply Last reply
                    0

                    • Login

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • Users
                    • Groups
                    • Search
                    • Get Qt Extensions
                    • Unsolved