Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Parsing hyperlinks from HTML using QTextDocument?
Qt 6.11 is out! See what's new in the release blog

Parsing hyperlinks from HTML using QTextDocument?

Scheduled Pinned Locked Moved Solved General and Desktop
3 Posts 2 Posters 2.0k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V Offline
    V Offline
    Violet Giraffe
    wrote on last edited by Violet Giraffe
    #1

    I'm having trouble with parsing simple things from a simple HTML page without using Webkit and DOM (I don't want a Webkit dependency).

    1. This page says QTextDocument can parse stuff from HTML.
    2. I do the following and I can see that my HTML had been parsed nicely, but only the user-visible text, no markup here:
    for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
    {
    	qDebug() << block.text();
    }
    
    1. I read some QTextBlock docs and try this, but anchorHref and anchorNames is always empty even for the blocks that I know are <a href>.
    for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
    {
    	qDebug() << block.text();
    	qDebug() << block.charFormat().anchorNames();
    	qDebug() << block.charFormat().anchorHref();
    	qDebug() << "-------------------------------";
    }
    

    Is there any way to get the hyperlink URLs?

    raven-worxR 1 Reply Last reply
    0
    • V Violet Giraffe

      I'm having trouble with parsing simple things from a simple HTML page without using Webkit and DOM (I don't want a Webkit dependency).

      1. This page says QTextDocument can parse stuff from HTML.
      2. I do the following and I can see that my HTML had been parsed nicely, but only the user-visible text, no markup here:
      for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
      {
      	qDebug() << block.text();
      }
      
      1. I read some QTextBlock docs and try this, but anchorHref and anchorNames is always empty even for the blocks that I know are <a href>.
      for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
      {
      	qDebug() << block.text();
      	qDebug() << block.charFormat().anchorNames();
      	qDebug() << block.charFormat().anchorHref();
      	qDebug() << "-------------------------------";
      }
      

      Is there any way to get the hyperlink URLs?

      raven-worxR Offline
      raven-worxR Offline
      raven-worx
      Moderators
      wrote on last edited by
      #2

      @Violet-Giraffe
      This should work (untested though)

      void searchLink(QTextFrame * parent)
      {
          for( QTextFrame::iterator it = parent->begin(); !it.atEnd(); ++it )
          {
              QTextFrame *textFrame = it.currentFrame();
              QTextBlock textBlock = it.currentBlock();
      
              if( textFrame )
              {
                  this->searchLink(textFrame);
              }
              else if( textBlock.isValid() )
              {
                  this->searchLink(textBlock);
              }
          }
      }
      
      void searchLink(QTextBlock & parent)
      {
          for(QTextBlock::iterator it = parent.begin(); !it.atEnd(); ++it)
          {
              QTextFragment textFragment = it.fragment();
              if( textFragment.isValid() )
              {
                  QTextCharFormat textCharFormat = textFragment.charFormat();
                  if( textCharFormat.isAnchor() )
                  {
                       textCharFormat.anchorHref();  // <-- URL
                  }
              }
          }
      }
      

      The searchLink() method searches recursively.

      searchLink( textDocument->rootFrame() );
      

      --- SUPPORT REQUESTS VIA CHAT WILL BE IGNORED ---
      If you have a question please use the forum so others can benefit from the solution in the future

      1 Reply Last reply
      3
      • V Offline
        V Offline
        Violet Giraffe
        wrote on last edited by
        #3

        Aha! So my mistake was that I only looked at blocks and not fragments. Thank you.

        1 Reply Last reply
        0

        • Login

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • Users
        • Groups
        • Search
        • Get Qt Extensions
        • Unsolved