Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Parsing hyperlinks from HTML using QTextDocument?
Forum Updated to NodeBB v4.3 + New Features

Parsing hyperlinks from HTML using QTextDocument?

Scheduled Pinned Locked Moved Solved General and Desktop
3 Posts 2 Posters 1.7k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V Offline
    V Offline
    Violet Giraffe
    wrote on last edited by Violet Giraffe
    #1

    I'm having trouble with parsing simple things from a simple HTML page without using Webkit and DOM (I don't want a Webkit dependency).

    1. This page says QTextDocument can parse stuff from HTML.
    2. I do the following and I can see that my HTML had been parsed nicely, but only the user-visible text, no markup here:
    for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
    {
    	qDebug() << block.text();
    }
    
    1. I read some QTextBlock docs and try this, but anchorHref and anchorNames is always empty even for the blocks that I know are <a href>.
    for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
    {
    	qDebug() << block.text();
    	qDebug() << block.charFormat().anchorNames();
    	qDebug() << block.charFormat().anchorHref();
    	qDebug() << "-------------------------------";
    }
    

    Is there any way to get the hyperlink URLs?

    raven-worxR 1 Reply Last reply
    0
    • V Violet Giraffe

      I'm having trouble with parsing simple things from a simple HTML page without using Webkit and DOM (I don't want a Webkit dependency).

      1. This page says QTextDocument can parse stuff from HTML.
      2. I do the following and I can see that my HTML had been parsed nicely, but only the user-visible text, no markup here:
      for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
      {
      	qDebug() << block.text();
      }
      
      1. I read some QTextBlock docs and try this, but anchorHref and anchorNames is always empty even for the blocks that I know are <a href>.
      for (QTextBlock block = doc.begin(), end = doc.end(); block != end; block = block.next())
      {
      	qDebug() << block.text();
      	qDebug() << block.charFormat().anchorNames();
      	qDebug() << block.charFormat().anchorHref();
      	qDebug() << "-------------------------------";
      }
      

      Is there any way to get the hyperlink URLs?

      raven-worxR Offline
      raven-worxR Offline
      raven-worx
      Moderators
      wrote on last edited by
      #2

      @Violet-Giraffe
      This should work (untested though)

      void searchLink(QTextFrame * parent)
      {
          for( QTextFrame::iterator it = parent->begin(); !it.atEnd(); ++it )
          {
              QTextFrame *textFrame = it.currentFrame();
              QTextBlock textBlock = it.currentBlock();
      
              if( textFrame )
              {
                  this->searchLink(textFrame);
              }
              else if( textBlock.isValid() )
              {
                  this->searchLink(textBlock);
              }
          }
      }
      
      void searchLink(QTextBlock & parent)
      {
          for(QTextBlock::iterator it = parent.begin(); !it.atEnd(); ++it)
          {
              QTextFragment textFragment = it.fragment();
              if( textFragment.isValid() )
              {
                  QTextCharFormat textCharFormat = textFragment.charFormat();
                  if( textCharFormat.isAnchor() )
                  {
                       textCharFormat.anchorHref();  // <-- URL
                  }
              }
          }
      }
      

      The searchLink() method searches recursively.

      searchLink( textDocument->rootFrame() );
      

      --- SUPPORT REQUESTS VIA CHAT WILL BE IGNORED ---
      If you have a question please use the forum so others can benefit from the solution in the future

      1 Reply Last reply
      3
      • V Offline
        V Offline
        Violet Giraffe
        wrote on last edited by
        #3

        Aha! So my mistake was that I only looked at blocks and not fragments. Thank you.

        1 Reply Last reply
        0

        • Login

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • Users
        • Groups
        • Search
        • Get Qt Extensions
        • Unsolved