Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Extract parts from webpage
Forum Updated to NodeBB v4.3 + New Features

Extract parts from webpage

Scheduled Pinned Locked Moved Solved General and Desktop
18 Posts 6 Posters 1.2k Views 3 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    realroot
    wrote on last edited by
    #9

    The data I fetch should be always inside the same "blocks", another example:

    <div class="infobox-works"><img src="/images/infobox/focus-abs.jpg" alt=""></div>
    

    So "/images/infobox/focus-abs.jpg".
    I think that regex can do this but it didn't work.

    I will look QDomDocument if that can parse html.

    1 Reply Last reply
    0
    • Christian EhrlicherC Offline
      Christian EhrlicherC Offline
      Christian Ehrlicher
      Lifetime Qt Champion
      wrote on last edited by
      #10

      Why don't you simply search for <img src= then?

      Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
      Visit the Qt Academy at https://academy.qt.io/catalog

      1 Reply Last reply
      0
      • R Offline
        R Offline
        realroot
        wrote on last edited by
        #11

        Search with QRegularExpression? Could you clarify?

        Christian EhrlicherC JonBJ 2 Replies Last reply
        0
        • R realroot

          Search with QRegularExpression? Could you clarify?

          Christian EhrlicherC Offline
          Christian EhrlicherC Offline
          Christian Ehrlicher
          Lifetime Qt Champion
          wrote on last edited by
          #12

          @realroot said in Extract parts from webpage:

          Search with QRegularExpression?

          Why do you need a regexp when you want to search for a simple string?

          Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
          Visit the Qt Academy at https://academy.qt.io/catalog

          1 Reply Last reply
          0
          • R realroot

            Search with QRegularExpression? Could you clarify?

            JonBJ Online
            JonBJ Online
            JonB
            wrote on last edited by JonB
            #13

            @realroot
            If you don't want to use regular expressions then, as @Christian-Ehrlicher has said, you could search for literal string <img src=" via indexOf(), find the next " after that, and the filepath is in-between the quote indexes.

            For picking out the filepath in a regular expression you will want something like
            <img src="([^"]*)"
            The parentheses (...) allow you to capture the string inside. You have to do whatever to protect in a C++ string, or use raw string literals.

            Even though it does not offer precise Qt syntax, I would recommend playing at e.g. https://regex101.com/ (EcmaScript (JavaScript) flavor) with bits of your input to learn how to match.

            SGaistS 1 Reply Last reply
            0
            • JonBJ JonB

              @realroot
              If you don't want to use regular expressions then, as @Christian-Ehrlicher has said, you could search for literal string <img src=" via indexOf(), find the next " after that, and the filepath is in-between the quote indexes.

              For picking out the filepath in a regular expression you will want something like
              <img src="([^"]*)"
              The parentheses (...) allow you to capture the string inside. You have to do whatever to protect in a C++ string, or use raw string literals.

              Even though it does not offer precise Qt syntax, I would recommend playing at e.g. https://regex101.com/ (EcmaScript (JavaScript) flavor) with bits of your input to learn how to match.

              SGaistS Offline
              SGaistS Offline
              SGaist
              Lifetime Qt Champion
              wrote on last edited by
              #14

              The regular expression tool might be worth a build and test to grab the correct syntax to use with QRegularExpression.

              Interested in AI ? www.idiap.ch
              Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

              1 Reply Last reply
              1
              • R Offline
                R Offline
                realroot
                wrote on last edited by
                #15

                I did not think about indexOf().
                I can use that indeed.

                To save jpg or pdf can I use a QPixmap?

                void onFinished(QNetworkReply *reply) {
                    ...
                    QPixmap pm;
                    pm.loadFromData(reply->readAll());
                
                1 Reply Last reply
                0
                • SGaistS Offline
                  SGaistS Offline
                  SGaist
                  Lifetime Qt Champion
                  wrote on last edited by
                  #16

                  What does your reply contain ? If it's the image data, then write it directly to a file.

                  Interested in AI ? www.idiap.ch
                  Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                  1 Reply Last reply
                  0
                  • R Offline
                    R Offline
                    realroot
                    wrote on last edited by realroot
                    #17

                    I still have to try it should be:

                    QNetworkReply* reply = m_manager->get(QNetworkRequest(QUrl("https://site.com/image.jpg")));
                    

                    So I think it is.

                    I use QTextStream for text not sure to how handle images.

                    1 Reply Last reply
                    0
                    • SGaistS Offline
                      SGaistS Offline
                      SGaist
                      Lifetime Qt Champion
                      wrote on last edited by
                      #18

                      So these are binary data, juste use QFile to write them to disk directly.

                      Interested in AI ? www.idiap.ch
                      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                      1 Reply Last reply
                      0
                      • R realroot has marked this topic as solved on

                      • Login

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • Users
                      • Groups
                      • Search
                      • Get Qt Extensions
                      • Unsolved