Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. fast way to update 10M record in Mysql database using QT
Forum Updated to NodeBB v4.3 + New Features

fast way to update 10M record in Mysql database using QT

Scheduled Pinned Locked Moved General and Desktop
7 Posts 4 Posters 1.3k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    A Offline
    AmrCoder
    wrote on 5 Apr 2020, 11:34 last edited by
    #1

    I have a 500MB file which contains a data that I need to use it to update a table that contains 10M rows so I use this code in using QT SQL library

    QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
    // my database info 
    db.transaction(); // start the transaction
        QSqlQuery q;
        QString sql = ""; // to append many sql statment to this string
    
        QFile file(dataFile);
        file.open(QIODevice::Text | QIODevice::ReadOnly);
        QTextStream stream(&file);
    
        int counter = 0;
        clock_t tStart = clock();
        while (!stream.atEnd()) {
            counter++;
            QString str = stream.readLine();
            QStringList list = str.split(';');
            QString id = list[0];
            QString streetName = list[1];
            QString HouseName = list[2];
            QString HouseNumber = list[3];
            QString city = list[4];
            QString postalCode = list[5];
    
            QString temp_sql = "UPDATE temp_table_copy SET street_name='!1', house_number='!2',postal_code='!3',address_city='!4', house='!5' where id=" + id + ";";
            temp_sql = temp_sql.replace("!1", streetName);
            temp_sql = temp_sql.replace("!2", HouseNumber);
            temp_sql = temp_sql.replace("!3", postalCode);
            temp_sql = temp_sql.replace("!4", city);
            temp_sql = temp_sql.replace("!5", HouseName);
    
            sql += temp_sql;
            if (sql.size() >= 100000) {
                if (!q.exec(sql)) {
                     qDebug() << "Excute Error: " << q.lastError().text();
                }
                sql = "";
                printf("Finished: %d sql statment\n", counter);
            }
    
            if (counter == Number)
                break;
        }
        if (!sql.isEmpty()) {
            if (!q.exec(sql)) {
                qDebug() << "Excute Error: " << q.lastError().text();
            }
            printf("Finished: %d sql statment\n", counter);
        }
        db.commit(); // commit and
    
    
        printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
        db.close();
    

    the idea is I append the big SQL string with the SQL statement and when it reaches the 100,000 in size I execute it and empty the string to avoid bad allocation when the string is too much size in memory, and also I read that MySQL database has a certain size to the SQL statement to be executed and I get its limit using

    SHOW VARIABLES LIKE 'max_allowed_packet';
    

    the problem here is that not all the data are updated some ids have a value in the text file and not updated inside the MySQL database table so what can be the problem? those columns are null before the update so when I try to count how many not update I found 7M only which means some ids are not updated and also I did not get any error from the query, any help or suggestion to know why this happened, or if there is another way to it.
    Thanks in advance.

    J 1 Reply Last reply 6 Apr 2020, 09:10
    0
    • C Offline
      C Offline
      Christian Ehrlicher
      Lifetime Qt Champion
      wrote on 5 Apr 2020, 11:53 last edited by Christian Ehrlicher 4 May 2020, 11:54
      #2

      Use a prepared statement + db transaction. Your current approach is a big invitation for SQL injection.

      First read the data into a QVector<struct> and then add it to the database:

      QVector <MyStruct> data;
      while (!stream.atEnd()) {
        data += ...
      }
      
      QSqlDatabase db = ...
      db.beginTransaction()
      QSqlQuery q(db);
      if (q.prepare("UPDATE temp_table_copy SET street_name=:sn, house_number=:hn, postal_code=: pc, address_city=:ac, house=:house where id= :id"))
        return false;
      for (const auto &line : data)
      {
        q.bindValue(":sn", line.street_name);
        q.bindValue(":hn", line.house_number);
       ...
        q.exec();
      }
      
      db.commit();
      

      Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
      Visit the Qt Academy at https://academy.qt.io/catalog

      1 Reply Last reply
      4
      • A Offline
        A Offline
        AmrCoder
        wrote on 5 Apr 2020, 13:40 last edited by AmrCoder 4 May 2020, 13:41
        #3

        is this the faster thing we can get, is it fine to use exec on each loop? I still running the command and I will write to you how much it take to finish

        1 Reply Last reply
        0
        • S Offline
          S Offline
          saulos
          wrote on 6 Apr 2020, 08:28 last edited by
          #4

          Performance can be better by having an index on the WHERE field (ID) another think is use a Store Procedure and pass the data to the SP this way the SQL don't spend time to recompile the SQL statement every time you call it

          1 Reply Last reply
          0
          • C Offline
            C Offline
            Christian Ehrlicher
            Lifetime Qt Champion
            wrote on 6 Apr 2020, 08:44 last edited by
            #5

            @saulos said in fast way to update 10M record in Mysql database using QT:

            SQL don't spend time to recompile the SQL statement every time you call it

            As you can see the statement is a prepared statement in my example - it's recompiled exactly once

            Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
            Visit the Qt Academy at https://academy.qt.io/catalog

            S 1 Reply Last reply 6 Apr 2020, 12:57
            0
            • A AmrCoder
              5 Apr 2020, 11:34

              I have a 500MB file which contains a data that I need to use it to update a table that contains 10M rows so I use this code in using QT SQL library

              QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
              // my database info 
              db.transaction(); // start the transaction
                  QSqlQuery q;
                  QString sql = ""; // to append many sql statment to this string
              
                  QFile file(dataFile);
                  file.open(QIODevice::Text | QIODevice::ReadOnly);
                  QTextStream stream(&file);
              
                  int counter = 0;
                  clock_t tStart = clock();
                  while (!stream.atEnd()) {
                      counter++;
                      QString str = stream.readLine();
                      QStringList list = str.split(';');
                      QString id = list[0];
                      QString streetName = list[1];
                      QString HouseName = list[2];
                      QString HouseNumber = list[3];
                      QString city = list[4];
                      QString postalCode = list[5];
              
                      QString temp_sql = "UPDATE temp_table_copy SET street_name='!1', house_number='!2',postal_code='!3',address_city='!4', house='!5' where id=" + id + ";";
                      temp_sql = temp_sql.replace("!1", streetName);
                      temp_sql = temp_sql.replace("!2", HouseNumber);
                      temp_sql = temp_sql.replace("!3", postalCode);
                      temp_sql = temp_sql.replace("!4", city);
                      temp_sql = temp_sql.replace("!5", HouseName);
              
                      sql += temp_sql;
                      if (sql.size() >= 100000) {
                          if (!q.exec(sql)) {
                               qDebug() << "Excute Error: " << q.lastError().text();
                          }
                          sql = "";
                          printf("Finished: %d sql statment\n", counter);
                      }
              
                      if (counter == Number)
                          break;
                  }
                  if (!sql.isEmpty()) {
                      if (!q.exec(sql)) {
                          qDebug() << "Excute Error: " << q.lastError().text();
                      }
                      printf("Finished: %d sql statment\n", counter);
                  }
                  db.commit(); // commit and
              
              
                  printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
                  db.close();
              

              the idea is I append the big SQL string with the SQL statement and when it reaches the 100,000 in size I execute it and empty the string to avoid bad allocation when the string is too much size in memory, and also I read that MySQL database has a certain size to the SQL statement to be executed and I get its limit using

              SHOW VARIABLES LIKE 'max_allowed_packet';
              

              the problem here is that not all the data are updated some ids have a value in the text file and not updated inside the MySQL database table so what can be the problem? those columns are null before the update so when I try to count how many not update I found 7M only which means some ids are not updated and also I did not get any error from the query, any help or suggestion to know why this happened, or if there is another way to it.
              Thanks in advance.

              J Offline
              J Offline
              JonB
              wrote on 6 Apr 2020, 09:10 last edited by
              #6

              @AmrCoder
              One observation: you are doing 500MB's-worth of data read update statements inside a single transaction. I don't know what sort of behind-the-scenes usage that puts on your database. Do you want all of these updates to either succeed or fail as a single transaction? There are possibilities like taking a table lock for a large number of updates. Or there are "batch" operations, include QSqlQuery::execBatch(). You might like to investigate these possibilities.

              1 Reply Last reply
              1
              • C Christian Ehrlicher
                6 Apr 2020, 08:44

                @saulos said in fast way to update 10M record in Mysql database using QT:

                SQL don't spend time to recompile the SQL statement every time you call it

                As you can see the statement is a prepared statement in my example - it's recompiled exactly once

                S Offline
                S Offline
                saulos
                wrote on 6 Apr 2020, 12:57 last edited by
                #7

                @Christian-Ehrlicher
                I was not referring to your code, it was just a generic suggestion :)

                1 Reply Last reply
                0

                1/7

                5 Apr 2020, 11:34

                • Login

                • Login or register to search.
                1 out of 7
                • First post
                  1/7
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • Users
                • Groups
                • Search
                • Get Qt Extensions
                • Unsolved