Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

fast way to update 10M record in Mysql database using QT



  • I have a 500MB file which contains a data that I need to use it to update a table that contains 10M rows so I use this code in using QT SQL library

    QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
    // my database info 
    db.transaction(); // start the transaction
        QSqlQuery q;
        QString sql = ""; // to append many sql statment to this string
    
        QFile file(dataFile);
        file.open(QIODevice::Text | QIODevice::ReadOnly);
        QTextStream stream(&file);
    
        int counter = 0;
        clock_t tStart = clock();
        while (!stream.atEnd()) {
            counter++;
            QString str = stream.readLine();
            QStringList list = str.split(';');
            QString id = list[0];
            QString streetName = list[1];
            QString HouseName = list[2];
            QString HouseNumber = list[3];
            QString city = list[4];
            QString postalCode = list[5];
    
            QString temp_sql = "UPDATE temp_table_copy SET street_name='!1', house_number='!2',postal_code='!3',address_city='!4', house='!5' where id=" + id + ";";
            temp_sql = temp_sql.replace("!1", streetName);
            temp_sql = temp_sql.replace("!2", HouseNumber);
            temp_sql = temp_sql.replace("!3", postalCode);
            temp_sql = temp_sql.replace("!4", city);
            temp_sql = temp_sql.replace("!5", HouseName);
    
            sql += temp_sql;
            if (sql.size() >= 100000) {
                if (!q.exec(sql)) {
                     qDebug() << "Excute Error: " << q.lastError().text();
                }
                sql = "";
                printf("Finished: %d sql statment\n", counter);
            }
    
            if (counter == Number)
                break;
        }
        if (!sql.isEmpty()) {
            if (!q.exec(sql)) {
                qDebug() << "Excute Error: " << q.lastError().text();
            }
            printf("Finished: %d sql statment\n", counter);
        }
        db.commit(); // commit and
    
    
        printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);
        db.close();
    

    the idea is I append the big SQL string with the SQL statement and when it reaches the 100,000 in size I execute it and empty the string to avoid bad allocation when the string is too much size in memory, and also I read that MySQL database has a certain size to the SQL statement to be executed and I get its limit using

    SHOW VARIABLES LIKE 'max_allowed_packet';
    

    the problem here is that not all the data are updated some ids have a value in the text file and not updated inside the MySQL database table so what can be the problem? those columns are null before the update so when I try to count how many not update I found 7M only which means some ids are not updated and also I did not get any error from the query, any help or suggestion to know why this happened, or if there is another way to it.
    Thanks in advance.


  • Lifetime Qt Champion

    Use a prepared statement + db transaction. Your current approach is a big invitation for SQL injection.

    First read the data into a QVector<struct> and then add it to the database:

    QVector <MyStruct> data;
    while (!stream.atEnd()) {
      data += ...
    }
    
    QSqlDatabase db = ...
    db.beginTransaction()
    QSqlQuery q(db);
    if (q.prepare("UPDATE temp_table_copy SET street_name=:sn, house_number=:hn, postal_code=: pc, address_city=:ac, house=:house where id= :id"))
      return false;
    for (const auto &line : data)
    {
      q.bindValue(":sn", line.street_name);
      q.bindValue(":hn", line.house_number);
     ...
      q.exec();
    }
    
    db.commit();
    


  • is this the faster thing we can get, is it fine to use exec on each loop? I still running the command and I will write to you how much it take to finish



  • Performance can be better by having an index on the WHERE field (ID) another think is use a Store Procedure and pass the data to the SP this way the SQL don't spend time to recompile the SQL statement every time you call it


  • Lifetime Qt Champion

    @saulos said in fast way to update 10M record in Mysql database using QT:

    SQL don't spend time to recompile the SQL statement every time you call it

    As you can see the statement is a prepared statement in my example - it's recompiled exactly once



  • @AmrCoder
    One observation: you are doing 500MB's-worth of data read update statements inside a single transaction. I don't know what sort of behind-the-scenes usage that puts on your database. Do you want all of these updates to either succeed or fail as a single transaction? There are possibilities like taking a table lock for a large number of updates. Or there are "batch" operations, include QSqlQuery::execBatch(). You might like to investigate these possibilities.



  • @Christian-Ehrlicher
    I was not referring to your code, it was just a generic suggestion :)


Log in to reply