Qt SQL - optimize bulk insert
-
Hi,
currently, we are evaluating a DBMS abstraction library and we are looking into Qt SQL. We observed that another C++ lib (SOCI) outperforms Qt SQL in speed dramatically when inserting 1000000 records (at least on Windows with ODBC and PostgreSQL). Does somebody know how to tweak the following code to achieve better performance? It's really urgent because we would like to take a decision by tomorrow.
Thanks in advance!
@
// Qt SQL implementation
void CQtDBEvaluation::PerformBulkOperations()
{
static const int NUMBER_OF_USERS = 1000000;
static const int BATCH_SIZE = 250;QSqlQuery query; // BULK insert CTimer timer(1000); // start timer QSqlDatabase::database().transaction(); QVariantList userIds; QVariantList domains; QVariantList creationTimes; QVariantList modifyTimes; for (int i=0; i<NUMBER_OF_USERS; ++i) { userIds.push_back(QString("user") + QString::number(i)); domains.push_back("master"); creationTimes.push_back(QDateTime::currentDateTime()); modifyTimes.push_back(QDateTime::currentDateTime()); if ((i + 1) % BATCH_SIZE == 0) { query.prepare("INSERT INTO vdsUser(vdsuserid, vdsdomain, vdscreatetime, vdsmodifytime) values (?, ?, ?, ?)"); // BULK INSERT query.addBindValue(userIds); query.addBindValue(domains); query.addBindValue(creationTimes); query.addBindValue(modifyTimes); query.execBatch(); userIds.clear(); domains.clear(); creationTimes.clear(); modifyTimes.clear(); } } QSqlDatabase::database().commit(); // stop timer int64 measuredTime = timer.Tick(); std::cout << "CQtDBEvaluation::PerformBulkOperations - inserting " << NUMBER_OF_USERS << " took " << measuredTime << " ms." << std::endl;
}
@@
// SOCI implementation
void CSOCIDBEvaluation::PerformBulkOperations()
{
std::tm time;
time.tm_year = 115;
time.tm_mday = 23;
time.tm_mon = 2;
time.tm_hour = 8;
time.tm_min = 8;
time.tm_sec = 8;
std::string domain = "master";const int NUMBER_OF_USERS = 1000000; const int BATCH_SIZE = 250; // ORM not(!!!) supported for bulk insert! std::vector<std::string> vdsUserIds; std::vector<std::string> vdsDomains; std::vector<std::tm> vdsCreateTimes; std::vector<std::tm> vdsModifyTimes; try { soci::session sql(soci::odbc, "DSN=test;Uid=test;Pwd=test;"); // start timer CTimer timer(1000); // start timer sql.begin(); for (int i = 0; i != NUMBER_OF_USERS; ++i) { std::stringstream ss; ss << "user" << i; vdsUserIds.push_back(ss.str()); vdsDomains.push_back(domain); vdsCreateTimes.push_back(time); vdsModifyTimes.push_back(time); if ((i + 1) % BATCH_SIZE == 0) { // BULK INSERT soci::statement st = (sql.prepare << "insert into vdsuser(vdsuserid, vdsdomain, vdscreatetime, vdsmodifytime) values(:vdsuserid, :vdsdomain, :vdscreatetime, :vdsmodifytime)", soci::use(vdsUserIds), soci::use(vdsDomains), soci::use(vdsCreateTimes), soci::use(vdsModifyTimes)); st.execute(true); // clear current ones vdsUserIds.clear(); vdsDomains.clear(); vdsCreateTimes.clear(); vdsModifyTimes.clear(); } } sql.commit(); // stop timer int64 measuredTime = timer.Tick(); std::cout << "CSOCIDBEvaluation::PerformBulkOperations - inserting " << NUMBER_OF_USERS << " took " << measuredTime << " ms." << std::endl; } catch (std::exception const &e) { std::cout << "Error: " << e.what() << std::endl; }
}
@ -
If you are looking for performance, you should consider using the QPSQL PostgreSQL native driver instead of the ODBC one.
-
this reply is out of time but prepare statement in both case should by out of "for loop"... and in SOCI example you have one "time" and "domain" creation outside "for loop"...