Run Python3 C Interface script with arguments
-
I need a way to get the video title from a youtube url, while g searching all the solution I found were in either PHP, javaScript or Python. I chose to go with python as I been wanting to learn how to use Python in C++.
So I downloaded Python for windows and after finding a script that does what I want I began searching how to use it in C++.
A lot of answers suggested usingQProcess
which is fine but not in my case as I understand that the user of my app will have to have python installed to be able to use it or I will have to send python in the installer package making it really large, correct me please if I am wrong.So I set up my env by adding these two lines to my .pro file:
INCLUDEPATH += -I C:\Python39\include LIBS += -L C:\Python39\libs -lpython39
And found a way to do it embedded, at least I expect it is embedded, but it is always complaining that
IndexError: list index out of range
, it seems that it is not receiving any arguments even though I am setting it withPySys_SetArgv
Here is the example code I am trying to test:
void MainWindow::runPyScriptArgs(const char* file, int argc, const char *argv[]) { FILE* file_; Py_SetProgramName((wchar_t*)file); Py_Initialize(); PySys_SetArgv(argc, (wchar_t**)argv); file_ = _Py_fopen(file,"r"); PyRun_SimpleFile(file_,file); Py_Finalize(); }
Code used to call it:
int py_argc = 1; const char* py_argv[] = {"https://www.youtube.com/watch?v=bUUZ1iD9_e4"}; runPyScriptArgs("script.py", py_argc, py_argv);
import urllib.request import json import urllib import sys url = str(sys.argv[1]) params = {"format": "json", "url": url} url = "https://www.youtube.com/oembed" query_string = urllib.parse.urlencode(params) url = url + "?" + query_string with urllib.request.urlopen(url) as response: response_text = response.read() data = json.loads(response_text.decode()) f = open("demofile2.txt", "w") f.write(data['title']) f.close()
Am I missing something?
-
@hbatalha said in Run Python3 C Interface script with arguments:
void MainWindow::runPyScriptArgs(const char* file, int argc, const char *argv[]) { FILE* file_; Py_SetProgramName((wchar_t*)file); Py_Initialize(); PySys_SetArgv(argc, (wchar_t**)argv);
Casting a string literal (char ) to wchar is unlikely to work as expected. You either need to convert, or start with a wide string literal. QString::toWCharArray() might be of interest.
Also, argv style arguments are often assumed to be mutable. Casting from a constant can lead to crashes. I don't know if this applies to PySys_SetArgv.
script.py
[...]Am I missing something?
If the goal is to retrieve the data rather than utilize python, this algorithm should be easy to convert to C++ Qt api calls. QNetworkAccessManager::get() and QJsonDocument::fromJson() will cover most of what is needed.
-
@jeremy_k said in Run Python3 C Interface script with arguments:
If the goal is to retrieve the data rather than utilize python, this algorithm should be easy to convert to C++ Qt api calls. QNetworkAccessManager::get() and QJsonDocument::fromJson() will cover most of what is needed.
Thanks for the tip, I was able to do just that:
QNetworkAccessManager* net = new QNetworkAccessManager(this); net->get(QNetworkRequest(QUrl("https://noembed.com/embed?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ"))); connect(net, &QNetworkAccessManager::finished,[](QNetworkReply* reply) { QString output = reply->readAll(); QJsonDocument doc; QJsonParseError errorPtr; doc = QJsonDocument::fromJson(output.toUtf8(), &errorPtr); if(! doc.isNull()) { QJsonObject videoInfoJson = doc.object(); qDebug() <<videoInfoJson.value("title").toString(); } });
However, I could notice with the python script run with QProcess for example takes half the time. Time is really important in this case
Also, argv style arguments are often assumed to be mutable. Casting from a constant can lead to crashes. I don't know if this applies to PySys_SetArgv.
Is there a way to verify that?
As a notice, I found a slick way to do what I want where I will hardcode the python code in a const char* and pass it to
PyRun_SimpleString
. -
@SGaist Done.
cpr::Response r = cpr::Get(cpr::Url("https://noembed.com/embed?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ")); if(r.status_code == 0) std::cerr << r.error.message << std::endl; else if (r.status_code >= 400) { std::cerr << "Error [" << r.status_code << "] making request" << std::endl; } else { std::cout << "Request took " << r.elapsed << std::endl; std::cout << "Body:" << std::endl << r.text; }
Still, python script is faster.
As for the problem in the OP. I found out I could just hardcode the python code in a const char* and passed it to
PyRun_SimpleString
void MainWindow::runPyScriptArgs(const char* code) { Py_Initialize(); PyRun_SimpleString(code); Py_Finalize(); }
QString url = "https://www.youtube.com/watch?v=bUUZ1iD9_e4"; QString script = "import urllib.request\n" "import json\n" "import urllib\n" "import sys\n" "url = '"+ url + "'\n" "params = {\"format\": \"json\", \"url\": url}\n" "url = \"https://www.youtube.com/oembed\"\n" "query_string = urllib.parse.urlencode(params)\n" "url = url + \"?\" + query_string\n" "with urllib.request.urlopen(url) as response:\n" " response_text = response.read()\n" " data = json.loads(response_text.decode())\n" " print(data['title'])\n" " f = open(\"" +AppDataLocation + "/demofile2.txt\", \"w\")\n" " f.write(data['title'])\n" " f.close()"; runPyScript(script.toStdString().c_str());
-
@hbatalha said in Run Python3 C Interface script with arguments:
However, I could notice with the python script run with QProcess for example takes half the time. Time is really important in this case
That's surprising. Is this difference consistent across multiple runs? Usually code precompiled to machine language is significantly faster. Profiling could help to clarify what is going on.
In this case, the wall clock time should be predominately waiting on network traffic.
Also, argv style arguments are often assumed to be mutable. Casting from a constant can lead to crashes. I don't know if this applies to PySys_SetArgv.
Is there a way to verify that?
Read the relevant documentation or look at the function declaration.
-
@hbatalha If it's important, you should try to profile the code and determine where the time is spent. For python, profile and cProfile are easy to use. I like snakeviz for viewing the results. For C++, it depends on what platform you are using. on macOS and Linux, Qt Creator knows how to use valgrind. A low latency timer like QElapsedTimer can work for simple profiling.
Use a network capture tool like wireshark to see the various parts of the network traffic. The first run will have to wait for the DNS query response. Subsequent executions might be able to take advantage of the DNS cache, and appear faster as a result. This will be visible in the capture.
Then again, sometimes any working implementation is good enough.
-
If it's important, you should try to profile the code and determine where the time is spent.
It is not that important right now mostly because of time which I don't have a lot as I will be having to learn how to use all the tools you mentioned. It is worth noting that I don't speak Python, I just google the script I need for some task and then use it. As I know C++ reading python code is not that difficult.
However I already took a look at the tools and had quick read, those are tools I will need in the near future. Thank you!Right now I am just using
QElapsedTimer
to time the code.I made some more tests comparing
QNetworkAccessManager::get
and Python script.
QNetworkAccessManager always took 1200-1300 milliseconds to get a reply while python script (Edit -> Run with QProcess) took only 400-500 milliseconds.Edit: Using embedded python, it took even less, roughly 350-430 milliseconds.
Edit 2: I just found out that as it turns out the Python script makes a get request that is quicker than the one I was using with
QNetworkAccessManager::get
, I made that same get request withQNetworkAccessManager::get
and it took 350-400 milliseconds.