Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

Run Python3 C Interface script with arguments



  • I need a way to get the video title from a youtube url, while g searching all the solution I found were in either PHP, javaScript or Python. I chose to go with python as I been wanting to learn how to use Python in C++.

    So I downloaded Python for windows and after finding a script that does what I want I began searching how to use it in C++.
    A lot of answers suggested using QProcess which is fine but not in my case as I understand that the user of my app will have to have python installed to be able to use it or I will have to send python in the installer package making it really large, correct me please if I am wrong.

    So I set up my env by adding these two lines to my .pro file:

    INCLUDEPATH += -I C:\Python39\include
    LIBS += -L C:\Python39\libs -lpython39
    

    And found a way to do it embedded, at least I expect it is embedded, but it is always complaining that IndexError: list index out of range, it seems that it is not receiving any arguments even though I am setting it with PySys_SetArgv

    Here is the example code I am trying to test:

    void MainWindow::runPyScriptArgs(const char* file, int argc, const char *argv[])
    {
        FILE* file_;
        Py_SetProgramName((wchar_t*)file);
        Py_Initialize();
        PySys_SetArgv(argc, (wchar_t**)argv);
        file_ = _Py_fopen(file,"r");
        PyRun_SimpleFile(file_,file);
        Py_Finalize();
    }
    

    Code used to call it:

    int py_argc = 1;
    const char* py_argv[] = {"https://www.youtube.com/watch?v=bUUZ1iD9_e4"};
     runPyScriptArgs("script.py", py_argc, py_argv);
    

    script.py

    import urllib.request
    import json
    import urllib
    import sys
    
    url = str(sys.argv[1])
    
    params = {"format": "json", "url": url}
    url = "https://www.youtube.com/oembed"
    
    query_string = urllib.parse.urlencode(params)
    url = url + "?" + query_string
    
    with urllib.request.urlopen(url) as response:
        response_text = response.read()
        data = json.loads(response_text.decode())
        f = open("demofile2.txt", "w")
        f.write(data['title'])
        f.close()
    

    Am I missing something?



  • @hbatalha said in Run Python3 C Interface script with arguments:

    void MainWindow::runPyScriptArgs(const char* file, int argc, const char *argv[])
    {
        FILE* file_;
        Py_SetProgramName((wchar_t*)file);
        Py_Initialize();
        PySys_SetArgv(argc, (wchar_t**)argv);
    

    Casting a string literal (char ) to wchar is unlikely to work as expected. You either need to convert, or start with a wide string literal. QString::toWCharArray() might be of interest.

    Also, argv style arguments are often assumed to be mutable. Casting from a constant can lead to crashes. I don't know if this applies to PySys_SetArgv.

    script.py
    [...]

    Am I missing something?

    If the goal is to retrieve the data rather than utilize python, this algorithm should be easy to convert to C++ Qt api calls. QNetworkAccessManager::get() and QJsonDocument::fromJson() will cover most of what is needed.



  • @jeremy_k said in Run Python3 C Interface script with arguments:

    If the goal is to retrieve the data rather than utilize python, this algorithm should be easy to convert to C++ Qt api calls. QNetworkAccessManager::get() and QJsonDocument::fromJson() will cover most of what is needed.

    Thanks for the tip, I was able to do just that:

     QNetworkAccessManager* net = new QNetworkAccessManager(this);
    
     net->get(QNetworkRequest(QUrl("https://noembed.com/embed?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ")));
    
        connect(net, &QNetworkAccessManager::finished,[](QNetworkReply* reply)
        {
            QString output = reply->readAll();
            QJsonDocument doc;
            QJsonParseError errorPtr;
            doc = QJsonDocument::fromJson(output.toUtf8(), &errorPtr);
    
            if(! doc.isNull())
            {
                QJsonObject videoInfoJson = doc.object();
                qDebug() <<videoInfoJson.value("title").toString();
            }
        });
    

    However, I could notice with the python script run with QProcess for example takes half the time. Time is really important in this case

    Also, argv style arguments are often assumed to be mutable. Casting from a constant can lead to crashes. I don't know if this applies to PySys_SetArgv.

    Is there a way to verify that?

    As a notice, I found a slick way to do what I want where I will hardcode the python code in a const char* and pass it to PyRun_SimpleString.


  • Lifetime Qt Champion

    Hi,

    You should then have a look at libcurl which is lighter weight than embedding Python in your application.



  • @SGaist Good tip. This C++ libcurl wrapper seems to be really good for this purpose. I haven't tried yet though.
    Will post asa I do what I want.



  • @SGaist Done.

    cpr::Response r = cpr::Get(cpr::Url("https://noembed.com/embed?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ"));
    
        if(r.status_code == 0)
            std::cerr << r.error.message << std::endl;
        else if (r.status_code >= 400)
        {
            std::cerr << "Error [" << r.status_code << "] making request" << std::endl;
        }
        else
        {
            std::cout << "Request took " << r.elapsed << std::endl;
            std::cout << "Body:" << std::endl << r.text;
        }
    

    Still, python script is faster.

    As for the problem in the OP. I found out I could just hardcode the python code in a const char* and passed it to PyRun_SimpleString

    void MainWindow::runPyScriptArgs(const char* code)
    {
            Py_Initialize();
            PyRun_SimpleString(code);
            Py_Finalize();
    }
    
     QString url = "https://www.youtube.com/watch?v=bUUZ1iD9_e4";
        QString script = "import urllib.request\n"
                         "import json\n"
                         "import urllib\n"
                         "import sys\n"
    
                         "url = '"+ url + "'\n"
    
                         "params = {\"format\": \"json\", \"url\": url}\n"
                         "url = \"https://www.youtube.com/oembed\"\n"
                         "query_string = urllib.parse.urlencode(params)\n"
                         "url = url + \"?\" + query_string\n"
    
                         "with urllib.request.urlopen(url) as response:\n"
                         "    response_text = response.read()\n"
                         "    data = json.loads(response_text.decode())\n"
                         "    print(data['title'])\n"
                         "    f = open(\"" +AppDataLocation +  "/demofile2.txt\", \"w\")\n"
                         "    f.write(data['title'])\n"
                         "    f.close()";
    
    runPyScript(script.toStdString().c_str());
    


  • @hbatalha said in Run Python3 C Interface script with arguments:

    However, I could notice with the python script run with QProcess for example takes half the time. Time is really important in this case

    That's surprising. Is this difference consistent across multiple runs? Usually code precompiled to machine language is significantly faster. Profiling could help to clarify what is going on.

    In this case, the wall clock time should be predominately waiting on network traffic.

    Also, argv style arguments are often assumed to be mutable. Casting from a constant can lead to crashes. I don't know if this applies to PySys_SetArgv.

    Is there a way to verify that?

    Read the relevant documentation or look at the function declaration.



  • @jeremy_k said in Run Python3 C Interface script with arguments:

    Is this difference consistent across multiple runs?

    I ran a few times but not enough I guess. I will some more tests and get back to you.



  • @hbatalha If it's important, you should try to profile the code and determine where the time is spent. For python, profile and cProfile are easy to use. I like snakeviz for viewing the results. For C++, it depends on what platform you are using. on macOS and Linux, Qt Creator knows how to use valgrind. A low latency timer like QElapsedTimer can work for simple profiling.

    Use a network capture tool like wireshark to see the various parts of the network traffic. The first run will have to wait for the DNS query response. Subsequent executions might be able to take advantage of the DNS cache, and appear faster as a result. This will be visible in the capture.

    Then again, sometimes any working implementation is good enough.



  • @jeremy_k

    If it's important, you should try to profile the code and determine where the time is spent.

    It is not that important right now mostly because of time which I don't have a lot as I will be having to learn how to use all the tools you mentioned. It is worth noting that I don't speak Python, I just google the script I need for some task and then use it. As I know C++ reading python code is not that difficult.
    However I already took a look at the tools and had quick read, those are tools I will need in the near future. Thank you!

    Right now I am just using QElapsedTimer to time the code.

    I made some more tests comparing QNetworkAccessManager::get and Python script.
    QNetworkAccessManager always took 1200-1300 milliseconds to get a reply while python script (Edit -> Run with QProcess) took only 400-500 milliseconds.

    Edit: Using embedded python, it took even less, roughly 350-430 milliseconds.

    Edit 2: I just found out that as it turns out the Python script makes a get request that is quicker than the one I was using with QNetworkAccessManager::get, I made that same get request with QNetworkAccessManager::get and it took 350-400 milliseconds.


Log in to reply