Download videos from youtube - help.



  • Hello there.
    I am a QT newbie and I am trying to download videos from youtube using QNetworkAccessManager.

    First of all, I use a python script called "youtube-dl" (more info at: "Youtube-dl website":https://github.com/rg3/youtube-dl) to get the real URL of the video.

    To get the true video URL I do the following steps:

    @
    void Script_Youtube_Downloader::set_real_url(QString url, QString cfn)
    {

    QObject * parent = new QObject();
    qint64 ll;
    char buf[2048];
    QString cookie_param;
    QString program = "/usr/bin/python";
    QStringList arguments;
    cookie_param = "--cookie=";
    cookie_param += cfn;
    arguments &lt;&lt; SCRIPT_PATH &lt;&lt; cookie_param &lt;&lt; "-g" << url;
    qDebug("running");
    qDebug("\n");
    qDebug("Script_path: %s", SCRIPT_PATH );
    QProcess *python_process = new QProcess(parent);
    python_process->setReadChannel(QProcess::StandardOutput); //standard output
    python_process->start(program, arguments);
    if (!python_process->waitForStarted(100))
    {
        qDebug("big error: could not start python. Is the python path right?\n");
        this->real_url = "ERROR";
    }
    else
    {
        if(python_process->state() == 2)
        {
            if(python_process->waitForReadyRead(-1))
            {
               ll = python_process->readLine(buf, sizeof(buf));
            }
            else
                ll = -1;
    
        }
        if (ll != -1) //data captured
        {
            this->real_url = QString(buf);
        }
        else
            this->real_url.append("FUNCTION ERROR");
    }
    python_process->waitForFinished(-1);
    python_process->kill();
    delete python_process;
    delete parent;
    

    }
    @

    The real URL is something like this:
    "http://v13.lscache3.c.youtube.com/videoplayback?ip=0.0.0.0&sparams=id,expire,ip,ipbits,itag,algorithm,burst,factor,oc-0.139711U0dYSVNQVl9FSkNNOF9LSlpF&algorithm=throttle-factor&itag=35&ipbits=0&burst=40&sver=3&expire=1294358400&key=yt1&signature=2692D3308AFF7DD578DF45F60E31D1051F5A24AF.7E101F55A1A0E1768DB7E3FFCCD91159F4FFCDE9&factor=1.25&id=cac6fc2f5dfdbda2"

    This script also writes a cookiejar file. I use this cookiejar to try to download the video too.
    It works with curl (a command line tool to transfer data from any server).

    The cookiejar file is something like this:
    " # Netscape HTTP Cookie File
    # http://www.netscape.com/newsref/std/cookie_spec.html
    # This is a generated file! Do not edit.

    .youtube.com TRUE / FALSE 1609612543 PREF f1=50000000&gl=US&hl=en
    .youtube.com TRUE / FALSE 1314988543 VISITOR_INFO1_LIVE FVfDznMtcAM"

    Before starting the download, I parse this cookiejar file and make a QList<QNetworkCookie> variable.
    Then I set this variable to a QNetworkCookieJar variable:
    @
    if (cookie_JAR.setCookiesFromUrl(this->parse_my_cookiejar(this->get_cookie_info()), url))
    qDebug("cookie JAr OK!\n");
    else
    qDebug("COOKIE_JAR_ERROR\n");
    manager->setCookieJar(&cookie_JAR);
    @
    Then, I try to download the video file:
    @
    reply = manager->get(request);
    connect(reply, SIGNAL(downloadProgress(qint64,qint64)), this, SLOT(printProgress(qint64, qint64)));
    connect(reply, SIGNAL(error(QNetworkReply::NetworkError)), this, SLOT(printNotOk(QNetworkReply::NetworkError)));
    connect(manager, SIGNAL(finished(QNetworkReply*)), this, SLOT(videoDownloaded(QNetworkReply*)));
    @
    The output is something like this:

    "
    An error HAS SERIOUSLY OCCURED:
    bytesReceived: 0 / bytesTotal: 0
    received "finished" SIGNAL.

    And error occured while I was trying to download
    Http attribute:
    HttpStatusCode-->403
    "
    I do not know what is happening. It just doesn't download. Can someone help me? Thanks in advance.



  • HTTP Error 403 Forbidden

    When I try to access that link from my explorer I get the same error. So it's probably because your cookie-trick isn't working as expected. You should start by looking into it.



  • Thanks for replying, ian.todorovich.
    I looked into it, and the cookies are correct.
    When I pass the cookies and the url together to curl, it does an url redirection before downloading the video.

    I think I'm having problems with the url redirection or my cookiejar parser is wrong somewhere:
    @
    QList<QNetworkCookie> Script_Youtube_Downloader::parse_my_cookiejar(QString cookie_jar_string)
    {
    QNetworkCookie ck1, ck2;
    QList<QNetworkCookie> Cookie_List;
    bool ok;
    QDateTime cookie_date;

    QList<QString> linhas = cookie_jar_string.split("\n", QString::SkipEmptyParts);

    // qDebug("Taking first cookie");
    QList<QString> linhas_cookie1 = linhas[3].split("\t", QString::SkipEmptyParts);

    // FIRST COOKIE:
    //set domain:
    ck1.setDomain(linhas_cookie1[0]);
    //set HTTPONLY:

    if (linhas_cookie1[1] == "TRUE")
    {
    qDebug("cookie 1 http only : True");
    ck1.setHttpOnly(true);
    }
    else
    {
    ck1.setHttpOnly(false);
    qDebug("cookie 1 http only: False");
    }
    //set path:

    ck1.setPath(linhas_cookie1[2]);
    //set secure

    if (linhas_cookie1[3] == "TRUE")
    {
    ck1.setSecure(true);
    qDebug("cookie 1 secure: True");
    }
    else
    {
    ck1.setSecure(false);
    qDebug("cookie 1 secure: False");
    }

    //set expiration date:

    cookie_date.setTime_t((unsigned int) linhas_cookie1[4].toInt(&ok, 10));
    if (ok)
    ck1.setExpirationDate(cookie_date);
    else
    qDebug("An error while I was trying to set date for cookie 1 has happened");
    //set name:

    ck1.setName(linhas_cookie1[5].toUtf8());
    //set value:

    ck1.setValue(linhas_cookie1[6].toUtf8());

    //inserting cookie 1 on the list:
    Cookie_List.append(ck1);

    //SECOND COOKIE:

    QList<QString> linhas_cookie2 = linhas[4].split("\t", QString::SkipEmptyParts);
    //set domain:
    ck2.setDomain(linhas_cookie2[0]);
    //set HTTPOnly
    //qDebug(linhas_cookie2[1].toUtf8());
    if (linhas_cookie2[1] == "TRUE")
    {
    ck2.setHttpOnly(true);
    qDebug("Cookie 2 http only: True");
    }
    else
    {
    ck2.setHttpOnly(false);
    qDebug("Cookie 2 http only: False");
    }
    //set path:
    ck2.setPath(linhas_cookie2[2]);
    //set secure:

    if (linhas_cookie2[3] == "TRUE")
    {
    ck2.setSecure(true);
    qDebug("Cookie 2 secure: True");
    }
    else
    {
    ck2.setSecure(false);
    qDebug("Cookie 2 secure: False");
    }
    //set expiration time:

    cookie_date.setTime_t((unsigned int) linhas_cookie2[4].toInt(&ok, 10));
    if (ok)
    ck2.setExpirationDate(cookie_date);
    else
    qDebug("An error has happened while I was trying to set date in cookie 2");

    //set name:
    ck2.setName(linhas_cookie2[5].toUtf8());
    //set value:
    ck2.setValue(linhas_cookie2[6].toUtf8());

    //inserting cookie 2 on the list:
    Cookie_List.append(ck2);

    return Cookie_List;
    }
    @
    The QString variable:
    @QString cookie_jar_string@
    Is the string captured from the cookiejar file.
    I tested this code and everything seems ok.

    Does the QNetworkAccessManager needs to post a cookie request before downloading the video?
    I've done this:
    @
    if (cookie_JAR.setCookiesFromUrl(this->parse_my_cookiejar(this->get_cookie_info()),
    passed_url))
    qDebug("cookie JAr OK!\n");
    else
    qDebug("COOKIE_JAR_ERROR\n");
    @
    And it returns "cookie JAR OK!".

    Now I really don't know what to do.
    =(



  • Just one more question:
    When I set a cookiejar for QNetworkAccessManager:
    @
    QNetworkAccessManager * manager = new QNetworkAccessManager(this);
    manager->setCookieJar(&cookie_JAR);
    @
    Do I need to post the cookie info before downloading the video?



  • [quote author="skycrusher" date="1294345809"]Just one more question:
    When I set a cookiejar for QNetworkAccessManager:
    @
    QNetworkAccessManager * manager = new QNetworkAccessManager(this);
    manager->setCookieJar(&cookie_JAR);
    @
    Do I need to post the cookie info before downloading the video?[/quote]

    What do you mean by post the "cookie info"? You need to add the cookies to the cookie jar and set the cookie jar on the manager (in no particular order) before you use the manager to get the file. The manager will take care of sending the cookies based on the domain.

    One question though (I must be missing something): how do you get the cookies on the first place? Just a simple request or there is some scrapping to find the cookies on the response? If a request returns the cookies (as cookies) you are all set for the second request.



  • Hello, fcrochik. Thanks for replying.
    I thought I had to post the cookies names/values to download the video.
    If you're saying that
    @
    manager->setCookieJar(&cookie_JAR);
    @

    is enough, so be it! =)
    But even so, I still can't download the video.

    Answering your question: I get the cookies (and the real video url) from a python script called "youtube-dl.":https://github.com/rg3/youtube-dl

    To make things clearer, I'll show you an example:
    In a command line:
    @
    ./youtube-dl --cookies=cookiejarfile -g 'http://www.youtube.com/watch?v=OYjZK_6i37M&feature=fvst'
    @
    I get the link:
    @
    "http://v2.lscache6.c.youtube.com/videoplayback?ip=0.0.0.0&sparams=id,expire,ip,ipbits,itag,algorithm,burst,factor,oc:U0dYSVNTTl9FSkNNOF9LTVJB&fexp=905500&algorithm=throttle-factor&itag=35&ipbits=0&burst=40&sver=3&expire=1294380000&key=yt1&signature=267F9CBE2AEAF4D1459FBC7369BDCB88DEEF96F7.7E6725F492AB86449870565C68653D73C3F4131B&factor=1.25&id=3988d92bfea2dfb3"
    @

    And this python script also writes a cookie jar file. This file's contents are:
    @
    # Netscape HTTP Cookie File
    # http://www.netscape.com/newsref/std/cookie_spec.html
    # This is a generated file! Do not edit.

    .youtube.com TRUE / FALSE 1609716210 PREF f1=50000000&gl=US&hl=en
    .youtube.com TRUE / FALSE 1315092210 VISITOR_INFO1_LIVE TnDv4gFkTIA
    @
    I read this file and parse the two cookies of it and add them to a QNetworkCookieJar.
    I use this QNetworkCookieJar in a QNetworkAccessManager variable.
    with QNetworkRequest, I use the video url and try to download. With no success. =(
    The script really works. I tested with "Curl":https://github.com/shuber/curl . And the video was downloaded just fine.

    I'm trying to download the video with QT, but I don't know what's wrong. =/



  • It looks like you create the QNetworkCookieJar object on the stack, because you get the pointer with the & operator:

    @
    manager->setCookieJar(&cookie_JAR)
    @

    If you do so, it is wrong. You have to create the cookie jar on the heap with new:

    @
    QNetworkCookieJar cookie_JAR = new QNetworkCookieJar(this)
    // ...
    manager->setCookieJar(cookie_JAR);
    @

    The reason is in the API docs for "QNetworkAccessManager::setCookieJar() ":http://doc.qt.nokia.com/stable/qnetworkaccessmanager.html#setCookieJar:

    bq. Note: QNetworkAccessManager takes ownership of the cookieJar object.

    And being QObject based, it is always a good idea to create QNetworkCookieJar on the heap.

    Also, the object you created on the stack might go out of scope (and be deleted) before QNetworkAccessManager accesses it.



  • [quote author="Volker" date="1294357703"]
    And being QObject based, it is always a good idea to create QNetworkCookieJar on the heap.

    Also, the object you created on the stack might go out of scope (and be deleted) before QNetworkAccessManager accesses it.

    [/quote]

    Volker: good catch! But I am not sure this is the only problem. I would expect a seg fault...

    Skycrusher: once you are taking the time to write this part as a Qt application why don't you rewrite the python script as well? It may actually make the overall process easier and certainly more portable.



  • If you want to a little crazier you can check out "cuteTube":http://talk.maemo.org/showthread.php?t=65854

    I haven't tried and don't know how it does but it may actually be as simple as recompiling it for your platform.



  • fcrochik: This "cutetube" is awesome, but I want a program a little different than that. :)
    I'm trying to make something like a "virtual video jukebox", but thanks anyway. ;)
    I want this program the be Qt-based because I think with Qt/C++ it is easier to make GUIs.

    Volker:
    [quote author="Volker" date="1294357703"]It looks like you create the QNetworkCookieJar object on the stack, because you get the pointer with the & operator:

    @
    manager->setCookieJar(&cookie_JAR)
    @

    If you do so, it is wrong. You have to create the cookie jar on the heap with new.

    [/quote]

    I just did the changes you told me. And the problem remains.
    Let me show you the code:

    @
    void Script_Youtube_Downloader::do_download()
    {
    qDebug("do_download");

    QNetworkAccessManager * manager = new QNetworkAccessManager();
    QNetworkCookieJar * cookie_JAR = new QNetworkCookieJar();
    QUrl url;
    QUrl passed_url;
    QList<QNetworkCookie> cookies;
    QNetworkRequest* request = new QNetworkRequest();
    cookies = this->parse_my_cookiejar(this->get_cookie_info());
    
    
    passed_url.setUrl(this->get_passed_url(), QUrl::StrictMode);
    url.setUrl(this->get_real_url(), QUrl::StrictMode);
    
    request->setUrl(url);
    
    request->setRawHeader("User-Agent",
        "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101028 Firefox/3.6.12");
    request->setRawHeader("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
    request->setRawHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
    request->setRawHeader("Accept-Encoding", "gzip, deflate");
    request->setRawHeader("Accept-Language", "en-us,en;q=0.5");
    
    if (cookie_JAR->setCookiesFromUrl(cookies, url))
        qDebug("cookie JAr OK!\n");
    else
        qDebug("COOKIE_JAR_ERROR\n");
    manager->setCookieJar(cookie_JAR);
    
    if (this->get_cookie_info().isEmpty())
        qDebug("EMPTY cookie jar from python");
    request->setUrl(url);
    
    reply = manager->get(*request);
    //"reply" is a public QNetworkReply attribute of this class
    reply->waitForReadyRead(-1);
    
    
    qDebug("Is it any redirection address?");
    qDebug(reply->attribute(QNetworkRequest::RedirectionTargetAttribute).toUrl().toString().toUtf8());
    connect(reply, SIGNAL(downloadProgress(qint64,qint64)), this, SLOT(printProgress(qint64, qint64)));
    connect(reply, SIGNAL(readyRead()), this, SLOT(printOk()));
    connect(manager, SIGNAL(finished(QNetworkReply*)), this, SLOT(videoDownloaded(QNetworkReply*)));
    connect(reply, SIGNAL(readChannelFinished()), this, SLOT(printOk()) );
    connect(reply, SIGNAL(error(QNetworkReply::NetworkError)), this, SLOT(printNotOk(QNetworkReply::NetworkError)));
    

    }
    @

    And the program debug outuput is:
    @
    /*
    do_download
    Is it any redirection address?

    An error HAS SERIOUSLY OCCURED: 202
    Redirection URL is EMPTY...
    And error occured while I was trying to download
    Http attribute:
    HttpStatusCode-->403
    */
    @



  • Why the waitForReadyRead? Especially before you connect the signals....

    Try to removing it....

    202 is not an error... How did you get two status code (202,403) for the same request?



  • Fcrochik,

    Just removed the "waitForReadyRead" command.

    The error continues.

    Actually, 202 is an error from QNetworkReply (it means that the requested operation was denied by the server). Look what I wrote in the "printnotOk" slot:
    @
    void Script_Youtube_Downloader::printNotOk(QNetworkReply::NetworkError error)
    {
    qDebug("An error HAS SERIOUSLY OCCURED: " + QString::number(error).toUtf8());
    }
    @

    and error 403 is an error from the Http protocol (it means that the operation was denied by the server).
    Look the "videoDownloaded" SLOT implementation:
    @
    void Script_Youtube_Downloader::videoDownloaded(QNetworkReply* reply)
    {
    qDebug("received "finished" SIGNAL.\n");

    QUrl url;
    QUrl redir;
    
    redir = reply->attribute(QNetworkRequest::RedirectionTargetAttribute).toUrl();
    url = reply->url();
    
    if(!url.isValid())
        qDebug("Reply's url is not valid! Review your code.");
    
    if (redir.isEmpty())
    {
        qDebug("Redirection URL is EMPTY...");
    }
    
    if (reply->error())
    {
        qDebug("And error occured while I was trying to download");
        qDebug("Http attribute:");
        qDebug("HttpStatusCode--&gt;%d", reply->attribute(QNetworkRequest::HttpStatusCodeAttribute).toInt());
    }
    else
    {
        if (saveToDisk("SALVA_DIABO!!!", reply))
            qDebug("file saved to: SALVA_DIABO" );
    }
    

    }
    @



  • There is a nice technique that you can use to explore what is going on. Instead of using the connection manager use a QWebView. You can get access to the networkmanager set the cookiejar and then use the webview to navigate to the URL. This will let webkit in charge of all the request handling and you will be able to verify that the cookie/request handling is working.

    p.s. Gostei do "Salva diabo"! (Portuguese)



  • I am having almost the same problem. I am decrypting the real URL of youtube video myself, using the same way as JDownloader, as follows:

    • Use QNetworkAccessManager to access the HTML content of the youtube URL, say: www.youtube.com/watch?v=tBF6MA0xfps

    • Analyze HTML and construct the video's real URL (a very long URL).

    After the above two steps, I could get the video's real URL as well QNetworkCookieJar from the QNetworkAccessManager I used.

    However, I got ERROR 202 from QNetworkReply when try to download from the real URL using the same QNetworkAccessManager and the same QNetworkCookieJar.

    The decrypted URL could be directly played by Chrome, Safari, VLC, or downloaded by wget, curl, and libcurl. This means that accessing the decrypted URL does not require cookie magic.
    I tried using the same or different QNetworkAccessManager with the same or different QNetworkCookieJar to access the decrypted URL. But I always got error 202.
    There must be some bugs in QNetworkAccessManager.

    I also tried Qt 4.8.1 on Windows 7, and Qt 4.7.4 on Mac OS X 10.7, but got the same error 202. I think the bug is probably in Qt's OS-independent code path.



  • This post is deleted!

  • Banned

    This post is deleted!


  • This post is deleted!

Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.