Wednesday, August 4, 2010

Downloading and converting multiple flv files from a external website

I was trying to download all the flv files shown in a website out of my control (in my case to be able to play them offline at any time.

To make it I have implemented a script scrapping all the web pages of the site (the urls of the pages have a simple format) to extract the url of the flv file from the HTML, download those files and convert them to a typical avi format.

This is the python script, feel free to adapt it to your needs:

import urllib2
import os
import re

base = ""

for id in range(1, 422):
    page = urllib2.urlopen(base + "/melodias.php?id=" + str(id))
    html =
    m ="so.addVariable\(\"file\",\"(.*)&autostart=true\"\);", html)
    print "Downloaded HTML Page " + str(id)
    if m:
        print "Found flv reference " +
        url = base +
        output = "video_" + str(id) + ".avi"
        os.system("wget \"" + url + "\" -O - | ffmpeg -i pipe: " + output)