Télécharger une image via urllib et python

124

J'essaie donc de créer un script Python qui télécharge les webcomics et les place dans un dossier sur mon bureau. J'ai trouvé quelques programmes similaires ici qui font quelque chose de similaire, mais rien de tel que ce dont j'ai besoin. Celui que j'ai trouvé le plus similaire est ici (http://bytes.com/topic/python/answers/850927-problem-using-urllib-download-images). J'ai essayé en utilisant ce code:Télécharger une image via urllib et python

>>> import urllib 
>>> image = urllib.URLopener() 
>>> image.retrieve("http://www.gunnerkrigg.com//comics/00000001.jpg","00000001.jpg") 
('00000001.jpg', <httplib.HTTPMessage instance at 0x1457a80>)

Je recherche alors mon ordinateur pour un fichier « 00000001.jpg », mais tout ce que je trouvais était l'image en cache de celui-ci. Je ne suis même pas sûr qu'il a enregistré le fichier sur mon ordinateur. Une fois que je comprends comment obtenir le fichier téléchargé, je pense que je sais comment gérer le reste. Pour l'essentiel, il suffit d'utiliser une boucle for et de diviser la chaîne au '00000000'. 'Jpg' et d'incrémenter le '00000000' jusqu'au plus grand nombre, ce que je devrais en quelque sorte déterminer. Des recommandations sur la meilleure façon de le faire ou comment télécharger le fichier correctement?

Merci!

EDIT 6/15/10

Voici le script terminé, il enregistre les fichiers dans un répertoire que vous choisissez. Pour une raison étrange, les fichiers ne téléchargeaient pas et ils ont juste fait. Toutes les suggestions sur la façon de le nettoyer serait très appréciée. Je travaille actuellement sur la façon de trouver de nombreuses bandes dessinées sur le site afin que je puisse obtenir le dernier, plutôt que de quitter le programme après un certain nombre d'exceptions.

import urllib 
import os 

comicCounter=len(os.listdir('/file'))+1 # reads the number of files in the folder to start downloading at the next comic 
errorCount=0 

def download_comic(url,comicName): 
    """ 
    download a comic in the form of 

    url = http://www.example.com 
    comicName = '00000000.jpg' 
    """ 
    image=urllib.URLopener() 
    image.retrieve(url,comicName) # download comicName at URL 

while comicCounter <= 1000: # not the most elegant solution 
    os.chdir('/file') # set where files download to 
     try: 
     if comicCounter < 10: # needed to break into 10^n segments because comic names are a set of zeros followed by a number 
      comicNumber=str('0000000'+str(comicCounter)) # string containing the eight digit comic number 
      comicName=str(comicNumber+".jpg") # string containing the file name 
      url=str("http://www.gunnerkrigg.com//comics/"+comicName) # creates the URL for the comic 
      comicCounter+=1 # increments the comic counter to go to the next comic, must be before the download in case the download raises an exception 
      download_comic(url,comicName) # uses the function defined above to download the comic 
      print url 
     if 10 <= comicCounter < 100: 
      comicNumber=str('000000'+str(comicCounter)) 
      comicName=str(comicNumber+".jpg") 
      url=str("http://www.gunnerkrigg.com//comics/"+comicName) 
      comicCounter+=1 
      download_comic(url,comicName) 
      print url 
     if 100 <= comicCounter < 1000: 
      comicNumber=str('00000'+str(comicCounter)) 
      comicName=str(comicNumber+".jpg") 
      url=str("http://www.gunnerkrigg.com//comics/"+comicName) 
      comicCounter+=1 
      download_comic(url,comicName) 
      print url 
     else: # quit the program if any number outside this range shows up 
      quit 
    except IOError: # urllib raises an IOError for a 404 error, when the comic doesn't exist 
     errorCount+=1 # add one to the error count 
     if errorCount>3: # if more than three errors occur during downloading, quit the program 
      break 
     else: 
      print str("comic"+ ' ' + str(comicCounter) + ' ' + "does not exist") # otherwise say that the certain comic number doesn't exist 
print "all comics are up to date" # prints if all comics are downloaded

Télécharger une image via urllib et python

Répondre

Questions connexes