J'essaye d'écrire une fonction en Python pour utiliser un proxy public anonyme et récupérer une page web, mais j'ai une erreur plutôt étrange.
Le code (je Python 2.4):utiliser proxy en python pour récupérer une page web
import urllib2
def get_source_html_proxy(url, pip, timeout):
# timeout in seconds (maximum number of seconds willing for the code to wait in
# case there is a proxy that is not working, then it gives up)
proxy_handler = urllib2.ProxyHandler({'http': pip})
opener = urllib2.build_opener(proxy_handler)
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
req=urllib2.Request(url)
sock=urllib2.urlopen(req)
timp=0 # a counter that is going to measure the time until the result (webpage) is
# returned
while 1:
data = sock.read(1024)
timp=timp+1
if len(data) < 1024: break
timpLimita=50000000 * timeout
if timp==timpLimita: # 5 millions is about 1 second
break
if timp==timpLimita:
print IPul + ": Connection is working, but the webpage is fetched in more than 50 seconds. This proxy returns the following IP: " + str(data)
return str(data)
else:
print "This proxy " + IPul + "= good proxy. " + "It returns the following IP: " + str(data)
return str(data)
# Now, I call the function to test it for one single proxy (IP:port) that does not support user and password (a public high anonymity proxy)
#(I put a proxy that I know is working - slow, but is working)
rez=get_source_html_proxy("http://www.whatismyip.com/automation/n09230945.asp", "93.84.221.248:3128", 50)
print rez
L'erreur:
retraçage (le plus récent appel dernier):
Fichier » ./public_html/cgi-bin/teste5.py ", ligne 43, dans?
= get_source_html_proxy rez ("http://www.whatismyip.com/automation/n09230945.asp", "xx.yy.zzz.ww: 3128", 50)
File "./public_html/cgi-bin/teste5.py", line 18, in get_source_html_proxy sock=urllib2.urlopen(req)
File "/usr/lib64/python2.4/urllib2.py", line 130, in urlopen return _opener.open(url, data)
File "/usr/lib64/python2.4/urllib2.py", line 358, in open response = self._open(req, data)
File "/usr/lib64/python2.4/urllib2.py", line 376, in _open '_open', req)
File "/usr/lib64/python2.4/urllib2.py", line 337, in _call_chain result = func(*args)
File "/usr/lib64/python2.4/urllib2.py", line 573, in lambda r, proxy=url, type=type, meth=self.proxy_open: \
File "/usr/lib64/python2.4/urllib2.py", line 580, in proxy_open if '@' in host:
TypeError: iterable argument required
Je ne sais pas pourquoi le caractère "@" est un problème (je ne tel que dans mon code Devrais-je avoir?)
Merci d'avance pour votre aide précieuse.
Merci, mais ce que l'hôte? L'IP: port de proxy? Ou l'URL? – carmao
un débogueur peut vous montrer plus de détails dans le retraçage. essayez winpdb, Wing IDE ou ipython (avec '% xmode verbose' et'% debug') – keturn
merci pour les astuces. – carmao