à partir du navigateur d'importation mécaniser br = Navigateur() page = br.open ('http://wow.interzet.ru/news.php?readmore=23') br.form = br.forms() suivant() impression br.form me donne l'erreur suivante:.Comment réparer ce bug ClientForm?
Traceback (most recent call last):
File "C:\Users\roddik\Desktop\mech.py", line 6, in <module>
br.form = br.forms().next()
File "build\bdist.win32\egg\mechanize\_mechanize.py", line 426, in forms
File "D:\py26\lib\site-package\mechanize-0.1.11-py2.6.egg\mechanize\_html.py", line 559, in forms
File "D:\py26\lib\site-packages\mechanize-0.1.11-py2.6.egg\mechanize\_html.py", line 225, in forms
File "D:\py26\lib\site-packages\clientform-0.2.10-py2.6.egg\ClientForm.py", line 967, in ParseResponseEx
File "D:\py26\lib\site-packages\clientform-0.2.10-py2.6.egg\ClientForm.py", line 1100, in _ParseFileEx
File "D:\py26\lib\site-packages\clientform-0.2.10-py2.6.egg\ClientForm.py", line 870, in feed
File "D:\py26\lib\sgmllib.py", line 104, in feed
self.goahead(0)
File "D:\py26\lib\sgmllib.py", line 138, in goahead
k = self.parse_starttag(i)
File "D:\py26\lib\sgmllib.py", line 290, in parse_starttag
self._convert_ref, attrvalue)
File "D:\py26\lib\sgmllib.py", line 302, in _convert_ref
return self.convert_charref(match.group(2)) or \
File "D:\py26\lib\site-packages\clientform-0.2.10-py2.6.egg\ClientForm.py", line 850, in convert_charref
File "D:\py26\lib\site-packages\clientform-0.2.10-py2.6.egg\ClientForm.py", line 244, in unescape_charref
ValueError: invalid literal for int() with base 10: 'e'
Comment puis-je résoudre ce problème?
Edit:
Je l'ai fixé cette façon. Est-ce que c'est bon? Si non, comment plutôt?
import ClientForm
from mechanize import Browser
def myunescape_charref(data, encoding):
if not str(data).isdigit(): return 0
name, base = data, 10
if name.startswith("x"):
name, base= name[1:], 16
uc = unichr(int(name, base))
if encoding is None:
return uc
else:
try:
repl = uc.encode(encoding)
except UnicodeError:
repl = "&#%s;" % data
return repl
ClientForm.unescape_charref = myunescape_charref
Eh bien, je suppose que je n'ai pas besoin de fragments de l'uri du tout. maintenant comment changer de mécaniser pour les remplacer en html avant de l'envoyer au clientform? – Fluffy