2016-06-07 1 views
2

J'ai suivi the official guide, mais nous avons eu ce message d'erreur:Comment installer Scrapy sur Unbuntu 16.04?

The following packages have unmet dependencies: 
scrapy : Depends: python-support (>= 0.90.0) but it is not installable 
     Recommends: python-setuptools but it is not going to be installed 
E: Unable to correct problems, you have held broken packages. 

J'ai ensuite essayé sudo apt-get python-support, mais ubuntu trouvé 16,04 enlevé python-support.

Enfin, j'ai essayé d'installer python-setuptools, mais il semble que ce serait installer uniquement python2 à la place.

The following additional packages will be installed: 
libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python 
python-minimal python-pkg-resources python2.7 python2.7-minimal 
Suggested packages: 
python-doc python-tk python-setuptools-doc python2.7-doc binutils 
binfmt-support 
The following NEW packages will be installed: 
libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python 
python-minimal python-pkg-resources python-setuptools python2.7 
python2.7-minimal 

Que dois-je faire pour utiliser Scrapy dans l'environnement Python 3 sur Ubuntu 16.04? Merci.

Répondre

2

Vous devriez être bien avec:

apt-get install -y \ 
    python3 \ 
    python-dev \ 
    python3-dev 

# for cryptography 
apt-get install -y \ 
    build-essential \ 
    libssl-dev \ 
    libffi-dev 

# for lxml 
apt-get install -y \ 
    libxml2-dev \ 
    libxslt-dev 

# install pip 
apt-get install -y python-pip 

Voici un exemple Dockerfile pour tester l'installation scrapy sur Python 3, sur Ubuntu 16.04/Xenial:

$ cat Dockerfile 
FROM ubuntu:xenial 

ENV DEBIAN_FRONTEND noninteractive 

RUN apt-get update 

# Install Python3 and dev headers 
RUN apt-get install -y \ 
    python3 \ 
    python-dev \ 
    python3-dev 

# Install cryptography 
RUN apt-get install -y \ 
    build-essential \ 
    libssl-dev \ 
    libffi-dev 

# install lxml 
RUN apt-get install -y \ 
    libxml2-dev \ 
    libxslt-dev 

# install pip 
RUN apt-get install -y python-pip 

RUN useradd --create-home --shell /bin/bash scrapyuser 

USER scrapyuser 
WORKDIR /home/scrapyuser 

Puis, après avoir construit l'image Docker et l'exécution d'un conteneur pour avec:

$ sudo docker build -t redapple/scrapy-ubuntu-xenial . 
$ sudo docker run -t -i redapple/scrapy-ubuntu-xenial 

vous pouvez exécuter pip install scrapy

Ci-dessous j'utilise virtualenvwrapper pour créer un Python 3 virtualenv:

[email protected]:~$ pip install --user virtualenvwrapper 
Collecting virtualenvwrapper 
    Downloading virtualenvwrapper-4.7.1-py2.py3-none-any.whl 
Collecting virtualenv-clone (from virtualenvwrapper) 
    Downloading virtualenv-clone-0.2.6.tar.gz 
Collecting stevedore (from virtualenvwrapper) 
    Downloading stevedore-1.14.0-py2.py3-none-any.whl 
Collecting virtualenv (from virtualenvwrapper) 
    Downloading virtualenv-15.0.2-py2.py3-none-any.whl (1.8MB) 
    100% |################################| 1.8MB 320kB/s 
Collecting pbr>=1.6 (from stevedore->virtualenvwrapper) 
    Downloading pbr-1.10.0-py2.py3-none-any.whl (96kB) 
    100% |################################| 102kB 1.5MB/s 
Collecting six>=1.9.0 (from stevedore->virtualenvwrapper) 
    Downloading six-1.10.0-py2.py3-none-any.whl 
Building wheels for collected packages: virtualenv-clone 
    Running setup.py bdist_wheel for virtualenv-clone ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/24/51/ef/93120d304d240b4b6c2066454250a1626e04f73d34417b956d 
Successfully built virtualenv-clone 
Installing collected packages: virtualenv-clone, pbr, six, stevedore, virtualenv, virtualenvwrapper 
Successfully installed pbr six stevedore virtualenv virtualenv-clone virtualenvwrapper 
You are using pip version 8.1.1, however version 8.1.2 is available. 
You should consider upgrading via the 'pip install --upgrade pip' command. 
[email protected]:~$ source ~/.local/bin/virtualenvwrapper.sh 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/premkproject 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postmkproject 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/initialize 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/premkvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postmkvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/prermvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postrmvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/predeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postdeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/preactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/get_env_details 
[email protected]:~$ export PATH=$PATH:/home/scrapyuser/.local/bin 
[email protected]:~$ mkvirtualenv --python=/usr/bin/python3 scrapy11.py3 
Running virtualenv with interpreter /usr/bin/python3 
Using base prefix '/usr' 
New python executable in /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/python3 
Also creating executable in /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/python 
Installing setuptools, pip, wheel...done. 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/predeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/postdeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/preactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/postactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/get_env_details 

et l'installation scrapy 1.1 est une question de pip install scrapy

(scrapy11.py3) [email protected]:~$ pip install scrapy 
Collecting scrapy 
    Downloading Scrapy-1.1.0-py2.py3-none-any.whl (294kB) 
    100% |################################| 296kB 1.0MB/s 
Collecting PyDispatcher>=2.0.5 (from scrapy) 
    Downloading PyDispatcher-2.0.5.tar.gz 
Collecting pyOpenSSL (from scrapy) 
    Downloading pyOpenSSL-16.0.0-py2.py3-none-any.whl (45kB) 
    100% |################################| 51kB 1.8MB/s 
Collecting lxml (from scrapy) 
    Downloading lxml-3.6.0.tar.gz (3.7MB) 
    100% |################################| 3.7MB 312kB/s 
Collecting parsel>=0.9.3 (from scrapy) 
    Downloading parsel-1.0.2-py2.py3-none-any.whl 
Collecting six>=1.5.2 (from scrapy) 
    Using cached six-1.10.0-py2.py3-none-any.whl 
Collecting Twisted>=10.0.0 (from scrapy) 
    Downloading Twisted-16.2.0.tar.bz2 (2.9MB) 
    100% |################################| 2.9MB 307kB/s 
Collecting queuelib (from scrapy) 
    Downloading queuelib-1.4.2-py2.py3-none-any.whl 
Collecting cssselect>=0.9 (from scrapy) 
    Downloading cssselect-0.9.1.tar.gz 
Collecting w3lib>=1.14.2 (from scrapy) 
    Downloading w3lib-1.14.2-py2.py3-none-any.whl 
Collecting service-identity (from scrapy) 
    Downloading service_identity-16.0.0-py2.py3-none-any.whl 
Collecting cryptography>=1.3 (from pyOpenSSL->scrapy) 
    Downloading cryptography-1.4.tar.gz (399kB) 
    100% |################################| 409kB 1.1MB/s 
Collecting zope.interface>=4.0.2 (from Twisted>=10.0.0->scrapy) 
    Downloading zope.interface-4.1.3.tar.gz (141kB) 
    100% |################################| 143kB 1.3MB/s 
Collecting attrs (from service-identity->scrapy) 
    Downloading attrs-16.0.0-py2.py3-none-any.whl 
Collecting pyasn1 (from service-identity->scrapy) 
    Downloading pyasn1-0.1.9-py2.py3-none-any.whl 
Collecting pyasn1-modules (from service-identity->scrapy) 
    Downloading pyasn1_modules-0.0.8-py2.py3-none-any.whl 
Collecting idna>=2.0 (from cryptography>=1.3->pyOpenSSL->scrapy) 
    Downloading idna-2.1-py2.py3-none-any.whl (54kB) 
    100% |################################| 61kB 2.0MB/s 
Requirement already satisfied (use --upgrade to upgrade): setuptools>=11.3 in ./.virtualenvs/scrapy11.py3/lib/python3.5/site-packages (from cryptography>=1.3->pyOpenSSL->scrapy) 
Collecting cffi>=1.4.1 (from cryptography>=1.3->pyOpenSSL->scrapy) 
    Downloading cffi-1.6.0.tar.gz (397kB) 
    100% |################################| 399kB 1.1MB/s 
Collecting pycparser (from cffi>=1.4.1->cryptography>=1.3->pyOpenSSL->scrapy) 
    Downloading pycparser-2.14.tar.gz (223kB) 
    100% |################################| 225kB 1.2MB/s 
Building wheels for collected packages: PyDispatcher, lxml, Twisted, cssselect, cryptography, zope.interface, cffi, pycparser 
    Running setup.py bdist_wheel for PyDispatcher ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/86/02/a1/5857c77600a28813aaf0f66d4e4568f50c9f133277a4122411 
    Running setup.py bdist_wheel for lxml ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/6c/eb/a1/e4ff54c99630e3cc6ec659287c4fd88345cd78199923544412 
    Running setup.py bdist_wheel for Twisted ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/fe/9d/3f/9f7b1c768889796c01929abb7cdfa2a9cdd32bae64eb7aa239 
    Running setup.py bdist_wheel for cssselect ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/1b/41/70/480fa9516ccc4853a474faf7a9fb3638338fc99a9255456dd0 
    Running setup.py bdist_wheel for cryptography ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/f6/6c/21/11ec069285a52d7fa8c735be5fc2edfb8b24012c0f78f93d20 
    Running setup.py bdist_wheel for zope.interface ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/52/04/ad/12c971c57ca6ee5e6d77019c7a1b93105b1460d8c2db6e4ef1 
    Running setup.py bdist_wheel for cffi ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/8f/00/29/553c1b1db38bbeec3fec428ae4e400cd8349ecd99fe86edea1 
    Running setup.py bdist_wheel for pycparser ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/9b/f4/2e/d03e949a551719a1ffcb659f2c63d8444f4df12e994ce52112 
Successfully built PyDispatcher lxml Twisted cssselect cryptography zope.interface cffi pycparser 
Installing collected packages: PyDispatcher, idna, pyasn1, six, pycparser, cffi, cryptography, pyOpenSSL, lxml, w3lib, cssselect, parsel, zope.interface, Twisted, queuelib, attrs, pyasn1-modules, service-identity, scrapy 
Successfully installed PyDispatcher-2.0.5 Twisted-16.2.0 attrs-16.0.0 cffi-1.6.0 cryptography-1.4 cssselect-0.9.1 idna-2.1 lxml-3.6.0 parsel-1.0.2 pyOpenSSL-16.0.0 pyasn1-0.1.9 pyasn1-modules-0.0.8 pycparser-2.14 queuelib-1.4.2 scrapy-1.1.0 service-identity-16.0.0 six-1.10.0 w3lib-1.14.2 zope.interface-4.1.3 

test Enfin, l'exemple de projet:

(scrapy11.py3) [email protected]:~$ scrapy startproject tutorial 
New Scrapy project 'tutorial', using template directory '/home/scrapyuser/.virtualenvs/scrapy11.py3/lib/python3.5/site-packages/scrapy/templates/project', created in: 
    /home/scrapyuser/tutorial 

You can start your first spider with: 
    cd tutorial 
    scrapy genspider example example.com 
(scrapy11.py3) [email protected]:~$ cd tutorial 
(scrapy11.py3) [email protected]:~/tutorial$ scrapy genspider example example.com 
Created spider 'example' using template 'basic' in module: 
    tutorial.spiders.example 
(scrapy11.py3) [email protected]:~/tutorial$ cat tutorial/spiders/example.py 
# -*- coding: utf-8 -*- 
import scrapy 


class ExampleSpider(scrapy.Spider): 
    name = "example" 
    allowed_domains = ["example.com"] 
    start_urls = (
     'http://www.example.com/', 
    ) 

    def parse(self, response): 
     pass 
(scrapy11.py3) [email protected]:~/tutorial$ scrapy crawl example 
2016-06-07 11:08:27 [scrapy] INFO: Scrapy 1.1.0 started (bot: tutorial) 
2016-06-07 11:08:27 [scrapy] INFO: Overridden settings: {'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial', 'ROBOTSTXT_OBEY': True, 'NEWSPIDER_MODULE': 'tutorial.spiders'} 
2016-06-07 11:08:27 [scrapy] INFO: Enabled extensions: 
['scrapy.extensions.logstats.LogStats', 'scrapy.extensions.corestats.CoreStats'] 
2016-06-07 11:08:27 [scrapy] INFO: Enabled downloader middlewares: 
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware', 
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 
'scrapy.downloadermiddlewares.retry.RetryMiddleware', 
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware', 
'scrapy.downloadermiddlewares.stats.DownloaderStats'] 
2016-06-07 11:08:27 [scrapy] INFO: Enabled spider middlewares: 
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 
'scrapy.spidermiddlewares.referer.RefererMiddleware', 
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 
'scrapy.spidermiddlewares.depth.DepthMiddleware'] 
2016-06-07 11:08:27 [scrapy] INFO: Enabled item pipelines: 
[] 
2016-06-07 11:08:27 [scrapy] INFO: Spider opened 
2016-06-07 11:08:28 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 
2016-06-07 11:08:28 [scrapy] DEBUG: Crawled (404) <GET http://www.example.com/robots.txt> (referer: None) 
2016-06-07 11:08:28 [scrapy] DEBUG: Crawled (200) <GET http://www.example.com/> (referer: None) 
2016-06-07 11:08:28 [scrapy] INFO: Closing spider (finished) 
2016-06-07 11:08:28 [scrapy] INFO: Dumping Scrapy stats: 
{'downloader/request_bytes': 436, 
'downloader/request_count': 2, 
'downloader/request_method_count/GET': 2, 
'downloader/response_bytes': 1921, 
'downloader/response_count': 2, 
'downloader/response_status_count/200': 1, 
'downloader/response_status_count/404': 1, 
'finish_reason': 'finished', 
'finish_time': datetime.datetime(2016, 6, 7, 11, 8, 28, 614605), 
'log_count/DEBUG': 2, 
'log_count/INFO': 7, 
'response_received_count': 2, 
'scheduler/dequeued': 1, 
'scheduler/dequeued/memory': 1, 
'scheduler/enqueued': 1, 
'scheduler/enqueued/memory': 1, 
'start_time': datetime.datetime(2016, 6, 7, 11, 8, 28, 24624)} 
2016-06-07 11:08:28 [scrapy] INFO: Spider closed (finished) 
(scrapy11.py3) [email protected]:~/tutorial$ 
+0

Merci Paul. J'ai suivi vos pas (mais sans utiliser docker et virtualenv), et l'installation a été réussie. Cependant, mon python par défaut devient 2.7.11 qui était 3.5.1 par défaut. Possible de résoudre ce problème? – Harrison

+0

J'ai reçu le message 'Successfully installed ... scrapy ...', mais lorsque j'ai exécuté 'scrapy startproject myProject', j'ai reçu un message d'erreur disant' Le programme 'scrapy' n'est pas installé actuellement. Vous pouvez l'installer en tapant: sudo apt installer python-scrapy' – Harrison

+0

Le second problème a été résolu par 'sudo pip install scrapy' – Harrison