Skip to content

Commit

Permalink
Use https links wherever possible
Browse files Browse the repository at this point in the history
  • Loading branch information
codeaditya committed Oct 28, 2017
1 parent 79df51a commit 9d9d83a
Show file tree
Hide file tree
Showing 37 changed files with 76 additions and 76 deletions.
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
The guidelines for contributing are available here:
http://doc.scrapy.org/en/master/contributing.html
https://doc.scrapy.org/en/master/contributing.html

Please do not abuse the issue tracker for support questions.
If your issue topic can be rephrased to "How to ...?", please use the
support channels to get it answered: http://scrapy.org/community/
support channels to get it answered: https://scrapy.org/community/
2 changes: 1 addition & 1 deletion INSTALL
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
For information about installing Scrapy see:

* docs/intro/install.rst (local file)
* http://doc.scrapy.org/en/latest/intro/install.html (online version)
* https://doc.scrapy.org/en/latest/intro/install.html (online version)
14 changes: 7 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ crawl websites and extract structured data from their pages. It can be used for
a wide range of purposes, from data mining to monitoring and automated testing.

For more information including a list of features check the Scrapy homepage at:
http://scrapy.org
https://scrapy.org

Requirements
============
Expand All @@ -47,12 +47,12 @@ The quick way::
pip install scrapy

For more details see the install section in the documentation:
http://doc.scrapy.org/en/latest/intro/install.html
https://doc.scrapy.org/en/latest/intro/install.html

Documentation
=============

Documentation is available online at http://doc.scrapy.org/ and in the ``docs``
Documentation is available online at https://doc.scrapy.org/ and in the ``docs``
directory.

Releases
Expand All @@ -63,12 +63,12 @@ You can find release notes at https://doc.scrapy.org/en/latest/news.html
Community (blog, twitter, mail list, IRC)
=========================================

See http://scrapy.org/community/
See https://scrapy.org/community/

Contributing
============

See http://doc.scrapy.org/en/master/contributing.html
See https://doc.scrapy.org/en/master/contributing.html

Code of Conduct
---------------
Expand All @@ -82,9 +82,9 @@ Please report unacceptable behavior to [email protected].
Companies using Scrapy
======================

See http://scrapy.org/companies/
See https://scrapy.org/companies/

Commercial Support
==================

See http://scrapy.org/support/
See https://scrapy.org/support/
6 changes: 3 additions & 3 deletions debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Priority: optional
Maintainer: Scrapinghub Team <[email protected]>
Build-Depends: debhelper (>= 7.0.50), python (>=2.7), python-twisted, python-w3lib, python-lxml, python-six (>=1.5.2)
Standards-Version: 3.8.4
Homepage: http://scrapy.org/
Homepage: https://scrapy.org/

Package: scrapy
Architecture: all
Expand All @@ -15,6 +15,6 @@ Conflicts: python-scrapy, scrapy-0.25
Provides: python-scrapy, scrapy-0.25
Description: Python web crawling and web scraping framework
Scrapy is a fast high-level web crawling and web scraping framework,
used to crawl websites and extract structured data from their pages.
It can be used for a wide range of purposes, from data mining to
used to crawl websites and extract structured data from their pages.
It can be used for a wide range of purposes, from data mining to
monitoring and automated testing.
8 changes: 4 additions & 4 deletions debian/copyright
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
This package was debianized by the Scrapinghub team <[email protected]>.

It was downloaded from http://scrapy.org
It was downloaded from https://scrapy.org

Upstream Author: Scrapy Developers

Expand All @@ -14,10 +14,10 @@ All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright

2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

Expand Down
2 changes: 1 addition & 1 deletion docs/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Contributing to Scrapy
.. important::

Double check you are reading the most recent version of this document at
http://doc.scrapy.org/en/master/contributing.html
https://doc.scrapy.org/en/master/contributing.html

There are many ways to contribute to Scrapy. Here are some of them:

Expand Down
2 changes: 1 addition & 1 deletion docs/intro/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ The next steps for you are to :ref:`install Scrapy <intro-install>`,
a full-blown Scrapy project and `join the community`_. Thanks for your
interest!

.. _join the community: http://scrapy.org/community/
.. _join the community: https://scrapy.org/community/
.. _web scraping: https://en.wikipedia.org/wiki/Web_scraping
.. _Amazon Associates Web Services: https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html
.. _Amazon S3: https://aws.amazon.com/s3/
Expand Down
2 changes: 1 addition & 1 deletion docs/topics/practices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ If you are still unable to prevent your bot getting banned, consider contacting
`commercial support`_.

.. _Tor project: https://www.torproject.org/
.. _commercial support: http://scrapy.org/support/
.. _commercial support: https://scrapy.org/support/
.. _ProxyMesh: https://proxymesh.com/
.. _Google cache: http://www.googleguide.com/cached_pages.html
.. _testspiders: https://github.com/scrapinghub/testspiders
Expand Down
4 changes: 2 additions & 2 deletions docs/topics/selectors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ To explain how to use the selectors we'll use the `Scrapy shell` (which
provides interactive testing) and an example page located in the Scrapy
documentation server:

http://doc.scrapy.org/en/latest/_static/selectors-sample1.html
https://doc.scrapy.org/en/latest/_static/selectors-sample1.html

.. _topics-selectors-htmlcode:

Expand All @@ -99,7 +99,7 @@ Here's its HTML code:

First, let's open the shell::

scrapy shell http://doc.scrapy.org/en/latest/_static/selectors-sample1.html
scrapy shell https://doc.scrapy.org/en/latest/_static/selectors-sample1.html

Then, after the shell loads, you'll have the response available as ``response``
shell variable, and its attached selector in ``response.selector`` attribute.
Expand Down
8 changes: 4 additions & 4 deletions docs/topics/shell.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ Example of shell session
========================

Here's an example of a typical shell session where we start by scraping the
http://scrapy.org page, and then proceed to scrape the https://reddit.com
https://scrapy.org page, and then proceed to scrape the https://reddit.com
page. Finally, we modify the (Reddit) request method to POST and re-fetch it
getting an error. We end the session by typing Ctrl-D (in Unix systems) or
Ctrl-Z in Windows.
Expand All @@ -154,7 +154,7 @@ shell works.

First, we launch the shell::

scrapy shell 'http://scrapy.org' --nolog
scrapy shell 'https://scrapy.org' --nolog

Then, the shell fetches the URL (using the Scrapy downloader) and prints the
list of available objects and useful shortcuts (you'll notice that these lines
Expand All @@ -164,7 +164,7 @@ all start with the ``[s]`` prefix)::
[s] scrapy scrapy module (contains scrapy.Request, scrapy.Selector, etc)
[s] crawler <scrapy.crawler.Crawler object at 0x7f07395dd690>
[s] item {}
[s] request <GET http://scrapy.org>
[s] request <GET https://scrapy.org>
[s] response <200 https://scrapy.org/>
[s] settings <scrapy.settings.Settings object at 0x7f07395dd710>
[s] spider <DefaultSpider 'default' at 0x7f0735891690>
Expand All @@ -182,7 +182,7 @@ After that, we can start playing with the objects::
>>> response.xpath('//title/text()').extract_first()
'Scrapy | A Fast and Powerful Scraping and Web Crawling Framework'

>>> fetch("http://reddit.com")
>>> fetch("https://reddit.com")

>>> response.xpath('//title/text()').extract()
['reddit: the front page of the internet']
Expand Down
4 changes: 2 additions & 2 deletions scrapy/_monkeypatches.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
if sys.version_info[0] == 2:
from urlparse import urlparse

# workaround for http://bugs.python.org/issue7904 - Python < 2.7
# workaround for https://bugs.python.org/issue7904 - Python < 2.7
if urlparse('s3://bucket/key').netloc != 'bucket':
from urlparse import uses_netloc
uses_netloc.append('s3')

# workaround for http://bugs.python.org/issue9374 - Python < 2.7.4
# workaround for https://bugs.python.org/issue9374 - Python < 2.7.4
if urlparse('s3://bucket/key?key=value').query != 'key=value':
from urlparse import uses_query
uses_query.append('s3')
Expand Down
4 changes: 2 additions & 2 deletions scrapy/core/downloader/contextfactory.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ class BrowserLikeContextFactory(ScrapyClientContextFactory):
"""
Twisted-recommended context factory for web clients.
Quoting http://twistedmatrix.com/documents/current/api/twisted.web.client.Agent.html:
Quoting https://twistedmatrix.com/documents/current/api/twisted.web.client.Agent.html:
"The default is to use a BrowserLikePolicyForHTTPS,
so unless you have special requirements you can leave this as-is."
Expand Down Expand Up @@ -100,6 +100,6 @@ def __init__(self, method=SSL.SSLv23_METHOD):
def getContext(self, hostname=None, port=None):
ctx = ClientContextFactory.getContext(self)
# Enable all workarounds to SSL bugs as documented by
# http://www.openssl.org/docs/ssl/SSL_CTX_set_options.html
# https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set_options.html
ctx.set_options(SSL.OP_ALL)
return ctx
2 changes: 1 addition & 1 deletion scrapy/crawler.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def crawl(self, *args, **kwargs):
yield defer.maybeDeferred(self.engine.start)
except Exception:
# In Python 2 reraising an exception after yield discards
# the original traceback (see http://bugs.python.org/issue7563),
# the original traceback (see https://bugs.python.org/issue7563),
# so sys.exc_info() workaround is used.
# This workaround also works in Python 3, but it is not needed,
# and it is slower, so in Python 3 we use native `raise`.
Expand Down
2 changes: 1 addition & 1 deletion scrapy/downloadermiddlewares/chunked.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

class ChunkedTransferMiddleware(object):
"""This middleware adds support for chunked transfer encoding, as
documented in: http://en.wikipedia.org/wiki/Chunked_transfer_encoding
documented in: https://en.wikipedia.org/wiki/Chunked_transfer_encoding
"""

def process_response(self, request, response, spider):
Expand Down
2 changes: 1 addition & 1 deletion scrapy/downloadermiddlewares/httpcache.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def process_response(self, request, response, spider):
return response

# RFC2616 requires origin server to set Date header,
# http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.18
# https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.18
if 'Date' not in response.headers:
response.headers['Date'] = formatdate(usegmt=1)

Expand Down
2 changes: 1 addition & 1 deletion scrapy/exporters.py
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ def _export_xml_field(self, name, serialized_value, depth):
self.xg.endElement(name)
self._beautify_newline()

# Workaround for http://bugs.python.org/issue17606
# Workaround for https://bugs.python.org/issue17606
# Before Python 2.7.4 xml.sax.saxutils required bytes;
# since 2.7.4 it requires unicode. The bug is likely to be
# fixed in 2.7.6, but 2.7.6 will still support unicode,
Expand Down
10 changes: 5 additions & 5 deletions scrapy/extensions/httpcache.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ def should_cache_request(self, request):
return True

def should_cache_response(self, response, request):
# What is cacheable - http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec14.9.1
# Response cacheability - http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4
# What is cacheable - https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec14.9.1
# Response cacheability - https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4
# Status code 206 is not included because cache can not deal with partial contents
cc = self._parse_cachecontrol(response)
# obey directive "Cache-Control: no-store"
Expand Down Expand Up @@ -163,7 +163,7 @@ def _get_max_age(self, cc):

def _compute_freshness_lifetime(self, response, request, now):
# Reference nsHttpResponseHead::ComputeFreshnessLifetime
# http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#410
# https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#706
cc = self._parse_cachecontrol(response)
maxage = self._get_max_age(cc)
if maxage is not None:
Expand Down Expand Up @@ -194,7 +194,7 @@ def _compute_freshness_lifetime(self, response, request, now):

def _compute_current_age(self, response, request, now):
# Reference nsHttpResponseHead::ComputeCurrentAge
# http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#366
# https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#658
currentage = 0
# If Date header is not set we assume it is a fast connection, and
# clock is in sync with the server
Expand Down Expand Up @@ -414,7 +414,7 @@ def _request_key(self, request):
def parse_cachecontrol(header):
"""Parse Cache-Control header
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
>>> parse_cachecontrol(b'public, max-age=3600') == {b'public': None,
... b'max-age': b'3600'}
Expand Down
2 changes: 1 addition & 1 deletion scrapy/extensions/telnet.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ def _get_telnet_vars(self):
'prefs': print_live_refs,
'hpy': hpy,
'help': "This is Scrapy telnet console. For more info see: " \
"http://doc.scrapy.org/en/latest/topics/telnetconsole.html",
"https://doc.scrapy.org/en/latest/topics/telnetconsole.html",
}
self.crawler.signals.send_catch_log(update_telnet_vars, telnet_vars=telnet_vars)
return telnet_vars
4 changes: 2 additions & 2 deletions scrapy/pipelines/files.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ def _onsuccess(boto_key):

def _get_boto_bucket(self):
# disable ssl (is_secure=False) because of this python bug:
# http://bugs.python.org/issue5103
# https://bugs.python.org/issue5103
c = self.S3Connection(self.AWS_ACCESS_KEY_ID, self.AWS_SECRET_ACCESS_KEY, is_secure=False)
return c.get_bucket(self.bucket, validate=False)

Expand Down Expand Up @@ -268,7 +268,7 @@ class FilesPipeline(MediaPipeline):
def __init__(self, store_uri, download_func=None, settings=None):
if not store_uri:
raise NotConfigured

if isinstance(settings, dict) or settings is None:
settings = Settings(settings)

Expand Down
2 changes: 1 addition & 1 deletion scrapy/signalmanager.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def send_catch_log_deferred(self, signal, **kwargs):
The keyword arguments are passed to the signal handlers (connected
through the :meth:`connect` method).
.. _deferreds: http://twistedmatrix.com/documents/current/core/howto/defer.html
.. _deferreds: https://twistedmatrix.com/documents/current/core/howto/defer.html
"""
kwargs.setdefault('sender', self.sender)
return _signal.send_catch_log_deferred(signal, **kwargs)
Expand Down
2 changes: 1 addition & 1 deletion scrapy/templates/project/module/items.py.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html
# https://doc.scrapy.org/en/latest/topics/items.html

import scrapy

Expand Down
2 changes: 1 addition & 1 deletion scrapy/templates/project/module/middlewares.py.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Define here the models for your spider middleware
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/spider-middleware.html
# https://doc.scrapy.org/en/latest/topics/spider-middleware.html

from scrapy import signals

Expand Down
2 changes: 1 addition & 1 deletion scrapy/templates/project/module/pipelines.py.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html
# See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html


class ${ProjectName}Pipeline(object):
Expand Down
Loading

0 comments on commit 9d9d83a

Please sign in to comment.