Use https links wherever possible

bitmakerla · Oct 28, 2017 · 9d9d83a · 9d9d83a
1 parent 79df51a
commit 9d9d83a
Show file tree

Hide file tree

Showing 37 changed files with 76 additions and 76 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,6 +1,6 @@
 The guidelines for contributing are available here:
-http://doc.scrapy.org/en/master/contributing.html
+https://doc.scrapy.org/en/master/contributing.html
 
 Please do not abuse the issue tracker for support questions.
 If your issue topic can be rephrased to "How to ...?", please use the
-support channels to get it answered: http://scrapy.org/community/
+support channels to get it answered: https://scrapy.org/community/
diff --git a/INSTALL b/INSTALL
@@ -1,4 +1,4 @@
 For information about installing Scrapy see:
 
 * docs/intro/install.rst (local file)
-* http://doc.scrapy.org/en/latest/intro/install.html (online version)
+* https://doc.scrapy.org/en/latest/intro/install.html (online version)
diff --git a/README.rst b/README.rst
@@ -31,7 +31,7 @@ crawl websites and extract structured data from their pages. It can be used for
 a wide range of purposes, from data mining to monitoring and automated testing.
 
 For more information including a list of features check the Scrapy homepage at:
-http://scrapy.org
+https://scrapy.org
 
 Requirements
 ============
@@ -47,12 +47,12 @@ The quick way::
     pip install scrapy
 
 For more details see the install section in the documentation:
-http://doc.scrapy.org/en/latest/intro/install.html
+https://doc.scrapy.org/en/latest/intro/install.html
 
 Documentation
 =============
 
-Documentation is available online at http://doc.scrapy.org/ and in the ``docs``
+Documentation is available online at https://doc.scrapy.org/ and in the ``docs``
 directory.
 
 Releases
@@ -63,12 +63,12 @@ You can find release notes at https://doc.scrapy.org/en/latest/news.html
 Community (blog, twitter, mail list, IRC)
 =========================================
 
-See http://scrapy.org/community/
+See https://scrapy.org/community/
 
 Contributing
 ============
 
-See http://doc.scrapy.org/en/master/contributing.html
+See https://doc.scrapy.org/en/master/contributing.html
 
 Code of Conduct
 ---------------
@@ -82,9 +82,9 @@ Please report unacceptable behavior to [email protected].
 Companies using Scrapy
 ======================
 
-See http://scrapy.org/companies/
+See https://scrapy.org/companies/
 
 Commercial Support
 ==================
 
-See http://scrapy.org/support/
+See https://scrapy.org/support/
diff --git a/debian/control b/debian/control
@@ -4,7 +4,7 @@ Priority: optional
 Maintainer: Scrapinghub Team <[email protected]>
 Build-Depends: debhelper (>= 7.0.50), python (>=2.7), python-twisted, python-w3lib, python-lxml, python-six (>=1.5.2)
 Standards-Version: 3.8.4
-Homepage: http://scrapy.org/
+Homepage: https://scrapy.org/
 
 Package: scrapy
 Architecture: all
@@ -15,6 +15,6 @@ Conflicts: python-scrapy, scrapy-0.25
 Provides: python-scrapy, scrapy-0.25
 Description: Python web crawling and web scraping framework
  Scrapy is a fast high-level web crawling and web scraping framework,
- used to crawl websites and extract structured data from their pages. 
- It can be used for a wide range of purposes, from data mining to 
+ used to crawl websites and extract structured data from their pages.
+ It can be used for a wide range of purposes, from data mining to
  monitoring and automated testing.
diff --git a/debian/copyright b/debian/copyright
@@ -1,6 +1,6 @@
 This package was debianized by the Scrapinghub team <[email protected]>.
 
-It was downloaded from http://scrapy.org
+It was downloaded from https://scrapy.org
 
 Upstream Author: Scrapy Developers
 
@@ -14,10 +14,10 @@ All rights reserved.
 Redistribution and use in source and binary forms, with or without modification,
 are permitted provided that the following conditions are met:
 
-    1. Redistributions of source code must retain the above copyright notice, 
+    1. Redistributions of source code must retain the above copyright notice,
        this list of conditions and the following disclaimer.
-    
-    2. Redistributions in binary form must reproduce the above copyright 
+
+    2. Redistributions in binary form must reproduce the above copyright
        notice, this list of conditions and the following disclaimer in the
        documentation and/or other materials provided with the distribution.
 

diff --git a/docs/contributing.rst b/docs/contributing.rst
@@ -7,7 +7,7 @@ Contributing to Scrapy
 .. important::
 
     Double check you are reading the most recent version of this document at
-    http://doc.scrapy.org/en/master/contributing.html
+    https://doc.scrapy.org/en/master/contributing.html
 
 There are many ways to contribute to Scrapy. Here are some of them:
 

diff --git a/docs/intro/overview.rst b/docs/intro/overview.rst
@@ -160,7 +160,7 @@ The next steps for you are to :ref:`install Scrapy <intro-install>`,
 a full-blown Scrapy project and `join the community`_. Thanks for your
 interest!
 
-.. _join the community: http://scrapy.org/community/
+.. _join the community: https://scrapy.org/community/
 .. _web scraping: https://en.wikipedia.org/wiki/Web_scraping
 .. _Amazon Associates Web Services: https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html
 .. _Amazon S3: https://aws.amazon.com/s3/

diff --git a/docs/topics/practices.rst b/docs/topics/practices.rst
@@ -248,7 +248,7 @@ If you are still unable to prevent your bot getting banned, consider contacting
 `commercial support`_.
 
 .. _Tor project: https://www.torproject.org/
-.. _commercial support: http://scrapy.org/support/
+.. _commercial support: https://scrapy.org/support/
 .. _ProxyMesh: https://proxymesh.com/
 .. _Google cache: http://www.googleguide.com/cached_pages.html
 .. _testspiders: https://github.com/scrapinghub/testspiders

diff --git a/docs/topics/selectors.rst b/docs/topics/selectors.rst
@@ -86,7 +86,7 @@ To explain how to use the selectors we'll use the `Scrapy shell` (which
 provides interactive testing) and an example page located in the Scrapy
 documentation server:
 
-    http://doc.scrapy.org/en/latest/_static/selectors-sample1.html
+    https://doc.scrapy.org/en/latest/_static/selectors-sample1.html
 
 .. _topics-selectors-htmlcode:
 
@@ -99,7 +99,7 @@ Here's its HTML code:
 
 First, let's open the shell::
 
-    scrapy shell http://doc.scrapy.org/en/latest/_static/selectors-sample1.html
+    scrapy shell https://doc.scrapy.org/en/latest/_static/selectors-sample1.html
 
 Then, after the shell loads, you'll have the response available as ``response``
 shell variable, and its attached selector in ``response.selector`` attribute.

diff --git a/docs/topics/shell.rst b/docs/topics/shell.rst
@@ -142,7 +142,7 @@ Example of shell session
 ========================
 
 Here's an example of a typical shell session where we start by scraping the
-http://scrapy.org page, and then proceed to scrape the https://reddit.com
+https://scrapy.org page, and then proceed to scrape the https://reddit.com
 page. Finally, we modify the (Reddit) request method to POST and re-fetch it
 getting an error. We end the session by typing Ctrl-D (in Unix systems) or
 Ctrl-Z in Windows.
@@ -154,7 +154,7 @@ shell works.
 
 First, we launch the shell::
 
-    scrapy shell 'http://scrapy.org' --nolog
+    scrapy shell 'https://scrapy.org' --nolog
 
 Then, the shell fetches the URL (using the Scrapy downloader) and prints the
 list of available objects and useful shortcuts (you'll notice that these lines
@@ -164,7 +164,7 @@ all start with the ``[s]`` prefix)::
     [s]   scrapy     scrapy module (contains scrapy.Request, scrapy.Selector, etc)
     [s]   crawler    <scrapy.crawler.Crawler object at 0x7f07395dd690>
     [s]   item       {}
-    [s]   request    <GET http://scrapy.org>
+    [s]   request    <GET https://scrapy.org>
     [s]   response   <200 https://scrapy.org/>
     [s]   settings   <scrapy.settings.Settings object at 0x7f07395dd710>
     [s]   spider     <DefaultSpider 'default' at 0x7f0735891690>
@@ -182,7 +182,7 @@ After that, we can start playing with the objects::
     >>> response.xpath('//title/text()').extract_first()
     'Scrapy | A Fast and Powerful Scraping and Web Crawling Framework'
 
-    >>> fetch("http://reddit.com")
+    >>> fetch("https://reddit.com")
 
     >>> response.xpath('//title/text()').extract()
     ['reddit: the front page of the internet']

diff --git a/scrapy/_monkeypatches.py b/scrapy/_monkeypatches.py
@@ -4,12 +4,12 @@
 if sys.version_info[0] == 2:
     from urlparse import urlparse
 
-    # workaround for http://bugs.python.org/issue7904 - Python < 2.7
+    # workaround for https://bugs.python.org/issue7904 - Python < 2.7
     if urlparse('s3://bucket/key').netloc != 'bucket':
         from urlparse import uses_netloc
         uses_netloc.append('s3')
 
-    # workaround for http://bugs.python.org/issue9374 - Python < 2.7.4
+    # workaround for https://bugs.python.org/issue9374 - Python < 2.7.4
     if urlparse('s3://bucket/key?key=value').query != 'key=value':
         from urlparse import uses_query
         uses_query.append('s3')

diff --git a/scrapy/core/downloader/contextfactory.py b/scrapy/core/downloader/contextfactory.py
@@ -64,7 +64,7 @@ class BrowserLikeContextFactory(ScrapyClientContextFactory):
         """
         Twisted-recommended context factory for web clients.
 
-        Quoting http://twistedmatrix.com/documents/current/api/twisted.web.client.Agent.html:
+        Quoting https://twistedmatrix.com/documents/current/api/twisted.web.client.Agent.html:
         "The default is to use a BrowserLikePolicyForHTTPS,
         so unless you have special requirements you can leave this as-is."
 
@@ -100,6 +100,6 @@ def __init__(self, method=SSL.SSLv23_METHOD):
         def getContext(self, hostname=None, port=None):
             ctx = ClientContextFactory.getContext(self)
             # Enable all workarounds to SSL bugs as documented by
-            # http://www.openssl.org/docs/ssl/SSL_CTX_set_options.html
+            # https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set_options.html
             ctx.set_options(SSL.OP_ALL)
             return ctx
diff --git a/scrapy/crawler.py b/scrapy/crawler.py
@@ -83,7 +83,7 @@ def crawl(self, *args, **kwargs):
             yield defer.maybeDeferred(self.engine.start)
         except Exception:
             # In Python 2 reraising an exception after yield discards
-            # the original traceback (see http://bugs.python.org/issue7563),
+            # the original traceback (see https://bugs.python.org/issue7563),
             # so sys.exc_info() workaround is used.
             # This workaround also works in Python 3, but it is not needed,
             # and it is slower, so in Python 3 we use native `raise`.

diff --git a/scrapy/downloadermiddlewares/chunked.py b/scrapy/downloadermiddlewares/chunked.py
@@ -11,7 +11,7 @@
 
 class ChunkedTransferMiddleware(object):
     """This middleware adds support for chunked transfer encoding, as
-    documented in: http://en.wikipedia.org/wiki/Chunked_transfer_encoding
+    documented in: https://en.wikipedia.org/wiki/Chunked_transfer_encoding
     """
 
     def process_response(self, request, response, spider):

diff --git a/scrapy/downloadermiddlewares/httpcache.py b/scrapy/downloadermiddlewares/httpcache.py
@@ -75,7 +75,7 @@ def process_response(self, request, response, spider):
             return response
 
         # RFC2616 requires origin server to set Date header,
-        # http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.18
+        # https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.18
         if 'Date' not in response.headers:
             response.headers['Date'] = formatdate(usegmt=1)
 

diff --git a/scrapy/exporters.py b/scrapy/exporters.py
@@ -188,7 +188,7 @@ def _export_xml_field(self, name, serialized_value, depth):
         self.xg.endElement(name)
         self._beautify_newline()
 
-    # Workaround for http://bugs.python.org/issue17606
+    # Workaround for https://bugs.python.org/issue17606
     # Before Python 2.7.4 xml.sax.saxutils required bytes;
     # since 2.7.4 it requires unicode. The bug is likely to be
     # fixed in 2.7.6, but 2.7.6 will still support unicode,

diff --git a/scrapy/extensions/httpcache.py b/scrapy/extensions/httpcache.py
@@ -70,8 +70,8 @@ def should_cache_request(self, request):
         return True
 
     def should_cache_response(self, response, request):
-        # What is cacheable - http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec14.9.1
-        # Response cacheability - http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4
+        # What is cacheable - https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec14.9.1
+        # Response cacheability - https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4
         # Status code 206 is not included because cache can not deal with partial contents
         cc = self._parse_cachecontrol(response)
         # obey directive "Cache-Control: no-store"
@@ -163,7 +163,7 @@ def _get_max_age(self, cc):
 
     def _compute_freshness_lifetime(self, response, request, now):
         # Reference nsHttpResponseHead::ComputeFreshnessLifetime
-        # http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#410
+        # https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#706
         cc = self._parse_cachecontrol(response)
         maxage = self._get_max_age(cc)
         if maxage is not None:
@@ -194,7 +194,7 @@ def _compute_freshness_lifetime(self, response, request, now):
 
     def _compute_current_age(self, response, request, now):
         # Reference nsHttpResponseHead::ComputeCurrentAge
-        # http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#366
+        # https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpResponseHead.cpp#658
         currentage = 0
         # If Date header is not set we assume it is a fast connection, and
         # clock is in sync with the server
@@ -414,7 +414,7 @@ def _request_key(self, request):
 def parse_cachecontrol(header):
     """Parse Cache-Control header
 
-    http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
+    https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
 
     >>> parse_cachecontrol(b'public, max-age=3600') == {b'public': None,
     ...                                                 b'max-age': b'3600'}

diff --git a/scrapy/extensions/telnet.py b/scrapy/extensions/telnet.py
@@ -82,7 +82,7 @@ def _get_telnet_vars(self):
             'prefs': print_live_refs,
             'hpy': hpy,
             'help': "This is Scrapy telnet console. For more info see: " \
-                "http://doc.scrapy.org/en/latest/topics/telnetconsole.html",
+                "https://doc.scrapy.org/en/latest/topics/telnetconsole.html",
         }
         self.crawler.signals.send_catch_log(update_telnet_vars, telnet_vars=telnet_vars)
         return telnet_vars
diff --git a/scrapy/pipelines/files.py b/scrapy/pipelines/files.py
@@ -120,7 +120,7 @@ def _onsuccess(boto_key):
 
     def _get_boto_bucket(self):
         # disable ssl (is_secure=False) because of this python bug:
-        # http://bugs.python.org/issue5103
+        # https://bugs.python.org/issue5103
         c = self.S3Connection(self.AWS_ACCESS_KEY_ID, self.AWS_SECRET_ACCESS_KEY, is_secure=False)
         return c.get_bucket(self.bucket, validate=False)
 
@@ -268,7 +268,7 @@ class FilesPipeline(MediaPipeline):
     def __init__(self, store_uri, download_func=None, settings=None):
         if not store_uri:
             raise NotConfigured
-        
+
         if isinstance(settings, dict) or settings is None:
             settings = Settings(settings)
 

diff --git a/scrapy/signalmanager.py b/scrapy/signalmanager.py
@@ -55,7 +55,7 @@ def send_catch_log_deferred(self, signal, **kwargs):
         The keyword arguments are passed to the signal handlers (connected
         through the :meth:`connect` method).
 
-        .. _deferreds: http://twistedmatrix.com/documents/current/core/howto/defer.html
+        .. _deferreds: https://twistedmatrix.com/documents/current/core/howto/defer.html
         """
         kwargs.setdefault('sender', self.sender)
         return _signal.send_catch_log_deferred(signal, **kwargs)

diff --git a/scrapy/templates/project/module/items.py.tmpl b/scrapy/templates/project/module/items.py.tmpl
@@ -3,7 +3,7 @@
 # Define here the models for your scraped items
 #
 # See documentation in:
-# http://doc.scrapy.org/en/latest/topics/items.html
+# https://doc.scrapy.org/en/latest/topics/items.html
 
 import scrapy
 

diff --git a/scrapy/templates/project/module/middlewares.py.tmpl b/scrapy/templates/project/module/middlewares.py.tmpl
@@ -3,7 +3,7 @@
 # Define here the models for your spider middleware
 #
 # See documentation in:
-# http://doc.scrapy.org/en/latest/topics/spider-middleware.html
+# https://doc.scrapy.org/en/latest/topics/spider-middleware.html
 
 from scrapy import signals
 

diff --git a/scrapy/templates/project/module/pipelines.py.tmpl b/scrapy/templates/project/module/pipelines.py.tmpl
@@ -3,7 +3,7 @@
 # Define your item pipelines here
 #
 # Don't forget to add your pipeline to the ITEM_PIPELINES setting
-# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html
+# See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html
 
 
 class ${ProjectName}Pipeline(object):