Python masquerades as HTTP and 1.1 when collecting with scrapy
- 2020-05-07 19:57:15
- OfStack
This example illustrates how Python masquerades as HTTP/1.1 when collecting with scrapy. Share with you for your reference. The details are as follows:
Add the following code to the settings.py file
DOWNLOADER_HTTPCLIENTFACTORY = 'myproject.downloader.HTTPClientFactory'
Save the following code to a separate.py file
from scrapy.core.downloader.webclient import ScrapyHTTPClientFactory, ScrapyHTTPPageGetter
class PageGetter(ScrapyHTTPPageGetter):
def sendCommand(self, command, path):
self.transport.write('%s %s HTTP/1.1\r\n' % (command, path))
class HTTPClientFactory(ScrapyHTTPClientFactory):
protocol = PageGetter
I hope this article has been helpful to your Python programming.