python gets the code for how to encode web pages
- 2020-05-27 05:56:35
- OfStack
python gets the code for the implementation of web page encoding
<span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);">
</span><span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);">
python Development, automatic access to web code used chardet Library, character set detection, this class in python2.7 No, it needs to be downloaded from the official website.
I'll download it here chardet-2.3.0.tar.gz Compress the package file, only need to decompress the compressed package file chardet Files in python Install under the package
python27/lib/site-packages/ Down, that's it. </span>
Then import chardet
An automated detection function is written below to detect Url connections and return the encoding of url for the web page.
import chardet # Character set detection
import urllib
url="http://www.jd.com"
def automatic_detect(url):
content=urllib.urlopen(url).read()
result=chardet.detect(content)
encoding=result['encoding']
return encoding
urls=['http://www.baidu.com','http://www.163.com','http://dangdang.com']
for url in urls:
print url,automatic_detect(url)
The detect method of the chardet class is used above, the dictionary is returned, and the encoding encoding is pulled out
Thank you for reading, I hope to help you, thank you for your support of this site!