Selenium+BeautifulSoup+json Get json data in Script tag
- 2021-08-17 00:29:57
- OfStack
The data encountered by the Selenium crawler is wrapped in an JSON string in an Script tag,
Suppose the code under the Script tag is as follows:
<script id="DATA_INFO" type="application/json" >
{
"user": {
"isLogin": true,
"userInfo": {
"id": 123456,
"nickname": "LiMing",
"intro": " Life is too short, I use python"
}
}
}
</script>
At this point drive.find_elements_by_xpath ('//* [@ id= "DATA_INFO"] can only navigate to the element, but the json data under the Script tag cannot be retrieved by the. text method
from bs4 import BeautifulSoup as bs
import json as js
#selenium Get the source code of the current page
html = drive.page_source
#BeautifulSoup Convert page source code
bs=BeautifulSoup(html,'lxml')
# Get Script Complete under the label json Data, and through json Load into dictionary format
js_test=js.loads(bs.find("script",{"id":"DATA_INFO"}).get_text())
# Get Script Under the label nickname Value
js_tes