Parse file_get_contents to mimic the browser header of user_agent for data

  • 2020-06-22 23:57:21
  • OfStack

What is user agent
User Agent, known as User Agent in Chinese or UA for short, is a special string header that enables the server to identify the operating system and version the client is using, the TYPE of CPU, the browser and version, the browser rendering engine, the browser language, the browser plug-in, and so on.
Websites can render different websites by judging different UA, such as mobile visits and PC visits displaying different pages.
When PHP collects web sites using the file_get_contents function, it is sometimes clear that the browser can see it, but nothing is retrieved.
This is most likely set up on the server to determine whether it is a normal browser request based on User_agent, since the default file_get_contents function of PHP does not send ua.
To capture such a site, we would have to have PHP impersonate the browser sending UA, tricking the site into returning to normal content.

The implementation is as follows:
ini_set('user_agent','Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; 4399Box.560; .NET4.0C; .NET4.0E)');
This is UA modeled after IE8, but you can switch to something else. Like In Firefox
You can also read:


$opts = array(
'http'=>array(
    'method'=>"GET",
    'header'=>"Host: zh.wikipedia.org\r\n" . 
              "Accept-language: zh-cn\r\n" . 
              "User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; 4399Box.560; .NET4.0C; .NET4.0E)" .
              "Accept: *//*"
)
); 


Related articles: