Solution for 403 error in PHP Curl

  • 2021-06-28 11:43:53
  • OfStack

My own small PHP application, use curl to grab web pages and handle them. For the convenience of wall-crossing, use Privoxy as a proxy to select which sites use proxy and which do not.Today, however, I encountered a strange problem. Visiting google baidu returned 403 errors. Visiting other websites is OK. If you set up to not use proxy, you can access them normally.

Isn't google baidu allowed to connect using proxy?Obviously impossible, so turn on the information output from curl (curl_setopt ($this- > mSh, CURLOPT_VERBOSE, 1);)Look and get the following results:


*   Trying 127.0.0.1... * connected
* Connected to 127.0.0.1 (127.0.0.1) port 8118 (#0)
* Establish HTTP proxy tunnel to www.baidu.com:80
> CONNECT www.baidu.com:80 HTTP/1.0
Host: www.baidu.com:80
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Proxy-Connection: Keep-Alive
< HTTP/1.0 403 Connection not allowable
< X-Hint: If you read this message interactively, then you know why this happens ,-)
< 
* The requested URL returned error: 403
* Received HTTP code 403 from proxy after CONNECT
* Closing connection #0
... Failed.

You can see that the proxy server is working correctly. It is true that baidu returned 403 errors, but the reason must still be on my side.Finally, I got some inspiration from the Internet (1of2, 2of2) - I used proxytunnel instead of proxy.

In the code, there is one sentence:


 curl_setopt($this->mSh, CURLOPT_HTTPPROXYTUNNEL, true);
 curl_setopt($this->mSh, CURLOPT_PROXY, $phost);

There is no detailed description in the php documentation, but it is explained in detail in man curl that both are proxies. proxytunnel (-p parameter) allows other protocols to be transferred through the http proxy, while proxy (-x parameter) can only follow the http protocol.So I guess that the server of google baidu is not in harmony with proxytunnel of curl, so I return 403.

curl access returns to normal after disabling the first sentence of the two lines above.

Strangely enough, for a MAC OSX that is not the same on several operating systems, it is necessary to explicitly disable proxytunnel, the curl version:


$ curl --version
curl 7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3
Protocols: tftp ftp telnet dict ldap http file https ftps 
Features: GSS-Negotiate IPv6 Largefile NTLM SSL libz 

The other ubuntu is completely unaffected and can be used anyway, curl version:

$ curl --version
curl 7.18.2 (i486-pc-linux-gnu) libcurl/7.18.2 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.10
Protocols: tftp ftp telnet dict ldap ldaps http file https ftps 
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz 

centos on the MT host is fine, curl version:

$ curl --version
curl 7.15.5 (i686-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
Protocols: tftp ftp telnet dict ldap http file https ftps 
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz 

It doesn't seem to be entirely an curl version issue. MAC OSX is really different.

There is also one reason why curl returns a 403 error if set:


 curl_setopt($ch, CURLOPT_NOBODY, true);

Need to follow the settings:

 curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');

Otherwise, a 403 error will be returned because the http server does not allow the HEAD command.Reference: Trouble with a cURL request in PHP (http://forums.devshed.com/php-development-5/trouble-with-a-curl-request-in-php-445222.html).This is also why curl on MAC OSX is special.


Related articles: