Talking about the application methods of HttpWebRequest and HttpWebResponse in C

2021-11-24 02:37:29
OfStack

This class is specially written for GET and POST requests of HTTP, and solves the problems of encoding, certificate, automatic striping Cookie and so on.

C # HttpHelper, Help Class, Real Httprequest Request Ignoring Encoding, Ignoring Certificates, Ignoring Cookie, Web Page Crawling

1. The first trick is to obtain web page information according to URL address

Let's look at the code first

get method


public static string GetUrltoHtml(string Url,string type)
{
 try
 {
  System.Net.WebRequest wReq = System.Net.WebRequest.Create(Url);
  // Get the response instance.
  System.Net.WebResponse wResp = wReq.GetResponse();
  System.IO.Stream respStream = wResp.GetResponseStream();
  // Dim reader As StreamReader = New StreamReader(respStream)
  using (System.IO.StreamReader reader = new System.IO.StreamReader(respStream, Encoding.GetEncoding(type)))
  {
   return reader.ReadToEnd();
  }
 }
 catch (System.Exception ex)
 {
  //errorMsg = ex.Message;
 }
 return "";
}

post method


///<summary>
/// Adopt https Protocol access network 
///</summary>
public string OpenReadWithHttps(string URL, string strPostdata, string strEncoding)
{
 Encoding encoding = Encoding.Default;
 HttpWebRequest request = (HttpWebRequest)WebRequest.Create(URL);
 request.Method = "post";
 request.Accept = "text/html, application/xhtml+xml, */*";
 request.ContentType = "application/x-www-form-urlencoded";
 byte[] buffer = encoding.GetBytes(strPostdata);
 request.ContentLength = buffer.Length;
 request.GetRequestStream().Write(buffer, 0, buffer.Length);
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();
 using( StreamReader reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.GetEncoding(strEncoding)))
  {
   return reader.ReadToEnd();
  }
}

This trick is the first type of entry, which is characterized by:

1. The simplest and most intuitive one, introductory course.

2. A page that is suitable for plaintext and can be entered without login or verification.

3. The data type obtained is an HTML document.

4. Request method is Get/Post

2. The second trick is to obtain the webpage information that needs to be verified to access according to the URL address

Let's look at the code first

get method


 // Callback verification certificate problem 
public bool CheckValidationResult(object sender, X509Certificate certificate, X509Chain chain, SslPolicyErrors errors)
{ 
 //  Always accept  
 return true;
}
/// <summary>
///  Incoming URL Returns the html Code 
/// </summary>
public string GetUrltoHtml(string Url)
{
 StringBuilder content = new StringBuilder();
 try
 {
  // This 1 Sentence 1 Be sure to write before creating the connection. Certificate validation using the callback method. 
  ServicePointManager.ServerCertificateValidationCallback = new System.Net.Security.RemoteCertificateValidationCallback(CheckValidationResult);
  //  With the designation URL Create HTTP Request 
  HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
  // Create a certificate file 
  X509Certificate objx509 = new X509Certificate(Application.StartupPath + "\\123.cer");
  // Add to request 
  request.ClientCertificates.Add(objx509);
  //  Get the corresponding HTTP Response to a request 
  HttpWebResponse response = (HttpWebResponse)request.GetResponse();
  //  Get response flow 
  Stream responseStream = response.GetResponseStream();
  //  Docking response flow ( With "GBK" Character set )
  StreamReader sReader = new StreamReader(responseStream, Encoding.GetEncoding("utf-8"));
  //  Start reading data 
  Char[] sReaderBuffer = new Char[256];
  int count = sReader.Read(sReaderBuffer, 0, 256);
  while (count > 0)
  {
   String tempStr = new String(sReaderBuffer, 0, count);
   content.Append(tempStr);
   count = sReader.Read(sReaderBuffer, 0, 256);
  }
  //  End of reading 
  sReader.Close();
 }
 catch (Exception)
 {
  content = new StringBuilder("Runtime Error");
 }
 return content.ToString();
}

post method


// Callback verification certificate problem 
public bool CheckValidationResult(object sender, X509Certificate certificate, X509Chain chain, SslPolicyErrors errors)
{
 //  Always accept  
 return true;
}
///<summary>
/// Adopt https Protocol access network 
///</summary>
public string OpenReadWithHttps(string URL, string strPostdata, string strEncoding)
{
 //  This 1 Sentence 1 Be sure to write before creating the connection. Certificate validation using the callback method. 
 ServicePointManager.ServerCertificateValidationCallback = new System.Net.Security.RemoteCertificateValidationCallback(CheckValidationResult);
 Encoding encoding = Encoding.Default;
 HttpWebRequest request = (HttpWebRequest)WebRequest.Create(URL);
 // Create a certificate file 
 X509Certificate objx509 = new X509Certificate(Application.StartupPath + "\\123.cer");
 // Loading Cookie
 request.CookieContainer = new CookieContainer();
 // Add to request 
 request.ClientCertificates.Add(objx509);
 request.Method = "post";
 request.Accept = "text/html, application/xhtml+xml, */*";
 request.ContentType = "application/x-www-form-urlencoded";
 byte[] buffer = encoding.GetBytes(strPostdata);
 request.ContentLength = buffer.Length;
 request.GetRequestStream().Write(buffer, 0, buffer.Length);
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();
 using (StreamReader reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.GetEncoding(strEncoding)))
  {
   return reader.ReadToEnd();
  }
}

This trick is to learn to enter the gate, Any page that needs to verify the certificate can be accessed by this method. I use the certificate callback verification method, Whether the certificate verification passes the client-side authentication, In this way, we can use our own definition of a method to verify. Some people will say that it is not clear how to verify it. Others are very simple. The code is written by ourselves. Why is it so difficult for ourselves? It is not finished to return an True directly. It will always be verified, so that we can ignore the existence of certificates. Characteristics:

1. Small problems before entering, elementary courses.

2. Adapt to pages that do not need login and are plaintext but need authentication certificates to access.

3. The data type obtained is an HTML document.

4. Request method is Get/Post

3. The third trick is to obtain the webpage information that needs to be logged in to access according to the URL address

Let's first analyze this type of web page. Web pages that need to be logged in to access, Others are also a kind of verification. Verify what, Verify whether the client logs in, whether it has the corresponding credentials, and verify SessionID if it needs to log in. This is what every page that needs to log in needs to be verified. How do we do it? Our first step is to get the data that exists in Cookie, including SessionID. How to get it? There are many methods, which can be easily obtained by using ID9 or Firefox browser.

Provide a web page to crawl hao123 mobile phone number attribution example, which for ID9 has a detailed description.

If we get the login Cookie information, it will be very simple to visit the corresponding page again. To put it bluntly, it is only necessary to carry the local Cookie information when requesting it.

Look at the code

get method


/// <summary>
///  Incoming URL Returns the html Method with certificate in code 
/// </summary>
public string GetUrltoHtml(string Url)
{
 StringBuilder content = new StringBuilder();
 try
 {
  //  With the designation URL Create HTTP Request 
  HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
  request.UserAgent = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; BOIE9;ZHCN)";
  request.Method = "GET";
  request.Accept = "*/*";
  // Add this if the method verifies the source of the web page 1 If you don't verify the sentence, you can not write it 
  request.Referer = "http://txw1958.cnblogs.com";
  CookieContainer objcok = new CookieContainer();
  objcok.Add(new Uri("http://txw1958.cnblogs.com"), new Cookie(" Key ", " Value "));
  objcok.Add(new Uri("http://txw1958.cnblogs.com"), new Cookie(" Key ", " Value "));
  objcok.Add(new Uri("http://txw1958.cnblogs.com"), new Cookie("sidi_sessionid", "360A748941D055BEE8C960168C3D4233"));
  request.CookieContainer = objcok;
  // Do not keep the connection 
  request.KeepAlive = true;
  //  Get the corresponding HTTP Response to a request 
  HttpWebResponse response = (HttpWebResponse)request.GetResponse();
  //  Get response flow 
  Stream responseStream = response.GetResponseStream();
  //  Docking response flow ( With "GBK" Character set )
  StreamReader sReader = new StreamReader(responseStream, Encoding.GetEncoding("gb2312"));
  //  Start reading data 
  Char[] sReaderBuffer = new Char[256];
  int count = sReader.Read(sReaderBuffer, 0, 256);
  while (count > 0)
  {
   String tempStr = new String(sReaderBuffer, 0, count);
   content.Append(tempStr);
   count = sReader.Read(sReaderBuffer, 0, 256);
  }
  //  End of reading 
  sReader.Close();
 }
 catch (Exception)
 {
  content = new StringBuilder("Runtime Error");
 }
 return content.ToString();
}

post method.


///<summary>
/// Adopt https Protocol access network 
///</summary>
public string OpenReadWithHttps(string URL, string strPostdata)
{
 Encoding encoding = Encoding.Default;
 HttpWebRequest request = (HttpWebRequest)WebRequest.Create(URL);
 request.Method = "post";
 request.Accept = "text/html, application/xhtml+xml, */*";
 request.ContentType = "application/x-www-form-urlencoded";
 CookieContainer objcok = new CookieContainer();
 objcok.Add(new Uri("http://txw1958.cnblogs.com"), new Cookie(" Key ", " Value "));
 objcok.Add(new Uri("http://txw1958.cnblogs.com"), new Cookie(" Key ", " Value "));
 objcok.Add(new Uri("http://txw1958.cnblogs.com"), new Cookie("sidi_sessionid", "360A748941D055BEE8C960168C3D4233"));
 request.CookieContainer = objcok;
 byte[] buffer = encoding.GetBytes(strPostdata);
 request.ContentLength = buffer.Length;
 request.GetRequestStream().Write(buffer, 0, buffer.Length);
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();
 StreamReader reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.GetEncoding("utf-8"));
 return reader.ReadToEnd();
}

Features:

1. It's still a little water type. After successful practice, you can calf 1.

2. Adapt to pages that need to be logged in to access.

3. The data type obtained is an HTML document.

4. Request method is Get/Post

To sum up 1, other basic skills are just these parts. If you go further, it is the combination of basic skills

For example,

1. First use Get or Post method to log in and then get Cookie to visit the page to get information. This other is also a combination of the above skills. Here, you need to do such a step after request. response. Cookie

This is the method that you can get the current Cookie after your request, and directly get it back to the last method. We all constructed it ourselves, so we can use this Cookie directly here.

2. If we encounter the need to log in and verify the certificate of the web page how to do, other this is also very simple to our above method under 1, the following code here I take Get as an example, Post example is the same method


/// <summary>
///  Incoming URL Returns the html Code 
/// </summary>
public string GetUrltoHtml(string Url)
{
 StringBuilder content = new StringBuilder();
 try
 {
  // This 1 Sentence 1 Be sure to write before creating the connection. Certificate validation using the callback method. 
  ServicePointManager.ServerCertificateValidationCallback = new System.Net.Security.RemoteCertificateValidationCallback(CheckValidationResult);
  //  With the designation URL Create HTTP Request 
  HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
  // Create a certificate file 
  X509Certificate objx509 = new X509Certificate(Application.StartupPath + "\\123.cer");
  // Add to request 
  request.ClientCertificates.Add(objx509);
  CookieContainer objcok = new CookieContainer();
  objcok.Add(new Uri("http://www.cnblogs.com"), new Cookie(" Key ", " Value "));
  objcok.Add(new Uri("http://www.cnblogs.com"), new Cookie(" Key ", " Value "));
  objcok.Add(new Uri("http://www.cnblogs.com"), new Cookie("sidi_sessionid", "360A748941D055BEE8C960168C3D4233"));
  request.CookieContainer = objcok;
  //  Get the corresponding HTTP Response to a request 
  HttpWebResponse response = (HttpWebResponse)request.GetResponse();
  //  Get response flow 
  Stream responseStream = response.GetResponseStream();
  //  Docking response flow ( With "GBK" Character set )
  StreamReader sReader = new StreamReader(responseStream, Encoding.GetEncoding("utf-8"));
  //  Start reading data 
  Char[] sReaderBuffer = new Char[256];
  int count = sReader.Read(sReaderBuffer, 0, 256);
  while (count > 0)
  {
   String tempStr = new String(sReaderBuffer, 0, count);
   content.Append(tempStr);
   count = sReader.Read(sReaderBuffer, 0, 256);
  }
  //  End of reading 
  sReader.Close();
 }
 catch (Exception)
 {
  content = new StringBuilder("Runtime Error");
 }
 return content.ToString();
}

3. What should we do if we encounter a method that needs to verify the source of a web page? In this case, some programmers will think that you may use programs. Automatically to obtain web page information, in order to prevent the use of page source to verify, that is to say, as long as they are not from the page or domain name over the request is not accepted, have a plenty of direct verification of the source of IP, these can use the following sentence to enter, this is mainly the address can be forged directly


request.Referer = <a href=https://www.ofstack.com>https://www.ofstack.com</a>;

Hehe, others are very simple because this address can be modified directly. But if the server is validating the source URL that is finished, we have to modify the packet, this is a bit difficult not to discuss for the time being.

4. Provide some methods to configure this example

Method for filtering HTML tags


/// <summary>
///  Filter html Label 
/// </summary>
public static string StripHTML(string stringToStrip)
{
 // paring using RegEx   //
 stringToStrip = Regex.Replace(stringToStrip, "</p(?:\\s*)>(?:\\s*)<p(?:\\s*)>", "\n\n", RegexOptions.IgnoreCase | RegexOptions.Compiled);
 stringToStrip = Regex.Replace(stringToStrip, "
", "\n", RegexOptions.IgnoreCase | RegexOptions.Compiled);
 stringToStrip = Regex.Replace(stringToStrip, "\"", "''", RegexOptions.IgnoreCase | RegexOptions.Compiled);
 stringToStrip = StripHtmlXmlTags(stringToStrip);
 return stringToStrip;
}
private static string StripHtmlXmlTags(string content)
{
 return Regex.Replace(content, "<[^>]+>", "", RegexOptions.IgnoreCase | RegexOptions.Compiled);
}

Method of URL Transformation


#region  Transformation  URL
public static string URLDecode(string text)
{
 return HttpUtility.UrlDecode(text, Encoding.Default);
}
public static string URLEncode(string text)
{
 return HttpUtility.UrlEncode(text, Encoding.Default);
}
#endregion

Provide a practical example, This is the use of IP138 to query the mobile phone number attribution method, other in my last article have, here I put up again is convenient for everyone to read, the technology in this area is very interesting to study other, I hope you make more suggestions, I believe there should be more better, more perfect method, here to provide you with a reference. Thank you for your support

Example above


///<summary>
/// Adopt https Protocol access network 
///</summary>
public string OpenReadWithHttps(string URL, string strPostdata, string strEncoding)
{
 Encoding encoding = Encoding.Default;
 HttpWebRequest request = (HttpWebRequest)WebRequest.Create(URL);
 request.Method = "post";
 request.Accept = "text/html, application/xhtml+xml, */*";
 request.ContentType = "application/x-www-form-urlencoded";
 byte[] buffer = encoding.GetBytes(strPostdata);
 request.ContentLength = buffer.Length;
 request.GetRequestStream().Write(buffer, 0, buffer.Length);
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();
 using( StreamReader reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.GetEncoding(strEncoding)))
  {
   return reader.ReadToEnd();
  }
}

This example is written is not so good, some places can be simplified, this interface and can be directly used Xml, but my focus here is to let a novice look at the methods and ideas cool ah, ha ha

The fourth trick is to access through Socket


///<summary>
/// Adopt https Protocol access network 
///</summary>
public string OpenReadWithHttps(string URL, string strPostdata, string strEncoding)
{
 Encoding encoding = Encoding.Default;
 HttpWebRequest request = (HttpWebRequest)WebRequest.Create(URL);
 request.Method = "post";
 request.Accept = "text/html, application/xhtml+xml, */*";
 request.ContentType = "application/x-www-form-urlencoded";
 byte[] buffer = encoding.GetBytes(strPostdata);
 request.ContentLength = buffer.Length;
 request.GetRequestStream().Write(buffer, 0, buffer.Length);
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();
 using( StreamReader reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.GetEncoding(strEncoding)))
  {
   return reader.ReadToEnd();
  }
}