Use the C Winform application to get the web page source file solution

  • 2020-05-12 03:04:41
  • OfStack

In the C# Winform application, you can obtain the source file of a web page by:

First, introduce namespaces
using System.IO;
using System.Net;


WebClient MyWebClient = new WebClient();
                 
MyWebClient.Credentials = CredentialCache.DefaultCredentials;// Gets or sets to Internet Requests for resources are authenticated with network credentials 
Byte[] pageData = MyWebClient.DownloadData("http://www.baidu.com");
//string pageHtml = Encoding.Default.GetString(pageData); 
 FileStream file = new FileStream("C:\\test.html", FileMode.Create);
 file.Write(pageData, 0, pageData.Length);

Attached, c# to get the source code of the web page for example.
C# gets the HTML source code for the specified web page. WebClient WebRequest HttpWebRequest can be used in three ways: WebClient WebRequest HttpWebRequest.
Of course, you can also use webBrowse. If you are interested, you can study it by yourself.

1. WebClient mode


private string GetWebClient(string url)
{
  string strHTML = "";
  WebClient myWebClient = new WebClient();
  Stream myStream = myWebClient.OpenRead(url);
  StreamReader sr = new StreamReader(myStream, System.Text.Encoding.GetEncoding("utf-8"));
  strHTML = sr.ReadToEnd();
  myStream.Close();
  return strHTML;
}

2. WebRequest mode


private string GetWebRequest(string url)
{
  Uri uri = new Uri(url);
  WebRequest myReq = WebRequest.Create(uri);
  WebResponse result = myReq.GetResponse();
  Stream receviceStream = result.GetResponseStream();
  StreamReader readerOfStream = new StreamReader(receviceStream, System.Text.Encoding.GetEncoding("utf-8"));
  string strHTML = readerOfStream.ReadToEnd();
  readerOfStream.Close();
  receviceStream.Close();
  result.Close();
  return strHTML;
}

3. HttpWebRequest mode


private string GetHttpWebRequest(string url)
{
  Uri uri = new Uri(url);
  HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(uri);
  myReq.UserAgent = "User-Agent:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705";
  myReq.Accept = "*/*";
  myReq.KeepAlive = true;
  myReq.Headers.Add("Accept-Language", "zh-cn,en-us;q=0.5");
  HttpWebResponse result = (HttpWebResponse)myReq.GetResponse();
  Stream receviceStream = result.GetResponseStream();
  StreamReader readerOfStream = new StreamReader(receviceStream, System.Text.Encoding.GetEncoding("utf-8"));
  string strHTML = readerOfStream.ReadToEnd();
  readerOfStream.Close();
  receviceStream.Close();
  result.Close();
  return strHTML;
}

Note: "utf-8" should correspond to the encoding of the specified page.
conclusion
The HttpWebRequest approach is the most complex, but it does offer more options.
Some websites detect the client end of UserAgent! For example, 163.com, if you use WebClient WebRequest, you will get the contents of the error page.
HttpWebRequest does not have this problem.
Test environment: WIN2003 + VS2005 + C# + winForm


Related articles: