asp.net c acquisition requires the implementation principle and code of the login page

  • 2020-05-26 08:11:49
  • OfStack

First note: code snippets are obtained from the network, and then modified themselves. I think good things should be Shared.

Implementation principle: when we collect pages, if the collected website needs to log in to collect. Whether based on Cookie or Session, we will first send an Http request header, which contains the Cookie information required by the website. When the website receives the Http request header, it will get the Cookie or Session information from the Http request header, and then the program will process it to determine whether you have access to the current page.

All right, once the principle is clear, it's easy to do. All we need to do is to put the Cookie information into the Http request header at the time of collection (or when HttpWebRequest submits the data).

Here I offer two ways.
The first is to put Cookie information directly into CookieContainer of HttpWebRequest. Look at the code:
 
protected void Page_Load(object sender, EventArgs e) 
{ 
// Set up the Cookie And deposit Hashtable 
Hashtable ht = new Hashtable(); 
ht.Add("username", "youraccount"); 
ht.Add("id", "yourid"); 
this.Collect(ht); 
} 
public void Collect(Hashtable ht) 
{ 
string content = string.Empty; 
string url = "http://www.ibest100.com/ Pages that need to be logged in to be collected "; 
string host = "http://www.ibest100.com"; 
try 
{ 
// Gets the bytes submitted  
byte[] bs = Encoding.UTF8.GetBytes(content); 
// Set the parameters for the submission  
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(url); 
req.Method = "POST"; 
req.ContentType = "application/json;charset=utf-8"; 
req.ContentLength = bs.Length; 
// will Cookie In the CookieContainer And then I'm going to CookieContainer Added to the HttpWebRequest 
CookieContainer cc = new CookieContainer(); 
cc.Add(new Uri(host), new Cookie("username", ht["username"].ToString())); 
cc.Add(new Uri(host), new Cookie("id", ht["id"].ToString())); 
req.CookieContainer = cc; 
// Submit request data  
Stream reqStream = req.GetRequestStream(); 
reqStream.Write(bs, 0, bs.Length); 
reqStream.Close(); 
// Receive the returned page, as required, not omitted  
WebResponse wr = req.GetResponse(); 
System.IO.Stream respStream = wr.GetResponseStream(); 
System.IO.StreamReader reader = new System.IO.StreamReader(respStream, System.Text.Encoding.GetEncoding("utf-8")); 
string t = reader.ReadToEnd(); 
System.Web.HttpContext.Current.Response.Write(t); 
wr.Close(); 
} 
catch (Exception ex) 
{ 
System.Web.HttpContext.Current.Response.Write(" Abnormalities in getPostRespone:" + ex.Source + ":" + ex.Message); 
} 
} 

In the second case, every time you open the collection program, you need to simulate login to the collected website once to get CookieContainer, and then collect again. Look at the code:
 
protected void Page_Load(object sender, EventArgs e) 
{ 
try 
{ 
CookieContainer cookieContainer = new CookieContainer(); 
string formatString = "username={0}&password={1}";//*************** 
string postString = string.Format(formatString, "youradminaccount", "yourpassword"); 
// Converts the submitted string data into a byte array  
byte[] postData = Encoding.UTF8.GetBytes(postString); 
// Set the parameters for the submission  
string URI = "http://www.ibest100.com/ The login page ";//*************** 
HttpWebRequest request = WebRequest.Create(URI) as HttpWebRequest; 
request.Method = "POST"; 
request.KeepAlive = false; 
request.ContentType = "application/x-www-form-urlencoded"; 
request.CookieContainer = cookieContainer; 
request.ContentLength = postData.Length; 
//  Submit request data  
System.IO.Stream outputStream = request.GetRequestStream(); 
outputStream.Write(postData, 0, postData.Length); 
outputStream.Close(); 
// Receive the returned page, as required, not omitted  
HttpWebResponse response = request.GetResponse() as HttpWebResponse; 
System.IO.Stream responseStream = response.GetResponseStream(); 
System.IO.StreamReader reader = new System.IO.StreamReader(responseStream, Encoding.GetEncoding("gb2312")); 
string srcString = reader.ReadToEnd(); 
// Open the page you want to visit  
URI = "http://www.ibest100.com/ Pages that need to be logged in to be collected ";//*************** 
request = WebRequest.Create(URI) as HttpWebRequest; 
request.Method = "GET"; 
request.KeepAlive = false; 
request.CookieContainer = cookieContainer; 
//  Receive the returned page  
response = request.GetResponse() as HttpWebResponse; 
responseStream = response.GetResponseStream(); 
reader = new System.IO.StreamReader(responseStream, Encoding.GetEncoding("gb2312")); 
srcString = reader.ReadToEnd(); 
// Output the retrieved page or process  
Response.Write(srcString); 
} 
catch (WebException we) 
{ 
string msg = we.Message; 
Response.Write(msg); 
} 
} 

Some people may ask, what if the other party needs a captcha when they log in? You can use the first method, but you need to analyze the Cookie.

Application: data collection, forum posts, blog posts.

Related articles: