Regular mode of automatic thieves catch the net program

  • 2020-05-16 06:46:24
  • OfStack

There are 1 defects no time to refine, but to achieve the effect, we look at 1 look at this regular how to write:
URL: http: / / news. szhome. com / 83642. html
Content:
 
object></div></div> 
</div> 

<div class="share"><div class="linkshare" style="right: 0"> 

The code between the two tags. The problem with the END tag was resolved, but I was frustrated that the START flag was intercepted because there was a line feed between the second DIV and the third. I was speechless and didn't know what to do with the regex.
What's more, I am frustrated that there are a lot of such repetitive markers, and I am not familiar with regularization. My solution is as follows:

 
MatchCollection mc = Regex.Matches(ghoPage.Trim(), @"(?<=<div class=['""]txtmsg['""]>)[\s\S]*?(?=<div class=['""]share['""]><div class=)", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase); 
foreach (Match mm in mc) 
{ 
sb.Append(mc[0].Value.Substring(1933, mc[0].Value.Length - 1933)); 
} 

I calculate the intercept out more than two FLASH advertising DIV length is 1933, and then the string is processed through after get the text, I want to do this disadvantage is over 1 need to change the length of the two FLASH advertising DIV I obtain the data is not complete, are interested in the study, look at how the newline DIV regular processing.
Use to write their own 1 BUTTON inside control, click on the ban after repeated clicks, then is 1 some judgment, pretty good, in ways can do 1 straight fetching, because they don't often made WINDOWS service type, no such WINDOWS service program can make it, write rules on INI file, grasp its rules and regular in the configuration file, so it can realize automatic recording.

Very short code, to this kind of capture interested friends can try. download

Related articles: