Java method of scraping email addresses on web pages
- 2020-04-01 03:33:46
- OfStack
This article is an example of how Java can grab a mail address from a web page. Share with you for your reference. The specific implementation method is as follows:
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class h1
{
public static String getWebCon(String domain)
{
System.out.println(" Start scraping email addresses ..("+domain+")");
StringBuffer sb=new StringBuffer();
try
{
java.net.URL url=new java.net.URL(domain);
BufferedReader in=new BufferedReader(new InputStreamReader(url.openStream()));
String line;
while((line=in.readLine())!=null)
{
parse(line);
}
in.close();
}
catch(Exception e)
{
sb.append(e.toString());
System.err.println(e);
}
return sb.toString();
}
public static void main(String[] args)
{
String s;
s=h1.getWebCon("http://post.baidu.com/f?kz=34942387"); //This is the web page to grab, you can try it yourself. < br / >
//System.out.println(s);
}
private static void parse(String line)
{
Pattern p=Pattern.compile("[\w[.-]]+@[\w[.-]]+\.[\w]+");//Regular expression for mailboxes & NBSP; < br / >
Matcher m=p.matcher(line);
while(m.find())
{
System.out.println(m.group());
}
}
}
I hope this article has been helpful to your Java programming.