Java USES Jsoup to connect to the site timeout solution

  • 2020-04-01 02:24:51
  • OfStack

Today I did a project of Jsoup parsing a website, which occasionally appears when connecting to a website using jsoup.connect (url).get()
Abnormal java.net.SocketTimeoutException:Read timed out.
The reason is that the default Socket latency is relatively short, and some sites are slower to respond,
So there's going to be a timeout.

The solution :

Set a timeout when linking.
Doc = Jsoup. Connect (url). The timeout (5000). The get ();
5000 indicates that the delay time is set to 5s.

The test code is as follows:
1. When timeout is not set:


package jsoupTest;
import java.io.IOException;
import org.jsoup.*;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class JsoupTest {
 public static  void main(String[] args) throws IOException{
 String url = "//www.jb51.net";
 long start = System.currentTimeMillis();
 Document doc=null;
 try{
  doc = Jsoup.connect(url).get();
 }
 catch(Exception e){
  e.printStackTrace();
 }
 finally{
  System.out.println("Time is:"+(System.currentTimeMillis()-start) + "ms");
 }
 Elements elem = doc.getElementsByTag("Title");
 System.out.println("Title is:" +elem.text());
 } 
}

Sometimes a timeout occurs:
Java.net.SocketTimeoutException: Read timed out
The at java.net.SocketInputStream.socketRead0 (Native Method)
The at java.net.SocketInputStream.read (Unknown Source)
The at java.net.SocketInputStream.read (Unknown Source)
The at Java. IO. BufferedInputStream. The fill (Unknown Source)
The at Java. IO. BufferedInputStream. Read1 (Unknown Source)
The at Java. IO. BufferedInputStream. Read (Unknown Source)
The at sun.net.www.http.ChunkedInputStream.fastRead (Unknown Source)
The at sun.net.www.http.ChunkedInputStream.read (Unknown Source)
The at Java. IO. FilterInputStream. Read (Unknown Source)
The at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read (Unknown Source)
The at Java. Util. Zip. InflaterInputStream. The fill (Unknown Source)
The at Java. Util. Zip. InflaterInputStream. Read (Unknown Source)
The at Java. Util. Zip. GZIPInputStream. Read (Unknown Source)
The at Java. IO. BufferedInputStream. Read1 (Unknown Source)
The at Java. IO. BufferedInputStream. Read (Unknown Source)
The at Java. IO. FilterInputStream. Read (Unknown Source)
The at org. Jsoup. Helper. DataUtil. ReadToByteBuffer (DataUtil. Java: 113)
The at org. Jsoup. Helper. HttpConnection $Response. The execute (HttpConnection. Java: 447)
The at org. Jsoup. Helper. HttpConnection $Response. The execute (HttpConnection. Java: 393)
The at org. Jsoup. Helper. HttpConnection. Execute (HttpConnection. Java: 159)
The at org. Jsoup. Helper. HttpConnection. Get (HttpConnection. Java: 148)
The at jsoupTest. JsoupTest. Main (jsoupTest. Java: 17)
The Time is: 3885 ms
The Exception in the thread "main" Java. Lang. NullPointerException
The at jsoupTest. JsoupTest. Main (jsoupTest. Java: 25)

If 2 is set, it usually does not time out


package jsoupTest;
import java.io.IOException;
import org.jsoup.*;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class JsoupTest {
 public static  void main(String[] args) throws IOException{
 String url = "//www.jb51.net";
 long start = System.currentTimeMillis();
 Document doc=null;
 try{
  doc = Jsoup.connect(url).timeout(5000).get();
 }
 catch(Exception e){
  e.printStackTrace();
 }
 finally{
  System.out.println("Time is:"+(System.currentTimeMillis()-start) + "ms");
 }
 Elements elem = doc.getElementsByTag("Title");
 System.out.println("Title is:" +elem.text());
 } 
}


Related articles: