Parse the Java InputStream class and use it to read the PPT file

  • 2020-04-01 04:23:48
  • OfStack

1. About the InputStream. Read ()
        The inputstream.read () method is often used when reading data from a data stream to keep the graph simple. This method only reads one byte at a time from the stream, which is very inefficient.         A better approach is to use the inputstream.read (byte[] b) or inputstream.read (byte[] b,int off,int len) method to read more than one byte at a time.


2. About the available() method of the InputStream class
      When reading more than one byte at a time, the inputstream.available () method is often used, which lets you know how many bytes are available in the stream before you read or write. It is important to note that if this method is used on the slave
When the local file reads the data, it usually does not encounter problems, but if it is used for network operations, it often encounters some trouble. For example, when the Socket communication, the other side obviously sent 1000 bytes, but their own program call the available() method but only get 900, or 100, or even 0, feel a bit puzzling, how can not find the reason. In fact, this is because network communication is often intermittent, a string of bytes often sent in several batches. When a local program calls the available() method, it sometimes gets a 0, either because it hasn't responded yet, or because it has, but the data hasn't arrived yet. They sent you 1,000 bytes, maybe in three batches, so you have to call the available() method three times to get the total.
          If I write code like this:


 int count = in.available();
 byte[] b = new byte[count];
 in.read(b);

          There is always an error when doing network operations, because when you call the available() method, the data sent to the sender may not have arrived yet, and you get count is 0.
                It needs to be this way:


 int count = 0;
 while (count == 0) {
  count = in.available();
 }
 byte[] b = new byte[count];
 in.read(b);

3. Inputstream.read (byte[] b) and inputstream.read (byte[] b,int off,int len)

Both of these methods are used to read multiple bytes from a stream, and experienced programmers will often find that they do not read as many bytes as they want. For example, in the first method, programmers often expect the program to be able to read b. ength bytes, but in reality, the system is often unable to read that much. A closer look at the Java API specification reveals that this method is not guaranteed to read this many bytes, it is only guaranteed to read this many bytes at most (at least 1). Therefore, if you want the program to read count bytes, it is best to use the following code:


 byte[] b = new byte[count];
 int readCount = 0; //The number of bytes that have been successfully read
 while (readCount < count) {
  readCount += in.read(bytes, readCount, count - readCount);
 }

          This code is guaranteed to read count bytes, unless an IO exception is encountered or the end of the data stream (EOFException) is reached.

4. Examples of reading PowerPoint files


import java.io.InputStream; 
 
import org.apache.lucene.document.Document; 
import org.apache.poi.hslf.HSLFSlideShow; 
import org.apache.poi.hslf.model.TextRun; 
import org.apache.poi.hslf.model.Slide; 
import org.apache.poi.hslf.usermodel.SlideShow; 
 
public Document getDocument(Index index, String url, String title, InputStream is) 
throws DocCenterException { 
 StringBuffer content = new StringBuffer(""); 
 try{ 
  SlideShow ss = new SlideShow(new HSLFSlideShow(is));//Is for the InputStream of the file, create SlideShow
  Slide[] slides = ss.getSlides();//Get each slide
  for(int i=0;i<slides.length;i++){ 
  TextRun[] t = slides[i].getTextRuns();//To get the text content of the slide, create TextRun
  for(int j=0;j<t.length;j++){ 
   content.append(t[j].getText());//I'm going to add text to the content
  } 
  content.append(slides[i].getTitle()); 
  } 
  index.AddIndex(url, title, content.toString()); 
 }catch(Exception ex){ 
  System.out.println(ex.toString()); 
 } 
 return null; 
} 


Related articles: