Java uses the poi package to read Excel document code sharing

  • 2020-12-09 00:53:02
  • OfStack

The project needed to parse Excel document to get data, so I found some materials on the Internet, combined with my own use this time, and wrote down my experience:

1. maven project needs to add the following dependencies:


<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.10-FINAL</version>
  </dependency>
  <dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.10-FINAL</version>
  </dependency>

Directly on the test class, the class has a complete comment:


package shindo.Java;

import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ExcelUtil {

  public static void main(String[] args) {
    String path = "D:\\IDE\\workspace-Neon\\Java\\src\\refund.xls";
    try {
      List<List<String>> result = new ExcelUtil().readXls(path);
      System.out.println(result.size());
      for (int i = 0; i < result.size(); i++) {
        List<String> model = result.get(i);
        System.out.println("orderNum:" + model.get(0) + "--> orderAmount:" + model.get(1));
      }

    } catch (Exception e) {
      e.printStackTrace();
    }
  }

  /**
   * 
  * @Title: readXls 
  * @Description:  To deal with xls file 
  * @param @param path
  * @param @return
  * @param @throws Exception   Set file  
  * @return List<List<String>>   The return type  
  * @throws
  * 
  *  From the code, it is not difficult to find its processing logic: 
  * 1. Use first InputStream To obtain excel Of the file io flow 
  * 2. Then wear a 1 In memory excel file HSSFWorkbook Type object that represents the entire object excel File. 
  * 3. For this excel Each page of the file is looped 
  * 4. Loop through each line on each page 
  * 5. Process each cell in each row to get the value of that cell 
  * 6. Add the result of this row to 1 a List In the array 
  * 7. Add the results for each row to the final total 
  * 8. You get it when you've parsed it 1 a List<List<String>> Object of type 
  * 
   */
  private List<List<String>> readXls(String path) throws Exception {
    InputStream is = new FileInputStream(path);
    // HSSFWorkbook  Identify the excel
    HSSFWorkbook hssfWorkbook = new HSSFWorkbook(is);
    List<List<String>> result = new ArrayList<List<String>>();
    int size = hssfWorkbook.getNumberOfSheets();
    //  Each cycle 1 Page and process the current loop page 
    for (int numSheet = 0; numSheet < size; numSheet++) {
      // HSSFSheet  Identify a 1 page 
      HSSFSheet hssfSheet = hssfWorkbook.getSheetAt(numSheet);
      if (hssfSheet == null) {
        continue;
      }
      //  Process the current page, loop read each 1 line 
      for (int rowNum = 1; rowNum <= hssfSheet.getLastRowNum(); rowNum++) {
        // HSSFRow Said line 
        HSSFRow hssfRow = hssfSheet.getRow(rowNum);
        int minColIx = hssfRow.getFirstCellNum();
        int maxColIx = hssfRow.getLastCellNum();
        List<String> rowList = new ArrayList<String>();
        //  Iterate over the line changes, getting each processed cell The element 
        for (int colIx = minColIx; colIx < maxColIx; colIx++) {
          // HSSFCell  Representation cell 
          HSSFCell cell = hssfRow.getCell(colIx);
          if (cell == null) {
            continue;
          }
          rowList.add(getStringVal(cell));
        }
        result.add(rowList);
      }
    }
    return result;
  }

  /**
   * 
  * @Title: readXlsx 
  * @Description:  To deal with Xlsx file 
  * @param @param path
  * @param @return
  * @param @throws Exception   Set file  
  * @return List<List<String>>   The return type  
  * @throws
   */
  private List<List<String>> readXlsx(String path) throws Exception {
    InputStream is = new FileInputStream(path);
    XSSFWorkbook xssfWorkbook = new XSSFWorkbook(is);
    List<List<String>> result = new ArrayList<List<String>>();
    //  Each cycle 1 Page and process the current loop page 
    for (XSSFSheet xssfSheet : xssfWorkbook) {
      if (xssfSheet == null) {
        continue;
      }
      //  Process the current page, loop read each 1 line 
      for (int rowNum = 1; rowNum <= xssfSheet.getLastRowNum(); rowNum++) {
        XSSFRow xssfRow = xssfSheet.getRow(rowNum);
        int minColIx = xssfRow.getFirstCellNum();
        int maxColIx = xssfRow.getLastCellNum();
        List<String> rowList = new ArrayList<String>();
        for (int colIx = minColIx; colIx < maxColIx; colIx++) {
          XSSFCell cell = xssfRow.getCell(colIx);
          if (cell == null) {
            continue;
          }
          rowList.add(cell.toString());
        }
        result.add(rowList);
      }
    }
    return result;
  }

  //  Existing problems 
  /*
   *  In fact, sometimes the data that we want to get is excel But finally found that the results are not ideal 
   *  If you have a excel The numbers are numbers, as you'll see Java And the corresponding one becomes the scientific counting method. 
   *  So you do that when you get the value 1 Special treatment to ensure that you get the results you want 
   *  The online approach is to format numeric data to get the result you want. 
   *  The following provides additional information 1 So, before we do that, let's look at it 1 Under the poi For in the toString() methods :
   * 
   *  The method is poi From the source code, we can find that the processing flow is: 
   * 1. Gets the type of the cell 
   * 2. Format the data by type and output. And that creates a lot of things that we don't want 
   *  So let's do it this way 1 A transformation. 
   */
  /*public String toString(){
    switch(getCellType()){
      case CELL_TYPE_BLANK:
        return "";
      case CELL_TYPE_BOOLEAN:
        return getBooleanCellValue() ? "TRUE" : "FALSE";
      case CELL_TYPE_ERROR:
        return ErrorEval.getText(getErrorCellValue());
      case CELL_TYPE_FORMULA: 
        return getCellFormula();
      case CELL_TYPE_NUMERIC:
        if(DateUtil.isCellDateFormatted(this)){
          DateFormat sdf = new SimpleDateFormat("dd-MMM-yyyy")
          return sdf.format(getDateCellValue());
        }
        return getNumericCellValue() + "";
      case CELL_TYPE_STRING: 
        return getRichStringCellValue().toString();
      default :
        return "Unknown Cell Type:" + getCellType();
    }
  }*/

  /**
   *  transform poi The default toString () The method is as follows 
  * @Title: getStringVal 
  * @Description: 1. Returns for unfamiliar types or if null "" Control list 
  *        2. If it is a number, change the cell type to String And then go back String This ensures that the numbers are not formatted 
  * @param @param cell
  * @param @return   Set file  
  * @return String   The return type  
  * @throws
   */
  public static String getStringVal(HSSFCell cell) {
    switch (cell.getCellType()) {
    case Cell.CELL_TYPE_BOOLEAN:
      return cell.getBooleanCellValue() ? "TRUE" : "FALSE";
    case Cell.CELL_TYPE_FORMULA:
      return cell.getCellFormula();
    case Cell.CELL_TYPE_NUMERIC:
      cell.setCellType(Cell.CELL_TYPE_STRING);
      return cell.getStringCellValue();
    case Cell.CELL_TYPE_STRING:
      return cell.getStringCellValue();
    default:
      return "";
    }
  }
}

conclusion

That's it for this article about Java using the poi package to read the Excel document code share, I hope you find it helpful. Interested friends can continue to refer to other related topics in this site, if there is any deficiency, welcome to comment out. Thank you for your support!


Related articles: