The Java implementation transfers FTP and HTTP files directly to HDFS

  • 2020-04-01 03:43:29
  • OfStack

The previous implementation USES streaming to download HTTP and FTP files locally, and also to upload local files to HDFS, so now you can do it
FTP and HTTP files are transferred to HDFS instead of having to copy the FTP and HTTP files locally and then upload them to HDFS. Actually, the principle of this thing
It's as simple as using a stream, reading FTP or HTTP files into the stream, and then transferring the contents of the stream to HDFS without having to store the data
The local hard disk, just let the memory to complete the transfer process, I hope this tool, can help students with such needs ~
Here are the links to the previous tools:

(link: #)
(link: #)
Links to describe

The code is as follows:


import java.io.InputStream;
import java.io.OutputStream;
import java.io.IOException;


public class FileTrans {
  private String head = "";
  private String hostname = "";
  private String FilePath = "";
  private String hdfsFilePath = "";
  private HDFSUtil hdfsutil = null;
  private FtpClient ftp;
  private HttpUtil http;

  public void setFilePath(String FilePath){
    this.FilePath = FilePath;
  }

  public String getFilePath(String FilePath){
    return this.FilePath;
  }

  public void sethdfsFilePath(String hdfsFilePath){
    this.hdfsFilePath = hdfsFilePath;
  }

  public String gethdfsFilePath(String hdfsFilePath){
    return this.hdfsFilePath;
  }

  public void setHostName(String hostname){
    this.hostname = hostname;
  }

  public String getHostName(){
    return this.hostname;
  }

  public void setHead(String head){
    this.head = head;
  }

  public String getHead(){
    return this.head;
  }

  public FileTrans(String head, String hostname, String filepath, String hdfsnode,String hdfsFilepath){
    this.head = head;
    this.hostname = hostname;
    this.FilePath = filepath;
    this.hdfsFilePath = hdfsFilepath;
    if (head.equals("ftp") && hostname != ""){
      this.ftp = new FtpClient(this.hostname);
    }
    if ((head.equals("http") || head .equals("https")) && hostname != ""){
      String httpurl = head + "://" + hostname + "/" + filepath;
      this.http = new HttpUtil(httpurl);
    }
    if (hdfsnode != ""){
      this.hdfsutil = new HDFSUtil(hdfsnode);
    }
    this.hdfsutil.setHdfsPath(this.hdfsFilePath);
    this.hdfsutil.setFilePath(hdfsutil.getHdfsNode()+hdfsutil.getHdfsPath());
    this.hdfsutil.setHadoopSite("./hadoop-site.xml");
    this.hdfsutil.setHadoopDefault("./hadoop-default.xml");
    this.hdfsutil.setConfigure(false);
  }

  public static void main(String[] args) throws IOException{
    String head = "";
    String hostname = "";
    String filepath = "";
    String hdfsfilepath = "";
    String hdfsnode = "";
    String localpath = "";
    InputStream inStream = null;
    int samplelines = 0;
    try{
      head = args[0];         //Remote server type, HTTP or FTP
      hostname = args[1];       //Remote server hostname
      filepath = args[2];       //Remote file path
      hdfsnode = args[3];       //The machine name of HDFS without the beginning of HDFS
      hdfsfilepath = args[4];     //The file path for HDFS
      localpath = args[5];       //If you need to save a copy locally, enter the local path, do not save, pass in the space or samplelines pass in 0
      samplelines = Integer.parseInt(args[6]); //If you save it locally, save the first N rows, if you don't save it, put a 0
    }catch (Exception e){
      System.out.println("[FileTrans]:input args error!");
      e.printStackTrace();
    }
    FileTrans filetrans = new FileTrans(head, hostname, filepath, hdfsnode,hdfsfilepath);
    if (filetrans == null){
      System.out.println("filetrans null");
      return;
    }
    if (filetrans.ftp == null && head.equals("ftp")){
      System.out.println("filetrans ftp null");
      return;
    }
    if (filetrans.http == null && (head.equals("http") || head.equals("https"))){
      System.out.println("filetrans ftp null");
      return;
    }
    try{
      if (head.equals("ftp")){
        inStream = filetrans.ftp.getStream(filepath);
        if (samplelines > 0){
          filetrans.ftp.writeStream(inStream, localpath, samplelines);
        }
      }
      else{
        inStream = filetrans.http.getStream(head + "://" + hostname + "/" + filepath);
        if (samplelines > 0){
          filetrans.http.downLoad(head + "://" + hostname + "/" + filepath, localpath, samplelines);
        }
      }
      filetrans.hdfsutil.upLoad(inStream, filetrans.hdfsutil.getFilePath()); 
      if (head == "ftp"){
        filetrans.ftp.disconnect();
      }
    }catch (IOException e){
      System.out.println("[FileTrans]: file trans failed!");
      e.printStackTrace();
    }
    System.out.println("[FileTrans]: file trans success!");
  }

}

If there is a problem with compilation, it was mentioned in the article on hadoop tools for reference
Note: it is best to put the other three tools in the same directory, if not together, then please reference

This tool can either transfer FTP or HTTP to HDFS or save the first N rows locally for analysis

The above is all the content of this article, I hope to be able to help you learn Java.

Please take a moment to share the article with your friends or leave a comment. We sincerely appreciate your support!


Related articles: