The Java implementation transfers FTP and HTTP files directly to HDFS
- 2020-04-01 03:43:29
- OfStack
The previous implementation USES streaming to download HTTP and FTP files locally, and also to upload local files to HDFS, so now you can do it
FTP and HTTP files are transferred to HDFS instead of having to copy the FTP and HTTP files locally and then upload them to HDFS. Actually, the principle of this thing
It's as simple as using a stream, reading FTP or HTTP files into the stream, and then transferring the contents of the stream to HDFS without having to store the data
The local hard disk, just let the memory to complete the transfer process, I hope this tool, can help students with such needs ~
Here are the links to the previous tools:
(link: #)
(link: #)
Links to describe
The code is as follows:
import java.io.InputStream;
import java.io.OutputStream;
import java.io.IOException;
public class FileTrans {
private String head = "";
private String hostname = "";
private String FilePath = "";
private String hdfsFilePath = "";
private HDFSUtil hdfsutil = null;
private FtpClient ftp;
private HttpUtil http;
public void setFilePath(String FilePath){
this.FilePath = FilePath;
}
public String getFilePath(String FilePath){
return this.FilePath;
}
public void sethdfsFilePath(String hdfsFilePath){
this.hdfsFilePath = hdfsFilePath;
}
public String gethdfsFilePath(String hdfsFilePath){
return this.hdfsFilePath;
}
public void setHostName(String hostname){
this.hostname = hostname;
}
public String getHostName(){
return this.hostname;
}
public void setHead(String head){
this.head = head;
}
public String getHead(){
return this.head;
}
public FileTrans(String head, String hostname, String filepath, String hdfsnode,String hdfsFilepath){
this.head = head;
this.hostname = hostname;
this.FilePath = filepath;
this.hdfsFilePath = hdfsFilepath;
if (head.equals("ftp") && hostname != ""){
this.ftp = new FtpClient(this.hostname);
}
if ((head.equals("http") || head .equals("https")) && hostname != ""){
String httpurl = head + "://" + hostname + "/" + filepath;
this.http = new HttpUtil(httpurl);
}
if (hdfsnode != ""){
this.hdfsutil = new HDFSUtil(hdfsnode);
}
this.hdfsutil.setHdfsPath(this.hdfsFilePath);
this.hdfsutil.setFilePath(hdfsutil.getHdfsNode()+hdfsutil.getHdfsPath());
this.hdfsutil.setHadoopSite("./hadoop-site.xml");
this.hdfsutil.setHadoopDefault("./hadoop-default.xml");
this.hdfsutil.setConfigure(false);
}
public static void main(String[] args) throws IOException{
String head = "";
String hostname = "";
String filepath = "";
String hdfsfilepath = "";
String hdfsnode = "";
String localpath = "";
InputStream inStream = null;
int samplelines = 0;
try{
head = args[0]; //Remote server type, HTTP or FTP
hostname = args[1]; //Remote server hostname
filepath = args[2]; //Remote file path
hdfsnode = args[3]; //The machine name of HDFS without the beginning of HDFS
hdfsfilepath = args[4]; //The file path for HDFS
localpath = args[5]; //If you need to save a copy locally, enter the local path, do not save, pass in the space or samplelines pass in 0
samplelines = Integer.parseInt(args[6]); //If you save it locally, save the first N rows, if you don't save it, put a 0
}catch (Exception e){
System.out.println("[FileTrans]:input args error!");
e.printStackTrace();
}
FileTrans filetrans = new FileTrans(head, hostname, filepath, hdfsnode,hdfsfilepath);
if (filetrans == null){
System.out.println("filetrans null");
return;
}
if (filetrans.ftp == null && head.equals("ftp")){
System.out.println("filetrans ftp null");
return;
}
if (filetrans.http == null && (head.equals("http") || head.equals("https"))){
System.out.println("filetrans ftp null");
return;
}
try{
if (head.equals("ftp")){
inStream = filetrans.ftp.getStream(filepath);
if (samplelines > 0){
filetrans.ftp.writeStream(inStream, localpath, samplelines);
}
}
else{
inStream = filetrans.http.getStream(head + "://" + hostname + "/" + filepath);
if (samplelines > 0){
filetrans.http.downLoad(head + "://" + hostname + "/" + filepath, localpath, samplelines);
}
}
filetrans.hdfsutil.upLoad(inStream, filetrans.hdfsutil.getFilePath());
if (head == "ftp"){
filetrans.ftp.disconnect();
}
}catch (IOException e){
System.out.println("[FileTrans]: file trans failed!");
e.printStackTrace();
}
System.out.println("[FileTrans]: file trans success!");
}
}
If there is a problem with compilation, it was mentioned in the article on hadoop tools for reference
Note: it is best to put the other three tools in the same directory, if not together, then please reference
This tool can either transfer FTP or HTTP to HDFS or save the first N rows locally for analysis
The above is all the content of this article, I hope to be able to help you learn Java.
Please take a moment to share the article with your friends or leave a comment. We sincerely appreciate your support!