Simple Java program to Append to a file in Hdfs

Table of contents
Reading Time: 2 minutes

In this blog, I will present you with a java program to append to a file in HDFS.

I will be using Maven as the build tool.

Now to start with-

First, we need to add maven dependencies in pom.xml.



This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters


<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.8.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.8.0</version>
</dependency>
</dependencies>
view raw

pom.xml

hosted with ❤ by GitHub

Now we need to import the following classes-

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.*;

We will be using hadoop.conf.Configuration class to set the file system configurations as  per the configuration of Hadoop cluster installed.

Let us now start with configuring the file system

public FileSystem configureFileSystem(String coreSitePath, String hdfsSitePath) {
    FileSystem fileSystem = null;
    try {
        Configuration conf = new Configuration();
        conf.setBoolean("dfs.support.append", true);
        Path coreSite = new Path(coreSitePath);
        Path hdfsSite = new Path(hdfsSitePath);
        conf.addResource(coreSite);
        conf.addResource(hdfsSite);
        fileSystem = FileSystem.get(conf);
    } catch (IOException ex) {
        System.out.println("Error occurred while configuring FileSystem");
    }
    return fileSystem;
}

Make sure that the property “dfs.support.append” in hdfs-site.xml is set to true.

You can either set it manually by editing hdfs-site.xml file or programmatically using-

conf.setBoolean("dfs.support.append", true);

Now that the file system is configured, we can access the files stored in Hdfs.

Let us start with appending to a file in Hdfs.

public String appendToFile(FileSystem fileSystem, String content, String dest) throws IOException {

    Path destPath = new Path(dest);
    if (!fileSystem.exists(destPath)) {
        System.err.println("File doesn't exist");
        return "Failure";
    }

    Boolean isAppendable = Boolean.valueOf(fileSystem.getConf().get("dfs.support.append"));

    if(isAppendable) {
        FSDataOutputStream fs_append = fileSystem.append(destPath);
        PrintWriter writer = new PrintWriter(fs_append);
        writer.append(content);
        writer.flush();
        fs_append.hflush();
        writer.close();
        fs_append.close();
        return "Success";
    }
    else {
        System.err.println("Please set the dfs.support.append property to true");
        return "Failure";
    }
}

Now to see whether the data has been correctly written to HDFS, let us write a method to read from HDFS and return the content as String.

public String readFromHdfs(FileSystem fileSystem, String hdfsFilePath) {
    Path hdfsPath = new Path(hdfsFilePath);
    StringBuilder fileContent = new StringBuilder("");
    try{
        BufferedReader bfr=new BufferedReader(new InputStreamReader(fileSystem.open(hdfsPath)));
        String str;
        while ((str = bfr.readLine()) != null) {
            fileContent.append(str+"\n");
        }
    }
    catch (IOException ex){
        System.out.println("----------Could not read from HDFS---------\n");
    }
    return fileContent.toString();
}

After that we have successfully written and read file in HDFS, it’s time to close the file system.

public void closeFileSystem(FileSystem fileSystem){
    try {
        fileSystem.close();
    }
    catch (IOException ex){
        System.out.println("----------Could not close the FileSystem----------");
    }
}

Now before executing the code, you should have hadoop running in your system.

You just need to go to your HADOOP_HOME and run following command-

./sbin/start-all.sh

For complete program, refer to my github repository https://github.com/ksimar/HDFS_AppendAPI

Happy Coding !!


KNOLDUS-advt-sticker

Written by 

Simar is a Software Consultant, having experience of more than 1.5 years. She has an interest in both object-oriented and functional programming. She is a java enthusiast and is now indulged in learning Scala programming language. She is also familiar with relational database technologies such as MySQL, and NoSql database technologies as well. She has worked on Kafka, Spark, Hadoop etc. Her hobbies include coding and listening to music.

1 thought on “Simple Java program to Append to a file in Hdfs2 min read

Comments are closed.

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading