www.HadoopExam.com

HadoopExam Learning Resources

CCD-410 Certifcation CCA-500 Hadoop Administrator Exam HBase Certifcation CCB-400 Data Science Certifcation Hadoop Training with Hands On Lab Hadoop Package Deal

What is the best way to write and append a large file in java

 have a java program which sends a series of GET request to a webservice and stores the response body as a text file.

I have implemented following example code (filtered much of the code to highlight the concerned) which appends the text file and writes as a new line at the EOF. The code however works perfectly but the performances suffers as the size of the file grows bigger.

The total size of data is almost 4 GB and appends about 500 KB to 1 MB of data in avg.

do{//send the GET request & fetch data as stringString resultData = HTTP.GET <uri>;// buffered writer to create a file BufferedWriter writer =newBufferedWriter(newFileWriter(path,true));//write or append the file
       writer.write(resultData +"\n");while(resultData.exists());

These files are created on daily basis and moved to hdfs for hadoop consumption and as a real - time archive. Is there a better way to achieve this?


Why are you re-opening the writer for each individual request? Just open it once, before the do-while loop. Don't forget to close it after the do-while loop.


1) You are opening a new writer every time, without closing the previous writer object.

2) Don't open the file for each write operation, instead open it before the loop, and close it after the loop.

BufferedWriter writer =newBufferedWriter(newFileWriter(path,true));do{String resultData = HTTP.GET <uri>;
          writer.write(resultData +"\n");}while(resultData.exists());
writer.close();

3) Default buffered size of BufferedWriter is 8192 characters, Since you have 4 GB of data, I would increase the buffer size, to improve the performance but at the same time make sure your JVM has enough memory to hold the data.

BufferedWriter writer =newBufferedWriter(newFileWriter(path,true),8192*4);do{String resultData = HTTP.GET <uri>;
          writer.write(resultData +"\n");}while(resultData.exists());
writer.close();

 

4) Since you are making a GET web service call, the performance depends on the response time of webservice also.


 

You have no rights to post comments

You are here: Home Question & Answer Hadoop Questions What is the best way to write and append a large file in java