www.HadoopExam.com

HadoopExam Learning Resources

HadoopExam Training, Interview Questions, Certifications, Projects, POC and Hands On exercise access

    40000+ Learners upgraded/switched career    Testimonials

Hadoop MapReduce Join | error java.io.IOException: Unable to load Abbrevation data..!

I have wrote one small Mapside join program. but i am getting this error - java.io.IOException: Unable to load Abbrevation data..!! and job has failed....

My code for this is below -

package com.mr.join.mapside;

/*
 * calling way
 * hadoop jar mapsidejoin2.jar com.mr.join.mapside.MapSideJoin pupulation_dataset.txt op_mapsidejoin
 */
public class MapSideJoin {

    public static class MyMapper extends Mapper<LongWritable, Text, Text, Text>{
        private Map<String ,String> abMap = new HashMap<String, String>();
        private Text outputKey = new Text();
        private Text outputValue = new Text();

        protected void setup(Context context) throws IOException,InterruptedException{
            Path[] files = DistributedCache.getLocalCacheFiles(context.getConfiguration());

            for (Path p : files){
                if (p.getName().equals("abc.dat")) {
                    BufferedReader reader = new BufferedReader(new FileReader(p.toString()));
                    String line = reader.readLine();

                    while (line != null ) {
                        String[] tokens = line.split("\t");
                        String ab = tokens[0];
                        String state = tokens[1];
                        abMap.put(ab, state);
                        line = reader.readLine();

                    }
                }
            }
            if (abMap.isEmpty()){
                throw new IOException("Unable to load Abbrevation data..!!");
            }
        }

        @Override
        protected void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {
                String row = value.toString();
                String[] tokens = row.split("\t");
                String inab = tokens[0];
                String state = abMap.get(inab);
                outputKey.set(state);
                outputValue.set(row);
                context.write(outputKey, outputValue);

        } 
    }

    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
        Job job = new Job();
        job.setJarByClass(MapSideJoin.class);
        job.setJobName("Distributed Chase");
        job.setNumReduceTasks(0);

        try {
            DistributedCache.addCacheFile(new URI("abc.dat"), job.getConfiguration());
        } catch (Exception e) {
            System.out.println(e);
        }
        job.setMapperClass(MyMapper.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(Text.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.waitForCompletion(true);

    }
}

Does any buddy help me how to fix this error or any debug skill they may help to figure out that has been wrong here.


 

Visit Home Page : http://hadoopexam.com for more detail . As you are not blacklisted user.

You are here: Home Question & Answer Hadoop Questions Hadoop MapReduce Join | error java.io.IOException: Unable to load Abbrevation data..!