前景提要
HDC调试需求开发(15万预算),能者速来!>>>
我的问题:
环境:Hadoop2.6.0
jdk: 1.8.0_121
问题描述:
当下面程序中的输入文件目录(/init_data/)中 存在文件时,能够在输出目录中查看到输出文件,但是文件内容时空的,
当下面程序中的输入文件目录(/init_data/)中 不存在文件时,在输出目录中看不到文件,同时程序也没有存在错误!
那么想问的是,这个是怎么回事?
我的代码: package com.yangsen; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import java.io.IOException; public class mapreduce { public static void main(String[] args) throws IOException { Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://192.168.100.112:9000"); //conf.set("fs.hdfs.impl","org.apache.hadoop.hdfs.DistributedFileSystem"); Job job = Job.getInstance(conf,"kieryum"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(NullWritable.class); job.setJarByClass(mapreduce.class); job.setMapperClass(MyMapper.class); job.setNumReduceTasks(4); FileInputFormat.addInputPath(job, new Path("hdfs://192.168.100.112:9000/init_data")); //设置输出路径 Path output = new Path("hdfs://192.168.100.112:9000/clear_data"); //强制删除输出路径 output.getFileSystem(conf).delete(output,true); FileOutputFormat.setOutputPath(job, output); try { job.waitForCompletion(true); } catch (InterruptedException e) { e.printStackTrace(); } catch (ClassNotFoundException e) { e.printStackTrace(); } //System.out.println(job.getJobFile()); } class MyMapper extends Mapper<LongWritable,Text,Text,NullWritable> { @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { value.set("dd"); context.write(value,NullWritable.get()); } } }
结果图:
运行结果: