Introducing Maven dependency into POM file
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.4</version>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>com.pro.main</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
Code in main method
if(args.length !=2){
System.out.println("Please enter the path");
System.exit(-1);
}
Job job = Job.getInstance();
Configuration conf = new Configuration();
//1. encapsulate the position of the parameter jarbao
job.setJarByClass(Submitter.class);
//2. Wrapping parameters The position of the current job mapper implementation class in the position of the reduce implementation class
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordConutReduce.class);
//3. encapsulate the parameters of the current job map output
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
//4. encapsulate the parameters What is the output of this job reduce
job.setOutputKeyClass(Text.class);
job.setOutputKeyClass(IntWritable.class);
// determine whether there is an output folder
Path path = new Path(args[1]);
FileSystem fileSystem = path.getFileSystem(conf);// find this file according to path
if (fileSystem.exists(path)) {
fileSystem.delete(path, true);// true means that even if output has something, it is deleted along with it
}
//5. encapsulate the parameters where the dataset to be processed by this job is generated paths
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//6. Wrap parameters Number of multiple reduce tasks started
job.setNumReduceTasks(1);
//7. Submit the job
boolean b = job.waitForCompletion(true);
System.exit(b ?0 : 1);
Maven is packaged as a jar package and put into the Hadoop environment
Upload text to Hadoop file
hadoop fs -put xxx.info /input
Enter the Hadoop environment and enter the command to start
hadoop jar mapreducedemo-1.0-SNAPSHOT.jar /input /output