Recently in the course design, the main process is to collect data from the CSV file, store it in HBase, and then use MapReduce for statistical analysis of the data. During this period, we encountered some problems, which were finally solved through various searches. Record these problems and their solutions here.
1. HBase hmaster auto close problem
Enter zookeeper, delete HBase data (use with caution), and restart HBase
./zkCli.sh
rmr /hbase
stop-hbase.sh
start-hbase.sh
2. Dealing with multi module dependency when packaging with Maven
The project structure is shown in the figure below
ETL and statistics both refer to the common module. When they are packaged separately, they are prompted that the common dependency cannot be found and the packaging fails.
Solution steps:
1. To do Maven package and Maven install for common, I use idea to operate directly in Maven on the right.
2. Run Maven clean and Maven install commands in the outermost total project (root)
After completing these two steps, the problem can be solved.
3. When the Chinese language is stored in HBase, it becomes a form similar to “ XE5 x8f x91 Xe6 x98 x8e”
The classic Chinese encoding problem can be solved by calling the following method before using.
public static String decodeUTF8Str(String xStr) throws UnsupportedEncodingException {
return URLDecoder.decode(xStr.replaceAll("\\\\x", "%"), "utf-8");
}
4. Error in job submission of MapReduce
Write the code locally, type it into a jar package and run it on the server. The error is as follows:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.Job.getArchiveSharedCacheUploadPolicies(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/Map;
at org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:491)
at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:92)
at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:172)
at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at MapReduce.main(MapReduce.java:49)
Solution: add dependencies
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.1.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-common</artifactId>
<version>3.1.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.1.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<version>3.1.3</version>
<scope>provided</scope>
</dependency>
Among them
Hadoop-mapreduce-client-core.jar supports running on a cluster
Hadoop-mapreduce-client-common.jar supports running locally
After solving the above problems, my code can run smoothly on the server.
Finally, it should be noted that the output path of MapReduce cannot already exist, otherwise an error will be reported.
I hope this article can help you with similar problems.
Read More:
- The list command in HBase shell reported an error org.apache.hadoop . hbase.PleaseHoldException : Master is initializing
- Some problems encountered in the development of ReportViewer
- A series of problems in configuring OpenGL development environment in vs2015
- Problems encountered in configuring OpenGL development environment in vs2015
- hbase ERROR org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
- Common errors and solutions in MapReduce stage
- Phpcms development problems and Solutions
- OpenGL development environment configuration [vs2017] + common problems
- Solution of server not running yetexception in HBase
- [Solved] Import org.apache.hadoop.hbase.hbaseconfiguration package cannot be imported
- HBase hangs up immediately after startup. The port reports an error of 500 and hmaster aborted
- How to Solve Hbase Error: Master is initializing
- HBase import CSV file
- An error occurs when HBase uses the shell command: pleaseholdexception: Master is initializing solution
- Java query HBase outoforderscannernextexception
- ERROR: KeeperErrorCode = NoNode for /hbase/master
- Kylin startup error HBase common lib not found
- Some problems in installing wsl2 and NVIDIA docker in win10
- Hadoop reports an error. Cannot access scala.serializable and python MapReduce reports an error
- Analysis and solution of common errors in ArcEngine development