Tag Archives: Hadoop

Exception on start hive: caused by: java.net.noroutetohostexception: no route to host

When we start hive, we will encounter the problem that the router is unable to connect. At this time, the most likely problem is that the firewall is turned on. At this time, we just need to turn it off. Baidu is very simple.

  Another problem may be our IP address. Let’s start all Hadoop clusters: start-all.sh,

See if there are five processes showing that if it is missing, it may be that our I IP address has changed. At this time, we need to modify the hadoop-env.sh file, CD   ~/ hadoop/etc/hadoop

vi hadoop-env.sh

Just change our IP address.

Some problems in the development of HBase MapReduce

Recently in the course design, the main process is to collect data from the CSV file, store it in HBase, and then use MapReduce for statistical analysis of the data. During this period, we encountered some problems, which were finally solved through various searches. Record these problems and their solutions here.

1. HBase hmaster auto close problem

Enter zookeeper, delete HBase data (use with caution), and restart HBase

./zkCli.sh
rmr /hbase
stop-hbase.sh 
start-hbase.sh 

2. Dealing with multi module dependency when packaging with Maven

The project structure is shown in the figure below

ETL and statistics both refer to the common module. When they are packaged separately, they are prompted that the common dependency cannot be found and the packaging fails.

Solution steps:

1. To do Maven package and Maven install for common, I use idea to operate directly in Maven on the right.

2. Run Maven clean and Maven install commands in the outermost total project (root)

After completing these two steps, the problem can be solved.

3. When the Chinese language is stored in HBase, it becomes a form similar to “ XE5  x8f  x91  Xe6  x98  x8e”

The classic Chinese encoding problem can be solved by calling the following method before using.

public static String decodeUTF8Str(String xStr) throws UnsupportedEncodingException {
        return URLDecoder.decode(xStr.replaceAll("\\\\x", "%"), "utf-8");
    }

4. Error in job submission of MapReduce

Write the code locally, type it into a jar package and run it on the server. The error is as follows:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.Job.getArchiveSharedCacheUploadPolicies(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/Map;
    at org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:491)
    at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:92)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:172)
    at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
    at MapReduce.main(MapReduce.java:49)

Solution: add dependencies

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-core</artifactId>
    <version>3.1.3</version>
</dependency>

 <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-common</artifactId>
            <version>3.1.3</version>
        </dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>3.1.3</version>
</dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
    <version>3.1.3</version>
    <scope>provided</scope>
</dependency>

Among them

Hadoop-mapreduce-client-core.jar supports running on a cluster

Hadoop-mapreduce-client-common.jar supports running locally

After solving the above problems, my code can run smoothly on the server.

Finally, it should be noted that the output path of MapReduce cannot already exist, otherwise an error will be reported.

I hope this article can help you with similar problems.

HBase shell Find ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

Problem description

Because of the HBase version, after I changed the HBase version, I opened the HBase shell query table and reported the following error:

How to solve it

The above problems are generally the cause of the failure of hregionserver node.

    1. first step, JPS checks whether its hregionserver node is normally opened (mostly hung up) and looks at the configuration file hbase-site.xml (my problem is here, because the installation has changed the version, and it has not been integrated with Phoneix yet, it is necessary to comment out all the Phoenix mapping configuration in the configuration file, otherwise hregionserver will not start normally)

 

    1. the following is a positive list Exact configuration file </ OL>
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  <property>
    <name>hbase.tmp.dir</name>
    <value>/export/data/hbase/tmp</value>
   <!--This tmp temporary file should be created in the hbase installation directory-->
  </property>
  <property>
    <name>hbase.master</name>
    <value>node1:16010</value>
  </property>
  <property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
  </property>
    <property>
    <name>hbase.rootdir</name>
    <value>hdfs://node1:9000/hbase</value>
  </property>
    <property>
    <name>hbase.zookeeper.quorum</name>
    <value>node1,node2,node3:2181</value>
  </property>
  <property>
    <name>hbase.unsafe.stream.capability.enforce</name>
	<value>false</value>
  </property>
  <property>
	<name>hbase.zookeeper.property.dataDir</name>
	<value>/export/servers/zookeeper/data</value>
  </property>
</configuration>

3. Close HBase, delete/HBase on HDFS and restart HBase

--Close hbase
stop-hbase.sh
--delete /hbase
hadoop fs -rm -r /hbase
--start hbase
start-hbase.sh

Note: in case of this kind of error, check whether the configuration file is correct, restart HBase (restart needs to delete/HBase), that is, follow step 3 to do it again.

[Solved] Hadoop error java.lang.nosuchmethoderror

Record a Hadoop error:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.Job.getArchiveSharedCacheUploadPolicies(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/Map;
	at org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:491)
	at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:93)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:172)
	at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:794)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
	at hadoop.mapjoin.MapJoinDriver.main(MapJoinDriver.java:59)

At this point, the dependency introduced in POM. XML is

 <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-web</artifactId>
            <version>4.3.16.RELEASE</version>
        </dependency>
        <!--Dependencies used by hbase-->
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>2.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-server</artifactId>
            <version>2.0.0</version>
        </dependency>

        <!--hadoop dependencies-->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.2.2</version>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.16.18</version>
        </dependency>

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
        </dependency>
        <!--Log information-->
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.30</version>
        </dependency>

However, after the dependency of HBase is removed, wordcount can run normally. Is it a problem with both versions
the corresponding relationship of the recommended version on the official website, and the link to view on the official website http://hbase.apache.org/book.html#java

CDH Namenode Abnormal stop Error: flush failed for required journal (JournalAndStream(mgr=QJM to

The error information is as follows:

2020-12-09 14:07:56,509 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [xxx:8485, xxx:8485, xxx:8485], stream=QuorumOutputStream starting at txid 74798133))
2020-12-09 14:07:56,499 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 74798133
        at java.lang.Thread.run(Thread.java:748)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:243)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:711)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:521)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:55)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:385)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:525)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:109)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:286)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
xxx:8485: IPC's epoch 33 is less than the last promised epoch 34

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
xxx:8485: IPC's epoch 33 is less than the last promised epoch 34

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
xxx:8485: IPC's epoch 33 is less than the last promised epoch 34
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
2020-12-09 14:07:56,496 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [xxx:8485, xxx:8485, xxx:8485], stream=QuorumOutputStream starting at txid 74798133))
2020-12-09 14:07:56,494 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 7611ms to send a batch of 2 edits (179 bytes) to remote journal xxx:8485
        at java.lang.Thread.run(Thread.java:748)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:389)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:396)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:187)
        at com.sun.proxy.$Proxy19.journal(Unknown Source)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.Client.call(Client.java:1355)
        at org.apache.hadoop.ipc.Client.call(Client.java:1445)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1499)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 33 is less than the last promised epoch 34
2020-12-09 14:07:56,492 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal xxx:8485 failed to write txns 74798134-74798135. Will try to write to this JN again after the next log roll.
]
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
, xxx:8485: IPC's epoch 33 is less than the last promised epoch 34
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
2020-12-09 14:07:55,886 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 7003 ms (timeout=20000 ms) for a response for sendEdits. Exceptions so far: [xxx:8485: IPC's epoch 33 is less than the last promised epoch 34
]
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
, xxx:8485: IPC's epoch 33 is less than the last promised epoch 34
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
2020-12-09 14:07:54,883 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 6001 ms (timeout=20000 ms) for a response for sendEdits. Exceptions so far: [xxx:8485: IPC's epoch 33 is less than the last promised epoch 34
        at java.lang.Thread.run(Thread.java:748)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:389)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:396)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:187)
        at com.sun.proxy.$Proxy19.journal(Unknown Source)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.Client.call(Client.java:1355)
        at org.apache.hadoop.ipc.Client.call(Client.java:1445)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1499)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:27401)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:162)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:179)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:372)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:484)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:458)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 33 is less than the last promised epoch 34
2020-12-09 14:07:49,776 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote journal xxx:8485 failed to write txns 74798134-74798135. Will try to write to this JN again after the next log roll.

When HA is configured, one of the namenode stops, and the key message “IPC’s epoch is less than the last committed epoch” is probably due to network failure. After reading the log, every time another namenode is started, port 8485 of the three journalnode services will be detected, indicating that it is failed,
indicating that it is most likely a network problem, The troubleshooting is as follows:
ifconfig – a check whether the network card has packet loss
check whether/etc/sysconfig/SELinux = disabled is correct
/etc/init.d/iptables status check whether the firewall is running, because Hadoop is running in the Intranet environment, remember that the firewall was closed when it was deployed before
check the firewalls of three journalnode servers successively, It’s all closed

Online solutions:
1) adjust the write timeout of journal node
for example, dfs.qjournal.write-txns.timeout.ms = 90000

In fact, in the actual production environment, this kind of timeout is also easy to happen, so we need to change the default 20s timeout to a larger value, such as 60 or 90s.

We can add a set of configurations in hdfs-site.xml under Hadoop/etc/Hadoop

dfs.qjournal.write-txns.timeout.ms
60000

CDH cluster searches dfs.qjournal.write-txns.timeout.ms in HDFS configuration interface
2) adjusts the Java parameters of namenode and triggers full GC in advance, so that the time of full GC will be less
3) the default full GC mode of namenode is parallel GC, which is in STW mode and is changed to CMS format. Adjust the startup parameters of namenode:
– XX: + usecompansedoops
– XX: + useparnewgc – XX: + useconcmarksweepgc – XX: + cmsclassunloadingenabled
– XX: + usecmpackage at full collection – XX: cmsfullgcsbeforecompaction = 0
– XX: + cmsparallelremarkenabled – XX: + disableexplicitgc
– XX: + usecmsinitiatingoccupancyonly – XX: cmsinitiatingoccupancyfraction = 75
– XX: cmsfullgcsbeforecompaction SoftRefLRUPolicyMSPerMB=0

Error in initializing namenode when configuring Hadoop!!!

When formatting namenode in Linux:
exception: when encountering exception during format

Error namenode: failed to start namenode_ When home is not set and could not be found
1. Check the path of hdfs-site.xml file namenode
2. If the user’s permission is not enough, give the Hadoop folder user authorization sudo chown – R Hadoop./Hadoop
3. Configure Java in hadoop-env.sh_ HOME

ERROR: Invalid HADOOP_COMMON_HOME

Error when starting Hadoop and yarn
/start yarn. Sh
error: invalid Hadoop_ COMMON_ HOME

Solution:
1. Check Java first_ Is home configured correctly

java
javac

2. Then check whether Hadoop home is configured correctly

hadoop version
Hadoop 3.1.3
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579
Compiled by ztang on 2019-09-12T02:47Z
Compiled with protoc 2.5.0
From source with checksum ec785077c385118ac91aadde5ec9799
This command was run using /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-common-3.1.3.jar

A solution to the problem that the number of nodes does not increase and the name of nodes is unstable after adding nodes dynamically in Hadoop cluster

Problem phenomenon

At present, there are three Hadoop clusters with 01, 02 and 03 nodes, among which 01 is the master node and 02 and 03 are the slave nodes. After dynamically adding 04 slave nodes, the number of nodes displayed in Hadoop web interface does not increase. Further observation shows that the contents of slave nodes in the node list are sometimes 02 and 04 nodes, and sometimes 02 and 03 nodes, which is very strange.

 

Analyze the reasons

Find the current/version file in the dataDir directory, and the datanodeuuid values in the 03 and 04 servers are exactly the same. It can be inferred that the master node treats 03 and 04 as the same server. The reason is that the 03 server has been started before, and the datanodeuuid has been recorded in the generated version. When the virtual machine is cloned, all the information is the same as the 03 server.

 

Solution

Considering that if it is not a clone but a direct installation, the dataDir and TMPDIR directories of the 04 server are blank, so stop the datanode and nodemanager services of the 04 server, delete all the files in the dataDir and TMPDIR directories, restart the datanode and nodemanager services, and observe again that the contents of the Hadoop cluster display three datanodes 02, 03 and 04, which are similar to the The expectation is consistent and the problem is solved.

matters needing attention

When adding new nodes by copying or cloning virtual machines, the following operations need to be done:

1. Modify the host name and IP address, restart the new node, and copy the/etc/hosts file to other nodes in the cluster (if using the local DNS server, this step can be omitted, just add the modified node to the domain name resolution record in the DNS server);

2. The new node needs to re run SSH keygen – t RSA to generate the public key and add it to authorized_ And copy it to other nodes in the cluster.

reference material:

1. Hadoop datanode starts normally, but there is no node in live nodes

https://blog.csdn.net/wk51920/article/details/51729460

2. Hadoop 2.7 dynamically adds and deletes nodes
in this paper

https://blog.csdn.net/Mark_ LQ/article/details/53393081

Hadoop datanode using JPS to view the solution that can’t be started

Problem Description: after the Hadoop environment is deployed, the process of datanode cannot be seen when running JPS on the slave machine.

Solution: delete all HDFS on slave- site.xml All contents in the datanode folder configured in( dfs.data.dir Parameter), and then initialize the namenode to run

hadoop namenode -format

Reinitialize.

Reason: the namenode has been initialized many times, but the master has not cleared all the initialization data in the datanode folder, which makes the ID generated in the two folders inconsistent. After deleting the initialization data in the datanode folder, the namenode initialization will take effect. Start Hadoop again and use JPS to see the datanode process.

@Solution of error failed to get response from / Vue cli version marker

thank: CSDN@beeegood Questions provided.

Today, they encountered a very interesting bug. When they created a project with @ Vue/cli, they reported an error, which they had never seen before:

At first, I thought it was a version problem. After all, the error message said update, but the version of CLI was the latest. After asking, the node and NPM versions were also the latest (12.16.1, which was the latest as of the time I wrote this article). Most importantly, there was no old version of Vue cli

That’s very interesting. According to convention, when encountering the front-end problem, the first reaction is to unload and reload

npm uninstall -g @vue/cli
npm cache clean --force
npm install -g @vue/cli

However, it’s useless. After checking for a long time, we can’t find the relevant error report on the Internet, which is very embarrassing.

Later, I noticed that there is an output of yarn below. Is the built-in yarn in cli?But this shouldn’t be:

although I think it’s incredible, I still decided to look at the version of yarn. Yarn is highly suspected. Sure enough:

the problem was found. But what is this?Hadoop?

Later, I remembered that yarn is also a part of Hadoop, which is used to schedule resources:

Hadoop — HDFS data writing process

HDFS write data flow

Suppose there is a local file data.txt First, the HDFS client creates a connection object file system. Then the client requests the namenode to upload the file/user/fzl/ data.txtNameNode After receiving the request, respond to the client and upload the file. The client requests to upload the first block (0-128m), please return to datanode, and then namenode returns datanode1, datanode2, datanode3 nodes (considering node distance, load balancing and other factors), indicating that these three nodes will be used to store data. After receiving the return node, the client creates fsdataoutputstream, and then requests datanode1 to establish a block transmission channel, and datanode1 requests to communicate with Da Tanode2 establishes a channel, datanode2 establishes a channel with datanode3, datanode3 responds to datanode2, datanode2 responds to datanode1, datanode1 responds to fsdataoutputstream successfully (pipeline transmission), and then starts to send data in the form of packet (each transmission of stream is a chunk (512byte), saves a packet (64K) and then transmits the packet object). When transmitting to datanode1, datan Ode1 writes the data file to the disk first, and then transfers it to datanode2 directly from memory. Datanode2 and datanode3 are in turn and so on. In this way, the problem that a datanode fails to write after an error is avoided. After all transfers are completed, a data write operation is completed

The flow chart is as follows:

Node distance calculation

Node distance: the sum of the distances from two nodes to the nearest common ancestor

Figure 2-10: node distance: 3

Rack aware

The first copy is on the node where the client is located. If the client is outside the cluster, randomly select a second replica in a random node of another rack, and the third replica in a random node of the rack where the second replica is located