Tag Archives: Big data

How to Solve HiveServer2 & Beeline Error

1. Hive hiveserver2 starts, the process exists, but the exception of port 10000 is not found?

Add a configuration in hive-site.xml and restart the hiveserver2 service

<property>
    <name>hive.metastore.event.db.notification.api.auth</name>
    <value>false</value>
</property>

2. Beeline can’t connect to hiveserver2, report to org apache.hadoop.security.authorize.AuthorizationException?

Permission problem. Entity users such as root are not allowed to access

Add the following configuration in core-site.xml, distribute the configuration, remember to distribute , restart Hadoop, and connect again

<property>
    <name>hadoop.proxyuser.root.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.root.groups</name>
    <value>*</value>
</property>

[Solved] Error: Could not open client transport with JDBC Uri

Error Messages:

[root@cpucode100 bin]# beeline -u jdbc:hive2://cpucode100:10000 -n root
Connecting to jdbc:hive2://cpucode100:10000
21/12/15 21:41:51 [main]: WARN jdbc.HiveConnection: Failed to connect to cpucode100:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://cpucode100:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate root (state=08S01,code=0)
Beeline version 3.1.2 by Apache Hive

Connecting to jdbc:hive2://cpucode100:10000
21/12/16 21:15:53 [main]: WARN jdbc.HiveConnection: Failed to connect to cpucode100:10000
Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://cpucode100:10000: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)
Beeline version 3.1.2 by Apache Hive

Solution:
The configuration information in the hadoop file core-site.xml is as follows, restart Hadoop, and start hiveserver2 and beeline again. Can
Replace the root below with your own username

	<property>
		<name>hadoop.proxyuser.root.hosts</name>
		<value>*</value>
	</property>
	<property>
		<name>hadoop.proxyuser.root.groups</name>
		<value>*</value>
	</property>

distribute

xsync /etc

Restart start

myhadoop.sh stop

myhadoop.sh start

hive --service hiveserver2

Wait, it will be a long time here, about 10 minutes

beeline -u jdbc:hive2://cpucode100:10000 -n root

[Solved] Hadoop root directory Cancell the quota restriction Error

When we use the directory quota limit, it acts on the root directory of the HDFS instance

hdfs dfsadmin -setQuota 2 /

Setting can be successful, but an error will be reported when you want to cancel

hdfs dfsadmin -clrQuota  /
clrQuota: Cannot clear namespace quota on root.

Current version 2.9.2 cannot be solved. Adding a patch to the HDP release version can solve this problem

[Solved] hive find error: The original program header file is: #include Modify to read

These errors occur when executing query statements in hive,

ERROR : Job Submission failed with exception 'java.net.ConnectException(Call From ************ failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused)'
java.net.ConnectException: Call From *************** to ****************:8032 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at sun.reflect.GeneratedConstructorAccessor34.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:755)
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1549)
	at org.apache.hadoop.ipc.Client.call(Client.java:1491)
	at org.apache.hadoop.ipc.Client.call(Client.java:1388)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
	at com.sun.proxy.$Proxy85.getNewApplication(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:274)
	at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy86.getNewApplication(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:270)
	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:278)
	at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:196)
	at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:271)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:157)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423)
	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
	at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:329)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Denied to connect
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:804)
	at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:421)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1606)
	at org.apache.hadoop.ipc.Client.call(Client.java:1435)
	... 54 more

ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Call From ****** to hadoop102:8032 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

After careful analysis of this problem, the problem should be that the port belonging to 8032 has an error and has been prompting for abnormal connection. However, I set the 8032 port as the yarn port, which is also 8032. I found that my own yarn service has not been turned on. So you can enter the following command again

start-yarn.sh

Restart yarn and execute the query statement again to display success.

[Solved] hbase Create Sheet Error: ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

To create a table in the HBase shell:

A few days ago, it was OK. Today, I want to review it. Suddenly, it doesn’t work.
I checked a lot and said that list can’t create tables. When I encountered a problem, list can’t create tables.
after reading a lot, I found that many of them are about the problem of zookeeper. Clearing HBase cache, I cried because I didn’t install zookeeper client, let alone clearing it.
the next step is to change the configuration, Synchronize the time cycle and eliminate strange error reports

Problem Description:

There is no problem using list in HBase shell
there is a problem when creating a table: error: org apache. hadoop. hbase. PleaseHoldException: Master is initializing

Cause analysis:

1. hbase file configuration
2. The clock does not correspond to
3. Some strange errors are reported when hbase runs (although it can run normally before, there are hidden dangers)
4. hbase is sick (artificial mental retardation)

Solution:

Modify HBase configuration file

By modifying the hbase configuration file hbase-site.xml, hbase.rootdir is changed to hbase.root.dir. The following configuration file


<configuration>
        <property>
                <name>hbase.root.dir</name>
                <value>hdfs://localhost:9000/hbase</value>
        </property>
        <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
        </property>
        <property>
                <name>hbase.unsafe.stream.capability.enforce</name>
                <value>false</value>
        </property>
</configuration>

synchronization time

directly enter the following command in the shell

ntpdate 1.cn.pool.ntp.org

3. Eliminate strange messages from HBase

Directly in HBase env Add the following command to the SH file

 export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP=true

Reboot

Python Use PIP to install pyinstaller Error [Solved]

1. Error reporting using pip

D:\date\IDEA\Game\PyInstaller-3.4>pip install pyinstaller
'pip' is not an internal or external command, nor is it a runnable program
or batch file.

Solution: add environment variables

Add new system variables.
Variable name: PYTHON_HOME
Variable value: D:\software\anaconda3\Scripts
Modify the path variable.
	Add:%PYTHON_HOME%

2. Error when installing pyinstaller with PIP

D:\date\IDEA\Game\PyInstaller-3.4>pip install pyinstaller
WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pyinstaller/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pyinstaller/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pyinstaller/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pyinstaller/
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pyinstaller/
Could not fetch URL https://pypi.tuna.tsinghua.edu.cn/simple/pyinstaller/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.tuna.tsinghua.edu.cn', port=443): Max retries exceeded with url: /simple/pyinstaller/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
ERROR: Could not find a version that satisfies the requirement pyinstaller (from versions: none)
ERROR: No matching distribution found for pyinstaller
WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Could not fetch URL https://pypi.tuna.tsinghua.edu.cn/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.tuna.tsinghua.edu.cn', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping

Solution: add trust source

D:\date\IDEA\Game\PyInstaller-3.4>pip install  pyinstaller  --index-url http://pypi.douban.com/simple/    --trusted-host pypi.douban.com
Looking in indexes: http://pypi.douban.com/simple/
Collecting pyinstaller
  Downloading http://pypi.doubanio.com/packages/da/98/ba0071620a2bd672cd98b398653517a296f988a8271d9d2f365a7922a51b/pyinstaller-4.7-py3-none-win_amd64.whl (2.0 MB)
     |████████████████████████████████| 2.0 MB 2.2 MB/s
Requirement already satisfied: pywin32-ctypes>=0.2.0 in d:\software\anaconda3\lib\site-packages (from pyinstaller) (0.2.0)
Collecting pyinstaller-hooks-contrib>=2020.6
  Downloading http://pypi.doubanio.com/packages/80/74/4c885df43604c4ae570610e187052f29d806d582e398c2e48b83dad74610/pyinstaller_hooks_contrib-2021.4-py2.py3-none-any.whl (215 kB)
     |████████████████████████████████| 215 kB ...
Collecting pefile>=2017.8.1
  Downloading http://pypi.doubanio.com/packages/ee/e1/a7bd302cf5f74547431b4e9b206dbef782d112df6b531f193bb4a29fb1b9/pefile-2021.9.3.tar.gz (72 kB)
     |████████████████████████████████| 72 kB ...
Collecting altgraph
  Downloading http://pypi.doubanio.com/packages/84/3f/1a5c9bef54cac9bf41edd6f4aaf61cd52ed578e10ccc607e0278012cb4a5/altgraph-0.17.2-py2.py3-none-any.whl (21 kB)
Requirement already satisfied: setuptools in d:\software\anaconda3\lib\site-packages (from pyinstaller) (58.0.4)
Requirement already satisfied: future in d:\software\anaconda3\lib\site-packages (from pefile>=2017.8.1->pyinstaller) (0.18.2)
Building wheels for collected packages: pefile
  Building wheel for pefile (setup.py) ... done
  Created wheel for pefile: filename=pefile-2021.9.3-py3-none-any.whl size=68844 sha256=139bf21106643db9019b3f0c33e2da41e2bda609170130bb24491cf1a5eeba0a
  Stored in directory: c:\users\edz\appdata\local\pip\cache\wheels\7b\f8\a2\c28c8f2797bf32ae30e18da14e5a5ab9cfbfb72da03593426d
Successfully built pefile
Installing collected packages: pyinstaller-hooks-contrib, pefile, altgraph, pyinstaller
Successfully installed altgraph-0.17.2 pefile-2021.9.3 pyinstaller-4.7 pyinstaller-hooks-contrib-2021.4

[Solved] Mapreducer Class Conversion error: java.lang.ClassCastException

An error was reported while writing the mapreducer.
java.lang.ClassCastException: class date2021_11_27_5.Commodity
    at java.lang.Class.asSubclass(Unknown Source)
    at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:887)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1004)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
2021-11-29 10:00:26,301 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map task executor complete.
  2021-11-29 10:00:26,302 WARN [org.apache.hadoop.mapred.LocalJobRunner] - job_local49120036_0001
  java.lang.Exception: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :class date2021_11_27_5.Commodity
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :class date2021_11_27_5.Commodity
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:415)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.ClassCastException: class date2021_11_27_5.Commodity
    at java.lang.Class.asSubclass(Unknown Source)
    at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:887)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1004)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402)
    ... 10 more

 

How to Solve:
This is because the custom entity class does not implement the WritableComparable<Commodity> interface, or does not override the compareTo method, resulting in a class conversion exception due to lack of serialization.
There is another possibility that
Implementing the WritableComparable<Commodity> interface instead of the WritableComparable<Commodity> interface because of the mapreducer rules
This implementation of two interfaces will result in the above error message when using a custom entity class for the value without problems, but when using a custom entity class for the key
Just reimplement the WritableComparable<Commodity> interface and guide the package
Hadoop specifies that if you implement the Writable and Comparable<Commodity> interfaces, the data of the custom entity class can only be used as value, but if you implement the WritableComparable<Commodity> interface, the data of the custom entity class can be used as both key and value.

[Solved] Hive execute insert overwrite error: could not be cleared up

Problem description

    1. User Zhangsan executes insert overwrite:
INSERT OVERWRITE table temp.push_temp PARTITION(d_layer='app_video_uid_d_1')
SELECT ...

Could not be cleaned up:

Failed with exception Directory hdfs://Ucluster/user/hive/warehouse/temp.db/push_temp/d_layer=app_video_uid_d_1 could not be cleaned up.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Directory hdfs://Ucluster/user/hive/warehouse/temp.db/push_temp/d_layer=app_video_uid_d_1 could not be cleaned up.
      1. check the HDFS directory permissions and find that the directory is writable by everyone, and the directory owner is Lisi:
drwxrwxrwt   - lisi supergroup          0 2021-11-29 15:04 /user/hive/warehouse/temp.db/push_temp/d_layer=app_video_uid_d_1
        1. the user Lisi executes step 1. The SQL in step 1 can execute  successfully

Cause of problem

Three words – viscous bit
look carefully at the directory permissions above. The last bit is “t”, which means that the sticky bit is enabled for the directory, that is, only the owner of the directory can delete files under the directory

# Non-owner delete sticky bit files
$ hadoop fs -rm /user/hive/warehouse/temp.db/push_temp/d_layer=app_video_uid_d_1/000000_0
21/11/29 16:32:59 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 7320 minutes, Emptier interval = 0 minutes.
rm: Failed to move to trash: hdfs://Ucluster/user/hive/warehouse/temp.db/push_temp/d_layer=app_video_uid_d_1/000000_0: Permission denied by sticky bit setting: user=admin, inode=000000_0

Because insert overwrite needs to delete the original file in the directory, but it cannot be deleted due to sticky bits, resulting in HQL execution failure

Solution

Cancel the sticky bit of the directory

# Cancellation of sticking position
hadoop fs -chmod -R o-t /user/hive/warehouse/temp.db/push_temp/d_layer=app_video_uid_d_1

# Open sticking position
hadoop fs -chmod -R o+t /user/hive/warehouse/temp.db/push_temp/d_layer=app_video_uid_d_1

Datanode startup failed with an error: incompatible clusterids

Article catalog

Datanode failed to start and reported an error incompatible clusterids information error summary problem description problem cause analysis steps solution reference

Datanode startup failed with an error: incompatible clusterids

Information

Environment version: Hadoop 3.3.1 system version: CentOS 7.4 java version: Java se 1.8.0_ three hundred and one

Error report summary

java.io.IOException: Incompatible clusterIDs in /opt/module/hadoop-3.3.1/data/dfs/data: namenode clusterID = CID-aa23cfe4-9ad3-4c06-87fc-e862c8f3a722; datanode clusterID = CID-55fa9a51-7777-4ff4-87d6-4df7cf2cb8b9

Problem description

An error is reported when datanode is started. The contents of the error reported in/opt/module/hadoop-3.3.1/logs/hadoop-bordy-datanode-hadoop 102.log log are as follows:

2021-11-29 21:58:51,350 INFO org.apache.hadoop.hdfs.server.common.Storage: Using 1 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=1, dataDirs=1)
2021-11-29 21:58:51,354 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /opt/module/hadoop-3.3.1/data/dfs/data/in_use.lock acquired by nodename 13694@hadoop102
2021-11-29 21:58:51,356 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/opt/module/hadoop-3.3.1/data/dfs/data
java.io.IOException: Incompatible clusterIDs in /opt/module/hadoop-3.3.1/data/dfs/data: namenode clusterID = CID-aa23cfe4-9ad3-4c06-87fc-e862c8f3a722; datanode clusterID = CID-55fa9a51-7777-4ff4-87d6-4df7cf2cb8b9
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:746)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:296)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:409)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:561)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1753)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1689)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:394)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:295)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:854)
        at java.lang.Thread.run(Thread.java:748)
2021-11-29 21:58:51,358 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid a4eeff59-0192-4402-8278-4743158fa405) service to hadoop101/192.168.2.101:8020. Exiting.
java.io.IOException: All specified directories have failed to load.
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:562)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1753)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1689)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:394)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:295)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:854)
        at java.lang.Thread.run(Thread.java:748)
2021-11-29 21:58:51,359 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid a4eeff59-0192-4402-8278-4743158fa405) service to hadoop101/192.168.2.101:8020
2021-11-29 21:58:51,363 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid a4eeff59-0192-4402-8278-4743158fa405)
2021-11-29 21:58:53,364 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2021-11-29 21:58:53,424 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hadoop102/192.168.2.102
************************************************************/

Cause of problem

The upgrade function of Hadoop requires data node to store a permanent clusterid in its version file. When datanode starts, it will check and match the clusterid in the version file of namenode. If the two do not match, an exception of “incompatible clusterids” will appear. See the official CCR [hdfs-107]

Analysis steps

    View clusterid in the version file under /opt/module/hadoop-3.3.1/data/DFs/data/current in the datanode directory /opt/module/hadoop-3.3.1/data/DFs/name/current .
    view clusterid in the version file under /code> in the namnode directory /opt/module/hadoop-3.3.1/data/name/current . Br> found two files The clusterid in is missing and does not match. It is understood that in the HDFS architecture, each datanode needs to communicate with the namenode, and the clusterid is the unique ID of the namenode.

    terms of settlement

    Modify the clusterid value of the failed datanode to the clusterid of the primary namenode

    reference resources

    Hadoop failed to start datanode. There is a problem with clusterid - Wang Shen - blog Garden (cnblogs. Com)

Spark Error: java.lang.StackOverflowError [How to Solve]

Spark broadcast class message java.lang.stackoverflowerror

Background: a 167m tree class needs to be broadcast, so the stack memory is not enough
solution: add in spark submit: (at present, due to the small data level, it runs in local mode)

spark-submit \$
--class bp_beauty_op.beauty_op \$
--master local[*] \$
--driver-java-options "-Xss256m" \$
test-1.0-SNAPSHOT.jar

Or add spark.Driver.Extrajavaoptions = “- xss256m”
in spark-defaults.conf to test the running.