Tag Archives: Hadoop

HDFS Java API operation error (user permission)

Problem Description:

There is a problem when running Hadoop HDFS in idea. The error is as follows:
org. Apache. Hadoop. Security. Accesscontrolexception: permission denied: user = XXXX, access = write, inode = “/”: root: supergroup: drwxr-xr-x
because the user name of this machine is different from that of the Linux operating system, an error will be reported.

Solution:

Under Linux system, find the directory where Hadoop is installed, and find etc/Hadoop/HDFS site. XML
under this directory

<property>
  <name>dfs.permissions.enabled</name>
  <value>false</value>
  <description>
    If "true", enable permission checking in HDFS.
    If "false", permission checking is turned off,
    but all other behavior is unchanged.
    Switching from one parameter value to the other does not change the mode,
    owner or group of files or directories.
  </description>
</property>

Add the above code, restart the cluster, and there will be no problem in operation.

Solution:

The ultimate code, one line solution
Add system.setproperty ("hadoop_user_name", "root") to the Java code to set the permissions of the client to operate on HDFS. that will do


		
		
			This entry was posted in How to Fix and tagged Hadoop, hdfs, java on 2021-10-29 by Robins.



	
				
			
						
				Hadoop reports an error. Cannot access scala.serializable and python MapReduce reports an error
			
								


				
			
Record the problems encountered when doing school Hadoop homework. The homework is more basic, that is, calling Hadoop through makefile to execute the MapReduce program written in advance
Error 1
An error occurred in the Hadoop wordcount code
java: cannot access scala.Serializable class file for scala.Serializable not found

An error is reported
Solution: 
 through this Q & A on stack overflow, I guess that the scala version is incompatible with the Hadoop version, so rollback to 2.7 will solve the problem
Error report 2
Attempting to run Python on Hadoop. But an error is reported. The error information is not detailed: 
 insert a picture description here 
  
 solution: 
 add the following at the beginning of the source code:
#!/usr/bin/env python
# -*-coding:utf-8 -*


(the problem with the coding format is really that I don’t know how to debug it.)


					

		
		
			This entry was posted in How to Fix and tagged 1024 programmer section, Hadoop, MapReduce, python, Scala on 2021-10-29 by Robins.								

	


	
				
			
						
				hdfs-bug:DataXceiver error processing WRITE_BLOCK operation
			
								


				
			
The error message and screenshot are as follows:

calculation112.aggrx:50010:DataXceiver error processing WRITE_BLOCK operation  src: /10.1.1.116:36274 dst: /10.1.1.112:50010
java.io.IOException: Premature EOF from inputStream
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:203)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:901)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:808)
    at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
    at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
    at java.lang.Thread.run(Thread.java:748)
　　......


Reason: the file operation exceeds the lease term, that is, the file is deleted during the data stream operation
Scheme:
Step 1: modify the maximum number of files opened in the process
cat /etc/security/limits.conf  | grep -v ^#


*        soft    nofile        1000000
*         hard    nofile        1048576
*        soft    nproc        65536
*        hard    nproc        unlimited
*        soft    memlock        unlimited
*        hard    memlock        unlimited
*         -      nofile          1000000


Step 2: (modify the number of data transmission threads)
 
Done.
 

					

		
		
			This entry was posted in How to Fix and tagged Big data, Counting warehouse, Hadoop, Hive, java on 2021-10-26 by Robins.								

	


	
				
			
						
				[Solved] failed on connection exception: java.net.ConnectException:  Connection denied
			
								


				
			
Processing method:

start ResourceManager

command:
yarn --daemon start resourcemanager

If failed on connection exception: java.net.connectexception: connection timeout occurs

just turn off the firewall.

					

		
		
			This entry was posted in JAVA and tagged abnormal, Hadoop, java on 2021-10-22 by Robins.								

	


	
				
			
						
				Tez Execute MR Task Error [How to Solve]
			
								


				
			Question
I’m executing the DWS Layer command  DWS_load_member_When start.sh 2020-07-21 , an error is reported. This is all the error information
which: no hbase in (:/opt/install/jdk1.8.0_231/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/install/hadoop-2.9.2/bin:/opt/install/hadoop-2.9.2/sbin:/opt/install/flume-1.9.0/bin:/opt/install/hive-2.3.7/bin:/opt/install/datax/bin:/opt/install/spark-2.4.5/bin:/opt/install/spark-2.4.5/sbin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/install/hive-2.3.7/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/install/tez-0.9.2/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/install/hadoop-2.9.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/opt/install/hive-2.3.7/lib/hive-common-2.3.7.jar!/hive-log4j2.properties Async: true
Query ID = root_20211014210413_76de217f-e97b-4435-adca-7e662260ab0b
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1634216554071_0002)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------
VERTICES: 00/00  [>>--------------------------] 0%    ELAPSED TIME: 8.05 s     
----------------------------------------------------------------------------------------------
Status: Failed--------------------------------------------------------------------------------
Application application_1634216554071_0002 failed 2 times due to AM Container for appattempt_1634216554071_0002_000002 exited with  exitCode: -103
Failing this attempt.Diagnostics: [2021-10-14 21:04:29.444]Container [pid=20544,containerID=container_1634216554071_0002_02_000001] is running beyond virtual memory limits. Current usage: 277.4 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1634216554071_0002_02_000001 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 20544 20543 20544 20544 (bash) 0 0 115900416 304 /bin/bash -c /opt/install/jdk1.8.0_231/bin/java  -Xmx819m -Djava.io.tmpdir=/opt/install/hadoop-2.9.2/data/tmp/nm-local-dir/usercache/root/appcache/application_1634216554071_0002/container_1634216554071_0002_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster --session 1>/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001/stdout 2>/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001/stderr  
	|- 20551 20544 20544 20544 (java) 367 99 2771484672 70721 /opt/install/jdk1.8.0_231/bin/java -Xmx819m -Djava.io.tmpdir=/opt/install/hadoop-2.9.2/data/tmp/nm-local-dir/usercache/root/appcache/application_1634216554071_0002/container_1634216554071_0002_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= org.apache.tez.dag.app.DAGAppMaster --session 

[2021-10-14 21:04:29.458]Container killed on request. Exit code is 143
[2021-10-14 21:04:29.481]Container exited with a non-zero exit code 143. 
For more detailed output, check the application tracking page: http://hadoop1:8088/cluster/app/application_1634216554071_0002 Then click on links to logs of each attempt.
. Failing the application.
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1634216554071_0002 failed 2 times due to AM Container for appattempt_1634216554071_0002_000002 exited with  exitCode: -103
Failing this attempt.Diagnostics: [2021-10-14 21:04:29.444]Container [pid=20544,containerID=container_1634216554071_0002_02_000001] is running beyond virtual memory limits. Current usage: 277.4 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1634216554071_0002_02_000001 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 20544 20543 20544 20544 (bash) 0 0 115900416 304 /bin/bash -c /opt/install/jdk1.8.0_231/bin/java  -Xmx819m -Djava.io.tmpdir=/opt/install/hadoop-2.9.2/data/tmp/nm-local-dir/usercache/root/appcache/application_1634216554071_0002/container_1634216554071_0002_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster --session 1>/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001/stdout 2>/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001/stderr  
	|- 20551 20544 20544 20544 (java) 367 99 2771484672 70721 /opt/install/jdk1.8.0_231/bin/java -Xmx819m -Djava.io.tmpdir=/opt/install/hadoop-2.9.2/data/tmp/nm-local-dir/usercache/root/appcache/application_1634216554071_0002/container_1634216554071_0002_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/opt/install/hadoop-2.9.2/logs/userlogs/application_1634216554071_0002/container_1634216554071_0002_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= org.apache.tez.dag.app.DAGAppMaster --session 

[2021-10-14 21:04:29.458]Container killed on request. Exit code is 143
[2021-10-14 21:04:29.481]Container exited with a non-zero exit code 143. 
For more detailed output, check the application tracking page: http://hadoop1:8088/cluster/app/application_1634216554071_0002 Then click on links to logs of each attempt.
. Failing the application.

It’s taken off to interpret this error message.
The logs at the beginning of  slf4j before line 9 do not need to be concerned, but only indicate that some unimpeded packages are missing;
Then is the implementation of the task I submitted this time
Logging initialized using configuration in jar:file:/opt/install/hive-2.3.7/lib/hive-common-2.3.7.jar!/hive-log4j2.properties Async: true
Query ID = root_20211014210413_76de217f-e97b-4435-adca-7e662260ab0b
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1634216554071_0002)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------
VERTICES: 00/00  [>>--------------------------] 0%    ELAPSED TIME: 8.05 s     
----------------------------------------------------------------------------------------------

Tell me again that the task execution failed, and the program exit number is  - 103 ①
Status: Failed--------------------------------------------------------------------------------
Application application_1634216554071_0002 failed 2 times due to AM Container for appattempt_1634216554071_0002_000002 exited with  exitCode: -103
Failing this attempt.Diagnostics: [2021-10-14 21:04:29.444]Container [pid=20544,containerID=container_1634216554071_0002_02_000001] is running beyond virtual memory limits. Current usage: 277.4 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.

Followed by a list of reasons:  container [attribute describing container] is running beyond virtual memory limits. Current usage: 277.4 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container ② means that my running task (actually container, which is easy to understand here) exceeds the limit of virtual memory. The usage is 1g of physical memory. My task uses 277.4m, which is OK. It’s not too much, but I use 2.7g of virtual memory, which is obviously unreasonable, So nodemanager killed it.
The last part of the log, which is also the most informative part, will tell us where the problem will be recorded
Dump of the process-tree for container_1634216554071_0002_02_000001 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- ...
	|- ...

[2021-10-14 21:04:29.458]Container killed on request. Exit code is 143
[2021-10-14 21:04:29.481]Container exited with a non-zero exit code 143. 
For more detailed output, check the application tracking page: http://hadoop1:8088/cluster/app/application_1634216554071_0002 Then click on links to logs of each attempt.
. Failing the application.

Focus on the penultimate sentence:  for more detailed output, check the application tracking page: http://hadoop1:8088/cluster/app/application_1634216554071_0002 then click on links to logs of each attempt. http://hadoop1:8088/cluster/app/application_1634216554071_0002 find it. ③
Solution:
The old idea is that there are problems in resources, and there are only two directions: 1. Too heavy tasks and 2. Too few resources. In this case, the task is not heavy. It can also be seen from the occupied physical memory that the memory I allocate is more than enough to complete the task. The problem lies in the virtual memory. Then I have two ideas about the “virtual memory”. First, what configuration can intervene? Which configuration is it?
Next time you encounter this kind of problem, you can think about these points:
Cancel the check of virtual memory
yarn-site.xmlSet in or program yarn.nodemanager.vmem-check-enabledasfalse
<property>
 	<name>yarn.nodemanager.vmem-check-enabled</name>
	 <value>false</value>
 	<description>Whether virtual memory limits will be enforced for containers.</description>
</property>

In addition to virtual memory super, there may be a super-physical memory, also can be set to check physical memory yarn.nodemanager.pmem-check-enabledis false, personally think that this approach is not very good, if a program has a memory leak and other issues, cancel the check, it could lead to cluster collapse.
Increase mapreduce.map.memory.mbormapreduce.reduce.memory.mb
This method should be given priority. This method can not only solve the virtual memory, perhaps most of the time the physical memory is not enough, this method is just suitable.
<property>    
    <name>mapreduce.map.memory.mb</name>    
    <value>2048</value>    
    <description>maps</description>
</property>
<property>    
    <name>mapreduce.reduce.memory.mb</name>    
    <value>2048</value>    
    <description>reduces</description>
</property>

 
Properly increase yarn.nodemanager.vmem-pmem-ratiothe size, one physical memory increases multiple virtual memory, but this parameter should not be too outrageous, the essence is to deal with mapreduce.reduce.memory.db* yarn.nodemanager.vmem-pmem-ratio.
If the memory occupied by the task is too ridiculous, more consideration should be whether the program has a memory leak, whether there is data skew, etc., and the program should be given priority to solve such problems.
					

		
		
			This entry was posted in Error and tagged Hadoop, Hive, MR on 2021-10-20 by Robins.								

	


	
				
			
						
				The MapReduce program generates a jar package and runs with an error classnotfoundexception
			
								


				
			

Reason: The path to the driver class is not written up



					

		
		
			This entry was posted in JAVA and tagged Big data, error, Hadoop, jar, java, MapReduce on 2021-10-19 by Robins.								

	


	
				
			
						
				Error reported when debugging Hadoop cluster under windows failed to find winutils.exe
			
								


				
			
 
Modify 
  set Java in/etc/Hadoop/Hadoop env.cmd file_ HOME=%JAVA_ Home%  
 is (modified to the JDK location configured by your own machine) 
  set Java_ HOME=C:\Program Files\Java\jdk1.8.0_ 144  check fs.default.name 
 check whether the attribute value of fs.default.name in/etc/Hadoop/core-site.xml is consistent with that in the server. Inconsistency needs to be changed to consistency. Configure environment variables 
  Hadoop_ Home  and % Hadoop in path_ Home% \ bin  configure local hosts 
 192.168.19.185 Server1 
 192.168.19.184 server2 
 192.168.19.199 server3winutils.exe and hadoop.dll in Hadoop/bin directory, hadoop.dll in C: \ window \ system32 directory, and restart the IDE. If an error is still reported, restart the computer

					

		
		
			This entry was posted in How to Fix and tagged error, Hadoop, windows on 2021-10-19 by Robins.								

	


	
				
			
						
				docker service Failed to get D-Bus connection: Operation not permitted
			
								


				
			
[hadoop@hadoop03 fastdfs-docker]$ sudo docker run -d --privileged=true  a:v1
2be67a4f7e63d5277fc7bea8ca46946e40d526f93bc634166826de2bd5179554
[hadoop@hadoop03 fastdfs-docker]$ sudo docker exec -it 2b bash
[root@2be67a4f7e63 /]# service keepalived start
Redirecting to /bin/systemctl start keepalived.service
Failed to get D-Bus connection: Operation not permitted
[root@2be67a4f7e63 /]# exit
exit
[hadoop@hadoop03 fastdfs-docker]$ sudo docker stop 2b
2b
[hadoop@hadoop03 fastdfs-docker]$ sudo docker rm 2b
2b


Create a container and start with/usr/SBIN/init
######################
[hadoop@hadoop03 fastdfs-docker]$ sudo docker run -d --privileged=true  a:v1 /usr/sbin/init
2dfab2702a180537f58549457fd5993bb0ee3e6e2da72e0672af58d922286ffd
[hadoop@hadoop03 fastdfs-docker]$ sudo docker exec -it 2d bash
[root@2dfab2702a18 /]# service keepalived start

Redirecting to /bin/systemctl start keepalived.service
[root@2dfab2702a18 /]#
[root@2dfab2702a18 /]# service keepalived status
Redirecting to /bin/systemctl status keepalived.service
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-10-08 06:26:55 UTC; 4s ago
  Process: 250 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 251 (keepalived)
   CGroup: /docker/2dfab2702a180537f58549457fd5993bb0ee3e6e2da72e0672af58d922286ffd/system.slice/keepalived.service
           └─251 /usr/sbin/keepalived -D
           ‣ 251 /usr/sbin/keepalived -D

Oct 08 06:26:55 2dfab2702a18 systemd[1]: Started LVS and VRRP High Availability Monitor.
Oct 08 06:26:55 2dfab2702a18 Keepalived_vrrp[253]: Unable to load module xt_set - not using ipsets
Oct 08 06:26:55 2dfab2702a18 Keepalived_vrrp[253]: VRRP_Instance(VI_1) removing protocol VIPs.
Oct 08 06:26:55 2dfab2702a18 Keepalived_vrrp[253]: VRRP_Instance(VI_1) removing protocol iptable drop rule
Oct 08 06:26:55 2dfab2702a18 Keepalived_vrrp[253]: Using LinkWatch kernel netlink reflector...
Oct 08 06:26:55 2dfab2702a18 Keepalived_vrrp[253]: VRRP_Instance(VI_1) Entering BACKUP STATE
Oct 08 06:26:55 2dfab2702a18 Keepalived_vrrp[253]: VRRP sockpool: [ifindex(447), proto(112), unicast(0), fd(10,11)]
Oct 08 06:26:55 2dfab2702a18 Keepalived[251]: Keepalived_healthcheckers exited with permanent error FATAL. Terminating
Oct 08 06:26:55 2dfab2702a18 Keepalived[251]: Stopping
Oct 08 06:26:56 2dfab2702a18 Keepalived_vrrp[253]: Stopped
[root@2dfab2702a18 /]#


					

		
		
			This entry was posted in How to Fix and tagged docker, Hadoop on 2021-10-10 by Robins.								

	


	
				
			
						
				Run hadoop fs -put Command Error: java.io.IOException: Got error, status message , ack with firstBadLink
			
								


				
			
After the Hadoop cluster is set up, the following port error occurs on one node when the Hadoop FS – put command is executed
location:org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1495)
throwable:java.io.IOException: Got error, status message , ack with firstBadLink as ip:port
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:142)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1482)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1385)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)

First, check whether the firewalls of the three nodes are closed
firewall-cmd --state

Turn off firewall command
systemctl stop firewalld.service

If this situation still occurs after the firewall is closed, it may be that the ports of ECs are not open to the public. Take Alibaba cloud server as an example (Tencent cloud opens all ports by default):

After setting, execute the command again to succeed.

					

		
		
			This entry was posted in JAVA and tagged Big data, Hadoop, java on 2021-10-08 by Robins.								

	


	
				
			
						
				Error during job, obtaining debugging information [How to Solve]
			
								


				
			
Error:
ERROR : Ended Job = job_1631679144970_1574917 with errors
ERROR : Error during job, obtaining debugging information...
ERROR :
Task with the most failures(4):
-----
Task ID:
task_1631679144970_1574917_m_000158

URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1631679144970_1574917&tipid=task_1631679144970_1574917_m_000158
-----
Diagnostic Messages for this Task:
Container [pid=18442,containerID=container_1631679144970_1574917_01_003316] is running beyond physical memory limits. Current usage: 4.1 GB of 4 GB physical memory used; 5.8 GB of 80 GB virtual memory used. Killing container.

I checked the written information on the Internet and said it was

java.lang.outofmemoryerror: Java heap space. The reason is that the memory space of the namenode is not enough and the JVM is not enough. It is caused by the start of a new job

the solution is to set the local mode
set hive.exec.mode.local.auto=true;

The specific error reported by me is running beyond physical memory limits. Current usage: 4.1 GB of 4 GB physical memory used

the physical memory size is insufficient
Solution:
set mapreduce.map.memory.mb=8192;
set mapreduce.reduce.memory.mb=8192;


					

		
		
			This entry was posted in Error and tagged Big data, bug, Hadoop, Hive on 2021-09-27 by Robins.								

	


	
				
			
						
				[Solved] NoClassDefFoundError: jline/console/completer/ArgumentCompleter$ArgumentDelimiter
			
								


				
			
An error occurs when importing data using sqoop:
Exception in thread “main” java.lang.NoClassDefFoundError: jline/console/completer/ArgumentCompleter$ArgumentDelimiter
Caused by: java.lang.ClassNotFoundException: jline.console.completer.ArgumentCompleter$ArgumentDelimiter
Solution:

Hadoop cannot find the jline package. Go to the Lib directory of hive to find jline.jar, put it in/share/Hadoop/yarn/Lib in Hadoop, and then execute it.

					

		
		
			This entry was posted in Error and tagged Big data, Hadoop, Hive on 2021-09-24 by Robins.								

	


	
				
			
						
				[Solved] Hadoop Start NameNode Error: ERROR: Cannot set priority of namenode process 2639
			
								


				
			
Project scene:
HadoopStart NameNodeReport Error: ERROR: Could not set priority of namenode process

=

Description of the question:
Hadoop Start NameNode errror: ERROR: Cannot set priority of namenode process 2639
[ atguigu@localhost333 ↓ logs]$ tail -100 [UNK]hadoop-localhost -namenode-hadoop333.log
😉
@Override public void run() { bytes = mmInStream.read(buffer); mHandler.obtainMessage(READ_DATA, bytes, -1, buffer).sendToTarget(); } 

Reasons analysis:
Date of First Visit: hadoop-atguigu-namenode-hadoop103.log
order as follows:

[localhost @hadoop333 logs]$ tail -100 ˚hadoop-atguigu-namenode-hadoop103.log log log log
2021-09-17 16:41:25,656 INFO org.apache.localhost.util.ExitUtil: 
Exiting with status 1: org.apache.localhost.hdfs.server.common.InconsistentFSStateException: 
Directory /opt/ha/hadoop-3.1.3/data/name is in an inconsistent state: 
storage directory does not exist or is not accessible.
2021-09-17 16:41:25,663 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop333/192.168.1.103


Solution:
Try to restart centos7, and then re delete the data and logs in Hadoop and the files in TMP in the system root directory to re initialize Hadoop. The problem is solved.
rm -rf /opt/ha/hadoop-3.1.3/data /opt/ha/hadoop-3.1.3/logs

sudo rm -rf /tmp/*

hdfs namenode -format


					

		
		
			This entry was posted in Error and tagged Big data, Hadoop, Hadoop Start NameNode Error on 2021-09-22 by Robins.								

	

			
				Post navigation
				← Older posts
				Newer posts →

ProgrammerAH

Programmer Guide, Tips and Tutorial

Tag Archives: Hadoop

HDFS Java API operation error (user permission)

Hadoop reports an error. Cannot access scala.serializable and python MapReduce reports an error

hdfs-bug:DataXceiver error processing WRITE_BLOCK operation

[Solved] failed on connection exception: java.net.ConnectException: Connection denied

Tez Execute MR Task Error [How to Solve]

The MapReduce program generates a jar package and runs with an error classnotfoundexception

Error reported when debugging Hadoop cluster under windows failed to find winutils.exe

docker service Failed to get D-Bus connection: Operation not permitted

Run hadoop fs -put Command Error: java.io.IOException: Got error, status message , ack with firstBadLink

Error during job, obtaining debugging information [How to Solve]

[Solved] NoClassDefFoundError: jline/console/completer/ArgumentCompleter$ArgumentDelimiter

[Solved] Hadoop Start NameNode Error: ERROR: Cannot set priority of namenode process 2639