Tag Archives: HDFS Startup Error

[Solved] Kafka Restarts error | Cloudera Manager Access Returns 500 | HDFS Startup Error

Hi~ Long time no update

1.Problems that need attention after restarting kafka
Kafka will have a write file a in the target storage location during executionthis file a will keep a write state for a whileusually one hour Heavy
Generate a new write file bEnd the last write file aThe duration of this ending needs to check the configuration of each cluster). then restart
Here comes the problemThe last write file awill be recreated after restartingThe last write file bSo the current
a will keep writing statuswhen reading and writing file a, it will report an errorincluding importing Hive query will also report an error&# xff08; load to hive
The table will not report an errorbut it will report an error when selecting),because this file is always in the write stateIt is inoperableIt is also called writing
LockI believe everyone has heard of).
SolutionThen we need to manually terminate the write status of the write fileFirst we need to determine the status of the write fileIn the command
Execute the command on the line 
hdfs fsck /data/logs/ Write the directory where the file is locatedChange according to where your file is located -openforwrite
The displayed files are all in the write state

insert picture description here

 After seeing the writing fileexecute the command to stop all writing fileshere explainwhy all stop&# xff0c; Logically, it should be stopped before
A write file but stopping all of them can also solve the problem is relatively simple and violent because manual stopping will automatically generate a
 writing files, so you can stop them all. then now execute the command 
hdfs debug recoverLease -path /logs/common_log/2022-09 -16/FlumeData.1663292498820.tmpExecute the previous command to display Output write file path -retries 3
It can be solved by executing each file onceSay moreIf this file has been loaded into hive,, you need to go to /user/warehouse/hive/ to find this write status file

insert picture description here

2.CDH's Cloudera Manager launch browser access returns 500error
 First check the configuration of the /etc/hosts file only need to leave these two lines with the cluster The intranet IP mapping can be
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

It is also necessary to check whether the ports related to cm are occupied by the firewall.

 Then restart CM execute the command
nameNodesystemctl stop cloudera-scm-server
Then execute systemctl stop cloudera-scm-agent on each node

nameNodesystemctl start cloudera-scm-server
Then execute systemctl start cloudera-scm-agent on each node
Attention Pay attention to!!! The execution order of these commands cannot be reversed Otherwise, there may be problems with cluster startup.
Then you can systemctl status cloudera-scm-server, systemctl status cloudera-scm-agent
Check out the operation.

If cm starts and can access  but starts HDFS error 1 or 2
1.Unable to retrieve non-local non-loopback IP address. Seeing address: cm/127.0.0.1
2.ERROR ScmActive-0:com.cloudera.server.cmf. components.ScmActive: ScmActive was not able to access CM identity  to validate it.2017-04-18 09:40 :29,308 ERROR ScmActive-0

So congratulations find a solution.
First find the source database of CMSome of them were configured at that timeIf you dont know, ask the person who installed themAlmost all of them are in
Don't ask me for the , account password on nameNode ~, then show databases; can See that there is a cm or scm library

insert picture description here

 use this librarythen show tables;
You will see a table called HOSTSView the data of this table-select * from HOSTS ;

insert picture description here

 You will find that there is a different line  that is, there is a difference between NAME and IP_ADDRESS Then you need to modify it back to
The name and IP_ADDRESS of the intranetI believe everyone will modify itThen restart the CMIt's done!