Tag Archives: Hadoop

[Solved] Flink Error: Flink Hadoop is not in the classpath/dependencies

Error background:

When installing the Flink on yarn cluster, the Flink cluster cannot be started.

Version:

flink-1.14.6

hadoop-3.2.3

org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:216) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:617) [flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:59) [flink-dist_2.12-1.14.6.jar:1.14.6]
Caused by: java.io.IOException: Could not create FileSystem for highly available storage path (hdfs:/flink/ha/default)
	at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:92) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:121) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:361) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:318) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:243) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:193) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	... 2 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded. For a full list of supported file systems, please see https://nightlies.apache.org/flink/flink-docs-stable/ops/filesystems/.
	at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:532) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:409) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:89) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:121) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:361) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:318) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:243) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:193) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	... 2 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.
	at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:55) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:528) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:409) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:89) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:121) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:361) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:318) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:243) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:193) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190) ~[flink-dist_2.12-1.14.6.jar:1.14.6]
	... 2 more

The reason for the error
Flink needs two jar package dependencies to access HDFS. Flink does not have them, so it needs to be put in by itself.

  1. flink-shaded-hadoop-3-3.1.1.7.2.9.0-173-9.0.jar
  2. commons-cli-1.5.0.jar

Solution:

Directly search the Maven warehouse for these two jar packages and download them: https://mvnrepository.com/

Put the jar package in the /flink/lib directory.

[Solved] Kafka Restarts error | Cloudera Manager Access Returns 500 | HDFS Startup Error

Hi~ Long time no update

1.Problems that need attention after restarting kafka:
Kafka will have a write file a in the target storage location during execution,this file a will keep a write state for a while,usually one hour Heavy
Generate a new write file b,End the last write file a(The duration of this ending needs to check the configuration of each cluster). then restart
Here comes the problem,The last write file a,will be recreated after restarting,The last write file b,So the current
a will keep writing status,when reading and writing file a, it will report an error,including importing Hive query will also report an error&# xff08; load to hive
The table will not report an error,but it will report an error when selecting),because this file is always in the write state,It is inoperable,It is also called writing
Lock(I believe everyone has heard of).
Solution:Then we need to manually terminate the write status of the write file,First we need to determine the status of the write file,In the command
Execute the command on the line :
hdfs fsck /data/logs/( Write the directory where the file is located,Change according to where your file is located) -openforwrite
The displayed files are all in the write state:

insert picture description here

 After seeing the writing file,execute the command to stop all writing files,here explain,why all stop&# xff0c; Logically, it should be stopped before
A write file, but stopping all of them can also solve the problem, is relatively simple and violent, because manual stopping will automatically generate a
, writing files, so you can stop them all. then now execute the command :
hdfs debug recoverLease -path /logs/common_log/2022-09 -16/FlumeData.1663292498820.tmp(Execute the previous command to display Output write file path) -retries 3
It can be solved by executing each file once,Say more,If this file has been loaded into hive,, you need to go to /user/warehouse/hive/ to find this write status file

insert picture description here

2.CDH's Cloudera Manager launch browser access returns 500error:
① First check the configuration of the /etc/hosts file, only need to leave these two lines with the cluster The intranet IP mapping can be
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

②It is also necessary to check whether the ports related to ,cm are occupied by the firewall.

③ Then restart CM, execute the command
nameNode:systemctl stop cloudera-scm-server
Then execute :systemctl stop cloudera-scm-agent on each node

nameNode:systemctl start cloudera-scm-server
Then execute :systemctl start cloudera-scm-agent on each node
Attention Pay attention to!!! The execution order of these commands cannot be reversed, Otherwise, there may be problems with cluster startup.
Then you can systemctl status cloudera-scm-server, systemctl status cloudera-scm-agent
Check out the operation.

②If cm starts and can access , but starts HDFS error 1 or 2
1.Unable to retrieve non-local non-loopback IP address. Seeing address: cm/127.0.0.1
2.ERROR ScmActive-0:com.cloudera.server.cmf. components.ScmActive: ScmActive was not able to access CM identity  to validate it.2017-04-18 09:40 :29,308 ERROR ScmActive-0

So congratulations ,find a solution.
First find the source database of CM,Some of them were configured at that time,If you don’t know, ask the person who installed them,Almost all of them are in
Don't ask me for the , account password on nameNode ~, then show databases; can See that there is a cm or scm library

insert picture description here

 use this library,then show tables;
You will see a table called HOSTS,View the data of this table-select * from HOSTS ;

insert picture description here

 You will find that there is a different line , that is, there is a difference between NAME and IP_ADDRESS, Then you need to modify it back, to
The name and IP_ADDRESS of the intranet,I believe everyone will modify it!Then restart the CM,It's done!

start-all.sh Execute error: Stopping journal nodes [slave2 slave1 master]…

Stopping journal nodes [slave2 slave1 master]
ERROR: Attempting to operate on hdfs journalnode as root
ERROR: but there is no HDFS_JOURNALNODE_USER defined. Aborting operation.
Stopping ZK Failover Controllers on NN hosts [master slave1 slave2]
ERROR: Attempting to operate on hdfs zkfc as root
ERROR: but there is no HDFS_ZKFC_USER defined. Aborting operation.

 

Error Cause:
Trying to operate on hdfs namenode as root user, but HDFS _ NAMENODE _ user is not defined. Abort the operation.

Solution:
In the environment variables in add.
1. into the environment variables
vi ~/.bash_profile
2. Add the following codes:

#hadoop
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export HDFS_ZKFC_USER=root

Enabling environment variables
source ~/.bash_profile
​​​​​

[Solved] java Internal error in the mapping processor java.lang.NullPointerException

Error:

java: Internal error in the mapping processor: java.lang.NullPointerException  	
at org.mapstruct.ap.internal.processor.DefaultVersionInformation.createManifestUrl(DefaultVersionInformation.java:182)  	
at org.mapstruct.ap.internal.processor.DefaultVersionInformation.openManifest(DefaultVersionInformation.java:153)  	
at org.mapstruct.ap.internal.processor.DefaultVersionInformation.getLibraryName(DefaultVersionInformation.java:129)  	
at org.mapstruct.ap.internal.processor.DefaultVersionInformation.getCompiler(DefaultVersionInformation.java:122)  	
at org.mapstruct.ap.internal.processor.DefaultVersionInformation.fromProcessingEnvironment(DefaultVersionInformation.java:95)  	
at org.mapstruct.ap.internal.processor.DefaultModelElementProcessorContext.<init>(DefaultModelElementProcessorContext.java:50)  
at org.mapstruct.ap.MappingProcessor.processMapperElements(MappingProcessor.java:218)  	
at org.mapstruct.ap.MappingProcessor.process(MappingProcessor.java:156)  	
at org.jetbrains.jps.javac.APIWrappers$ProcessorWrapper.process(APIWrappers.java:109)  	
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  	
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)  	
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)  	
at java.lang.reflect.Method.invoke(Method.java:498)  	
at org.jetbrains.jps.javac.APIWrappers$1.invoke(APIWrappers.java:213)  	
at org.mapstruct.ap.MappingProcessor.process(Unknown Source)  	
at com.sun.tools.javac.processing.JavacProcessingEnvironment.callProcessor(JavacProcessingEnvironment.java:794)  	
at com.sun.tools.javac.processing.JavacProcessingEnvironment.discoverAndRunProcs(JavacProcessingEnvironment.java:705)  
at com.sun.tools.javac.processing.JavacProcessingEnvironment.access$1800(JavacProcessingEnvironment.java:91)  	
at com.sun.tools.javac.processing.JavacProcessingEnvironment$Round.run(JavacProcessingEnvironment.java:1035)  	
at com.sun.tools.javac.processing.JavacProcessingEnvironment.doProcessing(JavacProcessingEnvironment.java:1176)  	
at com.sun.tools.javac.main.JavaCompiler.processAnnotations(JavaCompiler.java:1170)  	
at com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:856)  	
at com.sun.tools.javac.main.Main.compile(Main.java:523)  	
at com.sun.tools.javac.api.JavacTaskImpl.doCall(JavacTaskImpl.java:129)  	
at com.sun.tools.javac.api.JavacTaskImpl.call(JavacTaskImpl.java:138)  	
at org.jetbrains.jps.javac.JavacMain.compile(JavacMain.java:231)  	
at org.jetbrains.jps.incremental.java.JavaBuilder.compileJava(JavaBuilder.java:501)  	
at org.jetbrains.jps.incremental.java.JavaBuilder.compile(JavaBuilder.java:353)  	
at org.jetbrains.jps.incremental.java.JavaBuilder.doBuild(JavaBuilder.java:277)  	
at org.jetbrains.jps.incremental.java.JavaBuilder.build(JavaBuilder.java:231)  	
at org.jetbrains.jps.incremental.IncProjectBuilder.runModuleLevelBuilders(IncProjectBuilder.java:1441)  	
at org.jetbrains.jps.incremental.IncProjectBuilder.runBuildersForChunk(IncProjectBuilder.java:1100)  	at org.jetbrains.jps.incremental.IncProjectBuilder.buildTargetsChunk(IncProjectBuilder.java:1224)  	at org.jetbrains.jps.incremental.IncProjectBuilder.buildChunkIfAffected(IncProjectBuilder.java:1066)  	at org.jetbrains.jps.incremental.IncProjectBuilder.buildChunks(IncProjectBuilder.java:832)  	at org.jetbrains.jps.incremental.IncProjectBuilder.runBuild(IncProjectBuilder.java:419)  	at org.jetbrains.jps.incremental.IncProjectBuilder.build(IncProjectBuilder.java:183)  	at org.jetbrains.jps.cmdline.BuildRunner.runBuild(BuildRunner.java:132)  	at org.jetbrains.jps.cmdline.BuildSession.runBuild(BuildSession.java:302)  	at org.jetbrains.jps.cmdline.BuildSession.run(BuildSession.java:132)  	at org.jetbrains.jps.cmdline.BuildMain$MyMessageHandler.lambda$channelRead0$0(BuildMain.java:219)  	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)  	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)  	at java.lang.Thread.run(Thread.java:748)  

When using MapStruct, idea version 2020.3, an error occurs when building the project: java: Internal error in the mapping processor: java.lang.NullPointerException

Solution:
Setting –>Build,Execution,Deployment –>Compiler –>User-local build add the parameter
-Djps.track.ap.dependencies=false

How to Solve hadoop3.x.x sh start-dfs.sh Startup Error

hadoop3.x.x sh start-dfs.sh startup error

Error information:

/app/module/hadoop310/libexec/hadoop-functions.sh: line 398: syntax error near unexpected token `<'
/app/module/hadoop310/libexec/hadoop-functions.sh: line 398: `  done < <(for text in "${input[@]}"; do'
/app/module/hadoop310/libexec/hadoop-config.sh: line 70: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 87: hadoop_bootstrap: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 104: hadoop_parse_args: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 105: shift: : numeric argument required
/app/module/hadoop310/libexec/hadoop-config.sh: line 110: hadoop_find_confdir: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 111: hadoop_exec_hadoopenv: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 112: hadoop_import_shellprofiles: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 113: hadoop_exec_userfuncs: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 119: hadoop_exec_user_hadoopenv: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 120: hadoop_verify_confdir: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 122: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 123: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 124: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 129: hadoop_os_tricks: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 131: hadoop_java_setup: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 133: hadoop_basic_init: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 36: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 38: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 40: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 42: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 44: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 46: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 48: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 50: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 52: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 54: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 62: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/yarn-config.sh: line 64: hadoop_deprecate_envvar: command not found
/app/module/hadoop310/libexec/hadoop-config.sh: line 140: hadoop_shellprofiles_init: command not found

The specific cause of the error has not been found yet.
Solution: switch to the Hadoop directory and go directly.
cd /app/module/hadoop310
./sbin/start-dfs.sh

[Solved] ANTRL Compile HiveLexer.g File Error: syntax error: antlr: NoViableAltException(@[])

Project scenario:

The need for programming to obtain data pedigrees is not common for data warehouses, and separate data pedigree work is only necessary when the scale of data reaches a significant level, or the number of reports with complex data production relationships increases to a significant level.
Until scale is reached, manual identification and management is more cost effective.

antlr is the open source syntax parser that automatically generates syntax trees based on input and displays them visually. antlr-Another Tool for Language Recognition, formerly PCCTS, provides a syntax tree for languages including Java, C++, and C# by description to automatically construct a custom language recognizer (recognizer), compiler (parser) and interpreter (translator) framework. ANTLR is available in v2 v3 v4, with most of the Chinese documentation being in v2 and the hive 1.1.0 version mentioning antlr 3.4 in the comments. antlr combines the above by allowing us to define lexical rules for recognizing character streams and syntactic parser rules for interpreting Token streams. ANTLR will then automatically generate the appropriate lexical/syntactic parser based on the syntax file provided by the user. The user can use them to compile the input text and convert it into other forms (e.g. AST-Abstract Syntax Tree).

Use ANTLR version 3.5.2 to parse Hivesql’s source code grammar file for HiveLexer.g, and place the code:

@lexer::header {
package org.apache.hadoop.hive.ql.parse

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.hive.conf.HiveConf
}

@lexer::members {
  private Configuration hiveConf
  
  public void setHiveConf(Configuration hiveConf) {
    this.hiveConf = hiveConf
  }
  
  protected boolean allowQuotedId() {
    if(hiveConf == null){
      return false
    }
    String supportedQIds = HiveConf.getVar(hiveConf, HiveConf.ConfVars.HIVE_QUOTEDID_SUPPORT)
    return !"none".equals(supportedQIds)
  }
}

This paragraph needs to be annotated so that Java compiled with ANTLR does not contain this package.


Problem description

After this paragraph is annotated with/* * /, it is still compiled and an error is found:

[10:58:10] error(100): HiveLexer.g:42:1: syntax error: antlr: NoViableAltException(9@[])
[10:58:10] error(100): HiveLexer.g:42:2: syntax error: antlr: NoViableAltException(49@[])
[10:58:10] error(100): HiveLexer.g:42:7: syntax error: antlr: MissingTokenException(inserted [@-1,0:0='<missing ACTION>',<4>,42:6] at :)
[10:58:10] error(100): HiveLexer.g:42:8: syntax error: antlr: NoViableAltException(22@[])
[10:58:10] error(100): HiveLexer.g:42:9: syntax error: antlr: NoViableAltException(80@[])
[10:58:10] error(100): HiveLexer.g:42:9: syntax error: antlr: NoViableAltException(80@[])
[10:58:10] error(100): HiveLexer.g:47:1: syntax error: antlr: MissingTokenException(inserted [@-1,0:0='<missing SEMI>',<82>,47:0] at KW_TRUE)

Cause analysis:

Query data discovery

‘; ‘ is a terminator in comment and so error occurred


Solution:

Delete all the original annotation documents or delete them; Delete it.

Problem solved.

[Solved] hadoop-2.7.1 Error: Error Cannot find configuration directory etchadoop

Since the configuration is hadoop-2.7.1, it will be found later during startup

The terminal executes./start yarn sh
starting yarn daemons
Error: Cannot find configuration directory: /etc/hadoop
Error: Cannot find configuration directory: /etc/hadoop

This is the reason why the directory cannot be found. You can find the solution by reading the corresponding shell script~

Solution:

Configures the directory where a Hadoop configuration file is located in hadoop-env.sh (codes as below):

export HADOOP_CONF_DIR=/opt/hadoop-2.7.1/etc/hadoop/

Execute command

source  hadoop-env.sh

[Solved] Hive Run SQL error: mapreduce failed to initiate a task

When running sql in hive, mapreduce fails to launch the task, check the log that APPmaster cannot find the hadoop path.
Solution:
Add the following codes in mapred-site.xml, the error will be solved.

<configuration>
<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=/opt/apps/hadoop-3.1.1</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=/opt/apps/hadoop-3.1.1</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=/opt/apps/hadoop-3.1.1</value>
</property>
 
</configuration>

[Solved] Failed to re-init queues: Illegal queue capacity setting (abs-capacity=0.6) > (abs-maximum-capacity

The following exception was thrown when allocating queues to Yarn today,

llq@hadoop001:/software/hadoop-3.1.3$ yarn rmadmin -refreshQueues
2022-07-30 05:43:14,554 INFO client.RMProxy: Connecting to ResourceManager at hadoop002/192.168.86.102:8033
refreshQueues: java.io.IOException: Failed to re-init queues : Illegal queue capacity setting (abs-capacity=0.6) > (abs-maximum-capacity=0.4) for queue=[root.default],label=[]
	at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.logAndWrapException(AdminService.java:920)
	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:406)
	at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshQueues(ResourceManagerAdministrationProtocolPBServiceImpl.java:114)
	at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:271)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1000)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2916)
Caused by: java.io.IOException: Failed to re-init queues : Illegal queue capacity setting (abs-capacity=0.6) > (abs-maximum-capacity=0.4) for queue=[root.default],label=[]
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:477)
	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:430)
	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:401)
	... 10 more
Caused by: java.lang.IllegalArgumentException: Illegal queue capacity setting (abs-capacity=0.6) > (abs-maximum-capacity=0.4) for queue=[root.default],label=[]
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.capacitiesSanityCheck(CSQueueUtils.java:75)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.loadUpdateAndCheckCapacities(CSQueueUtils.java:116)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupConfigurableCapacities(AbstractCSQueue.java:179)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.setupQueueConfigs(AbstractCSQueue.java:356)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.setupQueueConfigs(LeafQueue.java:177)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.<init>(LeafQueue.java:162)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.<init>(LeafQueue.java:141)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.parseQueue(CapacitySchedulerQueueManager.java:259)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.parseQueue(CapacitySchedulerQueueManager.java:283)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.reinitializeQueues(CapacitySchedulerQueueManager.java:171)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:726)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:472)

Reason: the default rated queue capacity is greater than the maximum online queue capacity
solution:

<!-- Reduce the default queue resource rating to 40%, default 100% -->
<property>
    <name>yarn.scheduler.capacity.root.default.capacity</name>
    <value>40</value>
</property>

<!-- Lower the maximum capacity of default queue resources to 60%, default 100% -->
<property>
    <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
    <value>60</value>
</property>

Hive operation TMP file viewing content error [How to Solve]

1. Hive operation TMP file viewing content error

Permission denied: user=dr.who, access=READ_EXECUTE, inode="/tmp":hadoopadmin:supergroup:drwx-wx-wx

2. Cause analysis:

In case of insufficient user permissions, you can see that this tmp folder does not have read r permission for other users. The default login user of the page is dr.who users


3. Solution:

1. Change the permissions of TMP

Just execute the code:

hdfs dfs -chmod -R 777 /tmp

2. Modify the default login user of 50070

The core-site.xml can be configured as the user name corresponding to Hadoop

<property>
    <name>hadoop.http.staticuser.user</name>
    <value>username</value>
</property>

Hive: How to Solve dearby database initialization error

Error Messages:

Metastore connection URL:     jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver :     org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:     APP
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.derby.sql
 
Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
Use --verbose for detailed stacktrace.
*** schemaTool failed ***

 

Solution:

Go to share/common/lib/guava-27.0-jre.jar in hadoop
and Replace the lib/guava-27.0-jre.jar file in hive.

[Solved] Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.

Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.

This error occurs when executing sql with a collection in the hive on spark engine:
[42000][3] Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during runtime. Please check stacktrace for the root cause.

 

Solution 1: Switching engine using mr
set hive.execution.engine=mr;

Solution 2:
set mapred.map.tasks.speculative.execution=true
set mapred.reduce.tasks.speculative.execution=true