Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide"
The reason for this error is the lack of spark-sql-kafka-0-10_2.11-2.4.5.jar dependency
Download the jar package, put it on the server, and add it to the submission command
–jars spark-sql-kafka-0-10_2.11-2.4.5.jar
Error is still reported, and error is reported at this time
ommandExec.sideEffectResult(commands.scala:69)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:87)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:177)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:173)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:201)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:198)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:173)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:93)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:91)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:727)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:727)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1$$anonfun$apply$1.apply(SQLExecution.scala:95)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:86)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:789)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:63)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:727)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:313)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:694)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.serialization.ByteArrayDeserializer
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
... 45 more
Check the spark directory. There is no kafka-clients jar package
Just add the Kafka-clients dependency package to the submit command
spark-submit --master yarn --deploy-mode cluster --jars spark-sql-kafka-0-10_2.11-2.4.5.jar,kafka-clients-2.0.0.jar
Resubmit and solve the problem
Read More:
- Run spark to report error while identifying ‘ org.apache.spark . sql.hive.HiveSessionState ‘
- [Solved] Spark SQL Error: File xxx could only be written to 0 of the 1 minReplication nodes.
- Error while instantiating ‘org.apache.spark.sql.hive.HiveExternalCatalog’:
- org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session 0354
- kafka-eagle-2.0.3:Note: Kafka version is – or JMX Port is -1 maybe kafka broker jmxport disable.
- KAFKA – ERROR Failed to write meta.properties due to (kafka.server.BrokerMetadataCheckpoint)
- Kafka connection abnormal org.apache.kafka . common.errors.TimeoutException : Failed to update metadata after 60000 ms.
- Spark SQL startup error: error creating transactional connection factory
- Using sqoop to export data from hive to MySQL
- Spring data JAP SQL error:17059 SQL State:99999
- Datagrip import & export table structure and data
- This function has none of deterministic, no SQL, or reads SQL data in its error records
- Python export data (CSV format)
- Spark shell startup error, error: not found: value spark (low level solved)
- K8s deployment Kafka has been reporting errors.
- Import / export. SQL file / gzip file for MySQL under Linux
- Kafka prompts no brokers found when trying to rebalance
- Solution of error converting data type varchar to datetime in SQL Server
- Several ways to view spark task log
- SQL-DataCamp-Analyzing Business Data in SQL