Run Spark Program in idea and find that datafram can perform df.show() but just df.count() will display the following exception information:
2022-03-25 17:56:13,691 ERROR executor.Executor: Exception in task 14.0 in stage 7.0 (TID 222) java.lang.NullPointerException at $line33.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.$anonfun$rdd01$1(<console>:26) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2022-03-25 17:56:13,728 WARN scheduler.TaskSetManager: Lost task 14.0 in stage 7.0 (TID 222) (westgis-134 executor driver): java.lang.NullPointerException at $line33.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.$anonfun$rdd01$1(<console>:26) at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
The reason for this error is that there is a null value in the dataframe attribute in the previous processing, using na.drop() is removed and the error is resolved.
df05=df05.select("direction","station_name","order_no","lat","lng").na.drop()
Read More:
- Spark Error: java.lang.StackOverflowError [How to Solve]
- [Solved] Request processing failed; nested exception is java.lang.NullPointerException
- [Solved] Springboot Reflection calls ServiceImpl Error: java.lang.NullPointerException, mapper is null
- POI Export Excel Error: HTTP Status 500 – Request processing failed; nested exception is java.lang.NullPointerException
- [Solved] swagger3 Error: org.springframework.context.ApplicationContextException: Failed to start bean ‘documentationPluginsBootstrapper’; nested exception is java.lang.NullPointerException
- How to Solve Spark Writes Hudi Error
- [Solved] Spark Error: org.apache.spark.SparkException: A master URL must be set in your configuration
- Java.lang.AbstractMethodError: org.mybatis.spring.transaction.SpringManagedTransaction.getTimeout()Ljava/lang/Integer; error resolution
- [Solved] org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow
- [Solved] Response Export error on submit request on future invoke, java.lang.OutOfMemoryError: Java heap space
- Hive ERROR Failed with exception java.io.IOException:java.lang.IllegalArgumentException
- Tomcat startup error: java.lang.NoClassDefFoundError
- Idea Error: java: java.lang.OutOfMemoryError: GC overhead limit exceeded
- [Solved] java.lang.reflect.InaccessibleObjectException: Unable to make protected java.net.http.HttpRequest()…
- [Solved] Hive tez due to: ROOT_INPUT_INIT_FAILURE java.lang.IllegalArgumentException: Illegal Capacity: -38297
- [Solved] Error: exception: java.lang.reflect.InvocationTargetException: null
- [Android Error] java.lang.RuntimeException: An error occurred while executing doInBackground()
- [Solved] Hadoop error java.lang.nosuchmethoderror
- [Solved] jhat Analyzes dump File Error: java.lang.OutOfMemoryError