Tag Archives: No Pointer

spark Program Error: ERROR01——java.lang.NullPointerException

Run Spark Program in idea and find that datafram can perform df.show() but just df.count() will display the following exception information:

2022-03-25 17:56:13,691 ERROR executor.Executor: Exception in task 14.0 in stage 7.0 (TID 222)
java.lang.NullPointerException
	at $line33.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.$anonfun$rdd01$1(<console>:26)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2022-03-25 17:56:13,728 WARN scheduler.TaskSetManager: Lost task 14.0 in stage 7.0 (TID 222) (westgis-134 executor driver): java.lang.NullPointerException
	at $line33.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.$anonfun$rdd01$1(<console>:26)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

The reason for this error is that there is a null value in the dataframe attribute in the previous processing, using na.drop() is removed and the error is resolved.

df05=df05.select("direction","station_name","order_no","lat","lng").na.drop()

Request processing failed;nested exception is java.lang.*

catalogue

Questions

analysis

solve

Appendix

be careful

reference resources


Questions

Error 1:

Httpclient sends a request to the server, and the server sometimes returns 500 errors to the client,

Open the server error log and report the following error:

2021-05-28   21:05:06.548 default [http-nio-0.0.0.0-xxxx-exec-6] ERROR o.a.c.c.C.[.[localhost].[/].[dispatcherServlet] – L ine:175 – Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is java.lang.NullPointerException] with root cause
java.lang.NullPointerException: null

(the problem is that there is no specific empty pointer in which line, which is not easy to locate. Moreover, it does not report an error every time, but occasionally.)

Error 2:

java.net.SocketTimeoutException

analysis

Error 1 Analysis

After further analysis, it is found that the problem lies in InputStream,

InputStream inputstream =request.getInputStream()

inputstream.read()

After obtaining the InputStream in the request, the above error 2: java.net.sockettimeoutexception appears in the process of reading the InputStream. However, the program only catches this exception and does not handle it, resulting in a null pointer.

So the question is, why do you report error 2, and sometimes.

Error 2 Analysis

Error 2 is caused by the timeout of httpclient getting response, that is, before the server has finished reading the InputStream, httpclient has already responded to the timeout, and broken the link after the timeout, resulting in the exception of reading InputStream.

There are many reasons why response timeout is triggered,

It may be due to the network, which leads to the slow network transmission;

It may also be that httpclient carries a large amount of data in this request and the server reads it slowly;

It is also possible that the load of the server is high, the data processing is slow, and so on.

solve

After analyzing the reasons, we can solve the problem from two aspects,

For the server, it is necessary to handle the exception capture and respond to the corresponding prompt information to the client. For the client, the response time (HTTP. Socket. Timeout) can be increased appropriately.

Appendix

About the setting of two timeouts of httpclient,

Request timeout (HTTP. Connection. Timeout): that is, connection timeout, refers to the time from the establishment of HTTP link initiated by the client to the completion of the establishment of HTTP link. Response timeout (HTTP. Socket. Timeout): that is, reading data timeout, refers to the time from the client sending the HTTP request to the server receiving the response.

be careful

To write Java, we must catch all kinds of exceptions and handle them; Otherwise, it is difficult to locate the problem. For all kinds of file streams, network streams, etc., we must close the stream in finally {} to ensure that the stream is closed under both normal conditions and abnormal conditions. Otherwise, it may lead to the occupation of resources under abnormal conditions, resulting in the server can not normally provide external services. The problem that is relatively difficult to locate is that sometimes an error is reported, sometimes it is normal, and sometimes it is normal for the same request. Sometimes this problem has to be considered from the network, hardware, CPU, memory, or operating system level.

reference resources

https://blog.csdn.net/goodlixueyong/article/details/50676821

https://blog.csdn.net/weixin_ 38629529/article/details/89788963

https://blog.csdn.net/senblingbling/article/details/43916851

https://blog.csdn.net/u010142437/article/details/18091545

https://tech.kujiale.com/ying-yong-pin-fan-bao-chu-cause-java-net-sockettimeoutexception-read-timed-outzen-yao-ban/