Tag Archives: jupyter notebook Error

[Solved] Jupyter Notebook Error: SparkException: Python worker failed to connect back

report errors

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-24-bafca16b0526> in <module>
      8     return jobitem, ratingsRDD
      9 jobitem, jobRDD = preparJobdata(sc)
---> 10 jobRDD.collect() 

G:\Projects\python-3.6.4-amd64\lib\site-packages\pyspark\rdd.py in collect(self)
    947         """
    948         with SCCallSiteSync(self.context) as css:
--> 949             sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
    950         return list(_load_from_socket(sock_info, self._jrdd_deserializer))
    951 

G:\Projects\python-3.6.4-amd64\lib\site-packages\py4j\java_gateway.py in __call__(self, *args)
   1303         answer = self.gateway_client.send_command(command)
   1304         return_value = get_return_value(
-> 1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
   1307         for temp_arg in temp_args:

G:\Projects\python-3.6.4-amd64\lib\site-packages\py4j\protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (192.168.101.68 executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
	at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:182)
	at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:107)
	at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:119)
	at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:145)
	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Accept timed out
	at java.net.DualStackPlainSocketImpl.waitForNewConnection(Native Method)
	at java.net.DualStackPlainSocketImpl.socketAccept(DualStackPlainSocketImpl.java:135)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
	at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:199)
	at java.net.ServerSocket.implAccept(ServerSocket.java:545)
	at java.net.ServerSocket.accept(ServerSocket.java:513)
	at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:174)
	... 14 more

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2253)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2202)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2201)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2201)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1078)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1078)
	at scala.Option.foreach(Option.scala:407)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1078)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2440)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2382)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2371)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:868)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2202)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2223)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2242)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2267)
	at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1030)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:1029)
	at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:180)
	at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Python worker failed to connect back.
	at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:182)
	at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:107)
	at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:119)
	at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:145)
	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	... 1 more
Caused by: java.net.SocketTimeoutException: Accept timed out
	at java.net.DualStackPlainSocketImpl.waitForNewConnection(Native Method)
	at java.net.DualStackPlainSocketImpl.socketAccept(DualStackPlainSocketImpl.java:135)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
	at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:199)
	at java.net.ServerSocket.implAccept(ServerSocket.java:545)
	at java.net.ServerSocket.accept(ServerSocket.java:513)
	at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:174)
	... 14 more

Solution:

The following variable environments are configured:

# Windows Hadoop variable environments are configured
HADOOP_HOME = F:\hadoop-common-2.2.0-bin-master\hadoop-common-2.2.0-bin-master

# Windows JDKvariable environments are configured
JAVA_HOME = F:\jdk-8u121-windows-x64_8.0.1210.13

# Windows Pysparkvariable environments are configured
PYSPARK_DRIVER_PYTHON = jupyter
PYSPARK_DRIVER_PYTHON_OPTS = notebook
PYSPARK_PYTHON = python

Remember to restart the computer after the configuration is completed!

[Solved] error: Raw kernel process exited code: 3221226505

1. Error Description:

After installing CUDA, when using jupyter notebook in Visual Studio code, whether using TensorFlow or PyTorch, an error will be reported when the model-related code is involved:

info 10:44:12.758: kill daemon
error 10:44:12.758: Raw kernel process exited code: 3221226505
error 10:44:12.768: Error in waiting for cell to complete [Error: Canceled future for execute_request message before replies were done
	at t.KernelShellFutureHandler.dispose (c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:2:32353)
	at c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:2:51405
	at Map.forEach (<anonymous>)
	at y._clearKernelState (c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:2:51390)
	at y.dispose (c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:2:44872)
	at c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:24:251157
	at t.swallowExceptions (c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:29:120529)
	at dispose (c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:24:251135)
	at t.RawSession.dispose (c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:24:256072)
	at processTicksAndRejections (node:internal/process/task_queues:96:5)]
warn 10:44:12.771: Cell completed with errors {
  message: 'Canceled future for execute_request message before replies were done'
}

(1) Remind

Looking at the previous text of error in the log, CUDA and cuDNN can be adjusted.

warn 10:44:12.276: StdErr from Kernel Process 2022-07-27 10:44:12.27561
warn 10:44:12.276: StdErr from Kernel Process 7: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384]
warn 10:44:12.277: StdErr from Kernel Process  Loaded cuDNN version 8401

error 10:44:12.756: Disposing session as kernel process died ExitCode: 3221226505, Reason: c:\Users\Eddie\AppData\Local\Programs\Python\Python39\lib\site-packages\traitlets\traitlets.py:2392: FutureWarning: Supporting extra quotes around strings is deprecated in traitlets 5.0. You can use 'hmac-sha256' instead of '"hmac-sha256"' if you require traitlets >=5.
  warn(
c:\Users\Eddie\AppData\Local\Programs\Python\Python39\lib\site-packages\traitlets\traitlets.py:2346: FutureWarning: Supporting extra quotes around Bytes is deprecated in traitlets 5.0. Use '59483b09-e83e-4bf0-a9b6-82301995d744' instead of 'b"59483b09-e83e-4bf0-a9b6-82301995d744"'.
  warn(
2022-07-27 10:44:06.428210: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-27 10:44:07.682221: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1335 MB memory:  -> device: 0, name: NVIDIA GeForce MX150, pci bus id: 0000:01:00.0, compute capability: 6.1
2022-07-27 10:44:12.275617: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8401

Look carefully at the contents of error. It seems that the error is in jupyter, but it is not.

error 10:44:12.768: Error in waiting for cell to complete [Error: Canceled future for execute_request message before replies were done
	at t.KernelShellFutureHandler.dispose (c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:2:32353)
	at c:\Users\Eddie\.vscode\extensions\ms-toolsai.jupyter-2022.6.1201981810\out\extension.node.js:2:51405

2. Solution:

(1) Use the browser to run jupyter to check the errors (it is recommended to skip)

It is recommended to skip this step! The explanation here is only to record the complete error resolution steps.

Use the command to install jupyter,

pip install jupyter

Open the .ipynb file in the browser and run it again. The error is different from that in Visual Studio Code:

Could not locate zlibwapi.dll

(2) Install Zlib

Install Zlib as this tutorial. the error will be solved!

[Solved] Jupyter Notebook Error: IOPub data rate exceeded

When writing code with jupyter notebook, there is no output result in print, as shown in the following figure

Don’t panic. This is not a code error, but the IOPub data rate is limited, that is, jupyter limits the output

resolvent

1. CMD open the command line window and enter: jupyter notebook –generate-config (there is a space after the notebook)

If this happens, enter jupyter notebook –generate-config in Anaconda Prompt instead

2. The configuration file path of jupyter appears. Don’t enter it. Just find jupyter_notebook_config.py file according to this path.

3. Open the file with Notepad or python, open the search window with Ctrl + F, and enter iopub_data_rate_limit.

Find a specific line, uncomment, and add many 0

4. Restart the Jupiter notebook and the display is normal

[Solved] jupyter notebook Error: 500 : Internal Server Error

1. Problem Description:
the Jupiter notebook can open the directory page, but it cannot open normally .ipynb file, as shown in the figure below, reports an error:

after consulting the data, we know that the error is caused by the incompatibility between nbconvert and pandoc
2. Solution:
enter the following command to install and upgrade nbconvert

pip install --upgrade --user nbconvert

After successfully installing nbconvert, start jupyter notebook again and open .ipynb file normally in the browser.

[Solved] jupyter notebook Error: ModuleNotFoundError: No module named jupyter_nbextensions_configurator

Problem description

Platform: Windows 10 professional edition, anaconda3

When starting the Jupiter notebook, there is an error message, as follows:

ModuleNotFoundError: No module named jupyter_nbextensions_configurator

Although the jupyter lab can still be used when it is opened, the error message is always a hidden danger. Therefore, after searching the data, the following solutions are found