Tag Archives: tensorflow

[Solved] Pointsift Error: – ltensorflow not found_framework

My environment: Ubuntu 18.04 tensorflow 2.1
when reproducing pointsift, follow the readme prompt, modify the locations of tensorflow and Lib in the. Sh file, compile the. Sh file, and report an error:
/usr/bin/LD: cannot find – ltensorflow_framework
collect2: error: ld returned 1 exit status

The reason is that the shell file is connected to the dynamic library In libtensorflow_framework.so, the dynamic library name of tensorflow 2.1 is libtensorflow_Frame.So.2, so the link is not available

Solution: create a connection symbol to make libtensorflow_Framework. So. 2 and libtensorflow_Framework.so points to the same

cd /usr/local/lib/python3.6/dist-packages/tensorflow_core //My files are in this directory, some are in the tensorflow directory, as long as they are in the same directory as .so.2
ln -s libtensorflow_framework.so.1 libtensorflow_framework.so

[Solved] bert_as_service startup error: Tensorflow 2.1.0 is not tested!

Error Messages:

bert_as_service + tensorflow 2.1.0
Tensorflow 2.1.0 is not tested!

So reinstalled the virtual environment

I:?[35mVENTILATOR?[0m:freeze, optimize and export graph, could take a while…
d:\anaconda\envs\tensorflow\lib\site-packages\bert_serving\server\helper.py:176: UserWarning: Tensorflow 2.1.0 is not tested! It may or may not work. Feel free to submit an i
ssue at https://github.com/hanxiao/bert-as-service/issues/
‘Feel free to submit an issue at https://github.com/hanxiao/bert-as-service/issues/’ % tf.version)
E:?[36mGRAPHOPT?[0m:fail to optimize the graph!
Traceback (most recent call last):
File “d:\anaconda\envs\tensorflow\lib\runpy.py”, line 193, in run_module_as_main
“main”, mod_spec)
File “d:\anaconda\envs\tensorflow\lib\runpy.py”, line 85, in run_code
exec(code, run_globals)
File "D:\Anaconda\envs\tensorflow\Scripts\bert-serving-start.exe_main.py", line 9, in
File "d:\anaconda\envs\tensorflow\lib\site-packages\bert_serving\server\cli_init.py", line 4, in main
with BertServer(get_run_args()) as server:
File “d:\anaconda\envs\tensorflow\lib\site-packages\bert_serving\server_init_.py”, line 71, in init
self.graph_path, self.bert_config = pool.apply(optimize_graph, (self.args,))
TypeError: ‘NoneType’ object is not iterable

It can be used by installing tensorflow1.10+python3.6.10

Internalerror: GPU sync failed error (How to Solve)

1. Error reporting: (from Python deep learning p178-179)

When vscode runs the following code in Jupiter notebook, an error is reported: internalerror: GPU sync failed

from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.optimizers import RMSprop

model = Sequential()
model.add(layers.Flatten(input_shape=(lookback // step, float_data.shape[-1])))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(1))

model.compile(optimizer=RMSprop(), loss='mae')
history = model.fit_generator(train_gen,
                              steps_per_epoch=500,
                              epochs=20,
                              validation_data=val_gen,
                              validation_steps=val_steps)

2. Solution:

(1) Don’t open too many ipynb file windows. There is only one running window left. Restart and there should be no problem.

(2) Some friends said that they might have something to do with the wallpaper engine. Just turn it off. I haven’t verified this yet.

However, I found that when the wallpaper engine dynamic desktop is displayed, the GPU utilization will increase sharply:

ERROR: Could not find a version that satisfies the requirement tensorfolw==1.14

ERROR: Could not find a version that satisfies the requirement tensorfolw==1.14

After configuring the Linux environment, an error “error: could not find a version that satisfies the requirement tensorflow = = 1.14” appears when installing tensorflow

Error

Check the reason. It is found that the installed version of acaconda is too high, so the matching version of tensorflow cannot be found
the original version of Anaconda was Anaconda 3-5.3.0

terms of settlement

Reduce Anaconda version 3-5.3.0 to Anaconda version 3-5.2.0 to install tensorflow = = 1.14.0 .

installation command

wget https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh

If it is shown in the figure below, anaconda3-5.2.0 is successfully installed

Install tensorfolw = = 1.14.0

Resolve – keyerror encountered while installing tensorflow GPU: ‘tensorflow’ error


1. Error content

the error is as follows (example):

ERROR: Exception:
Traceback (most recent call last):
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 171, in _merge_into_criterion
    crit = self.state.criteria[name]
KeyError: 'numpy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/urllib3/response.py", line 438, in _error_catcher
    yield
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/urllib3/response.py", line 519, in read
    data = self._fp.read(amt) if not fp_closed else b""
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 62, in read
    data = self.__fp.read(amt)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/http/client.py", line 463, in read
    n = self.readinto(b)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/http/client.py", line 507, in readinto
    n = self.fp.readinto(b)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/ssl.py", line 1012, in recv_into
    return self.read(nbytes, buffer)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/ssl.py", line 874, in read
    return self._sslobj.read(len, buffer)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/ssl.py", line 631, in read
    v = self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 189, in _main
    status = self.run(options, args)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_internal/cli/req_command.py", line 178, in wrapper
    return func(self, options, args)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 317, in run
    reqs, check_supported_wheels=not options.target_dir
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 122, in resolve
    requirements, max_rounds=try_to_avoid_resolution_too_deep,
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 453, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 347, in resolve
    failure_causes = self._attempt_to_pin_criterion(name, criterion)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 207, in _attempt_to_pin_criterion
    criteria = self._get_criteria_to_update(candidate)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 199, in _get_criteria_to_update
    name, crit = self._merge_into_criterion(r, parent=candidate)
  File "/home/guest/anaconda3/envs/tf_1.8/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 173, in _merge_into_criterion

2. Solutions

the input code is as follows:

pip install tensorflow-gpu==1.8.0 --default-timeout=10000 --upgrade

Summary

accumulate more in peacetime and make fewer mistakes in wartime! It’s over

[Solved] Tensorflow Error: failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED

Tensorflow failed to create cublas handle: cublas_ STATUS_ ALLOC_ FAILED

Foreword problem description problem solving reference link

preface

After many days of in-depth learning, I finally learned to use GPU. I was very happy, but I chatted with my classmates and learned that my 1660ti running in-depth learning is nothing. Dunton doesn’t hold any hope. It’s good to use notebooks for learning. If you really run in-depth learning, you have to use laboratory computers. Alas, there’s still no money

Problem description

An error occurred while using GPU

2021-11-09 20:43:26.114720: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2021-11-09 20:43:26.386261: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-11-09 20:43:26.386617: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-11-09 20:43:26.386735: W tensorflow/stream_executor/stream.cc:1919] attempting to perform BLAS operation using StreamExecutor without BLAS support
Traceback (most recent call last):
  File "first.py", line 30, in <module>
    gpu_time = timeit.timeit(gpu_run,number=10)
  File "D:\Anaconda\Anaconda3\envs\tensorflow2_0_0_gpu\lib\timeit.py", line 233, in timeit
    return Timer(stmt, setup, timer, globals).timeit(number)
  File "D:\Anaconda\Anaconda3\envs\tensorflow2_0_0_gpu\lib\timeit.py", line 177, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
  File "first.py", line 21, in gpu_run
    c = tf.matmul(gpu_a,gpu_b)
  File "D:\Anaconda\Anaconda3\envs\tensorflow2_0_0_gpu\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "D:\Anaconda\Anaconda3\envs\tensorflow2_0_0_gpu\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2765, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "D:\Anaconda\Anaconda3\envs\tensorflow2_0_0_gpu\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6126, in mat_mul
    _six.raise_from(_core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(10000, 1000), b.shape=(1000, 2000), m=10000, n=2000, k=1000 [Op:MatMul] name: MatMul/

I was in a hurry to find out the reason. I didn’t have enough video memory, and the GPU didn’t run full

Solution:

There are two main reasons
1. The versions of cudnn and CUDA and tensorflow are not applicable, but mine are based on the tutorial and confirmed several times to ensure that they are OK. This excludes the shortage of GPU video memory. It can be solved through the method on the official website: t because ensorflow 2.0 supports two GPU computing methods:
(1) dynamically allocate video memory
(2) set hard video memory (for example, only 1g video memory can be used, and others can play games
set the mode to (1) dynamic allocation, and the code is;

import tensorflow as tf

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

[Solved] TF2.4 Error: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize

First, check whether the CUDA version and cudnn version are aligned.

Version number view:

Note that CUDA indicates the minimum compatibility. For example, version 2.4 and above 11.0 are OK. My side is 11.5, and there is no problem

The error on my side is caused by insufficient video memory

For the error of insufficient video memory, add the following code.

import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)

[Solved] TensorFlow Error: GetNext() failed because the iterator has not been initialized

Error Messages:
FailedPreconditionError (see above for traceback): GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element. [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[, ], output_types=[DT_UINT8, DT_UINT8], _device=”/job:localhost/replica:0/task:0/device:CPU:0″]]

Solution:
Iterator is not initialized
Add before the error code: sess.run(iterator.initializer)

refer to: https://stackoverflow.com/questions/48443203/tensorflow-getnext-failed-because-the-iterator-has-not-been-initialized

tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed

When running the image stylization code with tensorflow version 2.4.0, the following error occurred:

tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,480000,64], b.shape=[1,480000,64], m=64, n=64, k=480000 [Op:Einsum]

The following two solutions are found by consulting the data:
1. Add the following code to the program:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '/gpu:0'

The program can run normally, but the CPU is used, and the running speed of the program is much slower
2. Modify the cudnn version, but it is generally not recommended. It is too troublesome.

[Solved] Tensorflow error or keras error and tf.keras error: oom video memory is insufficient

Hint: if you want to see a list of allocated tenants when oom happens, add Report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Problem description

The problems encountered in today’s 50% off cross-validation and grid search are that the amount of data was too large or bitch_ It also occurs when the size is too large, as shown in the figure:
use the command: Watch – N 0.1 NVIDIA SMI in Linux to view the GPU usage

reason

Due to the lack of video memory, but it is not the real lack of video memory, but because TensorFlow has eaten up the video memory, but there is no actual effective utilization. Therefore, the required video memory can be allocated to TensorFlow. (keras based on TensorFlow is also applicable)

Solution:

1. Set small pitch_Size, although it can be used, the indicator does not cure the root cause
2. Manually set the GPU. In train.py:

(1) in tensorflow
import tensorflow as tf
import os

os.environ["CUDA_VISIBLE_DEVICES"] = "0" Specify which GPU to use
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # Allocate video memory on demand
config.gpu_options.per_process_gpu_memory_fraction = 0.4 # Maximum memory usage 40%
session = tf.Session(config=config)) # Create tensorflow session
...
(2) in keras
import tensorflow as tf
from keras.models import Sequential
import os
from keras.backend.tensorflow_backend import set_session ## Different from tf.keras

os.environ["CUDA_VISIBLE_DEVICES"] = "0"
config = tf.ConfigProto()
config.gpu_options.allow_growth = True  # Allocate video memory on demand
set_session(tf.Session(config=config)) # Pass the settings to keras

model = Sequential()
...
(3) in tf.keras
import tensorflow as tf
from tensorflow.keras.models import Sequential

import os
from tensorflow_core.python.keras.backend import set_session # Different from tf.keras

os.environ["CUDA_VISIBLE_DEVICES"] = "0"
config = tf.ConfigProto()
config.gpu_options.allow_growth = True  # Allocate video memory on demand
config.gpu_options.per_process_gpu_memory_fraction = 0.4 # use 40% of the maximum video memory
set_session(tf.Session(config=config)) # Pass the settings to tf.keras

model = Sequential()
...

Supplement:
tf.keras can use data reading multithreading acceleration:

model.fit(x_train,y_train,use_multiprocessing=True, workers=4) # Enable multithreading, using 4 CPUs

Empty session:

from tensorflow import keras
keras.backend.clear_session() 

After emptying, you can continue to create a new session

[Solved] Tensorflow Win10: ImportError: DLL load failed

Vs 2019 is installed, Microsoft Visual C + + 2015 is available, and GPU is not supported, so CUDA installation is skipped, and pip install tensorflow = = 2.1.0 is directly used. The result shows importerror: DLL load failed.

Solution: PIP uninstall tensorflow uninstall tensorflow

pip install tensorflow = = 2.0.0