Tag Archives: tensorflow

[Solved] Python Error: tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.

environment

Environment: docker
System: Ubuntu 18.04
graphics card: RTX 1080ti
tf-v: 1.15.0

Run the neural network model and report the error of the title

The screenshot of the error is as follows:

solution:

Add the following code after the import of the entry file:

config = tf.compat.v1.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6  # 0.6 sometimes works better for folks
keras.backend.tensorflow_backend.set_session(tf.compat.v1.Session(config=config))

Limit the running memory of Tensorflow

failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error

Question

It was good before. After restarting the computer running program, this error is reported:

failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
retrieving CUDA diagnositic information for host: ...

Then it runs with the old slow CPU.

environment

Ubuntu 20.04TensorFlow 2.5cudatoolkit 11.2cudnn 8.1

solve

The probability is that the graphics card driver is stained with something.

Because I happened to update the system (Ubuntu) automatically before, I probably moved some NVIDIA files or something. Then I can’t restart.

Then I opened the Software Updater, completely updated it, restarted it, and finished it.

Error reporting: attributeerror: module ‘******* has no attribute’ ******* and other problems are solved

Error reported:

resolvent:

1、 Install the third-party library directly on the command page

pip install xxxxxx

  2、 Select interpreter setting in the lower left corner of pychart, click the plus sign, enter the name of the third-party library, and then click Install pakage in the lower left corner to install.

  3、 The 2. X version of tensorflow does not support many third-party libraries. I have tried many methods, but I still think it is the most direct and effective (simple and crude) to install the 1. X version directly. Correspondingly, you may need to downgrade numpy because the installed tensorflow version and numpy version do not match.

Uninstall tensorflow first:

pip uninstall tensorflow

  To install a new version:

pip install tensorflow1.xx

  Check the applicable version of pycharm, and then install:  

pip install tensorflow==1.14 -i https://pypi.douban.com/simple

  After running the code, an error is reported indicating that the numpy version does not match:

For example, I report an error: futurewarning: passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,))/'(1,)type’.  _ np_ qint32 = np.dtype([(“qint32”, np.int32, 1)])

Recommended installation: or other version

pip install numpy==1.16.0

  Normal operation after.

[Solved] TFrecords Create Datas Error: Number of int64 values != expected. Values size: 1 but output shape: [3]

1. For fixed length label and feature

Generate tfrecord data:

Multiple label samples, where the label contains 5

import os
import tensorflow as tf
import numpy as np
output_flie = str(os.path.dirname(os.getcwd()))+"/train.tfrecords"
with tf.python_io.TFRecordWriter(output_flie) as writer:
    labels = np.array([[1,0,0,1,0],[0,1,0,0,1],[0,0,0,0,1],[1,0,0,0,0]])
    features = np.array([[0,0,0,0,0,0],[1,1,1,1,1,2],[1,1,1,0,0,2],[0,0,0,0,1,9]])
    for i in range(4):
        label = labels[i]
        feature = features[i]
        example = tf.train.Example(features=tf.train.Features(feature={
            "label": tf.train.Feature(int64_list=tf.train.Int64List(value=label)),
            'feature': tf.train.Feature(int64_list=tf.train.Int64List(value=feature))
        }))
        writer.write(example.SerializeToString())

Parse tfrecord data:

import os
import tensorflow as tf
import numpy as np


def read_tf(output_flie):
    filename_queue = tf.train.string_input_producer([output_flie])
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)
    result = tf.parse_single_example(serialized_example,
                                     features={
                                         'label': tf.FixedLenFeature([5], tf.int64),
                                         'feature': tf.FixedLenFeature([6], tf.int64),
                                     })
    feature = result['feature']
    label = result['label']
    return feature, label


output_flie = str(os.path.dirname(os.getcwd())) + "/train.tfrecords"
feature, label = read_tf(output_flie)
imageBatch, labelBatch = tf.train.batch([feature, label], batch_size=2, capacity=3)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)
    print(1)
    images, labels = sess.run([imageBatch, labelBatch])
    print(images)
    print(labels)
    coord.request_stop()
    coord.join(threads)

Output:

1
('----images: ', array([[0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 2]]))
('----labels:', array([[1, 0, 0, 1, 0],
       [0, 1, 0, 0, 1]]))

2. For variable length label and feature

Generate tfrecord

It is the same as the fixed length data generation method

import os
import tensorflow as tf
import numpy as np
train_TFfile = str(os.path.dirname(os.getcwd()))+"/hh.tfrecords"
writer = tf.python_io.TFRecordWriter(train_TFfile)
labels = [[1,2,3],[3,4],[5,2,6],[6,4,9],[9]]
features = [[2,5],[3],[5,8],[1,4],[5,9]]
for i in range(5):
    label = labels[i]
    print(label)
    feature = features[i]
    example = tf.train.Example(
        features=tf.train.Features(
            feature={'label': tf.train.Feature(int64_list=tf.train.Int64List(value=label)),
                       'feature': tf.train.Feature(int64_list=tf.train.Int64List(value=feature))}))
    writer.write(example.SerializeToString())
writer.close()

Parsing tfrecord

The main changes are:

tf.VarLenFeature(tf.int64)

Unfinished to be continued

Common errors:

When the defined label dimension is different from the dimension during parsing, an error will be reported as follows:

Details of error reporting:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Name: <unknown>, Key: label, Index: 0.  Number of int64 values != expected.  Values size: 1 but output shape: [3]

The size of the label is 1, but when used, it exceeds 1

Solution: when generating tfrecord, the length of label should be the same as that during parsing.

Tensorflow GPU error (4 Type Error and their Solutions)

I have just changed my laptop, and I have been reporting errors when training models with TensorFlow-gpu, so I am writing down my solution here. Because I tried to replace the cuda version, replace the cudnn version, replace the tensorflow-gpu and keras version when solving the gpu can not run, so the reported errors are also a mess.
Error 1:
2021-08-09 21:04:53.637764: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2021-08-09 21:04:58.598447: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-08-09 21:17:47.603456: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2021-08-09 21:17:47.675868: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
2021-08-09 21:17:47.676730: I tensorflow/stream_executor/stream.cc:4963] [stream=000001774007A1F0,impl=00000177393F7250] did not memzero GPU location; source: 000000726209DF28
2021-08-09 21:17:47.676867: I tensorflow/stream_executor/stream.cc:316] did not allocate timer: 000000726209DED0
2021-08-09 21:17:47.676954: I tensorflow/stream_executor/stream.cc:1964] [stream=000001774007A1F0,impl=00000177393F7250] did not enqueue ‘start timer’: 000000726209DED0
2021-08-09 21:17:47.677084: I tensorflow/stream_executor/stream.cc:1976] [stream=000001774007A1F0,impl=00000177393F7250] did not enqueue ‘stop timer’: 000000726209DED0
2021-08-09 21:17:47.677201: F tensorflow/stream_executor/gpu/gpu_timer.cc:65] Check failed: start_event_ != nullptr && stop_event_ != nullptr
Error 2:
Error 3:
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
Error 4:
CuDNN library: 7.4.1 but source was compiled with: 7.6.0.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration
In fact, these errors are caused by a problem: my computer graphics card is 3050ti, belongs to the 30 series, can only install cuda11 version or higher, so I reinstalled cuda11.3.1 and the corresponding cudnn8.2.0 version (cudnn8.2.1 is reported as an error)

ModuleNotFoundError: No module named ‘tensorflow_core.estimator‘

Question

When using tensorflow, an error is reported: modulenotfounderror: no module named ‘tensorflow_ core.estimator’

Possible causes and corresponding solutions

1、

Problem: the Matplotlib library was not imported
solution: Import matplotlib.pyplot as plot
if the Matplotlib library is not installed, use the command CONDA install Matplotlib in the command client to install the Matplotlib library.

2、

Problem: the version of tensorflow is inconsistent with that of tensorflow estimator,
solution: check whether the current tensorflow version is consistent with that of tensorflow estimator by using the command CONDA list in the command client. If not, reduce or increase the version of one party.

from keras.preprocessing.text import Tokenizer error: AttributeError: module ‘tensorflow.compat.v2‘ has..

Error from keras.preprocessing.text import tokenizer: attributeerror: module ‘tensorflow. Compat. V2’ has

Import the vocabulary mapper tokenizer in keras in NLP code

from keras.preprocessing.text import Tokenizer

Execute code and report error:

AttributeError: module 'tensorflow.compat.v2' has no attribute '__ internal__'

Baidu has been looking for the same error for a long time, but it sees a similar problem. Just change the above code to:

from tensorflow.keras.preprocessing.text import Tokenizer

That’s it!

[Solved] RuntimeError: Numpy is not available (Associated Torch or Tensorflow)

Runtimeerror: numpy is not available solution

Problem Description:

Today, the old computer crashed (cried) and the new computer got started. I couldn’t wait to install the pytorch and check the operation of the previous project. As a result, there was an error running the model training

the key sentence is the sentence loading the model training data

for i, data in enumerate(train_loader):

The bottom layer of this statement is applied to the operation of converting the tensor variable to numpy, that is

import numpy
import torch
test_torch = torch.rand(100,1,28,28)
print(test_torch.numpy())
# will produce the same error

Problem root

The direct correlation between pytoch and tensorflow is ignored
if there is only torch but no tensorflow in the environment, the tensor variable used by torch will lose the tensor related operation function, such as torch. Numpy()

Solution:

Install tensorflow in the basic environment, whether CPU version or GPU version
in Anaconda environment, you can directly

conda install tensorflow

Otherwise pip

pip install --ignore-installed --upgrade tensorflow-gpu