使用tensorflow训练数据时遇到的问题总结

266 阅读 0 评论 176 点赞

我是靠谱客的博主繁荣大神，这篇文章主要介绍使用tensorflow训练数据时遇到的问题总结，现在分享给大家，希望可以做个参考。

1、OP_REQUIRES failed at assign_op.h models

这个问题的根源在于GPU不够用，解决方法也是各不相同，这里写一下几个共性方法。

在eval文件中加入os.environ['CUDA_VISIBLE_DEVICES']='2'

强制使用CPU验证

降低batch_size
修改tensorflow-gpu的版本，可能有效果
更换网络

2、Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []

需要将models/research/object_detection/utils/learning_schedules.py里的：

  rate_index = tf.reduce_max(tf.where(tf.greater_equal(global_step, boundaries),
                                      range(num_boundaries),
                                      [0] * num_boundaries))

修改成：

  rate_index = tf.reduce_max(tf.where(tf.greater_equal(global_step, boundaries),
                                      list(range(num_boundaries)),
                                      [0] * num_boundaries)

3、valueerror not enough values to unpack (expected 7 got 0)

The batch_size in config file should be set the same number as your num_clones, which could prevent this.
The batch_size in detection and classification tasks has different definition.
– 来自github

意思：在你的配置文件中的batch_size需要和你的训练文件中的num_clones保持一致。

4、tensorboard无法显示问题

这个问题就是没有读取到正确的路径，使用下面方法可以解决。

将cmd的默认路径cd到log文件的上一层，即cd home/tensorBoard，之后等号后面直接键入log文件名即可，不需写全路径，即 tensorboard --logdir=logs

5、No scalar data was found…

最开始的时候不显示scalar数据，这个时候有可能是eval还没有解析，所以数据暂时不显示，只要tensorboard正常显示，这个数据可能等等就有了。

6、Value Error: First Step Cannot Be Zero

找到类似下面的代码

schedule {
  step: 0
  learning_rate: .0001
}

将step修改为非0，或者删除这一段。

7、查看gpu、cpu信息

https://blog.csdn.net/weiyumeizi/article/details/83035711
https://blog.csdn.net/wujizhishui/article/details/89333957

8、fail to start snmpd

package snmpd 5.7.3+dfsg-1ubuntu4 failed to install/upgrade: subprocess installed post-installation script returned error exit status 1

https://answers.launchpad.net/ubuntu/+source/net-snmp/+question/656995

9、Tensorflow 2.1 报错整合

RuntimeError: loss passed to Optimizer.compute_gradients should be a function when eager execution is enabled.
RuntimeError: Attempting to capture an EagerTensor without building a function.
RuntimeError: When eager execution is enabled, var_list must specify a list or dict of variables to save

当eager execution开启的时候，loss应该是一个Python函数。
在Tensorflow 2.0 中，eager execution 是默认开启的。
所以，需要先关闭eager execution
tf.compat.v1.disable_eager_execution()

10、github clone很慢解决方法

https://www.jianshu.com/p/fb9848d5418c

11、How to fix the bug “Expected “required”, “optional”, or “repeated”.”？

问题出在当前版本的protobuf有bug，所以需要安装其他版本的进行操作，步骤如下：

tensorflow$ mkdir protoc_3.3
tensorflow$ cd protoc_3.3
tensorflow/protoc_3.3$ wget wget https://github.com/google/protobuf/releases/download/v3.3.0/protoc-3.3.0-linux-x86_64.zip
tensorflow/protoc_3.3$ chmod 775 protoc-3.3.0-linux-x86_64.zip
tensorflow/protoc_3.3$ unzip protoc-3.3.0-linux-x86_64.zip
tensorflow/protoc_3.3$ cd ../models/
tensorflow/protoc_3.3$ /home/humayun/tensorflow/protoc_3.3/bin/protoc object_detection/protos/*.proto --python_out=.

https://github.com/tensorflow/models/issues/1834

12、安装google object detection api的有效的教程

https://zhuanlan.zhihu.com/p/215456184

13、解决no module named’pycocotools_mask’的问题

我以为是cocoAPI没装好，在tensorflow/models/research下有一个pycocotools,程序会优先导入这个包，但是这个包里的_mask并不是python程序，把这个包删了。在 models/research下重新安装，命令为

git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py install
make
make install前要先激活环境
make install

14、findfont: Font family [‘serif’] not found. Falling back to DejaVu Sans.

https://blog.csdn.net/mr_muli/article/details/89485619

15、LaTeX Error: File `type1ec.sty’ not found.

apt install cm-super

16、FileNotFoundError: [Errno 2] No such file or directory: ‘latex’: ‘latex’ (Python 3.6 issue)

sudo aptitude install texlive-fonts-recommended texlive-fonts-extra
sudo apt-get install dvipng

16、WARNING:root:image 4000 does not have groundtruth difficult flag specified

这个问题在于eval过多的图片，导致eval时间过长，所以需要减少，这个设置就在config文件中，在"eval_config"中"num_examples"，设置成你想要的数字，比如100，10即可

17、自己的数据集的图片的格式不同，导致识别出现问题

识别的图片的格式为RGBA，而程序是RGB，所以识别的时候一直报错，出现下面的问题。

ValueError: cannot reshape array of size 60654 into shape (264,256,1,3)

其实这个问题的原因就在于，RGB的图片通道是3通道，而RGBA的通道不是，所以，导致shape是对不上的。

所以，在加载图片时，需要做一下转换，将RGBA格式的图片转为RGB的格式。

 def load_image_into_numpy_array(image):
      # The function supports only grayscale images
      image_np = np.asarray(image)
      image_np = cv2.cvtColor(image_np, cv2.COLOR_RGBA2RGB)
      return image_np

使用的方法是CV2的方法：image_np = cv2.cvtColor(image_np, cv2.COLOR_RGBA2RGB)。
如上，问题应该就解决了。

18、cannot import name AsyncGenerator

解决办法就是降低版本

pip install --upgrade prompt-toolkit==2.0.1

安装成功后，执行

python -m ipykernel --version

如果有版本号，那问题就解决了，jupyter可以正常使用

或者，卸载后重新安装

pip uninstall Ipython
pip install Ipython

19、Cannot uninstall ‘ipython’. It is a distutils installed project and thus we cannot accurately det…

解决办法：使用下面命令进行强制更新即可。亲测可用

sudo pip3 install --ignore-installed ipython --upgrade

20、ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9‘ not found

https://blog.csdn.net/bitcarmanlee/article/details/90242598

最后

以上就是繁荣大神最近收集整理的关于使用tensorflow训练数据时遇到的问题总结的全部内容，更多相关使用tensorflow训练数据时遇到内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：>> 深度学习
浏览次数：266 次浏览
发布日期：2023-06-10 19:04:02
本文链接：https://www.kaopuke.com/article/k-p-k_14_ujokf2_14__23__6_x.html

使用tensorflow训练数据时遇到的问题总结

1、OP_REQUIRES failed at assign_op.h models

2、Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []

3、valueerror not enough values to unpack (expected 7 got 0)

4、tensorboard无法显示问题

5、No scalar data was found…

6、Value Error: First Step Cannot Be Zero

7、查看gpu、cpu信息

8、fail to start snmpd

9、Tensorflow 2.1 报错整合

10、github clone很慢解决方法

11、How to fix the bug “Expected “required”, “optional”, or “repeated”.”？

12、安装google object detection api的有效的教程

13、解决no module named’pycocotools_mask’的问题

14、findfont: Font family [‘serif’] not found. Falling back to DejaVu Sans.

15、LaTeX Error: File `type1ec.sty’ not found.

16、FileNotFoundError: [Errno 2] No such file or directory: ‘latex’: ‘latex’ (Python 3.6 issue)

16、WARNING:root:image 4000 does not have groundtruth difficult flag specified

17、自己的数据集的图片的格式不同，导致识别出现问题

18、cannot import name AsyncGenerator

19、Cannot uninstall ‘ipython’. It is a distutils installed project and thus we cannot accurately det…

20、ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9‘ not found

最后

评论列表共有 0 条评论

发表评论取消回复

使用tensorflow训练数据时遇到的问题总结

1、OP_REQUIRES failed at assign_op.h models

2、Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []

3、valueerror not enough values to unpack (expected 7 got 0)

4、tensorboard无法显示问题

5、No scalar data was found…

6、Value Error: First Step Cannot Be Zero

7、查看gpu、cpu信息

8、fail to start snmpd

9、Tensorflow 2.1 报错整合

10、github clone很慢解决方法

11、How to fix the bug “Expected “required”, “optional”, or “repeated”.”？

12、安装google object detection api的有效的教程

13、解决no module named’pycocotools_mask’的问题

14、findfont: Font family [‘serif’] not found. Falling back to DejaVu Sans.

15、LaTeX Error: File `type1ec.sty’ not found.

16、FileNotFoundError: [Errno 2] No such file or directory: ‘latex’: ‘latex’ (Python 3.6 issue)

16、WARNING:root:image 4000 does not have groundtruth difficult flag specified

17、 自己的数据集的图片的格式不同，导致识别出现问题

18、cannot import name AsyncGenerator

19、Cannot uninstall ‘ipython’. It is a distutils installed project and thus we cannot accurately det…

20、ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9‘ not found

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

17、自己的数据集的图片的格式不同，导致识别出现问题

发表评论取消回复