Python data cleaning — delete failed images__ Simple version

when using the caffe training algorithm to classify the model, during the training, it was suggested that the failure to read the pictures caused interruption, so I wrote a script to delete the failed pictures in advance. The script is as follows:

import os
import shutil
import warnings
import cv2
import io
 
from PIL import Image
warnings.filterwarnings("error", category=UserWarning)


base_dir = "/data/chw/images"
i = 0

def is_read_successfully(file):
    try:
        imgFile = Image.open(file)
        return True
    except Exception:
        return False

            
for parent, dirs, files in os.walk(base_dir):
    for file in files:
        if not is_read_successfully(os.path.join(parent, file)):
            print(os.path.join(parent, file))
            #os.remove(os.path.join(parent, file)) #真正使用时，这一行要放开，自己一般习惯先跑一遍，没有错误了再删除，防止删错。
            i = i + 1
print(i)

ProgrammerAH

Programmer Guide, Tips and Tutorial

Python data cleaning — delete failed images__ Simple version