tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches

一、總結

一句話總結：

保證batch_size（圖像增強中）*steps_per_epoch（fit中）小於等於訓練樣本數

train_generator = train_datagen.flow_from_directory(
    train_dir, # 目標目錄
    target_size=(150, 150), # 將所有圖像的大小調整為 150×150
    batch_size=20, # 因為使用了 binary_crossentropy 損失，所以需要用二進制標簽
    class_mode='binary')

history = model.fit(
    train_generator,
    steps_per_epoch=100,
    epochs=150,
    validation_data=validation_generator,
    validation_steps=50)

# case 1
# 如果上面train_generator的batch_size是32，如果這里steps_per_epoch=100，那么會報錯
"""
tensorflow:Your input ran out of data; interrupting training.
Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches).
You may need to use the repeat() function when building your dataset.
"""
# 因為train樣本數是2000（貓1000，狗1000），小於100*32
# case 2
# 如果上面train_generator的batch_size是20，如果這里steps_per_epoch=100，那么不會報錯
# 因為大小剛好
# case 3
# 如果上面train_generator的batch_size是32，如果這里steps_per_epoch=int(1000/32)，
# 那么不會報錯，但是會有警告，因為也是不整除
# 不會報錯因為int(1000/32)*32 < 2000
# case 4
# 如果上面train_generator的batch_size是40，如果這里steps_per_epoch=100，照樣報錯
# 因為40*100>2000

二、tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches

轉自或參考：https://stackoverflow.com/questions/60509425/how-to-use-repeat-function-when-building-data-in-keras

1、報錯

WARNING:tensorflow:Your input ran out of data;
interrupting training. Make sure that your dataset or generator can generate at least
steps_per_epoch * epochs batches (in this case, 5000 batches).
You may need to use the repeat() function when building your dataset.

2、現象

I am training a binary classifier on a dataset of cats and dogs:
Total Dataset: 10000 images
Training Dataset: 8000 images
Validation/Test Dataset: 2000 images

The Jupyter notebook code:

# Part 2 - Fitting the CNN to the images train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True) test_datagen = ImageDataGenerator(rescale = 1./255) training_set = train_datagen.flow_from_directory('dataset/training_set', target_size = (64, 64), batch_size = 32, class_mode = 'binary') test_set = test_datagen.flow_from_directory('dataset/test_set', target_size = (64, 64), batch_size = 32, class_mode = 'binary') history = model.fit_generator(training_set, steps_per_epoch=8000, epochs=25, validation_data=test_set, validation_steps=2000)

I trained it on a CPU without a problem but when I run on GPU it throws me this error:

Found 8000 images belonging to 2 classes. Found 2000 images belonging to 2 classes. WARNING:tensorflow:From <ipython-input-8-140743827a71>:23: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: Please use Model.fit, which supports generators. WARNING:tensorflow:sample_weight modes were coerced from ... to ['...'] WARNING:tensorflow:sample_weight modes were coerced from ... to ['...'] Train for 8000 steps, validate for 2000 steps Epoch 1/25 250/8000 [..............................] - ETA: 21:50 - loss: 7.6246 - accuracy: 0.5000 WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 200000 batches). You may need to use the repeat() function when building your dataset. 250/8000 [..............................] - ETA: 21:52 - loss: 7.6246 - accuracy: 0.5000

I would like to know how to use the repeat() function in keras using Tensorflow 2.0?

3、解決

Your problem stems from the fact that the parameters steps_per_epoch and validation_steps need to be equal to the total number of data points divided to the batch_size.

Your code would work in Keras 1.X, prior to August 2017.

Change your model.fit function to:

history = model.fit_generator(training_set,
                              steps_per_epoch=int(8000/batch_size), epochs=25, validation_data=test_set, validation_steps=int(2000/batch_size))

As of TensorFlow2.1, fit_generator is being deprecated. You can use .fit() method also on generators.

TensorFlow >= 2.1 code:

history = model.fit(training_set.repeat(),
                    steps_per_epoch=int(8000/batch_size), epochs=25, validation_data=test_set.repeat(), validation_steps=int(2000/batch_size))

Notice that int(8000/batch_size) is equivalent to 8000 // batch_size (integer division)

============================================================================

也就是steps_per_epoch=int(8000/batch_size)，這里的8000是訓練樣本數

4、實例

訓練樣本為1000張，

train_datagen = ImageDataGenerator(     
    rescale=1./255,     
    rotation_range=40,     
    width_shift_range=0.2,     
    height_shift_range=0.2, 
    shear_range=0.2,     
    zoom_range=0.2,     
    horizontal_flip=True,) 
# 注意，不能增強驗證數據
test_datagen = ImageDataGenerator(rescale=1./255) 


# 這里batch_size不能是32，不然就報如下錯誤
'''
WARNING:tensorflow:Your input ran out of data; 
interrupting training. Make sure that your dataset or generator can generate at least 
steps_per_epoch * epochs batches (in this case, 5000 batches). 
You may need to use the repeat() function when building your dataset.

'''
# 可能是整除關系吧



train_generator = train_datagen.flow_from_directory(         
    train_dir, # 目標目錄         
    target_size=(150, 150), # 將所有圖像的大小調整為 150×150
    batch_size=20, # 因為使用了 binary_crossentropy 損失，所以需要用二進制標簽
    class_mode='binary') 

validation_generator = test_datagen.flow_from_directory(         
    validation_dir,         
    target_size=(150, 150),         
    batch_size=20,         
    class_mode='binary')

history = model.fit(       
    train_generator,
    steps_per_epoch=100,
    epochs=150,
    validation_data=validation_generator,
    validation_steps=50)

如果上面的batch_size=32，那么這里如果steps_per_epoch=100會報錯

steps_per_epoch 參數的作用：從生成器中抽取 steps_per_epoch 個批量后（即運行了 steps_per_epoch 次梯度下降），擬合過程將進入下一個輪次。

4.1、具體測試情況

# case 1
# 如果上面train_generator的batch_size是32，如果這里steps_per_epoch=100，那么會報錯
"""
tensorflow:Your input ran out of data; interrupting training.
Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches).
You may need to use the repeat() function when building your dataset.
"""
# 因為train樣本數是2000（貓1000，狗1000），小於100*32
# case 2
# 如果上面train_generator的batch_size是20，如果這里steps_per_epoch=100，那么不會報錯
# 因為大小剛好
# case 3
# 如果上面train_generator的batch_size是32，如果這里steps_per_epoch=int(1000/32)，
# 那么不會報錯，但是會有警告，因為也是不整除
# 不會報錯因為int(1000/32)*32 < 2000
# case 4
# 如果上面train_generator的batch_size是40，如果這里steps_per_epoch=100，照樣報錯
# 因為40*100>2000

5、具體代碼

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python異常之EOFError: Ran out of input Tensorflow Dataset.from_generator使用示例 Could not get BatchedBridge, make sure your bundle is packaged correctly 『TensorFlow』數據讀取類_data.Dataset [Pytorch Bug] "EOFError: Ran out of input" When using Dataloader with num_workers=x NSIS Error:Error writing temporary file. Make sure your temp folder is valid的解決辦法 ERROR: Could not extract package's data directory. Are you sure that your installed application is debuggable? vagrant up啟動centos7時出現"rsync" could not be found on your PATH. Make sure that rsyncis properly ins VScode -- 提交無法通過Git提交代碼（make sure you configure your user.name and user.email in git） Try to run this command from the system terminal. Make sure that you use the correct version of 'pip' installed for your...模塊導入問題