tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches


tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches

一、總結

一句話總結:

保證batch_size(圖像增強中)*steps_per_epoch(fit中)小於等於訓練樣本數


train_generator = train_datagen.flow_from_directory(         
    train_dir, # 目標目錄         
    target_size=(150, 150), # 將所有圖像的大小調整為 150×150
    batch_size=20, # 因為使用了 binary_crossentropy 損失,所以需要用二進制標簽
    class_mode='binary')

history = model.fit(       
    train_generator,
    steps_per_epoch=100,
    epochs=150,
    validation_data=validation_generator,
    validation_steps=50)

# case 1
# 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=100,那么會報錯
"""
tensorflow:Your input ran out of data; interrupting training.
Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches).
You may need to use the repeat() function when building your dataset.
"""
# 因為train樣本數是2000(貓1000,狗1000),小於100*32
# case 2
# 如果上面train_generator的batch_size是20,如果這里steps_per_epoch=100,那么不會報錯
# 因為大小剛好
# case 3
# 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=int(1000/32),
# 那么不會報錯,但是會有警告,因為也是不整除
# 不會報錯因為int(1000/32)*32 < 2000
# case 4
# 如果上面train_generator的batch_size是40,如果這里steps_per_epoch=100,照樣報錯
# 因為40*100>2000

 

 

二、tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches

轉自或參考:https://stackoverflow.com/questions/60509425/how-to-use-repeat-function-when-building-data-in-keras

 

1、報錯

WARNING:tensorflow:Your input ran out of data;
interrupting training. Make sure that your dataset or generator can generate at least
steps_per_epoch * epochs batches (in this case, 5000 batches).
You may need to use the repeat() function when building your dataset.

 

2、現象

I am training a binary classifier on a dataset of cats and dogs:
Total Dataset: 10000 images
Training Dataset: 8000 images
Validation/Test Dataset: 2000 images

The Jupyter notebook code:

# Part 2 - Fitting the CNN to the images train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True) test_datagen = ImageDataGenerator(rescale = 1./255) training_set = train_datagen.flow_from_directory('dataset/training_set', target_size = (64, 64), batch_size = 32, class_mode = 'binary') test_set = test_datagen.flow_from_directory('dataset/test_set', target_size = (64, 64), batch_size = 32, class_mode = 'binary') history = model.fit_generator(training_set, steps_per_epoch=8000, epochs=25, validation_data=test_set, validation_steps=2000) 

I trained it on a CPU without a problem but when I run on GPU it throws me this error:

Found 8000 images belonging to 2 classes. Found 2000 images belonging to 2 classes. WARNING:tensorflow:From <ipython-input-8-140743827a71>:23: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: Please use Model.fit, which supports generators. WARNING:tensorflow:sample_weight modes were coerced from ... to ['...'] WARNING:tensorflow:sample_weight modes were coerced from ... to ['...'] Train for 8000 steps, validate for 2000 steps Epoch 1/25 250/8000 [..............................] - ETA: 21:50 - loss: 7.6246 - accuracy: 0.5000 WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 200000 batches). You may need to use the repeat() function when building your dataset. 250/8000 [..............................] - ETA: 21:52 - loss: 7.6246 - accuracy: 0.5000 

I would like to know how to use the repeat() function in keras using Tensorflow 2.0?

 

3、解決

Your problem stems from the fact that the parameters steps_per_epoch and validation_steps need to be equal to the total number of data points divided to the batch_size.

Your code would work in Keras 1.X, prior to August 2017.

Change your model.fit function to:

history = model.fit_generator(training_set,
                              steps_per_epoch=int(8000/batch_size), epochs=25, validation_data=test_set, validation_steps=int(2000/batch_size)) 

As of TensorFlow2.1, fit_generator is being deprecated. You can use .fit() method also on generators.

TensorFlow >= 2.1 code:

history = model.fit(training_set.repeat(),
                    steps_per_epoch=int(8000/batch_size), epochs=25, validation_data=test_set.repeat(), validation_steps=int(2000/batch_size)) 

Notice that int(8000/batch_size) is equivalent to 8000 // batch_size (integer division)

 

============================================================================

也就是steps_per_epoch=int(8000/batch_size),這里的8000是訓練樣本數

 

 

 

4、實例

訓練樣本為1000張,

train_datagen = ImageDataGenerator(     
    rescale=1./255,     
    rotation_range=40,     
    width_shift_range=0.2,     
    height_shift_range=0.2, 
    shear_range=0.2,     
    zoom_range=0.2,     
    horizontal_flip=True,) 
# 注意,不能增強驗證數據
test_datagen = ImageDataGenerator(rescale=1./255) 


# 這里batch_size不能是32,不然就報如下錯誤
'''
WARNING:tensorflow:Your input ran out of data; 
interrupting training. Make sure that your dataset or generator can generate at least 
steps_per_epoch * epochs batches (in this case, 5000 batches). 
You may need to use the repeat() function when building your dataset.

'''
# 可能是整除關系吧



train_generator = train_datagen.flow_from_directory(         
    train_dir, # 目標目錄         
    target_size=(150, 150), # 將所有圖像的大小調整為 150×150
    batch_size=20, # 因為使用了 binary_crossentropy 損失,所以需要用二進制標簽
    class_mode='binary') 

validation_generator = test_datagen.flow_from_directory(         
    validation_dir,         
    target_size=(150, 150),         
    batch_size=20,         
    class_mode='binary') 
 

 

 

history = model.fit(       
    train_generator,
    steps_per_epoch=100,
    epochs=150,
    validation_data=validation_generator,
    validation_steps=50)

如果上面的batch_size=32,那么這里如果steps_per_epoch=100會報錯

steps_per_epoch 參數的作用:從生成器中抽取 steps_per_epoch 個批量后(即運行了 steps_per_epoch 次梯度下降),擬合過程 將進入下一個輪次。

 

4.1、具體測試情況

# case 1
# 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=100,那么會報錯
"""
tensorflow:Your input ran out of data; interrupting training.
Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches).
You may need to use the repeat() function when building your dataset.
"""
# 因為train樣本數是2000(貓1000,狗1000),小於100*32
# case 2
# 如果上面train_generator的batch_size是20,如果這里steps_per_epoch=100,那么不會報錯
# 因為大小剛好
# case 3
# 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=int(1000/32),
# 那么不會報錯,但是會有警告,因為也是不整除
# 不會報錯因為int(1000/32)*32 < 2000
# case 4
# 如果上面train_generator的batch_size是40,如果這里steps_per_epoch=100,照樣報錯
# 因為40*100>2000

 

5、具體代碼

 

 

 

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM