圖像超分辨重構的原理,輸入一張像素點少,像素較低的圖像, 輸出一張像素點多,像素較高的圖像
而在作者的文章中,作者使用downsample_up, 使用imresize(img, []) 將圖像的像素從原理的384,384降低到96, 96, 從而構造出高水平的圖像和低水平的圖像
作者使用了三個部分構成網絡,
第一部分是生成網絡,用於進行圖片的生成,使用了16層的殘差網絡,最后的輸出結果為tf.nn.tanh(),即為-1, 1, 因為圖像進行了-1,1的預處理
第二部分是判別網絡, 用於進行圖片的判別操作,對於判別網絡而言,是希望將生成的圖片判別為假,將真的圖片判別為真
第三部分是VGG19來提取生成圖片和真實圖片的conv5層卷積層的輸出結果,用於生成局部部位的損失值mse
損失值說明:
d_loss:
d_loss_1: tl.cost.sigmoid_cross_entropy(logits_real, tf.ones_like(logits_real)) # 真實圖像的判別結果的損失值
d_loss_2: tl.cost.sigmoid_cross_entrpopy(logits_fake, tf.zeros_like(logits_real)) # 生成圖像的判別結果的損失值
g_loss:
g_gan_loss: 1e-3 * tl.cost.sigmoid_cross_entropy(logits_fake, tf.ones_like(logits_real)) # 損失值表示為 -log(D(g(lr))) # 即生成的圖像被判別為真的損失值
mse_loss: tl.cost.mean_squared_error(net_g.outputs, t_target_image) # 計算真實值與生成值之間的像素差
vgg_loss: tl.cost.mean_squared_error(vgg_predict_emb.outputs, vgg_target_emb.outputs) # 用於計算生成圖片和真實圖片經過vgg19的卷積層后,特征圖之間的差異,用來獲得特征細節的差異性
訓練說明:
首先進行100次迭代,用來優化生成網絡,使用tf.train.AdamOptimer(lr_v, beta1=beta1).minimize(mse_loss, var_list=g_var)
等生成網絡迭代好以后,開始迭代生成網絡和判別網絡,以及VGG19的損失值縮小
生成網絡:使用了16個殘差模塊,在殘差模塊的輸入與下一層的輸出之間又進行一次殘差直連
判別網絡:使用的是feature_map遞增的卷積層構造成的判別網路
代碼說明:
第一步:將參數從config中導入到main.py
第二步:使用tl.file.exists_or_mkdir() 構造用於儲存圖片的文件夾,同時定義checkpoint的文件夾
第三步:使用sorted(tl.files.load_file_list) 生成圖片的列表, 使用tl.vis.read_images() 進行圖片的讀入
第四步:構建模型的構架Model
第一步:定義輸入參數t_image = tf.placeholder('float32', [batch_size, 96, 96, 3]), t_target_image = tf.placeholder('float32', [batch_size, 384, 384, 3])
第二步: 使用SGRAN_g 用來生成最終的生成網絡,net_g, 輸入參數為t_image, is_training, reuse
第三步: 使用SGRAN_d 用來生成判別網絡,輸出結果為net_d網絡架構,logits_real, 輸入參數為t_target_image, is_training, reuse, 同理輸入t_image, 獲得logits_fake
第四步: 使用net_g.print_params(False) 和 net_g.print_layers() 不打印參數,打印每一層
第五步:將net_g.outputs即生成的結果和t_target_image即目標圖像的結果輸入到Vgg_19_simple_api, 獲得vgg_net, 以及conv第五層的輸出結果
第一步:tf.image.resize_images()進行圖片的維度變換,為了可以使得其能輸入到VGG_19中
第二步:將變化了維度的t_target_image 輸入到Vgg_19_simple_api, 獲得net_vgg, 和 vgg_target_emb即第五層卷積的輸出結果
第三步:將變化了維度的net_g.outputs 輸入到Vgg_19_simple_api, 獲得 vgg_pred_emb即第五層卷積的輸出結果
第六步: 構造net_g_test = SGRAN_g(t_image, False, True) 用於進行訓練中的測試圖片
第五步:構造模型loss,還有trian_ops操作
第一步: loss的構造, d_loss 和 g_loss的構造
第一步: d_loss的構造, d_loss_1 + d_loss_2
第一步: d_loss_1: 構造真實圖片的判別損失值,即tl.cost.softmax_cross_entropy(logits_real, tf.ones_like(logits_real))
第二步: d_loss_2: 構造生成圖片的判別損失值, 即tl.cost.softmax_cross_entropy(logits_fake, tf.ones_like(logits_fake))
第二步: g_loss的構造,g_gan_loss, mse_loss, vgg_loss
第一步: g_gan_loss, 生成網絡被判別網絡判別為真的概率,使用tl.cost.softmax_cross_entropy(logits_fake, tf.ones_like(logits_fake))
第二步:mse_loss 生成圖像與目標圖像之間的像素點差值,使用tl.cost.mean_squared_error(t_target_image, net_g.outputs)
第三步:vgg_loss 將vgg_target_emb.outputs與vgg_pred_emb.outputs獲得第五層卷積層輸出的mse_loss
第二步:構造train_op,包括 g_optim_init用預訓練, 構造g_optim, d_optim
第一步:g_var = tl.layers.get_variables_with_name(‘SGRAN_g') 生成網絡的參數獲得
第二步: d_var = tl.layers.get_variable_with_name('SGRAN_d') 判別網絡的參數獲得
第三步: 使用with tf.variable_scope('learning_rate'): 使用lr_v = tf.Variable(lr_init)
第四步:定義train_op, g_optim_init, g_optim, d_optim
第一步:構造g_optim_init 使用tf.train.Adaoptimer(lr_v, beta1=betal).minimize(mse_loss, var_list=g_var)
第二步:構造g_optim 使用tf.train.Adaoptimer(lr_v, beta1=betal).minimize(g_loss, var_list=g_var)
第三步:構造d_optim 使用tf.train.Adaoptimer(lr_v, beta1=betal).minimize(d_loss, var_list=d_var)
第六步:使用tl.files.load_and_assign_npz() 載入訓練好的sess參數
第一步: 使用tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=False))
第二步: tl.layers.initialize_global_variables(sess)
第三步: 使用tf.file.load_and_assign_npz 進行g_net的參數下載, 否者就下載g_{}_init的參數下載
第四步:使用tf.file.load_and_assgin_npz進行d_net的參數下載
第七步:下載VGG網絡,將其運用到net_vgg
第一步:使用np.load(path, encoding='latin1').item() 下載參數
第二步:循環sorted(npz.items()) 進行參數循環,將其添加到params
第三步:使用tl.files.assign_params(sess, params, net_vgg) 將參數運用到net_vgg
第八步:進行參數的訓練操作
第一步:從圖片中跳出一個batch_size的數據構成測試集
第一步: 使用tl.prepro.threading_data fn = crop_sub_imgs_fn, 使用crop進行裁剪操作
第二步: 使用tl.prepro.threading_data fn = downsample 使用imresize進行圖片的維度壓縮
第二步:進行預訓練操作
第一步:循環迭代, 獲得一個batch的數據,使用crop_sub_imgs_fn 和 downsample構造低水平的數據和高水平的數據
第二步:使用sess.run, g_optim_init進行圖片的預訓練
第三步:進行訓練操作
第一步:循環迭代,獲得一個batch的數據,使用crop_sub_imgs_fn 和 downsample構造低水平的數據和高水平的數據
第二步:使用sess.run, g_optim 和 d_optim 進行圖片的訓練操作
第九步:進行evaluate圖片的測試階段
第一步: 構造圖片展示的文件夾,使用tf.files.exits_files_mkdir
第二步: 使用tl.files.load_file_list 和 tl.vis.read_images讀入圖片
第三步:根據索引選擇一張圖片,/127.5 - 1 進行歸一化處理
第四步:使用tf.placeholder('float32', [1, None, None, 3]) 構造輸入的t_image
第五步: 使用SGRAN_g(t_image, False, False) 構造net_g
第六步:使用tf.Session() 構造sess,使用tl.files.load_and_assign_npz下載訓練好的sess, network=net_g
第七步:使用sess.run([net_g.outputs], feed_dict={t_image:[valid_lr_img]}) 獲得圖片
第八步:使用tl.vis.save_images(outputs[0])保存圖片
第九步:使用scipy.misc.imresize() 將低像素的圖片擴大為原來的四倍,與重構的圖像作對比
代碼: main.py 主函數
import tensorlayer as tl import tensorflow as tf import numpy as np from config import config from model import * import os import time import scipy ## 添加參數 batch_size = config.TRAIN.batch_size lr_init = config.TRAIN.lr_init betal = config.TRAIN.betal ### initialze G n_epoch_init = config.TRAIN.n_epoch_init ### adversarial learning n_epoch = config.TRAIN.n_epoch lr_decay = config.TRAIN.lr_decay decay_every = config.TRAIN.decay_every ni = int(np.sqrt(batch_size)) def train(): # 創建用於進行圖片儲存的文件 save_dir_ginit = 'sample/{}_ginit'.format(tl.global_flag['mode']) save_dir_gan = 'sample/{}_gan'.format(tl.global_flag['mode']) tl.files.exists_or_mkdir(save_dir_ginit) tl.files.exists_or_mkdir(save_dir_gan) checkpoint = 'checkpoint' tl.files.exists_or_mkdir(checkpoint) train_hr_img_list = sorted(tl.files.load_file_list(path=config.TRAIN.hr_img_path, regx='.*.png', printable=False)) train_lr_img_list = sorted(tl.files.load_file_list(path=config.TRAIN.lr_img_path, regx='.*.png', printable=False)) train_hr_img = tl.vis.read_images(train_hr_img_list, path=config.TRAIN.hr_img_path, n_threads=8) train_lr_img = tl.vis.read_images(train_lr_img_list, path=config.TRAIN.lr_img_path, n_threads=8) # 構造輸入 t_image = tf.placeholder('float32', [batch_size, 96, 96, 3]) t_target_image = tf.placeholder('float32', [batch_size, 384, 384, 3]) # 構造生成的model,獲得生成model的輸出net_g net_g = SRGAN_g(t_image, True, False) # 構造判別網絡,判別net_g.output, t_target_image, net_d表示整個網絡 net_d, logist_real = SRGAN_d(t_target_image, True, False) _, logist_fake = SRGAN_d(net_g.outputs, True, True) # 構造VGG網絡 net_g.print_params(False) net_g.print_layers() net_d.print_params(False) net_d.print_layers() # 進行輸入數據的維度變換,將其轉換為224和224 target_image_224 = tf.image.resize_images(t_target_image, [224, 224], method=0, align_corners=False) pred_image_224 = tf.image.resize_images(net_g.outputs, [224, 224], method=0, align_corners=False) net_vgg, vgg_target_emb = Vgg_19_simple_api((target_image_224 + 1) / 2, reuse=False) _, vgg_pred_emb = Vgg_19_simple_api((net_g + 1) / 2, reuse=True) # 進行訓練階段的測試 net_g_test = SRGAN_g(t_image, False, True) #### ========== DEFINE_TRAIN_OP =================### d_loss_1 = tl.cost.sigmoid_cross_entropy(logist_real, tf.ones_like(logist_real)) d_loss_2 = tl.cost.sigmoid_cross_entropy(logist_fake, tf.zeros_like(logist_fake)) d_loss = d_loss_1 + d_loss_2 g_gan_loss = 1e-3 * tl.cost.sigmoid_cross_entropy(logist_fake, tf.ones_like(logist_fake)) mse_loss = tl.cost.mean_squared_error(net_g.outputs, t_target_image, is_mean=True) vgg_loss = 2e-6 * tl.cost.mean_squared_error(vgg_target_emb.outputs, vgg_pred_emb.outputs, is_mean=True) g_loss = g_gan_loss + mse_loss + vgg_loss g_var = tl.layers.get_variables_with_name('SRGAN_g', True, True) d_var = tl.layers.get_variables_with_name('SRGAN_d', True, True) with tf.variable_scope('learning_rate'): lr_v = tf.Variable(lr_init, trainable=False) g_optim_init = tf.train.AdamOptimizer(lr_v, beta1=betal).minimize(mse_loss, var_list=g_var) g_optim = tf.train.AdamOptimizer(lr_v, beta1=betal).minimize(g_loss, var_list=g_var) d_optim = tf.train.AdamOptimizer(lr_v, beta1=betal).minimize(d_loss, var_list=d_var) ###======================RESTORE_MODEL_SESS ==================### sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=False)) tl.layers.initialize_global_variables(sess) if tl.files.load_and_assign_npz(sess, checkpoint + '/g_{}.npz'.format(tl.global_flag['mode'], network=net_g)) is False: tl.files.load_and_assign_npz(sess, checkpoint + '/g_init_{}.npz'.format(tl.global_flag['mode'], network=net_g)) tl.files.load_and_assign_npz(sess, checkpoint + '/d_{}.npz'.format(tl.global_flag['mode'], network=net_d)) ### ================== load vgg params =================== ### vgg_npy_path = 'vgg19.npy' if not os.path.isfile(vgg_npy_path): print('Please download vgg19.npz from : https://github.com/machrisaa/tensorflow-vgg') exit() npz = np.load(vgg_npy_path, encoding='latin1').item() params = [] for var in sorted(npz.items()): W = np.asarray(var[1][0]) b = np.asarray(var[1][1]) params.extend([W, b]) tl.files.assign_params(sess, params, net_vgg) print('ok') ###======================== TRAIN =======================### sample_imgs = train_hr_img[0:batch_size] # 進行隨機裁剪,保證其維度為384 sample_imgs_384 = tl.prepro.threading_data(sample_imgs, fn=crop_sub_imgs_fn, is_random=False) # 進行像素的降低 sample_imgs_96 = tl.prepro.threading_data(sample_imgs_384, fn=downsample_fn) # 進行圖片的保存 tl.vis.save_images(sample_imgs_96, [ni, ni], save_dir_ginit + '/_train_sample_96.png') tl.vis.save_images(sample_imgs_384, [ni, ni], save_dir_ginit + '/_train_sample_384.png') tl.vis.save_images(sample_imgs_96, [ni, ni], save_dir_gan + '/_train_sample_96.png') tl.vis.save_images(sample_imgs_384, [ni, ni], save_dir_gan + '/_train_sample_384.png') ###======================== initial train G =====================### for epoch in range(n_epoch_init): n_iter = 0 init_loss_total = 0 for idx in range(0, len(train_hr_img), batch_size): b_img_384 = tl.prepro.threading_data(train_hr_img[idx:idx+batch_size], fn=crop_sub_imgs_fn, is_random=False) b_img_96 = tl.prepro.threading_data(b_img_384, fn=downsample_fn) _, MSE_LOSS = sess.run([g_optim_init, mse_loss], feed_dict={t_image:b_img_96, t_target_image:b_img_384}) init_loss_total += MSE_LOSS if (epoch != 0) and (epoch % 10 == 0): out = sess.run(net_g_test.outputs, feed_dict={t_image:sample_imgs_96}) print('[*] save image') tl.vis.save_images(out, [ni, ni], save_dir_ginit + '/train_%d.png' % epoch) if (epoch != 0) and (epoch % 10 ==0): tl.files.save_npz(net_g.all_params, name=checkpoint + '/g_init_{}.npz'.format(tl.global_flag['mode'])) ### ======================== train GAN ================== ### for epoch in range(0, n_epoch+1): if epoch != 0 and epoch % decay_every == 0: new_lr = lr_decay ** (epoch // decay_every) sess.run(tf.assign(lr_v, new_lr * lr_v)) log = '** new learning rate: %f(for GAN)' % (lr_init * new_lr) print(log) elif epoch == 0: sess.run(tf.assign(lr_v, lr_init)) log = '** init lr: %f decay_every_init: %d, lr_decay: %f(for GAN)'%(lr_init, decay_every, lr_decay) print(log) epoch_time = time.time() total_d_loss, total_g_loss, n_iter = 0, 0, 0 for idx in range(0, len(train_hr_img), batch_size): b_img_384 = tl.prepro.threading_data(train_hr_img[idx:idx+batch_size], fn=crop_sub_imgs_fn, is_random=False) b_img_96 = tl.prepro.threading_data(b_img_384, fn=downsample_fn) _, errD = sess.run([d_optim, d_loss], feed_dict={t_image:b_img_96, t_target_image:b_img_384}) _, errG, errM, errV, errA = sess.run([g_optim, g_loss, mse_loss, vgg_loss, g_gan_loss], feed_dict={t_image:b_img_96, t_target_image:b_img_384}) total_d_loss += errD total_g_loss += errG if epoch != 0 and epoch % 10 == 0: out = sess.run(net_g_test.outputs, feed_dict={t_image:sample_imgs_96}) print('[*] save image') tl.vis.save_images(out, [ni, ni], save_dir_gan + '/train_%d' % epoch) if epoch != 0 and epoch % 10 == 0: tl.files.save_npz(net_g.all_params, name = checkpoint + '/g_{}.npz'.format(tl.global_flag['mode'])) tl.files.save_npz(net_d.all_params, name= checkpoint + '/d_{}.npz'.format(tl.global_flag['mode'])) def evaluate(): save_dir = 'sample/{}'.format(tl.global_flag['mode']) tl.files.exists_or_mkdir(save_dir) checkpoints = 'checkpoints' evaluate_hr_img_list = sorted(tl.files.load_file_list(config.VALID.hr_img_path, regx='.*.png', printable=False)) evaluate_lr_img_list = sorted(tl.files.load_file_list(config.VALID.lr_img_path, regx='.*.png', printable=False)) valid_lr_imgs = tl.vis.read_images(evaluate_lr_img_list, path=config.VALID.lr_img_path, n_threads=8) valid_hr_imgs = tl.vis.read_images(evaluate_hr_img_list, path=config.VALID.hr_img_path, n_threads=8) ### ==================== DEFINE MODEL =================### imid = 64 valid_lr_img = valid_lr_imgs[imid] valid_hr_img = valid_hr_imgs[imid] valid_lr_img = (valid_lr_img / 127.5) - 1 t_image = tf.placeholder('float32', [1, None, None, 3]) net_g = SGRAN_g(t_image, False, False) sess = tf.Session() tl.files.load_and_assign_npz(sess, checkpoints + '/g_{}.npz'.format(tl.global_flag['mode']), network=net_g) output = sess.run([net_g.outputs], feed_dict={t_image:[valid_lr_img]}) tl.vis.save_images(output[0], [ni, ni], save_dir + '/valid_gen.png') tl.vis.save_images(valid_lr_img, [ni, ni], save_dir + '/valid_lr.png') tl.vis.save_images(valid_hr_img, [ni, ni], save_dir + '/valid_hr.png') size = valid_hr_img.shape out_bicu = scipy.misc.imresize(valid_lr_img, [size[0]*4, size[1]*4], interp='bicubic', mode=None) tl.vis.save_images(out_bicu, [ni, ni], save_dir + '/valid_out_bicu.png') if __name__ == '__main__': import argparse parse = argparse.ArgumentParser() parse.add_argument('--mode', type=str, default='srgan', help='srgan evaluate') args = parse.parse_args() tl.global_flag['mode'] = args.mode if tl.global_flag['mode'] == 'srgan': train() elif tl.global_flag['mode'] == 'evaluate': evaluate()
model.py 構建模型
import tensorflow as tf import tensorlayer as tl from tensorlayer.layers import * import time def SRGAN_g(input_image, is_train, reuse): w_init = tf.random_normal_initializer(stddev=0.2) b_init = None g_init = tf.random_normal_initializer(1, 0.02) with tf.variable_scope('SRGAN_g', reuse=reuse): n = InputLayer(input_image, name='in') n = Conv2d(n, 64, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', W_init=w_init, name='n64s1/c') temp = n for i in range(16): nn = Conv2d(n, 64, (3, 3), (1, 1), act=None, padding='SAME', W_init=w_init, name='n64s1/c1/%d' % i) nn = BatchNormLayer(nn, act=tf.nn.relu, is_train=is_train, gamma_init=g_init, name='n64s1/b1/%d' % i) nn = Conv2d(nn, 64, (3, 3), (1, 1), act=None, padding='SAME', W_init=w_init, name='n64s1/c2/%d' % i) nn = BatchNormLayer(nn, act=None, is_train=is_train, gamma_init=g_init, name='n64s1/b2/%d'%i) nn = ElementwiseLayer([n, nn], tf.add, name='b_residual_add_%d' % i) n = nn n = Conv2d(n, 64, (3, 3), (1, 1), act=None, padding='SAME', W_init=w_init, name='n64s1/c3') n = BatchNormLayer(n, act=None, is_train=is_train, gamma_init=g_init, name='n64s1/b3') n = ElementwiseLayer([temp, n], tf.add, name='add3') # 進行反卷積操作 n = Conv2d(n, 256, (3, 3), (1, 1), act=None, padding='SAME', W_init=w_init, name='n64s1/c4') n = SubpixelConv2d(n, scale=2, n_out_channel=None, act=tf.nn.relu, name='pixelshuffler2/1') n = Conv2d(n, 256, (3, 3), (1, 1), act=None, padding='SAME', W_init=w_init, name='n64s1/c5') n = SubpixelConv2d(n, scale=2, n_out_channel=None, act=tf.nn.relu, name='pixelshuffle2/2') n = Conv2d(n, 3, (1, 1), (1, 1), act=tf.nn.tanh, padding='SAME', W_init=w_init, name='out') return n def SRGAN_d(input_image, is_training=True, reuse=False): w_init = tf.random_normal_initializer(stddev=0.2) b_init = None g_init = tf.random_normal_initializer(1.0, stddev=0.02) lrelu = lambda x: tl.act.lrelu(x, 0.2) df_dim = 64 with tf.variable_scope('SRGAN_d', reuse=reuse): tl.layers.set_name_reuse(reuse) net_in = InputLayer(input_image, name='input/image') net_h0 = Conv2d(net_in, df_dim, (4, 4), (2, 2), act=lrelu, padding='SAME', W_init=w_init, name='h0/c') net_h1 = Conv2d(net_h0, df_dim*2, (4, 4), (2, 2), act=None, padding='SAME', W_init=w_init, name='h1/c') net_h1 = BatchNormLayer(net_h1, act=lrelu, is_train=is_training, gamma_init=g_init, name='h1/bn') net_h2 = Conv2d(net_h1, df_dim*4, (4, 4), (2, 2), act=None, padding='SAME', W_init=w_init, name='h2/c') net_h2 = BatchNormLayer(net_h2, act=lrelu, is_train=is_training, gamma_init=g_init, name='h2/bn') net_h3 = Conv2d(net_h2, df_dim*8, (4, 4), (2, 2), act=None, padding='SAME', W_init=w_init, name='h3/c') net_h3 = BatchNormLayer(net_h3, act=lrelu, is_train=is_training, gamma_init=g_init, name='h3/bn') net_h4 = Conv2d(net_h3, df_dim*16, (4, 4), (2, 2), act=None, padding='SAME', W_init=w_init, name='h4/c') net_h4 = BatchNormLayer(net_h4, act=lrelu, is_train=is_training, gamma_init=g_init, name='h4/bn') net_h5 = Conv2d(net_h4, df_dim*32, (4, 4), (2, 2), act=None, padding='SAME', W_init=w_init, name='h5/c') net_h5 = BatchNormLayer(net_h5, act=lrelu, is_train=is_training, gamma_init=g_init, name='h5/bn') net_h6 = Conv2d(net_h5, df_dim*16, (1, 1), (1, 1), act=None, padding='SAME', W_init=w_init, name='h6/c') net_h6 = BatchNormLayer(net_h6, act=lrelu, is_train=is_training, gamma_init=g_init, name='h6/bn') net_h7 = Conv2d(net_h6, df_dim*8, (1, 1), (1, 1), act=None, padding='SAME', W_init=w_init, name='h7/c') net_h7 = BatchNormLayer(net_h7, act=lrelu, is_train=is_training, gamma_init=g_init, name='h7/bn') net = Conv2d(net_h7, df_dim*2, (1, 1), (1, 1), act=None, padding='SAME', W_init=w_init, name='reg/c') net = BatchNormLayer(net, act=lrelu, is_train=is_training, gamma_init=g_init, name='reg/bn') net = Conv2d(net, df_dim*2, (3, 3), (1, 1), act=None, padding='SAME', W_init=w_init, name='reg/c2') net = BatchNormLayer(net, act=lrelu, is_train=is_training, gamma_init=g_init, name='reg/bn2') net = Conv2d(net, df_dim*8, (3, 3), (1, 1), act=None, padding='SAME', W_init=w_init, name='reg/c3') net = BatchNormLayer(net, act=lrelu, is_train=is_training, gamma_init=g_init, name='reg/bn3') net_h8 = ElementwiseLayer([net_h7, net], tf.add, name='red/add') net_h8.outputs = tl.act.lrelu(net_h8.outputs, 0.2) net_ho = FlattenLayer(net_h8, name='ho/flatten') net_ho = DenseLayer(net_ho, n_units=1, act=tf.identity, W_init=w_init, name='ho/dense') logits = net_ho.outputs net_ho.outputs = tf.nn.sigmoid(net_ho.outputs) return net_ho, logits def Vgg_19_simple_api(input_image, reuse): VGG_MEAN = [103.939, 116.779, 123.68] # 將輸入的rgb圖像轉換為bgr with tf.variable_scope('VGG19', reuse=reuse) as vs: start_time = time.time() print('build the model') input_image = input_image * 255 red, green, blue = tf.split(input_image, 3, 3) assert red.get_shape().as_list()[1:] == [224, 224, 1] assert green.get_shape().as_list()[1:] == [224, 224, 1] assert blue.get_shape().as_list()[1:] == [224, 224, 1] bgr = tf.concat([blue-VGG_MEAN[0], green-VGG_MEAN[1], red-VGG_MEAN[2]], axis=3) assert input_image.get_shape().as_list()[1:] == [224, 224, 3] net_in = InputLayer(bgr, name='input') # 構建網絡 """conv1""" network = Conv2d(net_in, 64, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv1_1') network = Conv2d(network, 64, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv1_2') network = MaxPool2d(network, (2, 2), (2, 2), padding='SAME', name='pool1') '''conv2''' network = Conv2d(network, 128, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv2_1') network = Conv2d(network, 128, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv2_2') network = MaxPool2d(network, (2, 2), (2, 2), padding='SAME', name='pool2') '''conv3''' network = Conv2d(network, 256, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv3_1') network = Conv2d(network, 256, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv3_2') network = Conv2d(network, 256, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv3_3') network = Conv2d(network, 256, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv3_4') network = MaxPool2d(network, (2, 2), (2, 2), padding='SAME', name='pool3') '''conv4''' network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv4_1') network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv4_2') network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv4_3') network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv4_4') network = MaxPool2d(network, (2, 2), (2, 2), padding='SAME', name='pool4') '''conv5''' network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv5_1') network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv5_2') network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv5_3') network = Conv2d(network, 512, (3, 3), (1, 1), act=tf.nn.relu, padding='SAME', name='conv5_4') network = MaxPool2d(network, (2, 2), (2, 2), padding='SAME', name='pool5') conv = network """fc6-8""" network = FlattenLayer(network, name='flatten') network = DenseLayer(network, n_units=4096, act=tf.nn.relu, name='fc6') network = DenseLayer(network, n_units=4096, act=tf.nn.relu, name='fc7') network = DenseLayer(network, n_units=1000, act=tf.identity, name='fc8') print('finish the bulid %fs' % (time.time() - start_time)) return network, conv
config.py 參數文件
from easydict import EasyDict as edict import json config = edict() config.TRAIN = edict() # Adam config.TRAIN.batch_size = 1 config.TRAIN.lr_init = 1e-4 config.TRAIN.betal = 0.9 ### initialize G config.TRAIN.n_epoch_init = 100 ### adversarial_leaning config.TRAIN.n_epoch = 2000 config.TRAIN.lr_decay = 0.1 config.TRAIN.decay_every = int(config.TRAIN.n_epoch / 2) ## train set location config.TRAIN.hr_img_path = r'C:\Users\qq302\Desktop\srdata\DIV2K_train_HR' config.TRAIN.lr_img_path = r'C:\Users\qq302\Desktop\srdata\DIV2K_train_LR_bicubic\X4' # valid set location config.VALID = edict() config.VALID.hr_img_path = r'C:\Users\qq302\Desktop\srdata\DIV2K_valid_HR' config.VALID.lr_img_path = r'C:\Users\qq302\Desktop\srdata\DIV2K_valid_LR_bicubic/X4'
utils.py 操作文件
from tensorlayer.prepro import * def crop_sub_imgs_fn(img, is_random=True): x = crop(img, wrg=384, hrg=384, is_random=is_random) # 進行 -1 - 1 的歸一化 x = x / 127.5 - 1 return x def downsample_fn(img): x = imresize(img, [96, 96], interp='bicubic', mode=None) # 存在一定的問題 x = x / 127.5 - 1 return x