生成對抗網絡GAN入門講解


轉載:https://wiki.pathmind.com/generative-adversarial-network-gan

轉載:https://wiki.pathmind.com/

轉載:https://zhuanlan.zhihu.com/p/42606381

轉載:https://zhuanlan.zhihu.com/p/33752313 通俗理解GAN

筆者翻譯自原文:A Beginner's Guide to Generative Adversarial Networks (GANs) ,不錯的GAN入門材料。看完看完還不懂生成對抗網絡GAN你咬我~哈哈哈哈哈哈哈哈哈

生成對抗網絡(GANs) 是一種包含兩個網絡的深度神經網絡結構,將一個網絡與另一個網絡相互對立(因此稱為“對抗‘).

在2014年, GANs由Goodfellow 和蒙特利爾大學的其他研究者提出來,包括Yoshua Bengio,提及GANs, Yann LeCun 稱對抗訓練為“過去10年機器學習領域最有趣的idea”

GANs 的潛力巨大,因為它們能去學習模仿任何數據分布,因此,GANs能被教導在任何領域創造類似於我們的世界,比如圖像、音樂、演講、散文。在某種意義上,他們是機器人藝術家,他們的輸出令人印象深刻,甚至能夠深刻的打動人們。

In a surreal turn, Christie’s sold a portrait for $432,000 that had been generated by a GAN, based on open-source code written by Robbie Barrat of Stanford. Like most true artists, he didn’t see any of the money, which instead went to the French company, Obvious.0

In 2019, DeepMind showed that variational autoencoders (VAEs) could outperform GANs on face generation.

生成算法 VS 判別算法

為了理解GANs, 你需要知道生成算法是如何工作的,為此,我們最好拿判別算法與之進行對比。判別算法嘗試去區分輸入的數據,意思就是,給他們數據實例的特征,他們將預測這些數據所屬的標簽或者類別。

比如說,給它一封郵件的所有單詞,判別算法能夠判別這封郵件是否屬於垃圾郵件。垃圾郵件是標簽之一,從這封郵件中獲取的所有單詞(詞袋)就組成了輸入數據的特征。當以數學來表述這個問題,標簽被稱為y,特征被稱為x 。 公式 p(y|x) 的意思是“給定x,y發生的概率”,在這個例子中可以翻譯成“給定這些包含的單詞,這封郵件是垃圾郵件的概率”。

因此,判別算法是將特征映射為標簽,它們只關心這中相關性。一種去理解生成的方法是,它們所做的事情恰恰是相反的,生成並非是由給定特定的特征去預測標簽,而是嘗試由給定的標簽去預測特征。

生成算法在竭力回答這樣一個問題:假定這個郵件是垃圾郵件,那么這些特征應該是什么樣的? 判別模型關心y和x的關系,但生成模型關心的是“你怎樣得到x”, 它允許你獲得p(x|y), “給定y,x發生的概率”,或者給定一個類,特征的概率。(也就是說,生成算法也能被用做一個分類器,但是它不僅僅做的是輸入數據的分類)。

其他方式去區分生成和判別的方法可以如下:

  • 判別模型是去學習類之間的界限
  • 生成模型對某一類的分布進行建

GANs是怎樣工作的呢?

一個被稱為生成器的神經網絡生成新的數據實例,相對的,另一個被稱為判別器的東西去評估他們的真實性;也就是說,判別器決定每一個它檢驗的數據實例是否屬於真實的訓練數據集。

讓我們來談論一下比模仿蒙娜麗莎更平庸的事,我們來生成一些手寫數字,比如從現實世界里取到的MNIST數據集。判別器的目標是當給它們展示一個真正的MNIST數據集的實例,它能夠識別這個實例是真實的。

與此同時,生成器正在創造新的圖片來傳到判別器。它們希望它們生成的圖片被認為是真實的,即使它們是假的。生成器的目標是生成能夠通過的手寫數字,去欺騙那個傻乎乎的判別器而不被抓到。判別器的目標是鑒別來自生成器的圖片是否是假的。

以下是GAN所采取的步驟:(“左右互博”)

  • 生成器接收隨機數然后返回一張圖片
  • 這張圖片和真實數據集的圖片流一起被送進了判別器
  • 判別器接收真實的和假的圖片然后返回概率,一個0-1之間的數字,1代表為真實的預測,0代表是假的

所以你會有一個雙反饋循環:

  • 判別器和圖片的ground truth構成一個反饋循環
  • 生成器和判別器構成一個反饋循環

你可以把GAN想象成貓鼠游戲中偽造者和警察的角色,偽造者在學習傳遞虛假票據,警察正在學習檢測它們。雙方都是動態的,也就是說,警察也是在訓練(就像中央銀行正在為泄漏的票據做標記),並且雙方在不斷升級中學習對方的方法。

判別器網絡是一個標准的能夠分類圖片的卷積網絡,是一個二分類器標記圖片的真假。生成器網絡是一個反卷積網絡,在某種意義上講,標准卷積分類器對圖片進行下采樣並切生成一個排律,生成器會生成隨機噪聲向量並將其上采樣成一張圖片。判別器網絡是通過像maxpooling下采樣丟棄數據,生成器則是生成新數據。

 

 如果你想去更多的學習關於如何生成圖片, Brandon Amos 寫了一個非常的文章 interpreting images as samples from a probability distribution.

GANs、自編碼器 和 VAEs

將GANs和其他神經網絡做比較是非常有用的,比如自編碼器和變分自編碼器(VAEs)。

自編碼器對輸入數據編碼成向量,它們創造對原始數據的隱藏、或是壓縮表示。它們對降低維數很有用,也就是說,用作隱藏表示的向量將原始數據壓縮成更少量的主要維度。自編碼器能夠和解碼器共同存在,解碼器允許你對基於隱藏表示的輸入數據進行重構,就像需要用受限制玻爾茲曼機一樣。

變分自編碼器是一種生成算法,它對輸入數據的編碼添加了額外的限制,即隱藏表示被標准化。變分自編碼器能夠像自編碼器一樣壓縮數據,也能像GAN一樣合成數據。當然,GAN生成的數據是精細的,VAEs生成的數據更加模糊。Deeplearning4j的例子包含 自編碼器和變分自編碼器.

你可以將生成算法分為以下三類:

  • 給定標簽,預測相關特征(朴素貝葉斯)
  • 給定隱藏表示,預測相關特征(VAE,GAN)
  • 給定特征,預測其他(修復、插補)

訓練GAN的技巧:

當你訓練一個判別器,固定生成器的值不變;當你需驗一個生成器,固定判別器的值不變。都應該訓練對抗靜態對手,比如,這使得生成器更好讀取的必須學習的梯度。

處於同樣的原因,訓練生成器器之前先預訓練MNIST的判別器將建立更清晰的梯度。

GAN的每一邊都會壓倒另一邊,如果判別器太好,將會返回非常接近0或者1的數字,這樣生成器去讀取梯度時就很困難;如果生成器太好,它將持續利用判別器的弱點導致“漏報”(false negatives), 這可以通過網絡各自的學習率來減輕。

GANs 的訓練會花很長時間,單GPU訓練GAN會花數個小時,單CPU會超過一天。雖然難以去調整和使用,但是GANs已經有了有趣的研究和寫作

上代碼吧~

以下是是Keras編寫的GAN代碼

class GAN():
    def __init__(self):
        self.img_rows = 28 
        self.img_cols = 28
        self.channels = 1
        self.img_shape = (self.img_rows, self.img_cols, self.channels)

        optimizer = Adam(0.0002, 0.5)

        # Build and compile the discriminator
        self.discriminator = self.build_discriminator()
        self.discriminator.compile(loss='binary_crossentropy', 
            optimizer=optimizer,
            metrics=['accuracy'])

        # Build and compile the generator
        self.generator = self.build_generator()
        self.generator.compile(loss='binary_crossentropy', optimizer=optimizer)

        # The generator takes noise as input and generated imgs
        z = Input(shape=(100,))
        img = self.generator(z)

        # For the combined model we will only train the generator
        self.discriminator.trainable = False

        # The valid takes generated images as input and determines validity
        valid = self.discriminator(img)

        # The combined model  (stacked generator and discriminator) takes
        # noise as input => generates images => determines validity 
        self.combined = Model(z, valid)
        self.combined.compile(loss='binary_crossentropy', optimizer=optimizer)

    def build_generator(self):

        noise_shape = (100,)

        model = Sequential()

        model.add(Dense(256, input_shape=noise_shape))
        model.add(LeakyReLU(alpha=0.2))
        model.add(BatchNormalization(momentum=0.8))
        model.add(Dense(512))
        model.add(LeakyReLU(alpha=0.2))
        model.add(BatchNormalization(momentum=0.8))
        model.add(Dense(1024))
        model.add(LeakyReLU(alpha=0.2))
        model.add(BatchNormalization(momentum=0.8))
        model.add(Dense(np.prod(self.img_shape), activation='tanh'))
        model.add(Reshape(self.img_shape))

        model.summary()

        noise = Input(shape=noise_shape)
        img = model(noise)

        return Model(noise, img)

    def build_discriminator(self):

        img_shape = (self.img_rows, self.img_cols, self.channels)

        model = Sequential()

        model.add(Flatten(input_shape=img_shape))
        model.add(Dense(512))
        model.add(LeakyReLU(alpha=0.2))
        model.add(Dense(256))
        model.add(LeakyReLU(alpha=0.2))
        model.add(Dense(1, activation='sigmoid'))
        model.summary()

        img = Input(shape=img_shape)
        validity = model(img)

        return Model(img, validity)

    def train(self, epochs, batch_size=128, save_interval=50):

        # Load the dataset
        (X_train, _), (_, _) = mnist.load_data()

        # Rescale -1 to 1
        X_train = (X_train.astype(np.float32) - 127.5) / 127.5
        X_train = np.expand_dims(X_train, axis=3)

        half_batch = int(batch_size / 2)

        for epoch in range(epochs):

            # ---------------------
            #  Train Discriminator
            # ---------------------

            # Select a random half batch of images
            idx = np.random.randint(0, X_train.shape[0], half_batch)
            imgs = X_train[idx]

            noise = np.random.normal(0, 1, (half_batch, 100))

            # Generate a half batch of new images
            gen_imgs = self.generator.predict(noise)

            # Train the discriminator
            d_loss_real = self.discriminator.train_on_batch(imgs, np.ones((half_batch, 1)))
            d_loss_fake = self.discriminator.train_on_batch(gen_imgs, np.zeros((half_batch, 1)))
            d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)


            # ---------------------
            #  Train Generator
            # ---------------------

            noise = np.random.normal(0, 1, (batch_size, 100))

            # The generator wants the discriminator to label the generated samples
            # as valid (ones)
            valid_y = np.array([1] * batch_size)

            # Train the generator
            g_loss = self.combined.train_on_batch(noise, valid_y)

            # Plot the progress
            print ("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" % (epoch, d_loss[0], 100*d_loss[1], g_loss))

            # If at save interval => save generated image samples
            if epoch % save_interval == 0:
                self.save_imgs(epoch)

    def save_imgs(self, epoch):
        r, c = 5, 5
        noise = np.random.normal(0, 1, (r * c, 100))
        gen_imgs = self.generator.predict(noise)

        # Rescale images 0 - 1
        gen_imgs = 0.5 * gen_imgs + 0.5

        fig, axs = plt.subplots(r, c)
        cnt = 0
        for i in range(r):
            for j in range(c):
                axs[i,j].imshow(gen_imgs[cnt, :,:,0], cmap='gray')
                axs[i,j].axis('off')
                cnt += 1
        fig.savefig("gan/images/mnist_%d.png" % epoch)
        plt.close()


if __name__ == '__main__':
    gan = GAN()
    gan.train(epochs=30000, batch_size=32, save_interval=200)

一些生成網絡的資源

GAN Use Cases

Notable Papers on GANs

  • [Generative Adversarial Nets] [Paper] [Code](Ian Goodfellow’s breakthrough paper)

Unclassified Papers & Resources

  • GAN Hacks: How to Train a GAN? Tips and tricks to make GANs work
  • Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks] [Paper][Code]
  • [Adversarial Autoencoders] [Paper][Code]
  • [Generating Images with Perceptual Similarity Metrics based on Deep Networks] [Paper]
  • [Generating images with recurrent adversarial networks] [Paper][Code]
  • [Generative Visual Manipulation on the Natural Image Manifold] [Paper][Code]
  • [Learning What and Where to Draw] [Paper][Code]
  • [Adversarial Training for Sketch Retrieval] [Paper]
  • [Generative Image Modeling using Style and Structure Adversarial Networks] [Paper][Code]
  • [Generative Adversarial Networks as Variational Training of Energy Based Models] [Paper](ICLR 2017)
  • [Synthesizing the preferred inputs for neurons in neural networks via deep generator networks] [Paper][Code]
  • [SalGAN: Visual Saliency Prediction with Generative Adversarial Networks] [Paper][Code]
  • [Adversarial Feature Learning] [Paper]

Generating High-Quality Images

  • [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks] [Paper][Code](Gan with convolutional networks)(ICLR)
  • [Generative Adversarial Text to Image Synthesis] [Paper][Code][Code]
  • [Improved Techniques for Training GANs] [Paper][Code](Goodfellow’s paper)
  • [Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space] [Paper][Code]
  • [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks] [Paper][Code]
  • [Improved Training of Wasserstein GANs] [Paper][Code]
  • [Boundary Equibilibrium Generative Adversarial Networks Implementation in Tensorflow] [Paper][Code]
  • [Progressive Growing of GANs for Improved Quality, Stability, and Variation ] [Paper][Code]

Semi-supervised learning

  • [Adversarial Training Methods for Semi-Supervised Text Classification] [Paper][Note]( Ian Goodfellow Paper)
  • [Improved Techniques for Training GANs] [Paper][Code](Goodfellow’s paper)
  • [Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks] [Paper](ICLR)
  • [Semi-Supervised QA with Generative Domain-Adaptive Nets] [Paper](ACL 2017)

Ensembles

  • [AdaGAN: Boosting Generative Models] [Paper][[Code]](Google Brain)

Clustering

  • [Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks] [Paper](ICLR)

Image blending

  • [GP-GAN: Towards Realistic High-Resolution Image Blending] [Paper][Code]

Image Inpainting

  • [Semantic Image Inpainting with Perceptual and Contextual Losses] [Paper][Code](CVPR 2017)
  • [Context Encoders: Feature Learning by Inpainting] [Paper][Code]
  • [Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks] [Paper]
  • [Generative face completion] [Paper][Code](CVPR2017)
  • [Globally and Locally Consistent Image Completion] [MainPAGE](SIGGRAPH 2017)

Joint Probability

Super-Resolution

  • [Image super-resolution through deep learning ][Code](Just for face dataset)
  • [Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network] [Paper][Code](Using Deep residual network)
  • [EnhanceGAN] Docs[[Code]]

De-occlusion

  • [Robust LSTM-Autoencoders for Face De-Occlusion in the Wild] [Paper]

Semantic Segmentation

  • [Adversarial Deep Structural Networks for Mammographic Mass Segmentation] [Paper][Code]
  • [Semantic Segmentation using Adversarial Networks] [Paper](Soumith’s paper)

Object Detection

  • [Perceptual generative adversarial networks for small object detection] [Paper](CVPR 2017)
  • [A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection] [Paper][Code](CVPR2017)

RNN-GANs

  • [C-RNN-GAN: Continuous recurrent neural networks with adversarial training] [Paper][Code]

Conditional Adversarial Nets

Video Prediction & Generation

  • [Deep multi-scale video prediction beyond mean square error] [Paper][Code](Yann LeCun’s paper)
  • [Generating Videos with Scene Dynamics] [Paper][Web][Code]
  • [MoCoGAN: Decomposing Motion and Content for Video Generation] [Paper]

Texture Synthesis & Style Transfer

  • [Precomputed real-time texture synthesis with markovian generative adversarial networks] [Paper][Code](ECCV 2016)

Image Translation

  • [Unsupervised cross-domain image generation] [Paper][Code]
  • [Image-to-image translation using conditional adversarial nets] [Paper][Code][Code]
  • [Learning to Discover Cross-Domain Relations with Generative Adversarial Networks] [Paper][Code]
  • [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks] [Paper][Code]
  • [CoGAN: Coupled Generative Adversarial Networks] [Paper][Code](NIPS 2016)
  • [Unsupervised Image-to-Image Translation with Generative Adversarial Networks] [Paper]
  • [Unsupervised Image-to-Image Translation Networks] [Paper]
  • [Triangle Generative Adversarial Networks] [Paper]

GAN Theory

  • [Energy-based generative adversarial network] [Paper][Code](Lecun paper)
  • [Improved Techniques for Training GANs] [Paper][Code](Goodfellow’s paper)
  • [Mode Regularized Generative Adversarial Networks] [Paper](Yoshua Bengio , ICLR 2017)
  • [Improving Generative Adversarial Networks with Denoising Feature Matching] [Paper][Code](Yoshua Bengio , ICLR 2017)
  • [Sampling Generative Networks] [Paper][Code]
  • [How to train Gans] [Docu]
  • [Towards Principled Methods for Training Generative Adversarial Networks] [Paper](ICLR 2017)
  • [Unrolled Generative Adversarial Networks] [Paper][Code](ICLR 2017)
  • [Least Squares Generative Adversarial Networks] [Paper][Code](ICCV 2017)
  • [Wasserstein GAN] [Paper][Code]
  • [Improved Training of Wasserstein GANs] [Paper][Code](The improve of wgan)
  • [Towards Principled Methods for Training Generative Adversarial Networks] [Paper]
  • [Generalization and Equilibrium in Generative Adversarial Nets] [Paper](ICML 2017)

3-Dimensional GANs

  • [Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling] [Paper][Web][Code](2016 NIPS)
  • [Transformation-Grounded Image Generation Network for Novel 3D View Synthesis] [Web](CVPR 2017)

Music

  • [MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation using 1D and 2D Conditions] [Paper][HOMEPAGE]

Face Generation & Editing

  • [Autoencoding beyond pixels using a learned similarity metric] [Paper][Code][Tensorflow code]
  • [Coupled Generative Adversarial Networks] [Paper][Caffe Code][Tensorflow Code](NIPS)
  • [Invertible Conditional GANs for image editing] [Paper][Code]
  • [Learning Residual Images for Face Attribute Manipulation] [Paper][Code](CVPR 2017)
  • [Neural Photo Editing with Introspective Adversarial Networks] [Paper][Code](ICLR 2017)
  • [Neural Face Editing with Intrinsic Image Disentangling] [Paper](CVPR 2017)
  • [GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data ] [Paper](BMVC 2017)[Code]
  • [Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis] [Paper](ICCV 2017)

For Discrete Distributions

  • [Maximum-Likelihood Augmented Discrete Generative Adversarial Networks] [Paper]
  • [Boundary-Seeking Generative Adversarial Networks] [Paper]
  • [GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution] [Paper]

Improving Classification & Recognition

  • [Generative OpenMax for Multi-Class Open Set Classification] [Paper](BMVC 2017)
  • [Controllable Invariance through Adversarial Feature Learning] [Paper][Code](NIPS 2017)
  • [Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro] [Paper][Code] (ICCV2017)
  • [Learning from Simulated and Unsupervised Images through Adversarial Training] [Paper][Code](Apple paper, CVPR 2017 Best Paper)

Projects

  • [cleverhans] [Code](A library for benchmarking vulnerability to adversarial examples)
  • [reset-cppn-gan-tensorflow] [Code](Using Residual Generative Adversarial Networks and Variational Auto-encoder techniques to produce high-resolution images)
  • [HyperGAN] [Code](Open source GAN focused on scale and usability)

Tutorials

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM