缺陷檢測-3.CutPaste: Self-Supervised Learning for Anomaly Detection and Localization(剪切粘貼:自監督學習對於異常檢測和定位)


 

 

Abstract

We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data. To this end, we propose a two-stage framework for building anomaly detectors using normal training data only. We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations. We learn representations by classifying normal data from the CutPaste, a simple data augmentation strategy that cuts an image patch and pastes at a random location of a large image. Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects. We bring the improvement upon previous arts by 3.1 AUCs when learning representations from scratch. By transfer learning on pretrained representations on ImageNet, we achieve a new state-of-the art 96.6 AUC. Lastly, we extend the framework to learn and extract representations from patches to allow localizing defective areas without annotations during training。

摘要

我們旨在構造一個在缺陷檢測上高表現的模型,這個模型可以檢測到未知的圖片上的缺陷區域不需要異常數據。最后,我們采用了二步框架去構建異常檢測器只使用正常的迅雷數據。我們首先學習自監督的深度表達,然后在學習到的表達上構建一個單分類的生成器。我們通過從裁剪中分類正常的數據來學習表達,一種簡單的數據增強方式, 裁剪圖片的區域並且在一張大的圖片上的任意位置進行粘貼。我們在MVTec異常數據的實例學習上表明這種提出的方法可以被用來去發現不同的真實缺陷。當我們從頭開始迅雷時, 我們在原有的方法上提高了3.1AUC。通過在imageNet預訓練的特征進行遷移學習,我們使用了全新的96.6的AUC。最后,我們擴展這個框架去學習和提取特征從補丁上, 從而允許在訓練中不加標注的定位缺陷區域。

2. A Framework for Anomaly Detection

In this section, we present our anomaly detection framework for high-resolution image with defects in local regions. Following [54], we adopt a two-stage framework for building an anomaly detector, where in the first stage we learn deep representations from normal data and then construct an one-class classifier using learned representations. Subsequently, in Section 2.1, we present a novel method for learning self-supervised representations by predicting CutPaste augmentation, and extend to learning and extracting representations from local patches in Section 2.4.

異常檢測的框架

在這一章節,針對局部區域存在缺陷的高分辨圖片, 我們提出了缺陷檢測的框架。根據54,我們采用兩步框架構造一個異常檢測器,第一步我們學習深度特征從一個正常數據中,然后使用學到的表征來構造一個單類型的分類器。隨后,在2.1節,我們提出一個新的方法通過預測cutPaste增強來學習自監督的表征,擴展到Section2.4, 從局部patch中學習和提取表征。

 

 

這篇文章的核心點是兩個

 

  第一個是CutPaste的數據增強

       第二個是使用概率密度估計(a simple parametric Gaussian density estimator (GDE))來計算異常得分

 

1.0 CutPaste的數據增強

2.1 Self-Supervised Learning with CutPaste

We conjecture that geometric transformations [20, 24, 4], such as rotations and translations, are effective in learning representation of semantic concepts (e.g., objectness), but less of regularity (e.g., continuity, repetition). As shown in Figure 2(b), anomalous patterns of defect detection typically include irregularities such as cracks (bottle, wood) or twists (toothbrush, grid). Our aim is to design an augmentation strategy creating local irregular patterns. Then we train the model to identify these local irregularity with the hope that it can generalize to unseen real defects at test time.

我們推測幾何變化,像旋轉和變化, 學習分割概念的表征是有效的(像物體檢測), 但是有少量的規則(連續性和重復性),就像figure2中顯示的,經典的異常缺陷檢測模型包括不規則像裂縫(瓶子, 木材) 或者扭曲(牙刷, 網格). 我們的目標是設計一種新的數據增強策略生成不規則區域模式。然后我們訓練模型來識別局部不規則,希望在測試時可以推廣到沒有見過的缺陷上。

To further prevent learning naive decision rules for discriminating augmented images and encouraging the model to learn to detect irregularity, we propose the CutPaste augmentation as follows:

  1. Cut a small rectangular area of variable sizes and aspect ratios from a normal training image.

  2. Optionally, we rotate or jitter pixel values in the patch.

  3. Paste a patch back to an image at a random location.

為了進一步防止判別增強圖片時學習不成熟的決策規則並且鼓勵模型去學習,我們提出剪切粘貼增強如下: 

      1. 從一張正常的訓練數據, 剪切一個可變尺寸和長寬比的小矩形區域。

      2. 隨意地,在一個patch內,我們旋轉和像素抖動像素值

      3.將補丁粘貼回一個隨機圖片上的一個位置

 

CutPasteNormal的代碼說明: 

h = img.size[0]
w = img.size[1]
        
# ratio between area_ratio[0] and area_ratio[1]
# 1.圖片裁剪
ratio_area = random.uniform(self.area_ratio[0], 
    self.area_ratio[1]) * w * h # 在[0.02, 0.15]之間生成一個隨機返回值
        
# sample in log space
log_ratio = torch.log(torch.tensor((self.aspect_ratio,         
    1/self.aspect_ratio))) # log0.3 log(1/0.3)
aspect = torch.exp(
    torch.empty(1).uniform_(log_ratio[0], log_ratio[1]) # 在這兩個之間生成隨機數
        ).item()
        
cut_w = int(round(math.sqrt(ratio_area * aspect)))
cut_h = int(round(math.sqrt(ratio_area / aspect)))
        
# one might also want to sample from other images. currently we only sample from the image itself
from_location_h = int(random.uniform(0, h - cut_h)) # 初始化位置
from_location_w = int(random.uniform(0, w - cut_w))
        
box = [from_location_w, from_location_h, from_location_w + cut_w, from_location_h + cut_h]
    
patch = img.crop(box) # 進行圖片裁剪

patch.show()

# 2.顏色的隨機抖動
if self.colorJitter:
patch = self.colorJitter(patch)

# 3.粘貼到隨機的位置上
to_location_h = int(random.uniform(0, h - cut_h))
to_location_w = int(random.uniform(0, w - cut_w))
        
insert_box = [to_location_w, to_location_h, to_location_w + cut_w, to_location_h + cut_h]
augmented = img.copy()
augmented.paste(patch, insert_box)

augmented.show() # 進行圖片的粘貼

 

 

 

We show the CutPaste augmentation process in the orange dotted box of Figure 1 and more examples in Figure 2(e). Following the idea of rotation prediction [19], we define the training objective of the proposed self-supervised representation learning as follows:

我們展示這個CutPaste數據增強的過程在黃色的框內,更多的例子在Figure2(e).遵循旋轉預測的思想,我們將提出自監督表示學習訓練檢測定義如下

where X is the set of normal data, CP(·) is a CutPaste augmentation and g is a binary classifier parameterized by deep networks. CE(·, ·) refers to a cross-entropy loss. In practice, data augmentations, such as translation or color jitter, are applied before feeding x into g or CP.

X是一個正常的數據集, CP是一個CutPaste的數據增強和G是一個二分類參數的深度網絡.CE(,)為交叉熵損失函數.實際上,數據增強,像平移和顏色抖動,將喂x給g或者CP之前使用

 

y = torch.arange(len(xs), device=device)
y = y.repeat_interleave(xs[0].size(0))
loss = loss_fn(logits, y)

 

2.2. CutPaste Variants

Multi-Class Classification. While CutPaste (large patch) and CutPaste-Scar share a similarity, the shapes of an image patch of two augmentations are very different. Empirically, they have their own advantages on different types of defects. To leverage the strength of both scales in the training, we formulate a finer-grained 3-way classification task among normal, CutPaste and CutPaste-Scar by treating CutPaste variants as two separate classes. Detailed study will be presented in Section 5.2.

多類別分類. 當復制粘貼(巨大的補丁)和復制粘貼傷疤共享同一個相似點,兩種增強的圖像塊的形狀是不同的。經驗的,我們有我們自己的優勢在不同的缺陷類型上。在訓練過程中去利用這兩種天平的力量,我們制定了一個更加細粒度的3分類模型,CutPaste和CutPaste-Scar不同的變體作為兩個分離的類別。細節將在5.2節顯示。

關於CutPaste的代碼已經在上面給出,關於CutPaste-Scar的代碼如下, 由於圖片較小,粘貼的位置比較隨機,因此很有可能粘貼到目標圖像外了

h = img.size[0]
w = img.size[1]
        
# cut region 1.裁剪圖片
cut_w = random.uniform(*self.width)
cut_h = random.uniform(*self.height)
        
from_location_h = int(random.uniform(0, h - cut_h))
from_location_w = int(random.uniform(0, w - cut_w))
        
box = [from_location_w, from_location_h, from_location_w + cut_w, from_location_h + cut_h]
patch = img.crop(box)
        
# 2.進行顏色抖動和角度抖動
if self.colorJitter:
   patch = self.colorJitter(patch)

# rotate
rot_deg = random.uniform(*self.rotation)
patch = patch.convert("RGBA").rotate(rot_deg,expand=True)
        
#paste 3.進行圖片的粘貼 
to_location_h = int(random.uniform(0, h - patch.size[0]))
to_location_w = int(random.uniform(0, w - patch.size[1]))

mask = patch.split()[-1]
patch = patch.convert("RGB")
        
augmented = img.copy()
augmented.paste(patch, (to_location_w, to_location_h), mask=mask)

augmented.show()

2.0 使用概率密度估計(a simple parametric Gaussian density estimator (GDE))來計算異常得分,評估異常位置

There exist various ways to compute anomaly scores via one-class classifiers. In this work, we build generative classifiers like kernel density estimator [52] or Gaussian density estimator [43], on representations f. Below, we explain how to compute anomaly scores and the trade-offs. Although nonparametric KDE is free from distribution assumptions, it requires many examples for accurate estimation [58] and could be computationally expensive. With limited normal training examples for defect detection, we consider a simple parametric Gaussian density estimator (GDE) whose log-density is computed as follows:

通過一個類別去計算異常得分的方法有多種。在這個工作中,我們建立了一個生成分類器像核密度估計或者高斯密度估計。在表現f,如下,我們解釋如果去計算異常得分並且權衡。盡管非參數KDE不受分布假設的約束,對於正確估計,它需要多個exmaples並且計算昂貴。用有限的常規訓練實例進行缺陷檢測,我們認為一個簡單的高斯密度估計(GDE),密度測井如下

 

params = {'bandwidth': np.logspace(-10, 10, 50)}
grid = GridSearchCV(KernelDensity(), params)
grid.fit(embeds)

print("best bandwidth: {0}".format(grid.best_estimator_.bandwidth))

# # use the best estimator to compute the kernel density estimate
kde = grid.best_estimator_
kde = KernelDensity(kernel='gaussian', bandwidth=1).fit(train_embed)
scores = kde.score_samples(embeds)
print(scores)

 

 

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM