IoU

Intersection over Union (IoU) 是目標檢測里一種重要的評價值。上面第一張途中框出了 gt box 和 predict box，IoU 通過計算這兩個框 A、B 間的 Intersection Area $I$ 和 Union Area $U$ 的比值來獲得：

\begin{equation}
\label{IoU}
IoU = \frac{|A \cap B|}{|A \cup B|} = \frac{|I|}{|U|}
\end{equation}

然而現有的算法都采用 distance losses(例如 SSD 里的 smooth_L1 loss) 來優化這一評價值。講道理 The optimal objective for a metric is the metric itself. 所以我們可以直接將 IoU 直接作為回歸 loss 來使用，令人遺憾的是 IoU 無法優化無重疊的 bboxes。

如果用 IoU 作為 loss($\mathcal{L}_{IoU} = 1 - IoU$) 衡量值的話有兩個優點和一個缺點：
1. IoU 可以有效比較兩個任意形狀之間相似性
2. IoU 具有尺度不變性
3. 任意兩個形狀 A、B 之間如果沒有 overlap，則 IoU 均為 0，此時，IoU 無法分辨兩個形狀 A、B 是靠的非常近還是非常遠

GIoU

GIoU 作為 IoU 的升級版，既繼承了 IoU 的兩個優點，又彌補了 IoU 無法衡量無重疊框之間的距離的缺點。具體計算方式是在 IoU 計算的基礎上尋找一個 smallest convex shapes $C$，具體計算公式是：

\begin{equation}
\label{GIoU}
GIoU = \frac{|A \cap B|}{|A \cup B|} - \frac{|C \setminus (A \cup B)|}{|C|} = IoU - \frac{|C \setminus (A \cup B)|}{|C|}
\end{equation}

下圖中有兩個不同的檢測結果 bad & better，不難看出距離 gt box 越遠 $C$ 越大。

如此，損失函數可以寫成：$\mathcal{L}_{GIoU} = 1- GIoU$，不難發現 $\mathcal{L}_{GIoU}$ 的值域范圍為 $[0, 2)$。

In summary, this generalization keeps the major properties of IoU while rectifying its weakness.

DIoU & CIoU

論文中提出，GIoU loss 仍然存在收斂速度慢、回歸不准等問題。

In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, i.e., overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement.

作者在分析 GIoU loss 時，發現 GIoU 首先會試圖通過增加檢測框的大小使其與目標 bbox 有重疊，然后利用 IoU loss 項使其與目標 bbox 重疊面積最大，如下左圖所示：

同時，但兩個框有包含關系是，GIoU loss 就退化成了 IoU loss 了。這時候邊界框的對齊變得較困難，收斂較慢。

In Distance-IoU (DIoU) loss, we simply add a penalty term on IoU loss to directly minimize the normalized distance between central points of two bounding boxes, leading to much faster convergence than GIoU loss.

作者認為，一個好的 bbox 回歸損失應該考慮三個重要的集合度量：重疊面積、中心點距離和高寬比。結合這些，作者進一步提出了一個 Complete IoU (CIoU) loss。同時 DIoU 還可以引入到 NMS 中來替換里面的 IoU，使得目標在遮擋情況下檢測更魯棒。

DIoU

參考上圖，DIoU loss 的公式為：

\begin{equation}
\label{DIoU}
\begin{split}
& \mathcal{R}_{DIoU} = \frac{\rho^2(\bf{b}, \bf{b^{gt}})}{c^2} \\
& \mathcal{L}_{DIoU} = 1 - IoU + \frac{\rho^2(\bf{b}, \bf{b^{gt}})}{c^2} \\
& \mathcal{L}_{DIoU} = 1 - IoU + \frac{d^2}{c^2}
\end{split}
\end{equation}

這里的 $\bf{ｄ}$ 和 $\bf{c}$ 分別代表檢測框和真實框的中心點，且 $d$ 代表的是計算兩個中心點之間的歐氏距離，$c$ 則代表 GIoU 中提到的 smallest convex shapes 的對角線距離。

優點：

與GIoU loss 類似，DIoU loss 在與目標框不重疊時，仍然可以為邊界框提供移動方向。
DIoU loss 可以直接最小化兩個目標框的距離，因此比 GIoU loss 收斂快得多。
對於包含兩個框在水平方向和垂直方向上這種情況，DIoU loss 可以使回歸非常快，而 GIoU loss 幾乎退化為 IoU loss。
DIoU 還可以替換普通的 IoU 評價策略，應用於 NMS 中，使得 NMS 得到的結果更加合理和有效。

同 $\mathcal{L}_{GIoU}$ 類似， $\mathcal{L}_{DIoU}$ 的值域范圍也為 $[0, 2)$。

CIoU

$\mathcal{L}_{CIoU}$ 在 $\mathcal{L}_{DIoU}$ 的基礎上考慮了 aspect ratios：

\begin{equation}
\label{CIoU}
\begin{split}
& \mathcal{R}_{CIoU} = \frac{\rho^2(\bf{b}, \bf{b^{gt}})}{c^2} + \alpha v \\
& v = \frac{4}{{\pi}^2}(arctan \frac{w^{gt}}{h^{gt}} - arctan \frac{w}{h})^2 \\
& \alpha = \frac{v}{(1 - IoU) + v} \\
& \mathcal{L}_{CoU} = 1 - IoU + \frac{d^2}{c^2} + \alpha v
\end{split}
\end{equation}

額，這個。。。看起來復雜的一逼

其中，$v$ 用來衡量高寬比的一致性，$\alpha$ 是一個 positive trade-off parameter, 是不參與求導的。

DIoU-NMS

這個還沒試，等着。。。

示例

import numpy as np
import matplotlib.pyplot as plt
import math

epsilon = 1e-5

def IoU(box1, box2, wh=False):
    if wh:
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2

    # 計算交集部分尺寸
    W = min(xmax1, xmax2) - max(xmin1, xmin2)
    H = min(ymax1, ymax2) - max(ymin1, ymin2)

    # 計算兩個矩形框面積
    SA = (xmax1 - xmin1) * (ymax1 - ymin1)
    SB = (xmax2 - xmin2) * (ymax2 - ymin2)

    cross = max(0, W) * max(0, H)  # 計算交集面積
    iou = float(cross) / (SA + SB - cross)

    return iou

def GIoU(box1, box2, wh=False):
    if wh:
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2

    iou = IoU(box1, box2, wh)
    SC = (max(xmax1, xmax2) - min(xmin1, xmin2)) * (max(ymax1, ymax2) - min(ymin1, ymin2))

    # 計算交集部分尺寸
    W = min(xmax1, xmax2) - max(xmin1, xmin2)
    H = min(ymax1, ymax2) - max(ymin1, ymin2)

    # 計算兩個矩形框面積
    SA = (xmax1 - xmin1) * (ymax1 - ymin1)
    SB = (xmax2 - xmin2) * (ymax2 - ymin2)

    cross = max(0, W) * max(0, H)  # 計算交集面積

    add_area = SA + SB - cross  # 兩矩形並集的面積

    end_area = (SC - add_area) / SC  # 閉包區域中不屬於兩個框的區域占閉包區域的比重
    giou = iou - end_area
    return giou

def DIoU(box1, box2, wh=False):
    if wh:
        inter_diag = (box1[0] - box2[0])**2 + (box1[1] - box2[1])**2
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2
        center_x1 = (xmax1 + xmin1) / 2
        center_y1 = (ymax1 + ymin1) / 2
        center_x2 = (xmax2 + xmin2) / 2
        center_y2 = (ymax2 + ymin2) / 2
        inter_diag = (center_x1 - center_x2)/2 ** 2 + (center_y1 - center_y2) ** 2

    iou = IoU(box1, box2, wh)
    enclose1 = max(max(xmax1, xmax2)-min(xmin1, xmin2), 0.0)
    enclose2 = max(max(ymax1, ymax2)-min(ymin1, ymin2), 0.0)
    outer_diag = (enclose1 ** 2) + (enclose2 ** 2)
    diou = iou - 1.0 * inter_diag / outer_diag
    return diou

def CIoU(box1, box2, wh=False, normaled=False):
    if wh:
        w1, h1 = box1[2], box1[3]
        w2, h2 = box2[2], box2[3]
        inter_diag = (box1[0] - box2[0])**2 + (box1[1] - box2[1])**2
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2
        w1, h1 = xmax1-xmin1, ymax1-ymin1
        w2, h2 = xmax2-xmin2, ymax2-ymin2
        center_x1 = (xmax1 + xmin1) / 2
        center_y1 = (ymax1 + ymin1) / 2
        center_x2 = (xmax2 + xmin2) / 2
        center_y2 = (ymax2 + ymin2) / 2
        inter_diag = (center_x1 - center_x2)/2 ** 2 + (center_y1 - center_y2) ** 2

    iou = IoU(box1, box2, wh)
    enclose1 = max(max(xmax1, xmax2)-min(xmin1, xmin2), 0.0)
    enclose2 = max(max(ymax1, ymax2)-min(ymin1, ymin2), 0.0)
    outer_diag = (enclose1 ** 2) + (enclose2 ** 2)
    u = (inter_diag) / outer_diag

    arctan = math.atan(w2 / h2) - math.atan(w1 / h1)
    v = (4 / (math.pi ** 2)) * (math.atan(w2 / h2) - math.atan(w1 / h1))**2
    S = 1 - iou
    alpha = v / (S + v)
    w_temp = 2 * w1
    distance = w1 ** 2 + h1 ** 2
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = np.clip(cious, a_min=-1.0, a_max=1.0)

    return cious


def bbox_giou_np(boxes1, boxes2):
    # xywh -> xyxy
    boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                             boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                             boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
                             np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
                             np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = np.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 計算兩個邊界框之間的 iou 值
    iou = inter_area / union_area
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
    # 計算最小閉合凸面 C 的面積
    enclose_area = enclose[..., 0] * enclose[..., 1]
    # 根據 GIoU 公式計算 GIoU 值
    giou = iou - 1.0 * (enclose_area - union_area) / enclose_area

    return giou

# https://github.com/YunYang1994/TensorFlow2.0-Examples/blob/4d4a403d00e6e887ecb7229719b1407d2e132811/4-Object_Detection/YOLOV3/core/yolov3.py#L121
def bbox_giou_tf(boxes1, boxes2):
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
                        tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
                        tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = tf.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 計算兩個邊界框之間的 iou 值
    iou = inter_area / union_area
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
    # 計算最小閉合凸面 C 的面積
    enclose_area = enclose[..., 0] * enclose[..., 1]
    # 根據 GIoU 公式計算 GIoU 值
    giou = iou - 1.0 * (enclose_area - union_area) / enclose_area

    return giou

def bbox_giou_torch(boxes1, boxes2):
    # boxes1, boxes2 = torch.tensor(boxes1, dtype=torch.float32), torch.tensor(boxes2, dtype=torch.float32)
    boxes1, boxes2 = torch.from_numpy(boxes1).float(), torch.from_numpy(boxes2).float()
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = torch.cat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], dim=-1)
    boxes2 = torch.cat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], dim=-1)

    boxes1 = torch.cat([torch.min(boxes1[..., :2], boxes1[..., 2:]),
                        torch.max(boxes1[..., :2], boxes1[..., 2:])], dim=-1)
    boxes2 = torch.cat([torch.min(boxes2[..., :2], boxes2[..., 2:]),
                        torch.max(boxes2[..., :2], boxes2[..., 2:])], dim=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = torch.max(boxes1[..., :2], boxes2[..., :2])
    right_down = torch.min(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = torch.max(right_down - left_up, torch.tensor(0.0))
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 計算兩個邊界框之間的 iou 值
    iou = inter_area / union_area
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    enclose_left_up = torch.min(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = torch.max(boxes1[..., 2:], boxes2[..., 2:])
    enclose = torch.max(enclose_right_down - enclose_left_up, torch.tensor(0.0))
    # 計算最小閉合凸面 C 的面積
    enclose_area = enclose[..., 0] * enclose[..., 1]
    # 根據 GIoU 公式計算 GIoU 值
    giou = iou - 1.0 * (enclose_area - union_area) / enclose_area

    return giou


# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/65b68b53f73173397937d4950ff916a41545c960/utils/box/box_utils.py#L5
def bbox_diou_torch(bboxes1, bboxes2):
    bboxes1, bboxes2 = torch.from_numpy(bboxes1).float(), torch.from_numpy(bboxes2).float()
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    dious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return dious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        dious = torch.zeros((cols, rows))
        exchange = True

    w1 = bboxes1[:, 2] - bboxes1[:, 0]
    h1 = bboxes1[:, 3] - bboxes1[:, 1]
    w2 = bboxes2[:, 2] - bboxes2[:, 0]
    h2 = bboxes2[:, 3] - bboxes2[:, 1]

    area1 = w1 * h1
    area2 = w2 * h2
    center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
    center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
    center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
    center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2

    inter_max_xy = torch.min(bboxes1[:, 2:], bboxes2[:, 2:])
    inter_min_xy = torch.max(bboxes1[:, :2], bboxes2[:, :2])
    out_max_xy = torch.max(bboxes1[:, 2:], bboxes2[:, 2:])
    out_min_xy = torch.min(bboxes1[:, :2], bboxes2[:, :2])

    inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
    inter_area = inter[:, 0] * inter[:, 1]  # 交集
    inter_diag = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
    outer = torch.clamp((out_max_xy - out_min_xy), min=0)
    outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
    union = area1 + area2 - inter_area  # 並集
    dious = inter_area / union - (inter_diag) / outer_diag
    dious = torch.clamp(dious, min=-1.0, max=1.0)
    if exchange:
        dious = dious.T
    return dious

def bbox_diou_np(boxes1, boxes2, normaled=False):
    inter_diag = np.sum(np.square(boxes1[..., :2] - boxes2[..., :2]), axis=1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                             boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                             boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
                             np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
                             np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = np.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 計算兩個邊界框之間的 iou 值
    iou = inter_area / union_area
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    # 根據 DIoU 公式計算 DIoU 值
    diou = iou - 1.0 * inter_diag / outer_diag
    diou = np.clip(diou, a_min=-1.0, a_max=1.0)

    return diou

def bbox_diou_tf(boxes1, boxes2):
    inter_diag = tf.reduce_sum(tf.square(boxes1[..., :2] - boxes2[..., :2]), axis=1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
                        tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
                        tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = tf.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 計算兩個邊界框之間的 iou 值
    iou = inter_area / union_area
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    # 根據 GIoU 公式計算 GIoU 值
    diou = iou - 1.0 * inter_diag / outer_diag
    diou = tf.clip_by_value(diou, clip_value_min=-1.0, clip_value_max=1.0)

    return diou


# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/65b68b53f73173397937d4950ff916a41545c960/utils/box/box_utils.py#L47
def bbox_ciou_torch(bboxes1, bboxes2, normaled=False):
    bboxes1, bboxes2 = torch.from_numpy(bboxes1).float(), torch.from_numpy(bboxes2).float()
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    cious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return cious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        cious = torch.zeros((cols, rows))
        exchange = True

    w1 = bboxes1[:, 2] - bboxes1[:, 0]
    h1 = bboxes1[:, 3] - bboxes1[:, 1]
    w2 = bboxes2[:, 2] - bboxes2[:, 0]
    h2 = bboxes2[:, 3] - bboxes2[:, 1]

    area1 = w1 * h1
    area2 = w2 * h2

    center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
    center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
    center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
    center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2

    inter_max_xy = torch.min(bboxes1[:, 2:], bboxes2[:, 2:])
    inter_min_xy = torch.max(bboxes1[:, :2], bboxes2[:, :2])
    out_max_xy = torch.max(bboxes1[:, 2:], bboxes2[:, 2:])
    out_min_xy = torch.min(bboxes1[:, :2], bboxes2[:, :2])

    inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
    inter_area = inter[:, 0] * inter[:, 1]
    inter_diag = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
    outer = torch.clamp((out_max_xy - out_min_xy), min=0)
    outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
    union = area1 + area2 - inter_area
    u = (inter_diag) / outer_diag
    iou = inter_area / union
    with torch.no_grad():
        arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1)
        v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2)
        S = 1 - iou
        alpha = v / (S + v)
        w_temp = 2 * w1
        distance = w1 ** 2 + h1 ** 2
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = torch.clamp(cious, min=-1.0, max=1.0)
    if exchange:
        cious = cious.T
    return cious

def bbox_ciou_np(boxes1, boxes2, normaled=False):
    w1, h1 = boxes1[..., 2], boxes1[..., 3]
    w2, h2 = boxes2[..., 2], boxes2[..., 3]
    inter_diag = np.sum(np.square(boxes1[..., :2] - boxes2[..., :2]), axis=-1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                             boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                             boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
                             np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
                             np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = np.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 計算兩個邊界框之間的 iou 值
    iou = inter_area / union_area
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    u = (inter_diag) / outer_diag
    # 根據 CIoU 公式計算 CIoU 值
    arctan = np.arctan(w2 / h2) - np.arctan(w1 / h1)
    v = (4 / (math.pi ** 2)) * np.square(np.arctan(w2 / h2) - np.arctan(w1 / h1))
    S = 1 - iou
    alpha = v / (S + v)
    w_temp = 2 * w1
    distance = w1 ** 2 + h1 ** 2
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = np.clip(cious, a_min=-1.0, a_max=1.0)

    return cious

def bbox_ciou_tf(boxes1, boxes2, normaled=False):
    w1, h1 = boxes1[..., 2], boxes1[..., 3]
    w2, h2 = boxes2[..., 2], boxes2[..., 3]
    inter_diag = tf.reduce_sum(tf.square(boxes1[..., :2] - boxes2[..., :2]), axis=-1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
                        tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
                        tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = tf.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 計算兩個邊界框之間的 iou 值
    iou = inter_area / union_area
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    # 計算最小閉合凸面 C 左上角和右下角的坐標
    enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    u = (inter_diag) / outer_diag
    # 根據 CIoU 公式計算 CIoU 值
    # arctan = tf.atan(w2 / h2) - tf.atan(w1 / h1)
    # v = (4 / (math.pi ** 2)) * np.square(tf.atan(w2 / h2) - tf.atan(w1 / h1))
    arctan = tf.atan(w2 / (h2 + epsilon)) - tf.atan(w1 / (h1 + epsilon))
    v = (4 / (math.pi ** 2)) * np.square(tf.atan(w2 / (h2 + epsilon)) - tf.atan(w1 / (h1 + epsilon)))
    S = 1 - iou
    alpha = tf.stop_gradient(v / (S + v))
    w_temp = tf.stop_gradient(2 * w1)
    distance = tf.stop_gradient(w1 ** 2 + h1 ** 2 + epsilon)
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = tf.clip_by_value(cious, clip_value_min=-1.0, clip_value_max=1.0)

    return cious


img_width = 480.0
img_height = 320.0
gt_bboxes_xyxy = np.array([[50, 40, 200, 200], [270, 70, 400, 180]])  # xyxy
pre_bboxes_xyxy = np.array([[100, 100, 250, 300], [400, 180, 460, 300]])  # xyxy

gt_bboxes_xyxy_nomal = np.zeros(shape=gt_bboxes_xyxy.shape, dtype=np.float)
pre_bboxes_xyxy_nomal = np.zeros(shape=pre_bboxes_xyxy.shape, dtype=np.float)
gt_bboxes_xyxy_nomal[..., 0::2] = gt_bboxes_xyxy[..., 0::2] / img_width
gt_bboxes_xyxy_nomal[..., 1::2] = gt_bboxes_xyxy[..., 1::2] / img_height
pre_bboxes_xyxy_nomal[..., 0::2] = pre_bboxes_xyxy[..., 0::2] / img_width
pre_bboxes_xyxy_nomal[..., 1::2] = pre_bboxes_xyxy[..., 1::2] / img_height

gt_bboxes_xywh = np.array([[125, 120, 150, 160], [335, 125, 130, 110]])  # xywh
pre_bboxes_xywh = np.array([[175, 200, 150, 200], [430, 240, 60, 120]])  # xywh

gt_bboxes_xywh_nomal = np.zeros(shape=gt_bboxes_xywh.shape, dtype=np.float)
pre_bboxes_xywh_nomal = np.zeros(shape=pre_bboxes_xywh.shape, dtype=np.float)
gt_bboxes_xywh_nomal[..., 0::2] = gt_bboxes_xywh[..., 0::2] / img_width
gt_bboxes_xywh_nomal[..., 1::2] = gt_bboxes_xywh[..., 1::2] / img_height
pre_bboxes_xywh_nomal[..., 0::2] = pre_bboxes_xywh[..., 0::2] / img_width
pre_bboxes_xywh_nomal[..., 1::2] = pre_bboxes_xywh[..., 1::2] / img_height

# ================================================================ #
fig = plt.figure()
ax = fig.add_subplot(111)
currentAxis = plt.gca()
for idx, (gt, pt) in enumerate(zip(gt_bboxes_xywh, pre_bboxes_xywh)):
    iou = IoU(gt, pt, True)
    giou = GIoU(gt, pt, True)
    diou = DIoU(gt, pt, True)
    ciou = CIoU(gt, pt, True)
    currentAxis.text(gt[0] - gt[2] / 2, 20, 'iou={:.4f}, giou={:.4f}'.format(iou, giou),
                     bbox={'facecolor': 'yellow', 'alpha': 0.5})
    currentAxis.text(gt[0] - gt[2] / 2, gt[1] + gt[3] / 2 + 20, 'diou={:.4f}, ciou={:.4f}'.format(diou, ciou),
                     bbox={'facecolor': 'yellow', 'alpha': 0.5})
    currentAxis.add_patch(plt.Rectangle((gt[0]-gt[2]/2,gt[1]-gt[3]/2),gt[2],gt[3],
                                        fill=False, edgecolor='green', linewidth=2))
    currentAxis.text(gt[0]-gt[2]/2,gt[1]-gt[3]/2, 'g{}'.format(idx), bbox={'facecolor': 'green', 'alpha': 0.5})
    currentAxis.add_patch(plt.Rectangle((pt[0]-pt[2]/2, pt[1]-pt[3]/2), pt[2], pt[3],
                                        fill=False, edgecolor='red', linewidth=2))
    currentAxis.text(pt[0]-pt[2]/2, pt[1]-pt[3]/2, 'p{}'.format(idx), bbox={'facecolor': 'red', 'alpha': 0.5})


plt.xticks(np.arange(0, img_width+1, 40))
plt.yticks(np.arange(0, img_height+1, 40))
currentAxis.invert_yaxis()
plt.show()

# ================================================================ #

import tensorflow as tf
import torch

label_bbox = tf.placeholder(dtype=tf.float32, name='label_bbox')
predic_bbox = tf.placeholder(dtype=tf.float32, name='predic_bbox')
label_bbox_normal = tf.placeholder(dtype=tf.float32, name='label_bbox_normal')
predic_bbox_normal = tf.placeholder(dtype=tf.float32, name='predic_bbox_normal')

# ================================================================ #
#                               GIoU                               #
# ================================================================ #
gious = np.expand_dims(bbox_giou_np(gt_bboxes_xywh, pre_bboxes_xywh), axis=-1)
print('numpy publish giou:                ', gious)
# ================================================================ #
gious = tf.expand_dims(bbox_giou_tf(predic_bbox, label_bbox), axis=-1)

with tf.Session() as sess:
    result = sess.run(gious, feed_dict={label_bbox: gt_bboxes_xywh,
                                       predic_bbox: pre_bboxes_xywh}
                      )
    print('tensorflow publish giou:           ', result)
# ================================================================ #
gious = bbox_giou_torch(gt_bboxes_xywh, pre_bboxes_xywh).unsqueeze(-1)
print('pytorch publish goiu:              ', gious.numpy())

# ================================================================ #
#                               DIoU                               #
# ================================================================ #
dious = np.expand_dims(bbox_diou_np(gt_bboxes_xywh, pre_bboxes_xywh), axis=-1)
print('numpy publish diou :               ', dious)
# ================================================================
dious = bbox_diou_torch(gt_bboxes_xyxy, pre_bboxes_xyxy).unsqueeze(-1)
print('pytorch publish diou:              ', dious.numpy())
# ================================================================
label_bbox = tf.placeholder(dtype=tf.float32, name='label_bbox')
predic_bbox = tf.placeholder(dtype=tf.float32, name='predic_bbox')
dious = tf.expand_dims(bbox_diou_tf(label_bbox, predic_bbox), axis=-1)
with tf.Session() as sess:
    result = sess.run(dious, feed_dict={label_bbox: gt_bboxes_xywh,
                                       predic_bbox: pre_bboxes_xywh})
    print('tensorflow publish diou:           ', result)

# ================================================================ #
#                               CIoU                               #
# ================================================================ #
cious = bbox_ciou_torch(gt_bboxes_xyxy, pre_bboxes_xyxy, False).unsqueeze(-1)
print('pytorch publish ciou unnormaled:   ', cious.numpy())

cious = bbox_ciou_torch(gt_bboxes_xyxy_nomal, pre_bboxes_xyxy_nomal, True).unsqueeze(-1)
print('pytorch publish ciou normaled:     ', cious.numpy())
# ================================================================ #
cious = np.expand_dims(bbox_ciou_np(gt_bboxes_xywh, pre_bboxes_xywh, False), axis=-1)
print('numpy publish ciou unnormaled:     ', cious)

cious = np.expand_dims(bbox_ciou_np(gt_bboxes_xywh_nomal, pre_bboxes_xywh_nomal, True), axis=-1)
print('numpy publish ciou normaled:       ', cious)
# ================================================================ #
cious = tf.expand_dims(bbox_ciou_tf(label_bbox, predic_bbox, False), axis=-1)
cious_normal = tf.expand_dims(bbox_ciou_tf(label_bbox_normal, predic_bbox_normal, True), axis=-1)
with tf.Session() as sess:
    cious_tf, cious_tf_normal = sess.run([cious, cious_normal],
                                                  feed_dict={label_bbox_normal: gt_bboxes_xywh_nomal,
                                                             predic_bbox_normal: pre_bboxes_xywh_nomal,
                                                             label_bbox: gt_bboxes_xywh,
                                                             predic_bbox: pre_bboxes_xywh})
    print('tensorflow publish ciou unnormaled:', cious_tf)
    print('tensorflow publish ciou normaled:  ', cious_tf_normal)
# ================================================================ #

View Code

numpy publish giou:                 [[ 0.07342657]
 [-0.50800915]]
tensorflow publish giou:            [[ 0.07342657]
 [-0.50800914]]
pytorch publish goiu:               [[ 0.07342657]
 [-0.50800914]]
numpy publish diou :                [[ 0.14455897]
 [-0.25      ]]
pytorch publish diou:               [[ 0.14455898]
 [-0.25      ]]
tensorflow publish diou:            [[ 0.14455898]
 [-0.25      ]]
pytorch publish ciou unnormaled:    [[ 0.14428109]
 [-0.2600825 ]]
pytorch publish ciou normaled:      [[ 0.1392411 ]
 [-0.25120372]]
numpy publish ciou unnormaled:      [[ 0.14428107]
 [-0.26008251]]
numpy publish ciou normaled:        [[ 0.13924112]
 [-0.25120372]]
tensorflow publish ciou unnormaled: [[ 0.14428109]
 [-0.2600825 ]]
tensorflow publish ciou normaled:   [[ 0.13924108]
 [-0.25120363]]

同事實驗下來：

method	GIoU	DIoU	CIoU
mAP	81.37%	81.46%	82.36%

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 目標檢測回歸損失函數簡介SmoothL1/IoU/GIoU/DIoU/CIoU Loss IoU、GIoU、DIOU、CIOU損失函數 IoU、GIoU、DIoU、CIoU損失函數 AI大視覺（十五） | 損失函數進化史：MSE、IOU、GIOU、DIOU、CIOU、EIOU 目標檢測--IoU，GIoU，DIoU和CIoU三種目標檢測loss 交叉熵、Focal loss、L1，L2，smooth L1損失函數、IOU Loss、GIOU、DIOU和CIOU AAAI 2020 | DIoU和CIoU：IoU在目標檢測中的正確打開方式基於深度學習的目標檢測基於深度學習的目標檢測基於深度學習的目標檢測