【深度學習】梯度計算(矩陣向量求導)


0. 標量、向量、矩陣互相求導的形狀

標量、向量和矩陣的求導(形狀)
  標量x (1,) 向量x (n,1) 矩陣X (n,k)

標量y (1,)

$\frac{\partial y}{\partial x}$ (1,)  $\frac{\partial y}{\partial\textbf x}$ (1,n)  $\frac{\partial y}{\partial\textbf X}$ (k,n)

向量(m,1)

 $\frac{\partial\textbf y}{\partial x}$ (m,1)  $\frac{\partial\textbf y}{\partial\textbf x}$ (m,n)  $\frac{\partial\textbf y}{\partial\textbf X}$ (m,k,n)
矩陣(m,l)  $\frac{\partial\textbf Y}{\partial x}$ (m,l)  $\frac{\partial\textbf Y}{\partial\textbf x}$ (m,l,n)  $\frac{\partial\textbf Y}{\partial\textbf X}$ (m,l,k,n)

PS:默認使用列向量和分子布局(分子不變,分母轉置)。

1. 標量對向量求導 $\frac{\partial y}{\partial\textbf x}$ 

  $\textbf x\left ( n,1 \right )= \begin{bmatrix}
x_{1}\\
x_{2}\\
\vdots\\
x_{n}
\end{bmatrix}$ ,y為標量,

  $\frac{\partial y}{\partial\textbf x}\left ( 1,n \right )= \begin{bmatrix}
\frac{\partial y}{\partial x_{1}} & \frac{\partial y}{\partial x_{2}} & \cdots & \frac{\partial y}{\partial x_{n}}
\end{bmatrix}$

  PS:標量對列向量求導,變為行向量,標量對向量每一元素求導。

2. 向量對標量求導 $\frac{\partial\textbf y}{\partial x}$

  x為標量,$\textbf y\left ( m,1 \right )= \begin{bmatrix}
y_{1}\\
y_{2}\\
\vdots\\
y_{m}
\end{bmatrix}$ ,

  $\frac{\partial\textbf y}{\partial x}\left ( m,1 \right )= \begin{bmatrix}
\frac{\partial y_{1}}{\partial x}\\
\frac{\partial y_{2}}{\partial x}\\
\vdots\\
\frac{\partial y_{m}}{\partial x}
\end{bmatrix}$

  PS:向量對標量求導,形狀不變,向量每一元素對標量求導。

3. 向量對向量求導 $\frac{\partial\textbf y}{\partial\textbf x}$

  $\textbf x\left ( n,1 \right )= \begin{bmatrix}
x_{1}\\
x_{2}\\
\vdots\\
x_{n}
\end{bmatrix}$,$\textbf y\left ( m,1 \right )= \begin{bmatrix}
y_{1}\\
y_{2}\\
\vdots\\
y_{m}
\end{bmatrix}$,

  $\frac{\partial\textbf y}{\partial\textbf x}\left ( m,n \right )=
\begin{bmatrix}
\frac{\partial y_{1}}{\partial\textbf x}\\
\frac{\partial y_{2}}{\partial\textbf x}\\
\vdots\\
\frac{\partial y_{m}}{\partial\textbf x}
\end{bmatrix} =
\begin{bmatrix}
\frac{\partial y_{1}}{\partial x_{1}} & \frac{\partial y_{1}}{\partial x_{2}} & \cdots & \frac{\partial y_{1}}{\partial x_{n}}\\
\frac{\partial y_{2}}{\partial x_{1}} & \frac{\partial y_{2}}{\partial x_{2}} & \cdots & \frac{\partial y_{2}}{\partial x_{n}}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial y_{m}}{\partial x_{1}} & \frac{\partial y_{m}}{\partial x_{2}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}}
\end{bmatrix}$

  PS:向量對向量求導,形狀為矩陣,可以理解為一列標量分別對向量求導。

4. 標量對矩陣求導 $\frac{\partial y}{\partial\textbf X}$

  $\textbf X\left ( n,k \right )=
\begin{bmatrix}
x_{11} & x_{12} & \cdots & x_{1k}\\
x_{21} & x_{22} & \cdots & x_{2k}\\
\vdots & \vdots & \ddots & \vdots\\
x_{n1} & x_{n2} & \cdots & x_{nk}
\end{bmatrix}$,y為標量,

  $\frac{\partial y}{\partial\textbf X}\left ( k,n \right )=
\begin{bmatrix}
\frac{\partial y}{\partial\textbf x_{:,1}} & \frac{\partial y}{\partial\textbf x_{:,2}} & \cdots & \frac{\partial y}{\partial\textbf x_{:,k}}
\end{bmatrix}=
\begin{bmatrix}
\frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{21}} & \cdots & \frac{\partial y}{\partial x_{n1}}\\
\frac{\partial y}{\partial x_{12}} & \frac{\partial y}{\partial x_{22}} & \cdots & \frac{\partial y}{\partial x_{n2}}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial y}{\partial x_{1k}} & \frac{\partial y}{\partial x_{2k}} & \cdots & \frac{\partial y}{\partial x_{nk}}
\end{bmatrix}$

  PS:標量對矩陣求導,形狀為轉置的矩陣,可以理解為標量分別對k個列向量求導。

5. 矩陣對標量求導 $\frac{\partial\textbf Y}{\partial x}$

  x為標量,$\textbf Y\left ( m,l \right )=
\begin{bmatrix}
y_{11} & y_{12} & \cdots & y_{1l}\\
y_{21} & y_{22} & \cdots & y_{2l}\\
\vdots & \vdots & \ddots & \vdots\\
y_{m1} & y_{m2} & \cdots & y_{ml}
\end{bmatrix}$,

  $\frac{\partial\textbf Y}{\partial x}\left ( m,l \right )=
\begin{bmatrix}
\frac{\partial\textbf y_{:,1}}{\partial x} & \frac{\partial\textbf y_{:,2}}{\partial x} & \cdots & \frac{\partial\textbf y_{:,l}}{\partial x}
\end{bmatrix}=
\begin{bmatrix}
\frac{\partial y_{11}}{\partial x} & \frac{\partial y_{12}}{\partial x} & \cdots & \frac{\partial y_{1l}}{\partial x}\\
\frac{\partial y_{21}}{\partial x} & \frac{\partial y_{22}}{\partial x} & \cdots & \frac{\partial y_{2l}}{\partial x}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial y_{m1}}{\partial x} & \frac{\partial y_{m2}}{\partial x} & \cdots & \frac{\partial y_{ml}}{\partial x}
\end{bmatrix}$

  PS:矩陣對標量求導,形狀不變,可以理解為l個列向量分別對標量求導。

6. 向量對矩陣求導 $\frac{\partial\textbf y}{\partial\textbf X}$

  $\textbf X\left ( n,k \right )=
\begin{bmatrix}
x_{11} & x_{12} & \cdots & x_{1k}\\
x_{21} & x_{22} & \cdots & x_{2k}\\
\vdots & \vdots & \ddots & \vdots\\
x_{n1} & x_{n2} & \cdots & x_{nk}
\end{bmatrix}$,$\textbf y\left ( m,1 \right )= \begin{bmatrix}
y_{1}\\
y_{2}\\
\vdots\\
y_{m}
\end{bmatrix}$,

  $\frac{\partial\textbf y}{\partial\textbf X}\left ( m,k,n \right )=
\begin{bmatrix}\frac{\partial y_{1}}{\partial\textbf X}\end{bmatrix},\begin{bmatrix}\frac{\partial y_{2}}{\partial\textbf X}\end{bmatrix},\cdots,\begin{bmatrix}\frac{\partial y_{m}}{\partial\textbf X}\end{bmatrix}=\begin{bmatrix}
\frac{\partial y_{1}}{\partial x_{11}} & \frac{\partial y_{1}}{\partial x_{21}} & \cdots & \frac{\partial y_{1}}{\partial x_{n1}}\\
\frac{\partial y_{1}}{\partial x_{12}} & \frac{\partial y_{1}}{\partial x_{22}} & \cdots & \frac{\partial y_{1}}{\partial x_{n2}}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial y_{1}}{\partial x_{1k}} & \frac{\partial y_{1}}{\partial x_{2k}} & \cdots & \frac{\partial y_{1}}{\partial x_{nk}}
\end{bmatrix},\begin{bmatrix}
\frac{\partial y_{2}}{\partial x_{11}} & \frac{\partial y_{2}}{\partial x_{21}} & \cdots & \frac{\partial y_{2}}{\partial x_{n1}}\\
\frac{\partial y_{2}}{\partial x_{12}} & \frac{\partial y_{2}}{\partial x_{22}} & \cdots & \frac{\partial y_{2}}{\partial x_{n2}}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial y_{2}}{\partial x_{1k}} & \frac{\partial y_{2}}{\partial x_{2k}} & \cdots & \frac{\partial y_{2}}{\partial x_{nk}}
\end{bmatrix},\cdots,\begin{bmatrix}
\frac{\partial y_{m}}{\partial x_{11}} & \frac{\partial y_{m}}{\partial x_{21}} & \cdots & \frac{\partial y_{m}}{\partial x_{n1}}\\
\frac{\partial y_{m}}{\partial x_{12}} & \frac{\partial y_{m}}{\partial x_{22}} & \cdots & \frac{\partial y_{m}}{\partial x_{n2}}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial y_{m}}{\partial x_{1k}} & \frac{\partial y_{m}}{\partial x_{2k}} & \cdots & \frac{\partial y_{m}}{\partial x_{nk}}
\end{bmatrix}$

  PS:向量對矩陣求導,形狀為3維數組,可以理解為y的每個元素(標量)分別對矩陣求導,結果為m個k*n矩陣的組合。

7. 矩陣對向量求導 $\frac{\partial\textbf Y}{\partial\textbf x}$

  $\textbf x\left ( n,1 \right )=\begin{bmatrix}
x_{1}\\
x_{2}\\
\vdots\\
x_{n}
\end{bmatrix}$,$\textbf Y\left ( m,l \right )=
\begin{bmatrix}
y_{11} & y_{12} & \cdots & y_{1l}\\
y_{21} & y_{22} & \cdots & y_{2l}\\
\vdots & \vdots & \ddots & \vdots\\
y_{m1} & y_{m2} & \cdots & y_{ml}
\end{bmatrix}$,

  $\frac{\partial\textbf Y}{\partial\textbf x}\left ( m,l,n \right )=\begin{bmatrix}
\frac{\partial y_{11}}{\partial\textbf x} & \frac{\partial y_{12}}{\partial\textbf x} & \cdots & \frac{\partial y_{1l}}{\partial\textbf x}\\
\frac{\partial y_{21}}{\partial\textbf x} & \frac{\partial y_{22}}{\partial\textbf x} & \cdots & \frac{\partial y_{2l}}{\partial\textbf x}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial y_{m1}}{\partial\textbf x} & \frac{\partial y_{m2}}{\partial\textbf x} & \cdots & \frac{\partial y_{ml}}{\partial\textbf x}
\end{bmatrix}$

  PS:矩陣對向量求導,形狀為3維的數組,沒搞懂。

8. 矩陣對矩陣求導 $\frac{\partial\textbf Y}{\partial\textbf X}$

  $\textbf X\left ( n,k \right )=
\begin{bmatrix}
x_{11} & x_{12} & \cdots & x_{1k}\\
x_{21} & x_{22} & \cdots & x_{2k}\\
\vdots & \vdots & \ddots & \vdots\\
x_{n1} & x_{n2} & \cdots & x_{nk}
\end{bmatrix}$,$\textbf Y\left ( m,l \right )=
\begin{bmatrix}
y_{11} & y_{12} & \cdots & y_{1l}\\
y_{21} & y_{22} & \cdots & y_{2l}\\
\vdots & \vdots & \ddots & \vdots\\
y_{m1} & y_{m2} & \cdots & y_{ml}
\end{bmatrix}$,

  PS:搞懂再來寫。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM