TensorFlow tf.gradients的用法詳細解析以及具體例子


tf.gradients

官方定義:

tf.gradients(
    ys,
    xs,
    grad_ys=None, name='gradients', stop_gradients=None, )

Constructs symbolic derivatives of sum of ys w.r.t. x in xs.   

ys and xs are each a Tensor or a list of tensors. grad_ys is a list of Tensor, holding the gradients received by theys. The list must be the same length as ys.

gradients() adds ops to the graph to output the derivatives of ys with respect to xs. It returns a list of Tensor of length len(xs) where each tensor is the sum(dy/dx) for y in ys.

grad_ys is a list of tensors of the same length as ys that holds the initial gradients for each y in ys. When grad_ysis None, we fill in a tensor of '1's of the shape of y for each y in ys. A user can provide their own initial grad_ys to compute the derivatives using a different initial gradient for each y (e.g., if one wanted to weight the gradient differently for each value in each y).

stop_gradients is a Tensor or a list of tensors to be considered constant with respect to all xs. These tensors will not be backpropagated through, as though they had been explicitly disconnected using stop_gradient. Among other things, this allows computation of partial derivatives as opposed to total derivatives.

翻譯:

1. xs和ys可以是一個張量,也可以是張量列表,tf.gradients(ys,xs) 實現的功能是求ys(如果ys是列表,那就是ys中所有元素之和)關於xs的導數(如果xs是列表,那就是xs中每一個元素分別求導),返回值是一個與xs長度相同的列表。

例如ys=[y1,y2,y3], xs=[x1,x2,x3,x4],那么tf.gradients(ys,xs)=[d(y1+y2+y3)/dx1,d(y1+y2+y3)/dx2,d(y1+y2+y3)/dx3,d(y1+y2+y3)/dx4].具體例子見下面代碼第16-17行。

2. grad_ys 是ys的加權向量列表,和ys長度相同,當grad_ys=[q1,q2,g3]時,tf.gradients(ys,xs,grad_ys)=[d(g1*y1+g2*y2+g3*y3)/dx1,d(g1*y1+g2*y2+g3*y3)/dx2,d(g1*y1+g2*y2+g3*y3)/dx3,d(g1*y1+g2*y2+g3*y3)/dx4].具體例子見下面代碼第19-21行。

3. stop_gradients使得指定變量不被求導,即視為常量,具體的例子見官方例子,此處省略

 1 import tensorflow as tf
 2 w1 = tf.Variable([[1,2]])
 3 w2 = tf.Variable([[3,4]])
 4 res = tf.matmul(w1, [[2],[1]])
 5 
 6 #ys必須與xs有關,否則會報錯
 7 # grads = tf.gradients(res,[w1,w2])
 8 #TypeError: Fetch argument None has invalid type <class 'NoneType'>
 9 
10 # grads = tf.gradients(res,[w1])
11 # # Result [array([[2, 1]])]
12 
13 res2a=tf.matmul(w1, [[2],[1]])+tf.matmul(w2, [[3],[5]])
14 res2b=tf.matmul(w1, [[2],[4]])+tf.matmul(w2, [[8],[6]])
15 
16 # grads = tf.gradients([res2a,res2b],[w1,w2])
17 #result:[array([[4, 5]]), array([[11, 11]])]
18 
19 grad_ys=[tf.Variable([[1]]),tf.Variable([[2]])]
20 grads = tf.gradients([res2a,res2b],[w1,w2],grad_ys=grad_ys)
21 # Result: [array([[6, 9]]), array([[19, 17]])]
22 
23 with tf.Session() as sess:
24     tf.global_variables_initializer().run()
25     re = sess.run(grads)
26     print(re)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM