Partial differentiation and gradients : 偏微分和梯度
The generalization of the derivative to functions of several variables is the gradient : 导数对多变量函数的推广是梯度
We find the gradient of the function f with respect to x by varying one variable at a time and keeping the others constant. The gradient is then the collection of these partial derivatives : 通过一次改变一个变量并使其他变量保持不变，我们发现函数f相对于x的梯度。梯度就是这些偏导数的集合
Useful Identities for computing gradients : 计算梯度的有用恒等式
Backpropagation and Automatic Differentiation : 反向传播与自动微分
We can think of automatic differentation as a set of techniques to numerically (in contrast to differentiation symbolically) evaluate the exact (up to machine precision) gradient of a function by working with intermediate variables and applying the chain rule : 我们可以把自动微分看作一套技术，通过处理中间变量和应用链式规则，用数值方法（而不是用符号表示微分）计算函数的精确（达到机器精度）梯度
Forward pass in a multi-layer neural network to compute the loss : 多层神经网络中的前向传递计算损耗

Vector Calculus

results matching ""