AutoGrad.jl is an automatic differentiation package for Julia. It is a Julia port of the popular Python autograd package. It can differentiate regular Julia code that includes loops, conditionals, helper functions, closures etc. by keeping track of the primitive operations and using this execution trace to compute gradients. It uses reverse mode differentiation (a.k.a. backpropagation) so it can efficiently handle functions with array inputs and scalar outputs. It can compute gradients of gradients to handle higher order derivatives. Please see the comments in core.jl for a description of how the code works in detail.
InstallationYou can install AutoGrad in Julia using:
julia> Pkg.add("AutoGrad")
In order to use it in your code start with:
using AutoGrad
Example
Here is a linear regression example simplified from housing.jl:
using AutoGrad
function loss(w)
global xtrn,ytrn
ypred = w[1]*xtrn .+ w[2]
sum(abs2(ypred - ytrn)) / size(ypred,2)
end
function train(w; lr=.1, epochs=20)
gradfun = grad(loss)
for epoch=1:epochs
g = gradfun(w)
for i in 1:length(w)
w[i] -= lr * g[i]
end
end
return w
end
The loss
function takes parameters as input and returns the loss to
be minimized. The parameter w
for this example is a pair: w[1]
is
a weight matrix, and w[2]
is a bias vector. The training data
xtrn,ytrn
are in global variables. ypred
is the predicted output,
and the last line computes the quadratic loss. The loss
function is
implemented in regular Julia.
The train
function takes initial parameters and returns optimized
parameters. grad
is the only AutoGrad function used: it creates a
function gradfun
that takes the same arguments as loss
, but
returns the gradient instead. The returned gradient will have the
same type and shape as the input argument. The for
loop implements
gradient descent, where we calculate the gradient and subtract a
scaled version of it from the weights.
See the examples directory for more examples, and the extensively documented core.jl for details.
Extending AutoGradAutoGrad can only handle a function if the primitives it uses have
known gradients. You can add your own primitives with gradients as
described in detail in
core.jl
or using the @primitive
and @zerograd
macros in
util.jl
Here is an example:
@primitive hypot(x1::Number,x2::Number)::y (dy->dy*x1/y) (dy->dy*x2/y)
The @primitive
macro marks the hypot(::Number,::Number)
method as
a new primitive and the next two expressions define gradient functions
wrt the first and second argument. The gradient expressions can refer
to the parameters and the return variable (indicated after the final
::
) of the method declaration.
Note that Julia supports multiple-dispatch, i.e. a function may have
multiple methods each supporting different argument types. For
example hypot(x1::Array,x2::Array)
is another hypot method. In
AutoGrad.jl each method can independently be defined as a primitive
and can have its own specific gradient.
core.jl implements the main functionality and acts as the main documentation source. util.jl has some support functions to define and test new primitives. interfaces.jl sets up support for common data structures including Arrays, Tuples, and Dictionaries. The numerical gradients are defined in files such as base/math.jl, special/trig.jl that mirror the organization under julia/base.
Current status and future workThe gradient coverage is spotty, I am still adding more gradients to cover the Julia base. Next steps are to make models faster by providing support for GPU operations and overwriting functions (to avoid memory allocation). I should also find out about the efficiency of closures and untyped functions in Julia which are used extensively in the code.
Acknowledgments and referencesAutoGrad.jl was written by Deniz Yuret. Large parts of the code are directly ported from the Python autograd package. I'd like to thank autograd author Dougal Maclaurin for his support. See (Baydin et al. 2015) for a general review of automatic differentiation, autograd tutorial for some Python examples, and Dougal's PhD thesis for design principles. JuliaDiff has alternative differentiation tools for Julia. I would like to thank my students Ozan Arkan Can and Emre Yolcu for helpful contributions.
Also see: A presentation, A demo.
No comments:
Post a Comment