Skip to content

Commit 4f7b477

Browse files
committed
Add examples
1 parent 1ca63ca commit 4f7b477

3 files changed

Lines changed: 416 additions & 0 deletions

File tree

examples/halley.ipynb

Lines changed: 264 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,264 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# The Jacobian- and Hessian-free Halley's Method\n",
8+
"\n",
9+
"Say we have a system of $n$ equations with $n$ unknowns\n",
10+
"\n",
11+
"$$\n",
12+
"f(x)=0\n",
13+
"$$\n",
14+
"\n",
15+
"and $f\\in \\mathbb R^n\\to\\mathbb R^n$ is sufficiently smooth.\n",
16+
"\n",
17+
"Given a initial guess $x_0$, Halley's method specifies a series of points approximating the solution, where each iteration is\n",
18+
"\n",
19+
"$$\n",
20+
"x^{(i+1)}=x^{(i)}+\\frac{a^{(i)}a^{(i)}}{a^{(i)}+b^{(i)}/2}\n",
21+
"$$\n",
22+
"\n",
23+
"where the vector multiplication and division $ab, a/b$ is defined in Banach algebra, and the vectors $a^{(i)}, b^{(i)}$ are defined as\n",
24+
"\n",
25+
"$$\n",
26+
"J(x^{(i)})a^{(i)} = -f(x^{(i)})\n",
27+
"$$\n",
28+
"\n",
29+
"and\n",
30+
"\n",
31+
"$$\n",
32+
"J(x^{(i)})b^{(i)} = H(x^{(i)})a^{(i)}a^{(i)}\n",
33+
"$$\n"
34+
]
35+
},
36+
{
37+
"cell_type": "markdown",
38+
"metadata": {},
39+
"source": [
40+
"The full Jacobian (a matrix) and full Hessian (a 3-tensor) representation can be avoided by using forward-mode automatic differentiation. It is well known that a forward evaluation on a dual number $(x, v)$ gives the Jacobian-vector product,\n",
41+
"\n",
42+
"$$\n",
43+
"f(x,v)=(f(x),Jv)\n",
44+
"$$\n",
45+
"\n",
46+
"and similarly a forward evaluation on a second order Taylor expansion gives the Hessian-vector-vector product,\n",
47+
"\n",
48+
"$$\n",
49+
"f(x,v,0)=f(x,Jv,Hvv)\n",
50+
"$$\n",
51+
"\n",
52+
"Below, we demonstrate this possibility with TaylorDiff.jl."
53+
]
54+
},
55+
{
56+
"cell_type": "markdown",
57+
"metadata": {},
58+
"source": [
59+
"## Jacobian-free Newton Krylov\n",
60+
"\n",
61+
"To get started we first get familiar with the JFNK:"
62+
]
63+
},
64+
{
65+
"cell_type": "code",
66+
"execution_count": 1,
67+
"metadata": {},
68+
"outputs": [
69+
{
70+
"data": {
71+
"text/plain": [
72+
"newton (generic function with 1 method)"
73+
]
74+
},
75+
"metadata": {},
76+
"output_type": "display_data"
77+
}
78+
],
79+
"source": [
80+
"# The Jacobi- and Hessian-free Halley method for solving nonlinear equations\n",
81+
"\n",
82+
"using TaylorDiff\n",
83+
"using LinearAlgebra\n",
84+
"using LinearSolve\n",
85+
"\n",
86+
"function newton(f, x0, p; tol=1e-10, maxiter=100)\n",
87+
" x = x0\n",
88+
" for i in 1:maxiter\n",
89+
" fx = f(x, p)\n",
90+
" error = norm(fx)\n",
91+
" println(\"Iteration $i: x = $x, f(x) = $fx, error = $error\")\n",
92+
" if error < tol\n",
93+
" return x\n",
94+
" end\n",
95+
" get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)\n",
96+
" operator = FunctionOperator(get_derivative, similar(x), similar(x))\n",
97+
" problem = LinearProblem(operator, -fx)\n",
98+
" sol = solve(problem, KrylovJL_GMRES())\n",
99+
" x += sol.u\n",
100+
" end\n",
101+
" return x\n",
102+
"end"
103+
]
104+
},
105+
{
106+
"cell_type": "markdown",
107+
"metadata": {},
108+
"source": [
109+
"## Jacobian- and Hessian-free Halley\n",
110+
"\n",
111+
"This naturally follows, only difference is replacing the rhs by Hessian-vector-vector product:"
112+
]
113+
},
114+
{
115+
"cell_type": "code",
116+
"execution_count": 2,
117+
"metadata": {},
118+
"outputs": [
119+
{
120+
"data": {
121+
"text/plain": [
122+
"halley (generic function with 1 method)"
123+
]
124+
},
125+
"metadata": {},
126+
"output_type": "display_data"
127+
}
128+
],
129+
"source": [
130+
"function halley(f, x0, p; tol=1e-10, maxiter=100)\n",
131+
" x = x0\n",
132+
" for i in 1:maxiter\n",
133+
" fx = f(x, p)\n",
134+
" error = norm(fx)\n",
135+
" println(\"Iteration $i: x = $x, f(x) = $fx, error = $error\")\n",
136+
" if error < tol\n",
137+
" return x\n",
138+
" end\n",
139+
" get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)\n",
140+
" operator = FunctionOperator(get_derivative, similar(x), similar(x))\n",
141+
" problem = LinearProblem(operator, -fx)\n",
142+
" a = solve(problem, KrylovJL_GMRES()).u\n",
143+
" Haa = derivative(x -> f(x, p), x, a, 2)\n",
144+
" problem2 = LinearProblem(operator, Haa)\n",
145+
" b = solve(problem2, KrylovJL_GMRES()).u\n",
146+
" x += (a .* a) ./ (a .+ b ./ 2)\n",
147+
" end\n",
148+
" return x\n",
149+
"end"
150+
]
151+
},
152+
{
153+
"cell_type": "code",
154+
"execution_count": 3,
155+
"metadata": {},
156+
"outputs": [
157+
{
158+
"data": {
159+
"text/plain": [
160+
"f (generic function with 1 method)"
161+
]
162+
},
163+
"metadata": {},
164+
"output_type": "display_data"
165+
}
166+
],
167+
"source": [
168+
"# Testing with simple examples:\n",
169+
"\n",
170+
"f(x, p) = x .* x - p"
171+
]
172+
},
173+
{
174+
"cell_type": "code",
175+
"execution_count": 4,
176+
"metadata": {},
177+
"outputs": [
178+
{
179+
"name": "stdout",
180+
"output_type": "stream",
181+
"text": [
182+
"Iteration 1: x = [1.0, 1.0], f(x) = [-1.0, -1.0], error = 1.4142135623730951\n"
183+
]
184+
},
185+
{
186+
"name": "stdout",
187+
"output_type": "stream",
188+
"text": [
189+
"Iteration 2: x = [1.5, 1.5], f(x) = [0.25, 0.25], error = 0.3535533905932738\n",
190+
"Iteration 3: x = [1.4166666666666667, 1.4166666666666667], f(x) = [0.006944444444444642, 0.006944444444444642], error = 0.009820927516480105\n",
191+
"Iteration 4: x = [1.4142156862745099, 1.4142156862745099], f(x) = [6.007304882871267e-6, 6.007304882871267e-6], error = 8.495612038666664e-6\n",
192+
"Iteration 5: x = [1.4142135623746899, 1.4142135623746899], f(x) = [4.510614104447086e-12, 4.510614104447086e-12], error = 6.378971641140442e-12\n"
193+
]
194+
},
195+
{
196+
"data": {
197+
"text/plain": [
198+
"2-element Vector{Float64}:\n",
199+
" 1.4142135623746899\n",
200+
" 1.4142135623746899"
201+
]
202+
},
203+
"metadata": {},
204+
"output_type": "display_data"
205+
}
206+
],
207+
"source": [
208+
"newton(f, [1., 1.], [2., 2.])"
209+
]
210+
},
211+
{
212+
"cell_type": "code",
213+
"execution_count": 5,
214+
"metadata": {},
215+
"outputs": [
216+
{
217+
"name": "stdout",
218+
"output_type": "stream",
219+
"text": [
220+
"Iteration 1: x = [1.0, 1.0], f(x) = [-1.0, -1.0], error = 1.4142135623730951\n"
221+
]
222+
},
223+
{
224+
"name": "stdout",
225+
"output_type": "stream",
226+
"text": [
227+
"Iteration 2: x = [1.4000000000000001, 1.4000000000000001], f(x) = [-0.03999999999999959, -0.03999999999999959], error = 0.05656854249492323\n",
228+
"Iteration 3: x = [1.4142131979695431, 1.4142131979695431], f(x) = [-1.0306887576749801e-6, -1.0306887576749801e-6], error = 1.4576140196894333e-6\n",
229+
"Iteration 4: x = [1.414213562373142, 1.414213562373142], f(x) = [1.3278267374516872e-13, 1.3278267374516872e-13], error = 1.877830580585795e-13\n"
230+
]
231+
},
232+
{
233+
"data": {
234+
"text/plain": [
235+
"2-element Vector{Float64}:\n",
236+
" 1.414213562373142\n",
237+
" 1.414213562373142"
238+
]
239+
},
240+
"metadata": {},
241+
"output_type": "display_data"
242+
}
243+
],
244+
"source": [
245+
"halley(f, [1., 1.], [2., 2.])"
246+
]
247+
}
248+
],
249+
"metadata": {
250+
"kernelspec": {
251+
"display_name": "Julia 1.10.1",
252+
"language": "julia",
253+
"name": "julia-1.10"
254+
},
255+
"language_info": {
256+
"file_extension": ".jl",
257+
"mimetype": "application/julia",
258+
"name": "julia",
259+
"version": "1.10.1"
260+
}
261+
},
262+
"nbformat": 4,
263+
"nbformat_minor": 2
264+
}

examples/halley.jl

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# The Jacobi- and Hessian-free Halley method for solving nonlinear equations
2+
3+
using TaylorDiff
4+
using LinearAlgebra
5+
using LinearSolve
6+
7+
function newton(f, x0, p; tol=1e-10, maxiter=100)
8+
x = x0
9+
for i in 1:maxiter
10+
fx = f(x, p)
11+
error = norm(fx)
12+
println("Iteration $i: x = $x, f(x) = $fx, error = $error")
13+
if error < tol
14+
return x
15+
end
16+
get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)
17+
operator = FunctionOperator(get_derivative, similar(x), similar(x))
18+
problem = LinearProblem(operator, -fx)
19+
sol = solve(problem, KrylovJL_GMRES())
20+
x += sol.u
21+
end
22+
return x
23+
end
24+
25+
function halley(f, x0, p; tol=1e-10, maxiter=100)
26+
x = x0
27+
for i in 1:maxiter
28+
fx = f(x, p)
29+
error = norm(fx)
30+
println("Iteration $i: x = $x, f(x) = $fx, error = $error")
31+
if error < tol
32+
return x
33+
end
34+
get_derivative = (v, u, a, b) -> v .= derivative(x -> f(x, p), x, u, 1)
35+
operator = FunctionOperator(get_derivative, similar(x), similar(x))
36+
problem = LinearProblem(operator, -fx)
37+
a = solve(problem, KrylovJL_GMRES()).u
38+
Haa = derivative(x -> f(x, p), x, a, 2)
39+
problem2 = LinearProblem(operator, Haa)
40+
b = solve(problem2, KrylovJL_GMRES()).u
41+
x += (a .* a) ./ (a .+ b ./ 2)
42+
end
43+
return x
44+
end
45+
46+
f(x, p) = x .* x - p

0 commit comments

Comments
 (0)