|
1 | 1 | # What to return for non-differentiable points |
2 | 2 | !!! info "What is the short version?" |
3 | | - If the function is not-differentiable choose to return something useful rather than erroring. |
4 | | - For a branch a function is not differentiable due to e.g. a branch, like `abs`, your rule can reasonably claim the derivative at that point is the value from either branch, *or* any value in-between. |
| 3 | + If the function is not differentiable choose to return something useful rather than erroring. |
| 4 | + If a function is not differentiable due to e.g. a branch, like `abs`, your rule can reasonably claim the derivative at that point is the value from either branch, *or* any value in-between. |
5 | 5 | In particular for local optima (like in the case of `abs`) claiming the derivative is 0 is a good idea. |
6 | | - Similarly, if derivative is from one side is not defined, or is not finite, return the derivative from the other side. |
| 6 | + Similarly, if the derivative from one side is not defined, or is not finite, return the derivative from the other side. |
7 | 7 | Throwing an error, or returning `NaN` is generally the least useful option. |
8 | 8 |
|
9 | | -However, contrary to what calculus says most autodiff systems will return an answer for such functions. |
10 | | -For example for: `abs_left(x) = (x <= 0) ? -x : x`, AD will say the derivative at `x=0` is `-1`. |
11 | | -Alternatively for: `abs_right(x) = (x < 0) ? -x : x`, AD will say the derivative at `x=0` is `1`. |
| 9 | +Contrary to what calculus says most autodiff systems will return a derivative even at non-differentiable points. |
| 10 | +For example for: `abs_left(x) = x <= 0 ? -x : x`, AD will say the derivative at `x=0` is `-1`. |
| 11 | +Alternatively for: `abs_right(x) = x < 0 ? -x : x`, AD will say the derivative at `x=0` is `1`. |
12 | 12 | Those two examples are weird since they are equal at all points, but AD claims different derivatives at `x=0`. |
13 | 13 | The way to fix autodiff systems being weird is to write custom rules. |
14 | 14 | So what rule should we write for this case? |
15 | 15 |
|
16 | | -The obvious answer, would be to write a rule that throws an error if input at a point where calculus says the derivative is not defined. |
17 | | -Another option is to return some error signally value like `NaN`. |
| 16 | +The obvious answer would be to write a rule that throws an error at a point where calculus says the derivative is not defined. |
| 17 | +Another option is to return some error signaling value like `NaN`. |
18 | 18 | Which you *can* do. |
19 | 19 | However, there is no where to go with an error, the user still wants a derivative; so this is not useful. |
20 | 20 |
|
@@ -45,7 +45,7 @@ plot(abs) |
45 | 45 | $$\operatorname{abs}'(0) = \lim_{h \to 0^-} \dfrac{\operatorname{abs}(0)-\operatorname{abs}(0-h)}{0-h} = -1$$ |
46 | 46 | $$\operatorname{abs}'(0) = \lim_{h \to 0^+} \dfrac{\operatorname{abs}(0)-\operatorname{abs}(0-h)}{0-h} = 1$$ |
47 | 47 |
|
48 | | -Now, as discussed in the introduction, the AD system would on it's own choose either 1 or -1, depending on implementation. |
| 48 | +Now, as discussed in the introduction, the AD system would on its own choose either 1 or -1, depending on implementation. |
49 | 49 |
|
50 | 50 | We however have a potentially much nicer answer available to use: 0. |
51 | 51 |
|
@@ -116,7 +116,7 @@ Our alternatives would be to consider the derivative at `nextfloat(0.0)` or `pre |
116 | 116 | But this is more or less the same as choosing some large value -- in this case an extremely large value that will rapidly overflow. |
117 | 117 |
|
118 | 118 |
|
119 | | -### Derivative on-finite and different on both sides |
| 119 | +### Derivative nonfinite and different on both sides |
120 | 120 |
|
121 | 121 | ```@example nondiff |
122 | 122 | plot(x-> sign(x) * cbrt(x)) |
|
0 commit comments