|
| 1 | +# Rank Estimation |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +In some applications, we may have a tenor we believe to be closely approximated by a low rank tenor, but do not know what rank to use. In this setting, use `rank_detect_factorize` rather than `factorize` without a `rank=R` keyword. The estimated rank will be added to `kwargs`, and the `decomposition` at this rank returned. |
| 6 | + |
| 7 | +```julia |
| 8 | +R = 3 |
| 9 | +T = Tucker1((10, 10, 10), R) |
| 10 | +Y = array(T) |
| 11 | +options = (model=Tucker1, curvature_method=splines) # default values |
| 12 | +decomposition, stats, kwargs, final_rel_errors = rank_detect_factorize(Y; options...) |
| 13 | + |
| 14 | +@assert kwargs[:rank] == R |
| 15 | +``` |
| 16 | + |
| 17 | +The rank will be estimated by factorizing the input tensor at every possible rank from 1 up to the maximum rank and recording the relative error. We can pick the rank which balances small error (good fit) and small rank (low complexity). Our criteria is the rank which maximizes the curvature of the error vs. rank function, or in the case of the `:breakpoints` method, a proxy for the curvature. |
| 18 | + |
| 19 | +Note `rank_detect_factorize` also returns the stats and final relative errors of each rank tested. |
| 20 | + |
| 21 | +```@docs; canonical=false |
| 22 | +rank_detect_factorize |
| 23 | +max_possible_rank |
| 24 | +``` |
| 25 | + |
| 26 | +!!! tip |
| 27 | + If you run `rank_detect_factorize` with different methods and get the same rank, you can feel more confident in the detected rank. |
| 28 | + |
| 29 | + |
| 30 | +## Estimating Curvature |
| 31 | + |
| 32 | +This package has implemented a few methods for estimating curvature. These are the following list of methods. |
| 33 | + |
| 34 | +```julia |
| 35 | +:finite_differences |
| 36 | +:circles |
| 37 | +:splines |
| 38 | +``` |
| 39 | + |
| 40 | +We can also use the following curvature-proxy methods. |
| 41 | + |
| 42 | +```julia |
| 43 | +:breakpoints |
| 44 | +``` |
| 45 | + |
| 46 | +They can be passed to `curvature` and `standard_curvature` as a `method` keyword, or `rank_detect_factorize` with the `curvature_method` keyword. |
| 47 | + |
| 48 | +```@docs; canonical=false |
| 49 | +curvature |
| 50 | +standard_curvature |
| 51 | +``` |
| 52 | + |
| 53 | +These methods have been specialized to typical relative error vs. rank curves. We assume a relative error of ``1`` at rank ``0`` since the only rank ``0`` tenor is the ``0`` tenor. And that the relative error at the maximum possible rank is ``0``, and stays zero if we try to use a larger rank. |
| 54 | + |
| 55 | +## How do these methods work? |
| 56 | + |
| 57 | +### Finite Differences |
| 58 | + |
| 59 | +We approximate the first and second derivatives separately with three point finite differences and calculate the curvature with the formula |
| 60 | + |
| 61 | +```math |
| 62 | +k(x) = \frac{y''(x)}{(1 + y'(x)^2)^{3/2}}. |
| 63 | +``` |
| 64 | + |
| 65 | +The derivatives are approximated using centred three point finite differences,[^1] |
| 66 | + |
| 67 | +[^1]: M. Abramowitz and I. A. Stegun, "Handbook of mathematical functions: with formulas, graphs and mathematical tables", Unabridged, Unaltered and corr. Republ. of the 1964 ed. in Dover books on advanced mathematics. New York: Dover publ, 1972. (Table 25.2, p. 914) |
| 68 | + |
| 69 | +```math |
| 70 | +y'(x_i) \approx \frac{1}{2\Delta x}(y_{i+1} - y_{i-1})\quad\text{and}\quad y''(x_i)\approx \frac{1}{\Delta x^2}(y_{i+1} - 2y_i + y_{i-1}). |
| 71 | +``` |
| 72 | + |
| 73 | +The end-points use forward/back-differences. For the first derivative, this is |
| 74 | + |
| 75 | +```math |
| 76 | +y'(x_1) \approx \frac{1}{2\Delta x}(-3y_{1} + 4y_2 - y_{3})\quad\text{and}\quad y'(x_I) \approx \frac{1}{2\Delta x}(-y_{I-2} + 4y_{I-1} - 3y_{I}), |
| 77 | +``` |
| 78 | + |
| 79 | +and for the second derivative, this is |
| 80 | + |
| 81 | +```math |
| 82 | +y''(x_1) \approx \frac{1}{\Delta x^2}(y_{1} - 2y_2 + y_{3})\quad\text{and}\quad y''(x_I) \approx \frac{1}{\Delta x^2}(y_{I-2} - 2y_{I-1} + y_{I}). |
| 83 | +``` |
| 84 | + |
| 85 | +```@docs; canonical=false |
| 86 | +BlockTensorFactorization.Core.d_dx |
| 87 | +BlockTensorFactorization.Core.d2_dx2 |
| 88 | +``` |
| 89 | + |
| 90 | + |
| 91 | +### Splines |
| 92 | + |
| 93 | +We calculate a third order [spline](https://en.wikipedia.org/wiki/Spline_(mathematics)) |
| 94 | + |
| 95 | +```math |
| 96 | +g_i(x) = a_i(x-x_i)^3+b_i(x-x_i)^2+c_i(x-x_i)+d_i,\quad x_1 \leq x \leq x_{i+1} |
| 97 | +``` |
| 98 | + |
| 99 | +and use the coefficients to calculate the curvature of the spline |
| 100 | + |
| 101 | +```math |
| 102 | +k(x_i) = \frac{2b_i}{(1+c_i^2)^{3/2}}. |
| 103 | +``` |
| 104 | + |
| 105 | +We assume the following boundary conditions; |
| 106 | +1) ``g_{1}(x_{1}-\Delta x)=1``, ``y``-intercept ``(0,1)`` |
| 107 | +2) ``g_{I}(x_{I}+\Delta x)=y_{I}``, repeated right end-point (``y_{I+1}=y_{I}``) |
| 108 | +3) ``g_{I}''(x_{I}+\Delta x)=0``, flat right end-point, |
| 109 | + |
| 110 | +in addition to the usual continuity and smoothness assumptions of a third order spline: |
| 111 | +1) ``g_{i}(x_{i})=y_{i}`` |
| 112 | +2) ``g_{i}(x_{i+1})=g_{i+1}(x_{i+1})`` |
| 113 | +3) ``g_{i}'(x_{i+1})=g_{i+1}'(x_{i+1})`` |
| 114 | +4) ``g_{i}''(x_{i+1})=g_{i+1}''(x_{i+1})``. |
| 115 | + |
| 116 | +We can solve for the coefficients by first solving the tri-diagonal system ``Mb=v``, where |
| 117 | + |
| 118 | +```math |
| 119 | +M=\begin{bmatrix} |
| 120 | +6 & 0 & & & \\ |
| 121 | +1 & 4 & 1 & & \\ |
| 122 | +& \ddots& \ddots& \ddots& \\ |
| 123 | +& & 1 & 4 & 1& \\ |
| 124 | + & & & 0 & 1 |
| 125 | +\end{bmatrix},\quad |
| 126 | +b =\begin{bmatrix} |
| 127 | +b_{1} \\ |
| 128 | +b_{2}\\ |
| 129 | +\vdots \\ |
| 130 | +b_{I} \\ |
| 131 | +b_{I+1} |
| 132 | +\end{bmatrix},\quad \text{and} \quad |
| 133 | +v = \frac{3}{\Delta x^2}\begin{bmatrix} |
| 134 | +1-2y_{1}+y_{2} \\ |
| 135 | +y_{3}-y_{1} \\ |
| 136 | +\vdots \\ |
| 137 | +y_{I+1}-y_{I-1} \\ |
| 138 | +0 |
| 139 | +\end{bmatrix}, |
| 140 | +``` |
| 141 | + |
| 142 | +and then calculate |
| 143 | + |
| 144 | +```math |
| 145 | +a_{i} = \frac{1}{3\Delta x}(b_{i+1}-b_{i})\quad \text{and}\quad c_{i}=\frac{{y_{i+1}-y_{i}}}{\Delta x}-\frac{\Delta x}{3}(b_{i+1}-2b_{i}). |
| 146 | +``` |
| 147 | + |
| 148 | +```@docs |
| 149 | +BlockTensorFactorization.Core.cubic_spline_coefficients |
| 150 | +BlockTensorFactorization.Core.spline_mat |
| 151 | +BlockTensorFactorization.Core.make_spline |
| 152 | +BlockTensorFactorization.Core.d_dx_and_d2_dx2_spline |
| 153 | +``` |
| 154 | + |
| 155 | +### Circles |
| 156 | + |
| 157 | +We compute the radius ``r_i`` of a circle passing through three neighbouring points ``(x_{i-1}, y_{i-1})``, ``(x_{i}, y_{i})``, and ``(x_{i+1}, y_{i+1})``. The curvature magnitude is approximately the inverse of this radius; |
| 158 | + |
| 159 | +```math |
| 160 | +\lvert k(x_i)\rvert \approx \frac{1}{r_i}. |
| 161 | +``` |
| 162 | + |
| 163 | +We assume two additional points so we can estimate the curvature at the boundary. They are |
| 164 | + |
| 165 | +```math |
| 166 | +(x_{0}, y_{0}) = (0, 1)\quad\text{and}\quad (x_{I+1}, y_{I+1}) = (x_{I}+\Delta x, 0). |
| 167 | +``` |
| 168 | + |
| 169 | +To obtain a the signed curvature ``k(x_i)``, we check if the middle point ``(x_{i}, y_{i})`` is above (negative curvature) or below (positive curvature) the line segment connecting ``(x_{i-1}, y_{i-1})`` and ``(x_{i+1}, y_{i+1})``. |
| 170 | + |
| 171 | +```@docs |
| 172 | +BlockTensorFactorization.Core.three_point_circle |
| 173 | +BlockTensorFactorization.Core.circle_curvature |
| 174 | +BlockTensorFactorization.Core.signed_circle_curvature |
| 175 | +``` |
| 176 | + |
| 177 | + |
| 178 | +### Two-line Breakpoint |
| 179 | + |
| 180 | +This method does **not** approximate the curvature of a function passing through the points ``\{(x_i, y_i)\}``, but picks the best rank based on the optimal breakpoint. The optimal breakpoint ``r`` is the breakpoint of segmented linear model that minimizes the model error.[^2] This means |
| 181 | + |
| 182 | +[^2]: J. E. Saylor, K. E. Sundell, and G. R. Sharman, "Characterizing sediment sources by non-negative matrix factorization of detrital geochronological data," Earth and Planetary Science Letters, vol. 512, pp. 46–58, Apr. 2019, doi: [10.1016/j.epsl.2019.01.044](https://doi.org/10.1016/j.epsl.2019.01.044). (Section 3.2) |
| 183 | + |
| 184 | +```math |
| 185 | +r = \argmin_{z \in \{1,\dots, R\}} \sum_{i=1}^I (f_z(x_i) - y_i)^2, |
| 186 | +``` |
| 187 | + |
| 188 | +where |
| 189 | + |
| 190 | +```math |
| 191 | +f_z(x; a_z, b_z, c_z) = a_z + b_z(\min(x,z) - z) + c_z(\max(x,z) - z) |
| 192 | +``` |
| 193 | + |
| 194 | +and the coefficients ``(a_z, b_z, c_z)`` are selected to minimize the error between the model and the data, |
| 195 | + |
| 196 | +```math |
| 197 | +(a_z, b_z, c_z) = \argmin_{a,b,c\in\mathbb{R}} \sum_{i=1}^I (f_z(x_i;a,b,c) - y_i)^2. |
| 198 | +``` |
| 199 | + |
| 200 | +These coefficients have the closed form solution |
| 201 | + |
| 202 | +```math |
| 203 | +\begin{bmatrix} |
| 204 | +a_{z} \\ |
| 205 | +b_{z} \\ |
| 206 | +c_{z} |
| 207 | +\end{bmatrix} =(M^\top M)^{-1}M^\top \begin{bmatrix} |
| 208 | +y_{1} \\ |
| 209 | +\vdots \\ |
| 210 | +y_{I} |
| 211 | +\end{bmatrix},\quad\text{where}\quad M = \begin{bmatrix} |
| 212 | +1 & \min(x_{1},z) - z&\max(x_{1},z) - z \\ |
| 213 | +\vdots & \vdots & \vdots\\ |
| 214 | +1 & \min(x_{I},z) - z&\max(x_{I},z) - z |
| 215 | +\end{bmatrix}. |
| 216 | +``` |
| 217 | + |
| 218 | +Geometrically, the breakpoint model is two half-infinite lines on ``(-\infty, z]`` and ``[z,\infty)`` that meet continuously at ``(z,a)`` with slopes ``b`` and ``c`` respectively. |
| 219 | + |
| 220 | +```@docs |
| 221 | +BlockTensorFactorization.Core.best_breakpoint |
| 222 | +BlockTensorFactorization.Core.breakpoint_error |
| 223 | +BlockTensorFactorization.Core.breakpoint_model |
| 224 | +BlockTensorFactorization.Core.breakpoint_model_coefficients |
| 225 | +BlockTensorFactorization.Core.breakpoint_curvature |
| 226 | +``` |
0 commit comments