Conversation
Previously admm would rechunk the columns to be in a single chunk, and then pass delayed numpy arrays to the local_update function. If the chunks along columns were of different types, like a numpy array and a sparse array, then these would be inefficiently coerced to a single type. Now we pass a list of numpy arrays to the local_update function. If this list has more than one element then we construct a local dask.array so that operations like dot do the right thing and call two different local dot functions, one for each type.
|
There is a non-trivial cost to using dask.array within the function given to @moody-marlin is this generally possible? (please let me know if this description was not clear) |
|
Hmmm I'm not sure how possible this is; it definitely won't be as easy as evaluating doesn't split as a simple sum on each chunk. There might be fancier ways of combining the results, or possibly even altering ADMM to take this into account, but it will require some non-trivial thinking. |
The fancy way here is already handled by dask.array. I was just hoping to avoid having to recreate graphs every time. I can probably be clever here though. I'll give it a shot |
This create a dask graph for the local_update computation once and then hands the solver a function that just puts in a new beta and evaluates with the single threaded scheduler. Currently this fails to converge.
|
OK, I've pushed a solution that, I think, avoids most graph construction costs. However my algorithm is failing to converge. @moody-marlin if you find yourself with a free 15 minutes can you take a look at |
| print(result, gradient) | ||
| return result, gradient | ||
|
|
||
| beta, _, _ = solver(f2, beta, args=solver_args, maxiter=200, maxfun=250) |
There was a problem hiding this comment.
^^^ I think this line is incorrect; solver (in this case fmin_l_bfgs_b) has this call signature; the first argument is the function f, and a keyword argument needs to be fprime.
There was a problem hiding this comment.
I thought that I didn't have to specify fprime if f returned two results
There was a problem hiding this comment.
Oh, it looks like you're right; sorry, I've never called it that way. Hmm back to the drawing board.
| print(result, gradient) | ||
| return result, gradient | ||
|
|
||
| solver_args = () |
There was a problem hiding this comment.
Why no solver_args in this case?
There was a problem hiding this comment.
They are all, I think, in the task graph. My assumption is that these will not change during the call to local_update. Is this correct?
Previously admm would rechunk the columns to be in a single chunk, and
then pass delayed numpy arrays to the local_update function. If the
chunks along columns were of different types, like a numpy array and a
sparse array, then these would be inefficiently coerced to a single
type.
Now we pass a list of numpy arrays to the local_update function. If
this list has more than one element then we construct a local dask.array
so that operations like dot do the right thing and call two different
local dot functions, one for each type.
This currently depends on dask/dask#2272 though I may be able to avoid this dependency.