-
Notifications
You must be signed in to change notification settings - Fork 37
VNT Part 3: VarNamedTuple with concretized slices #1181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mhauru/arraylikeblock
Are you sure you want to change the base?
Conversation
Benchmark Report
Computer InformationBenchmark Results |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## mhauru/arraylikeblock #1181 +/- ##
=========================================================
+ Coverage 80.13% 80.18% +0.05%
=========================================================
Files 42 42
Lines 4298 4356 +58
=========================================================
+ Hits 3444 3493 +49
- Misses 854 863 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
DynamicPPL.jl documentation for PR #1181 is available at: |
| using OffsetArrays | ||
| x = OffsetArray(Vector{Float64}(undef, 3), -3) | ||
| x[-2] ~ Normal() | ||
| 0.0 ~ Normal(x[-2]) | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it strikes me that a lot of the problems we have here are actually incredibly similar to the problems of creating shadow memory in AD packages. the idea is that inside the model there is an x, but inside the VNT there is a shadow copy of x as well. cc @yebai
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one possible general solution to this is going to be something that's similar to what ForwardDiff does (at least for AbstractArray). Technically, it's even closer to a sparsity tracer.
When x is initialised to an OffsetArray{T}, we want the shadow of x (let's call it dx) to be initialised to the same OffsetArray{Dual{T}}. Each of these 'duals' needs to carry the value, along with the boolean mask indicating whether it was set or not. This is of course exactly the same as how PartialArray has one array of values and one array of masks.
struct Dual{T}
val::T
has_been_set::Bool
end
Dual(val) = Dual(val, false)Then when x[idxs...] is set to whatever rand(Normal()) returns, we set dx[idxs...] = Dual(x[idxs...], true). Of course, the block elements will need the same workaround as before (complete with all the typejoin shenanigans).
struct BlockDual{T}
val::T
original_indices::whatever
endThen, reading from the shadow memory becomes way more consistent, because indexing into dx carries the same semantics as indexing into x.
For another example, let's say that x and dx are regular Base.Matrix. Then when you index into dx[1, 1], it will be the same as indexing into dx[1]. You could even make dx[1] = ... error when you try to modify a value that was already set. For BlockDuals, you could further 'resolve' or 'normalise' the indices into a standard, 1-based scheme because you now know the type of the container (I think Base.to_indices does this, but not sure; in any case this should be easy to figure out). That allows you to use different indexing schemes to read the data, as long as they refer to the same elements.
A scheme like this would also enable 'native support' for not only OffsetArray, but also things like DimensionalData.DimArray which probably has way more legitimate uses than OffsetArray.
This looks like a lot of work for VNT to do, but we are already duplicating the entire array in the VNT*, so in principle I don't think performance should be any worse. Indexing into this would still be type stable (in the absence of blocks, but that's the same as currently) and you would still be able to do all the checks you need on whether the value was set or not.
In particular, I think that this should be fairly easy to implement for all AbstractArrays. (In fact, we know for a fact that it must be possible, because other packages do this successfully.) Extending this to dictionaries, or generally any type that supports getindex, would be annoying (for every new type you implement, you have to define the equivalent of a 'tangent type'†); it's doable, but it could be left for a later stage. I think the AbstractArray support would already solve 99% of pain points with this, and the path forwards to supporting other types would be clear, and when some type isn't supported, you can error with a very clear error message saying that "indexing into this type on LHS of tilde is not supported", whereas now our battle lines are quite arbitrarily drawn.
* OK, there is one case where this will do a lot more work than VNT currently: if the user has a very large array, like a length-100 vector, but only x[1] ~ ... and never the other elements. I don't really care about optimising performance for such cases. It is impossible to detect these without static analysis (and note that AD has a similar characteristic: if you differentiate f(x) = x[1] and pass in a length-100 vector, the gradient calculation will be equally unoptimised).
† Notice that the way VNT normalises all property access into NamedTuples is exactly what Mooncake's tangent_type does for structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that, since tilde_assume!! is currently our only 'hook' into modifying the VNT, this means that you have to pass x into tilde_assume!! as an argument. But I am not opposed to that at all. The alternative would be that you have to inspect the IR to determine when new arrays are being created.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a sound idea. VNT effectively carries a shadow copy for each LHS variable in tilde statements, and the dual-number mechanism used in autograd can be borrowed.
|
In fact, I would go even further and say that if VNT stores a copy of the array, there is no need for VarName concretisation at all. That is, The point is that, if someone wants to write Now, the original problem with VarInfo was that VarInfo never stored what Apart from this, I think there's genuinely no reason why using Turing, FlexiChains
@model function f()
x ~ Poisson(1.0)
y ~ MvNormal(zeros(x + 1), I)
end
chn = sample(f(), Prior(), 100; chain_type=VNChain)Now, one can argue until the cows come home about whether this is a 'meaningful' model (whatever that means). But the chain has no reason to care about this. Regardless of the model, once the chain is created, I should in principle be able to index into the chain with chn[@varname(x[end])]to obtain the last element of each julia> using AbstractPPL; @varname(x[end])
ERROR: UndefVarError: `x` not defined in `Main`The same considerations apply to This is pretty much the path I'm going down with TuringLang/AbstractPPL.jl#150. |
This PR primarily adds support for concretized slices to VarNamedTuple. That is actually all that is needed to make it support (I think) all the same
:syntax we previously supported in@model. Turns out we were already concretising allVarNames in the model body itself, so there was little to do.The PR also addresses some related issues, adds a test for
:s (we previously had zero models in our test suite that used:in an assume statement), and starts a HISTORY.md entry.If we are happy with all the new limitations on square bracket indexing, we could say that this makes VNT-for-LDF feature complete. However, adding support for dictionaries shouldn't be too hard, and maybe linear indexing wouldn't be awful either, so I would give those a day's worth of work before deciding that we just flat out ban them. Would be a shame to have a few DPPL versions for which they don't work, just to then add support in a few week's time.
To add support for the aforementioned cases (see also the HISTORY.md entry), my idea is that you would somehow provide VNT with information about which type of object this
VarNamethat has anIndexLensis indexing, and you could then do different things withmake_leafandPartialArraydepending on the type. That would solve theDictand linear indexing cases. Unusually indexed arrays would be harder, I don't know how to make e.g.OffsetArrays work without manually writing a version ofPartialArraythat explicitly uses them. Maybe there could be some way to do this generically for anyAbstractArraysubtype, but I worry that the interface forAbstractArraymay not be well-defined enough for that.