Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,112 @@

## 0.40

### Changes to indexing random variables with square brackets

0.40 internally reimplements how DynamicPPL handles random variables like `x[1]`, `x.y[2,2]`, and `x[:,1:4,5]`, i.e. ones that use indexing with square brackets.
Most of this is invisible to users, but it has some effects that show on the surface.
The gist of the changes is that any indexing by square brackets is now implicitly assumed to be indexing into a regular `Base.Array`, with 1-based indexing.
The general effect this has is that the new rules on what is and isn't allowed are stricter, forbidding some old syntax that used to be allowed, and at the same time guaranteeing that it works correctly.
(Previously there were some sharp edges around these sorts of variable names.)

#### No more linear indexing of multidimensional arrays

Previously you could do this:

```julia
x = Array{Float64,2}(undef, (2, 2))
x[1] ~ Normal()
x[1, 1] ~ Normal()
```

Now you can't, this will error.
If you first create a variable like `x[1]`, DynamicPPL from there on assumes that this variable only takes a single index (like a `Vector`).
It will then error if you try to index the same variable with any other number of indices.

The same logic also bans this, which likewise was previously allowed:

```julia
x = Array{Float64,2}(undef, (2, 2))
x[1, 1, 1] ~ Normal()
x[1, 1] ~ Normal()
```

This made use of Julia allowing trailing indices of `1`.

Note that the above models were previously quite dangerous and easy to misuse, because DynamicPPL was oblivious to the fact that e.g. `x[1]` and `x[1,1]` refer to the same element.
Both of the above examples previously created 2-dimensional models, with two distinct random variables, one of which effectively overwrote the other in the model body.

TODO(mhauru) This may cause surprising issues when using `eachindex`, which is generally encouraged, e.g.

```
x = Array{Float64,2}(undef, (3, 3)
for i in eachindex(x)
x[i] ~ Normal()
end
```

Maybe we should fix linear indexing before releasing?

#### No more square bracket indexing with arbitrary keys

Previously you could do this:

```julia
x = Dict()
x["a"] ~ Normal()
```

Now you can't, this will error.
This is because DynamicPPL now assumes that if you are indexing with square brackets, you are dealing with an `Array`, for which `"a"` is not a valid index.
You can still use a dictionary on the left-hand side of a `~` statement as long as the indices are valid indices to an `Array`, e.g. integers.

#### No more unusually indexed arrays, such as `OffsetArrays`

Previously you could do this

```julia
using OffsetArrays
x = OffsetArray(Vector{Float64}(undef, 3), -3)
x[-2] ~ Normal()
0.0 ~ Normal(x[-2])
```

Comment on lines +69 to +74
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it strikes me that a lot of the problems we have here are actually incredibly similar to the problems of creating shadow memory in AD packages. the idea is that inside the model there is an x, but inside the VNT there is a shadow copy of x as well. cc @yebai

Copy link
Member

@penelopeysm penelopeysm Dec 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one possible general solution to this is going to be something that's similar to what ForwardDiff does (at least for AbstractArray). Technically, it's even closer to a sparsity tracer.

When x is initialised to an OffsetArray{T}, we want the shadow of x (let's call it dx) to be initialised to the same OffsetArray{Dual{T}}. Each of these 'duals' needs to carry the value, along with the boolean mask indicating whether it was set or not. This is of course exactly the same as how PartialArray has one array of values and one array of masks.

struct Dual{T}
   val::T
   has_been_set::Bool
end
Dual(val) = Dual(val, false)

Then when x[idxs...] is set to whatever rand(Normal()) returns, we set dx[idxs...] = Dual(x[idxs...], true). Of course, the block elements will need the same workaround as before (complete with all the typejoin shenanigans).

struct BlockDual{T}
    val::T
    original_indices::whatever
end

Then, reading from the shadow memory becomes way more consistent, because indexing into dx carries the same semantics as indexing into x.

For another example, let's say that x and dx are regular Base.Matrix. Then when you index into dx[1, 1], it will be the same as indexing into dx[1]. You could even make dx[1] = ... error when you try to modify a value that was already set. For BlockDuals, you could further 'resolve' or 'normalise' the indices into a standard, 1-based scheme because you now know the type of the container (I think Base.to_indices does this, but not sure; in any case this should be easy to figure out). That allows you to use different indexing schemes to read the data, as long as they refer to the same elements.

A scheme like this would also enable 'native support' for not only OffsetArray, but also things like DimensionalData.DimArray which probably has way more legitimate uses than OffsetArray.

This looks like a lot of work for VNT to do, but we are already duplicating the entire array in the VNT*, so in principle I don't think performance should be any worse. Indexing into this would still be type stable (in the absence of blocks, but that's the same as currently) and you would still be able to do all the checks you need on whether the value was set or not.

In particular, I think that this should be fairly easy to implement for all AbstractArrays. (In fact, we know for a fact that it must be possible, because other packages do this successfully.) Extending this to dictionaries, or generally any type that supports getindex, would be annoying (for every new type you implement, you have to define the equivalent of a 'tangent type'†); it's doable, but it could be left for a later stage. I think the AbstractArray support would already solve 99% of pain points with this, and the path forwards to supporting other types would be clear, and when some type isn't supported, you can error with a very clear error message saying that "indexing into this type on LHS of tilde is not supported", whereas now our battle lines are quite arbitrarily drawn.


* OK, there is one case where this will do a lot more work than VNT currently: if the user has a very large array, like a length-100 vector, but only x[1] ~ ... and never the other elements. I don't really care about optimising performance for such cases. It is impossible to detect these without static analysis (and note that AD has a similar characteristic: if you differentiate f(x) = x[1] and pass in a length-100 vector, the gradient calculation will be equally unoptimised).

† Notice that the way VNT normalises all property access into NamedTuples is exactly what Mooncake's tangent_type does for structs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that, since tilde_assume!! is currently our only 'hook' into modifying the VNT, this means that you have to pass x into tilde_assume!! as an argument. But I am not opposed to that at all. The alternative would be that you have to inspect the IR to determine when new arrays are being created.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a sound idea. VNT effectively carries a shadow copy for each LHS variable in tilde statements, and the dual-number mechanism used in autograd can be borrowed.

Now you can't, this will error.
This is because DynamicPPL now assumes that if you are indexing with square brackes, you are dealing with an `Array`, for which `-2` is not a valid index.

#### The above limitations are not fundamental

The above, new restrictions to what sort of variable names are allowed aren't fundamental.
With some effort we could e.g. add support for linear indexing, this time done properly, so that e.g. `x[1,1]` and `x[1]` would be the same variable.
Likewise, we could manually add structures to support indexing into dictionaries or `OffsetArrays`.
If this would be useful to you, let us know.

#### This only affects `~` statements

You can still use any arbitrary indexing within your model in statements that don't involve `~`.
For instance, you can use `OffsetArray`s, or linear indexing, as long as you don't put them on the left-hand side of a `~`.

#### Performance benefits

The upside of all these new limitations is that models that use square bracket indexing are now faster.
For instance, take the following model

```julia
@model function f()
x = Vector{Float64}(undef, 1000)
for i in eachindex(x)
x[i] ~ Normal()
end
return 0.0 ~ Normal(sum(x))
end
```

Evaluating the log joint for this model has gotten about 3 times faster in v0.40.

#### Robustness benefits

TODO(mhauru) Add an example here for how this improves `condition`ing, once `condition` uses `VarNamedTuple`.

## 0.39.4

Removed the internal functions `DynamicPPL.getranges`, `DynamicPPL.vector_getrange`, and `DynamicPPL.vector_getranges` (the new LogDensityFunction construction does exactly the same thing, so this specialised function was not needed).
Expand Down
1 change: 1 addition & 0 deletions docs/src/internals/varnamedtuple.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ The typical use of this structure in DynamicPPL is that the user may define valu
This is also the reason why `PartialArray`, and by extension `VarNamedTuple`, do not support indexing by `Colon()`, i.e. `:`, as in `x[:]`.
A `Colon()` says that we should get or set all the values along that dimension, but a `PartialArray` does not know how many values there may be.
If `x[1]` and `x[4]` have been set, asking for `x[:]` is not a well-posed question.
Note however, that concretising the `VarName` resolves this ambiguity, and makes the `VarName` fine as a key to a `VarNamedTuple`.

`PartialArray`s have other restrictions, compared to the full indexing syntax of Julia, as well:
They do not support linearly indexing into multidimemensional arrays (as in `rand(3,3)[8]`), nor indexing with arrays of indices (as in `rand(4)[[1,3]]`), nor indexing with boolean mask arrays (as in `rand(4)[[true, false, true, false]]`).
Expand Down
1 change: 1 addition & 0 deletions src/test_utils.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
module TestUtils

using AbstractMCMC
using AbstractPPL: AbstractPPL
using DynamicPPL
using LinearAlgebra
using Distributions
Expand Down
111 changes: 111 additions & 0 deletions src/test_utils/models.jl
Original file line number Diff line number Diff line change
Expand Up @@ -565,6 +565,71 @@ function varnames(model::Model{typeof(demo_assume_matrix_observe_matrix_index)})
return [@varname(s), @varname(m)]
end

@model function demo_nested_colons(
x=(; data=[(; subdata=transpose([1.5 2.0;]))]), ::Type{TV}=Array{Float64}
) where {TV}
n = length(x.data[1].subdata)
d = n ÷ 2
s = (; params=[(; subparams=TV(undef, (d, 1, 2)))])
s.params[1].subparams[:, 1, :] ~ reshape(
product_distribution(fill(InverseGamma(2, 3), n)), d, 2
)
s_vec = vec(s.params[1].subparams)
# TODO(mhauru) The below element type concretisation is because of
# https://github.com/JuliaFolds2/BangBang.jl/issues/39
# which causes, when this is evaluated with an untyped VarInfo, s_vec to be an
# Array{Any}.
s_vec = [x for x in s_vec]
m ~ MvNormal(zeros(n), Diagonal(s_vec))

x.data[1].subdata[:, 1] ~ MvNormal(m, Diagonal(s_vec))

return (; s=s, m=m, x=x)
end
function logprior_true(model::Model{typeof(demo_nested_colons)}, s, m)
n = length(model.args.x.data[1].subdata)
# TODO(mhauru) We need to enforce a convention on whether this function gets called
# with the parameters as the model returns them, or with the parameters "unpacked".
# Currently different tests do different things.
s_vec = if s isa NamedTuple
vec(s.params[1].subparams)
else
vec(s)
end
return loglikelihood(InverseGamma(2, 3), s_vec) +
logpdf(MvNormal(zeros(n), Diagonal(s_vec)), m)
end
function loglikelihood_true(model::Model{typeof(demo_nested_colons)}, s, m)
# TODO(mhauru) We need to enforce a convention on whether this function gets called
# with the parameters as the model returns them, or with the parameters "unpacked".
# Currently different tests do different things.
s_vec = if s isa NamedTuple
vec(s.params[1].subparams)
else
vec(s)
end
return loglikelihood(MvNormal(m, Diagonal(s_vec)), model.args.x.data[1].subdata)
end
function logprior_true_with_logabsdet_jacobian(
model::Model{typeof(demo_nested_colons)}, s, m
)
return _demo_logprior_true_with_logabsdet_jacobian(model, s.params[1].subparams, m)
end
function varnames(::Model{typeof(demo_nested_colons)})
return [
@varname(
s.params[1].subparams[
AbstractPPL.ConcretizedSlice(Base.Slice(Base.OneTo(1))),
1,
AbstractPPL.ConcretizedSlice(Base.Slice(Base.OneTo(2))),
]
),
# @varname(s.params[1].subparams[1,1,1]),
# @varname(s.params[1].subparams[1,1,2]),
@varname(m),
]
end

const UnivariateAssumeDemoModels = Union{
Model{typeof(demo_assume_dot_observe)},
Model{typeof(demo_assume_dot_observe_literal)},
Expand Down Expand Up @@ -701,6 +766,51 @@ function rand_prior_true(rng::Random.AbstractRNG, model::MatrixvariateAssumeDemo
return vals
end

function posterior_mean(model::Model{typeof(demo_nested_colons)})
# Get some containers to fill.
vals = rand_prior_true(model)

vals.s.params[1].subparams[1, 1, 1] = 19 / 8
vals.m[1] = 3 / 4

vals.s.params[1].subparams[1, 1, 2] = 8 / 3
vals.m[2] = 1

return vals
end
function likelihood_optima(model::Model{typeof(demo_nested_colons)})
# Get some containers to fill.
vals = rand_prior_true(model)

# NOTE: These are "as close to zero as we can get".
vals.s.params[1].subparams[1, 1, 1] = 1e-32
vals.s.params[1].subparams[1, 1, 2] = 1e-32

vals.m[1] = 1.5
vals.m[2] = 2.0

return vals
end
function posterior_optima(model::Model{typeof(demo_nested_colons)})
# Get some containers to fill.
vals = rand_prior_true(model)

# TODO: Figure out exact for `s[1]`.
vals.s.params[1].subparams[1, 1, 1] = 0.890625
vals.s.params[1].subparams[1, 1, 2] = 1
vals.m[1] = 3 / 4
vals.m[2] = 1

return vals
end
function rand_prior_true(rng::Random.AbstractRNG, ::Model{typeof(demo_nested_colons)})
svec = rand(rng, InverseGamma(2, 3), 2)
return (;
s=(; params=[(; subparams=reshape(svec, (1, 1, 2)))]),
m=rand(rng, MvNormal(zeros(2), Diagonal(svec))),
)
end

"""
A collection of models corresponding to the posterior distribution defined by
the generative process
Expand Down Expand Up @@ -749,6 +859,7 @@ const DEMO_MODELS = (
demo_dot_assume_observe_submodel(),
demo_dot_assume_observe_matrix_index(),
demo_assume_matrix_observe_matrix_index(),
demo_nested_colons(),
)

"""
Expand Down
31 changes: 21 additions & 10 deletions src/varnamedtuple.jl
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ end
# The non-generated function implementations of these would be
# _has_colon(::T) where {T<:Tuple} = any(x <: Colon for x in T.parameters)
# function _is_multiindex(::T) where {T<:Tuple}
# return any(x <: UnitRange || x <: Colon for x in T.parameters)
# return any(x <: AbstractUnitRange || x <: Colon for x in T.parameters)
# end
# However, constant propagation sometimes fails if the index tuple is too big (e.g. length
# 4), so we play it safe and use generated functions. Constant propagating these is
Expand All @@ -39,18 +39,18 @@ end
@generated function _has_colon(::T) where {T<:Tuple}
for x in T.parameters
if x <: Colon
return :(true)
return :(return true)
end
end
return :(false)
return :(return false)
end
@generated function _is_multiindex(::T) where {T<:Tuple}
for x in T.parameters
if x <: UnitRange || x <: Colon
return :(true)
if x <: AbstractUnitRange || x <: Colon || x <: AbstractPPL.ConcretizedSlice
return :(return true)
end
end
return :(false)
return :(return false)
end

"""
Expand All @@ -74,7 +74,10 @@ _merge_recursive(_, x2) = x2
const PARTIAL_ARRAY_DIM_GROWTH_FACTOR = 4

"""A convenience for defining method argument type bounds."""
const INDEX_TYPES = Union{Integer,UnitRange,Colon}
const INDEX_TYPES = Union{Integer,AbstractUnitRange,Colon,AbstractPPL.ConcretizedSlice}

_unwrap_concretized_slice(cs::AbstractPPL.ConcretizedSlice) = cs.range
_unwrap_concretized_slice(x::Union{Integer,AbstractUnitRange,Colon}) = x

"""
ArrayLikeBlock{T,I}
Expand Down Expand Up @@ -376,7 +379,7 @@ end

"""Return the length needed in a dimension given an index."""
_length_needed(i::Integer) = i
_length_needed(r::UnitRange) = last(r)
_length_needed(r::AbstractUnitRange) = last(r)

"""Take the minimum size that a dimension of a PartialArray needs to be, and return the size
we choose it to be. This size will be the smallest possible power of
Expand Down Expand Up @@ -447,12 +450,16 @@ function _check_index_validity(pa::PartialArray, inds::NTuple{N,INDEX_TYPES}) wh
throw(BoundsError(pa, inds))
end
if _has_colon(inds)
throw(ArgumentError("Indexing PartialArrays with Colon is not supported"))
msg = """
Indexing PartialArrays with Colon is not supported.
You may need to concretise the `VarName` first."""
throw(ArgumentError(msg))
end
return nothing
end

function _getindex(pa::PartialArray, inds::Vararg{INDEX_TYPES})
inds = _unwrap_concretized_slice.(inds)
_check_index_validity(pa, inds)
if !(checkbounds(Bool, pa.mask, inds...) && all(@inbounds(getindex(pa.mask, inds...))))
throw(BoundsError(pa, inds))
Expand Down Expand Up @@ -501,6 +508,7 @@ function _getindex(pa::PartialArray, inds::Vararg{INDEX_TYPES})
end

function _haskey(pa::PartialArray, inds::NTuple{N,INDEX_TYPES}) where {N}
inds = _unwrap_concretized_slice.(inds)
_check_index_validity(pa, inds)
hasall =
checkbounds(Bool, pa.mask, inds...) && all(@inbounds(getindex(pa.mask, inds...)))
Expand Down Expand Up @@ -528,6 +536,7 @@ function _haskey(pa::PartialArray, inds::NTuple{N,INDEX_TYPES}) where {N}
end

function BangBang.delete!!(pa::PartialArray, inds::Vararg{INDEX_TYPES})
inds = _unwrap_concretized_slice.(inds)
_check_index_validity(pa, inds)
if _is_multiindex(inds)
pa.mask[inds...] .= false
Expand All @@ -537,7 +546,7 @@ function BangBang.delete!!(pa::PartialArray, inds::Vararg{INDEX_TYPES})
return pa
end

_ensure_range(r::UnitRange) = r
_ensure_range(r::AbstractUnitRange) = r
_ensure_range(i::Integer) = i:i

"""
Expand Down Expand Up @@ -580,6 +589,7 @@ function _needs_arraylikeblock(value, inds::Vararg{INDEX_TYPES})
end

function _setindex!!(pa::PartialArray, value, inds::Vararg{INDEX_TYPES})
inds = _unwrap_concretized_slice.(inds)
_check_index_validity(pa, inds)
pa = if checkbounds(Bool, pa.mask, inds...)
pa
Expand Down Expand Up @@ -733,6 +743,7 @@ The there are two major limitations to indexing by VarNamedTuples:

* `VarName`s with `Colon`s, (e.g. `a[:]`) are not supported. This is because the meaning of
`a[:]` is ambiguous if only some elements of `a`, say `a[1]` and `a[3]`, are defined.
However, _concretised_ `VarName`s with `Colon`s are supported.
* Any `VarNames` with IndexLenses` must have a consistent number of indices. That is, one
cannot set `a[1]` and `a[1,2]` in the same `VarNamedTuple`.

Expand Down
6 changes: 5 additions & 1 deletion test/logdensityfunction.jl
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,11 @@ end
end
ldf = DynamicPPL.LogDensityFunction(m, DynamicPPL.getlogjoint_internal, vi)
x = vi[:]
@inferred LogDensityProblems.logdensity(ldf, x)
# The below type inference fails on v1.10.
@test begin
@inferred LogDensityProblems.logdensity(ldf, x)
true
end skip = (VERSION < v"1.11.0")
end
end
end
Expand Down
Loading