-
Notifications
You must be signed in to change notification settings - Fork 77
Open
Description
Hi, I would like to implement the diffusion pretraining described in the paper. I understand that in ORB diffusion pretraining:
- Noise is added as
$x_\sigma = x_0 + \sigma * \epsilon $ - The model predicts a vector which is interpreted as
$\epsilon$ . So we train the model to match/learn the noise added to x, not to recover original position or other strategy. -
$\sigma$ is sampled log-uniformly from a range. - The loss is an MSE on
$\epsilon$ , weighted by$1 / \sigma^2 $
I’m a bit unsure about the following points
-
$\sigma$ is not passed as an explicit input (e.g. sinusoidal embedding) to the network, and the model is trained unconditionally across$\sigma$ . Is this correct? I.e. one single network, same parameters, trained on a mixture of noise levels. - Is the $1 / \sigma^2 $ factor the only sigma-dependent normalization?
- Is
$\sigma$ sampled per batch or per structure? - In the paper you say
$\sigma$ increases over time but you don't say what range/values were used. :)
Any clarification would be very helpful.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels