-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Hi,
Thanks for sharing this great piece of software!
I have been playing around with it and noticed that when launching the training of a vit model (specifically a VIT-M, launched with beast train ..), all modules are in eval mode:
| Name | Type | Params | Mode
------------------------------------------
0 | vit_mae | ViTMAE | 111 M | eval
------------------------------------------
111 M Trainable params
252 K Non-trainable params
111 M Total params
447.631 Total estimated model params size (MB)
0 Modules in train mode
353 Modules in eval modeI was wondering why is this? My understanding from reading the paper and the code is that we are pretraining a transformer with unlabelled frames. But wouldn't the eval mode prevent any gradient computation? The tensorboard logs seem to indicate learning is happening though. What am I missing?
Thanks for any tips and for the work, it is very exciting research!
Metadata
Metadata
Assignees
Labels
No labels