-
Couldn't load subscription status.
- Fork 26
Open
Description
I would like to implement the algorithm for grokfast, which is an exponentially weighted mean of past gradients added to the current gradients, with GradCache. I've been able to use it without GradCache, but I'm confused where it could be implemented with GradCache as I'm still learning the underlying mechanisms of GradCache. Any direction on how this might be done? Also curious if this would be an appropriate feature to this library
Metadata
Metadata
Assignees
Labels
No labels