Skip to content

Todo before manuscript draft #2

@dyurovsky

Description

@dyurovsky
  • 1. Compute barycenters losses for sizes 1-5
  • 2. Use the bend in the knee metric to pick a barycenter size (prob 5 or 6)
  • 3. get barycenter loss between global barycenter estimated from hclust and every language's barycenter for unigram and trigrams. Use this to see if there's more variability in unigram curves. Some kind of baseline?
  • 4. Try doing this within and across language family
  • 5. make a nice hclust visualization. color languages by family. zoom in on part?
  • 6. finish up MI analyses. come to a conclusion
  • 7. Study 2 needs to be fleshed out
  • 8. Discussion/Conclusion
  • 9. Reproduce Yu et al. plot
  • 10. English Wikipedia
  • 11. k-fold cross-validation splitting files
  • 12. k-fold cross-validation storing models
  • 13. k-fold cross-validation storing outputs
  • 14. Maurits, Perfors and Navarro (2010) UID deviation metric
  • 15. Centering the curves somehow
  • 16. Making the repo approachable
  • 17. If you want to replicate our pipeline, what should you do?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions