Hello,
Thanks for sharing your insight and code. I tried to run your code but found NYT corpus dataset is no longer available
https://catalog.ldc.upenn.edu/LDC2008T19 . Do you know any mirror repositories of this dataset available?
In any case, thanks for your patient and help!