I bet @ryanlovett knows how this one works! relevant section here: https://github.com/choldgraf/dsep_stack/blob/master/tech/data.md#larger-datasets