Early project pulls planet traversal images from Stellarium and contains two data sets. One is a bit mask of the planet. Another is a raw image dataset. We have also added functionalities to generate fourier and convoluted kernel variants of the data.
Stellarium download: https://stellarium.org/
Run pip install -r requirements.txt to install all dependencies.
Edit src/collect_data.scc for these things:
- Screenshot output directory, make sure that folder exists if not make one and change the directory appropriately.
- Choose the number of iterations the code will run for.
Next:
For python scripts, use python \<script name\> -h for detailed command-line argument help.
- Open Stellarium, disable satellite hints and meteor showers
- Pick a location
- Pick an FOV
- F12 to open scripts, load
src/collect_data.ssc - Run
src/collect_data.ssc, close the scripts menu so it doesn't show up in the image - Wait for termination
- Navigate to %appdata%/Stellarium, get output.txt, place in same folder as images
- Run
src/process.py - Run
src/helper.pyorsrc/featurize_fourier.py to augment and featurize the data.src/helper.pycompiles all the object bit masks into one overall .npz array and has functionality to increase the granularity of the data.src/featurize_fourier.pyextracts fourier features from the output of src/helper.py.
- For
src/featurize_fourier.pychange the file directory used to your personal directory containing the cleaned.npz files and make sure the data files are in the same directory assrc/featurize_fourier.py - Note*** Due to the way numpy stores complex numbers the fourier version of these files takes up a lot of space (~8GB per file) so be warned.
- The cleaned data in the .npz files can be downloaded from https://drive.google.com/drive/folders/1TFwxn_xk9RVnpkibTUW9kg-ef_b5GC0S?usp=sharing. This data was created using
src/helper.pywith an expand argument of 4.
Training a model:
An example model is included inside this repository, described at the bottom of this README. You can run this model after running src/process.py and src/helper.py or downloading the cleaned data .npz files in the Google Drive link above. Place them into a folder called "combined_data_matrix" and run src/train_basic.py. More detailed instructions are in the section of this README titled About our Model.
![]() |
![]() |
![]() |
| Raw Image | Bit Mask Images | Object Specific Mask Images (Jupiter) |
Inside Raw Images/ is an example output from running src/process.py on a collection of 25000 images taken from Stellarium. It contains all the outputs described in the write up, shown in the images above, and the provided helper functions can be run on this sample data.
Tutorial - What is a variational autoencoder?: https://jaan.io/what-is-variational-autoencoder-vae-tutorial/
This blog post describes the purpose and construction of variational autoencoders. This helps build intuition for understanding our choice of using the Beta-VAE.
Understanding Disentangling in Beta-VAE: https://arxiv.org/abs/1804.03599
This paper discusses the purpose of adding a beta term in front of the KL divergence term in the loss function of a VAE.
We decided to use a VAE to reconstruct images of the bitmasked data to demonstrate the viability of using our dataset for image to image tasks. We attempted to create smooth and disentangled traversals of the latent distributions as an auxiliary task. (Examples of smooth and disentangled traversals appear in the Understanding Disentangling in Beta-VAE paper.) The Beta-VAE allows us to enforce stricter disentanglement by weighing the KL Divergence term more heavily.
Our specific model works for our bitmasked images - 144x144 pixels and greyscale.
- (Optional) Change
fpathsinsrc/train_basic.pyfor planets to train the model on - Run
src/train_basic.py
Reconstruction Example
The images on the top row are the input images; the images on the bottom are the reconstructed images. Each image consists of a white sprite surrounded by black.
Traversal Example
The images on the top most row are the inputs; the images on the second to top most row are the reconstructions. The rows afterwards represent a traversal of the latent distribution. We think the traversals could be smoother and more disentangled with hyperparameter tuning and differing model complexities.See the Early Project Writeup for more details.



