Skip to content

Current .h5 dataset loading mechanism is problematic #3

@henrysky

Description

@henrysky

Currently, this is viewed as a low priority performance related issue. Probably wont be fixed in near future

System information

  • Have I written custom code?: Irrelevant
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04 or Windows 10 v1709 x64): Irrelevant
  • astroNN (Build or Version): commit 29fde34
  • TensorFlow installed from (source or binary, official build?): Irrelevant
  • TensorFlow version: Irrelevant
  • Keras version: Irrelevant
  • Python version: Irrelevant
  • CUDA/cuDNN version (Only neccessary if you are using Tensorflow-gpu): Irrelevant
  • GPU model and memory (Only neccessary if you are using Tensorflow-gpu): Irrelevant
  • Exact command/script to reproduce (optional): Irrelevant

Describe the problem

Current .h5 dataset loading mechanism is problematic due to the fact that astroNN load the whole dataset into memory regardless of the size. It will eventually be a serious problem if the dataset is too big and have too little memory (Already a little problem of loading APOGEE training data (~12GB on my 16GB RAM laptop and desktop)

Source code / logs

Irrelevant

Suggestion

Neural Network/Data generator should talk to H5Loader directly instead of H5Loader loads the whole dataset to memory to Neural Network/Data generator.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions