implementing a database interface

# the issue

ScaleWiz currently manages data storage by looking for JSON files. 

It performs some bulk data mutation and serialization occasionally. 
When updates to data do occur, the entire data file must be written and read from again. This is inefficient.

Further, the lack of a centralized store means data can be duplicated 2-3 times in memory (`TestHandler`, `ProjectEditor`, `EvaluationWindow`) , with each copy pending being mutated by some UI widget.  Problematic operations like this are currently disabled on the GUI layer.

Competing edits could result in data loss, so it would be better to just use a real database.
Data security here should be worth the overhead of having to make database retrievals/insertions.

# possible solutions


The two best options appear to be [MongoDB](https://www.mongodb.com/), and [TinyDB](https://github.com/msiemens/tinydb).

MongoDB would be the more comprehensive, robust option. However, it requires talking to an actual instance of MongoDB either locally or remotely. This makes ScaleWiz harder to install/setup initially, with some potential long term benefits.

TinyDB is a less comprehensive option, but is user-friendly and easily extensible. It is pure Python, and doesn't require talking to a external process / server instance. ScaleWiz could still simply be pip installed.

Both options work by serializing data in a very dict-compatible format. MongoDB uses BSON and TinyDB uses JSON (or YAML, etc.) ScaleWiz models are already stored in JSON so migrating to it should be pretty easy.  Migrating from TinyDB to something like MongoDB should be similarly easy if the need arises.

# the work 

Resolving this issue has a couple implementation goals
1. structuring the database
2. hooking up to the database
3. port the existing data to the new database

## structure 

One of the first and most important implementation details to sort out will be how to actually find the documents once they are in the database. Right now we rely on the user navigating the filesystem to show us where the relevant chunk of JSON is. This file's path is effectively its 'universal identifier'.

A possible solution is to add to each `Project` a `uuid`, which could be stored as a hex string such as 

`1d849bddb8d14f4da6743770033e463f`

We can generate these on the fly relatively cheaply. they are almost guaranteed to be collision free.

A TinyDB instance could be defined in the __init__.py of the package for easy access. All modifications to it should be passed through the main tkinter thread though (much like the GUI updates)

getting started will be kind of awkward. right now we rely on the user to navigate the filesystem to point us to the data. instead we will need to be able to query for it reliably. we also need to be able to relay the visual representation of the available data to the user. probably with something like a [treeview](https://tkdocs.com/tutorial/tree.html)
![image](https://user-images.githubusercontent.com/1144380/121220238-0dd30380-c84a-11eb-8581-345f0476736a.png)


also see the difference between these possible schemas 


https://github.com/teauxfu/scalewiz/blob/5523105262126654d62f7c37ed70d81190302bf6/sampledb.json#L1-L42

https://github.com/teauxfu/scalewiz/blob/5523105262126654d62f7c37ed70d81190302bf6/sampledb.yaml#L1-L33

[JSON vs YAML as a format](https://stackoverflow.com/questions/1726802/what-is-the-difference-between-yaml-and-json?rq=1)

## hooking up

the current data cycle is like 

JSON file -> Python dict -> Python object (tkinter variable model) -> Python dict -> JSON

this isn't very elegant, but the tkinter variables are easy to add traces to (fire a callback when the variable is updated).
this affords some nice functionality
- makes it easier to keep certain properties up to date (eg. `Test.name` and `Project.name`)
- it is easy to update chunks of the underlying data based in UI inputs (eg. updating scores when adding blanks to a report)

related #21 

a possibly better two-way flow might be like 

database (manages its own file, we don't care what kind)  <->  Python object (tkinter vars)

a pain point in the UI previously has been keeping multiple copies of the same `Project` in memory, each with their own view, and having to do lots of file read/writing to rebuild the UI

- the `TestHandler` which must read experiment params from the `Project`
  - it occasionally appends `Tests` to the `Project`, and can create new `Project`s but never independently mutates the metadata on the `Project` nor its existing `Test`s
- the `ProjectEditor` can make new `Project`s and read/write all of a `Project` metadata. cannot mutate existing `Test`s
- the `EvaluationWindow` can read/write to the list of `Test`s in a `Project`, read/write a couple `Project` metadata fields (`plot` and `path` iirc)

the new database implementation should 

1. avoid having to duplicate the `Project` object in this way
2. read from the database to `build` the GUI
3. update the database when appropriate

updates could be done immediately, or in batches by keeping a memo of update operations and sending them all off when the user clicks save etc. often it makes sense to let the user preview changes before persisting them

## porting

Whatever the implementation ends up being, it must provide a path to migrate older data. There seems to be two options here

1. migrate all the data up front, using a custom GUI/CLI tool to dynamically build a database from old JSON files. kind of a pain but not really hard to do. more work upfront, but overall a cleaner solution.
2. try to patch over pieces of JSON on the fly somehow, maybe adding an 'Import' style button/feature to ScaleWiz. this would require less work up front but would certainly be tacky and add bloat.

The data as it is currently stored doesn't use a specific JSON schema. At the Python/tkinter layer we just hope the right fields are there, using the `dict`'s `get` method to try to find the right key while offering fallbacks. eg.
```py
data = json.load('my_data.json')
self.property.set(data.get('my_key', data.get('myKey'))
```

This is somewhat useful for kind-of correcting old JSON files on the fly, but also cumbersome. Especially as each undocumented change to the non-existent schema has to be added in as extra Python logic. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

implementing a database interface #24

the issue

possible solutions

the work

structure

hooking up

porting

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

implementing a database interface #24

Description

the issue

possible solutions

the work

structure

hooking up

porting

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions