You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Data Validation Engine
2
2
3
-
The Data Validation Engine (DVE) is a configuration driven data validation library built and utilised by NHS England.
3
+
The Data Validation Engine (DVE) is a configuration driven data validation library built and utilised by NHS England. Currently the package has been reverted from v1.0.0 release to a 0.x as we feel the package is not yet mature enough to be considered a 1.0.0 release. So please bear this in mind if reading through the commits and references to a v1+ release when on v0.x.
4
4
5
5
As mentioned above, the DVE is "configuration driven" which means the majority of development for you as a user will be building a JSON document to describe how the data will be validated. The JSON document is known as a `dischema` file and example files can be accessed [here](./tests/testdata/). If you'd like to learn more about JSON document and how to build one from scratch, then please read the documentation [here](./docs/).
6
6
@@ -9,7 +9,7 @@ Once a dischema file has been defined, you are ready to use the DVE. The DVE is
9
9
|| Service | Purpose |
10
10
| -- | ------- | ------- |
11
11
| 1. | File Transformation | This service will take submitted files and turn them into stringified parquet file(s) to ensure that a consistent data structure can be passed through the other services. |
12
-
| 2. | Data Contract | This service will validate and peform type casting against a stringified parquet file using [pydantic models](https://docs.pydantic.dev/1.10/). |
12
+
| 2. | Data Contract | This service will validate and perform type casting against a stringified parquet file using [pydantic models](https://docs.pydantic.dev/1.10/). |
13
13
| 3. | Business Rules | The business rules service will perform more complex validations such as comparisons between fields and tables, aggregations, filters etc to generate new entities. |
14
14
| 4. | Error Reports | The error reports service will take all the errors raised in previous services and surface them into a readable format for a downstream users/service. Currently, this implemented to be an excel spreadsheet but could be reconfigured to meet other requirements/use cases. |
15
15
@@ -21,7 +21,7 @@ Additionally, if you'd like to contribute a new backend implementation into the
21
21
22
22
## Installation and usage
23
23
24
-
The DVE is a Python package and can be installed using `pip`. As of release v0.1.0 we currently only supports Python 3.7, with Spark version 3.2.1 and DuckDB version of 1.1.0. We are currently working on upgrading the DVE to work on Python 3.11+ and this will be made available asap with version 1.0.0 release.
24
+
The DVE is a Python package and can be installed using `pip`. As of release v0.1.x we currently only supports Python 3.7, with Spark version 3.2.1 and DuckDB version of 1.1.0. We are currently working on upgrading the DVE to work on Python 3.10-3.11 and this will be made available with version v0.2.x release.
25
25
26
26
In addition to a working Python 3.7+ installation you will need OpenJDK 11 installed if you're planning to use the Spark backend implementation.
27
27
@@ -33,7 +33,7 @@ To install the DVE package you can simply install using a package manager such a
Once you have installed the DVE you are ready to use it. For guidance on how to create your dischema json document (configuration), please read the [documentation](./docs/).
36
+
Once you have installed the DVE you are ready to use it. For guidance on how to create your dischema JSON document (configuration), please read the [documentation](./docs/).
37
37
38
38
Please note - The long term aim is to make the DVE available via PyPi and Conda but we are not quite there yet. Once available this documentation will be updated to contain the new installation options.
39
39
@@ -49,7 +49,7 @@ Below is a list of features that we would like to implement or have been request
49
49
| Feature | Release Version | Released? |
50
50
| ------- | --------------- | --------- |
51
51
| Open source release | 0.1.0 | Yes |
52
-
| Uplift to Python 3.11 |1.0.0 |No|
52
+
| Uplift to Python 3.11 |0.2.0 |Yes|
53
53
| Upgrade to Pydantic 2.0 | Not yet confirmed | No |
54
54
| Create a more user friendly interface for building and modifying dischema files | Not yet confirmed | No |
Copy file name to clipboardExpand all lines: docs/detailed_guidance/domain_types.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,24 +4,24 @@ Domain types are custom defined pydantic types that solve common problems with u
4
4
This might include Postcodes, NHS Numbers, dates with specific formats etc.
5
5
6
6
Below is a list of defined types, their output type and any contraints. Nested beneath them are any constraints that area allowed and their default values if there are any.
7
-
| Defined Type | Output Type | Contraints & Defaults |
0 commit comments