Skip to content
This repository was archived by the owner on Aug 25, 2024. It is now read-only.

Commit 5cce8a9

Browse files
author
John Andersen
committed
docs: Reargange
Signed-off-by: John Andersen <john.s.andersen@intel.com>
1 parent 1c94599 commit 5cce8a9

File tree

16 files changed

+319
-324
lines changed

16 files changed

+319
-324
lines changed

README.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Data Flow Facilitator for Machine Learning (dffml)
2+
3+
[![Build Status](https://travis-ci.org/intel/dffml.svg?branch=master)](https://travis-ci.org/intel/dffml) [![CII](https://bestpractices.coreinfrastructure.org/projects/2594/badge)](https://bestpractices.coreinfrastructure.org/projects/2594)
4+
5+
DFFML provides APIs for dataset generation and storage, and model definition
6+
using any machine learning framework, from high level down to low level use is
7+
supported.
8+
9+
The goal of DFFML is to build a community driven library of plugins for dataset
10+
generation and model definition. So that we as developers and researchers can
11+
quickly and easily plug and play various pieces of data with various model
12+
implementations.
13+
14+
DFFML allows users to take advantage of Python's `asyncio` library in order to
15+
build applications which interact in deterministic ways with external data
16+
sources and syncs (think pub/sub, websockets, grpc streams). Writing with
17+
`asyncio` in the loop (huh-huh) makes testing MUCH MUCH EASIER. `asyncio` usage
18+
also means that when generating datasets, we can do everything concurrently (or
19+
in parallel if you want to us an executor) making generating a dataset from
20+
scratch very fast, and best of all, clean error handling if things go wrong.
21+
22+
Here's a quick demo showing how DFFML can be used to train on the iris dataset.
23+
The more we build up the library of plugins (which anyone can maintain, they
24+
don't have to be contributed upstream unless you want to) the more variations on
25+
model implementations and feature data generators we all have to work with.
26+
27+
![Demo](https://github.com/intel/dffml/raw/master/docs/images/iris_demo.gif)
28+
29+
Right now we've released a wrapper around the Tensorflow DNN estimator, and a
30+
set of feature generators which gather data from git repositories.
31+
32+
## Installation
33+
34+
DFFML currently should work with Python 3.6. However, only Python 3.7 is
35+
officially supported. This is because there are a lot of nice helper methods
36+
Python 3.7 implemented that we intend to use instead of re-implementing.
37+
38+
```python
39+
python3.7 -m pip install -U dffml
40+
```
41+
42+
You can also install the Features for Git Version Control, and Models for
43+
Tensorflow Library all at once.
44+
45+
- [DFFML Features for Git Version Control](feature/git/README.md)
46+
- [DFFML Models for Tensorflow Library](model/tensorflow/README.md)
47+
48+
If you want a quick how to on the iris dataset head to the
49+
[DFFML Models for Tensorflow Library](model/tensorflow/README.md) repo.
50+
51+
```python
52+
python3.7 -m pip install -U dffml[git,tensorflow]
53+
```
54+
55+
## Usage
56+
57+
See [DFFML Models for Tensorflow Library](model/tensorflow/README.md) repo
58+
until documentation here is updated with a generic example.
59+
60+
## Documentation
61+
62+
Start with [Architechture](docs/ARCHITECHTURE.md).
63+
64+
## License
65+
66+
dffml is distributed under the [MIT License](LICENSE).
67+
68+
## Legal
69+
70+
> This software is subject to the U.S. Export Administration Regulations and
71+
> other U.S. law, and may not be exported or re-exported to certain countries
72+
> (Cuba, Iran, Crimea Region of Ukraine, North Korea, Sudan, and Syria) or to
73+
> persons or entities prohibited from receiving U.S. exports (including
74+
> Denied Parties, Specially Designated Nationals, and entities on the Bureau
75+
> of Export Administration Entity List or involved with missile technology or
76+
> nuclear, chemical or biological weapons).

README.rst

Lines changed: 0 additions & 260 deletions
This file was deleted.

docs/ABOUT.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Is DFFML Right For Me?
2+
3+
If you answer yes to any of these questions DFFML can make your life easier.
4+
5+
- Dataset Generation
6+
7+
- Need to generate a dataset
8+
- Need to run asynchronous operations in order to gather dataset (http
9+
requests, interaction with command line utilities, etc.)
10+
11+
- Models
12+
13+
- Want to quickly prototype how machine learning could be used on a dataset
14+
without writing a model
15+
- Need to write a finely tuned model by interacting with low level APIs of
16+
popular machine learning frameworks.
17+
18+
- Storage
19+
20+
- Need a way to use datasets which could be stored in different locations or
21+
formats.
22+
23+
# About
24+
25+
DFFML facilitates data generation, model creation, and use of models via
26+
services. See [Architecture](ARCHITECTURE.md) to learn how it works.
27+
28+
- Facilitates data collection, model creation, and use of models via services.
29+
- Provides plumbing to facilitate the collection of feature data to create
30+
datasets.
31+
- Allows developers to define their ML models via a standardized API.
32+
33+
- This let's users try different libraries / models to compare performance.
34+
35+
- Plugin based
36+
37+
- Features which gather feature data (Number of Git Authors, etc.)
38+
- Models which expose ML models via the standard API (Tensorflow, Scikit,
39+
etc.)
40+
- Sources which load and store feature data (CSV, JSON, MySQL, etc.)
41+
42+
The plumbing DFFML provides enables users to swap out models and features,
43+
in order to quickly prototype.

0 commit comments

Comments
 (0)