Skip to content

Commit cd7542a

Browse files
committed
breath ...
1 parent 618b83a commit cd7542a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+135273
-824
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
/dist/
55

66
# Cache
7+
.cache/
78
.coverage*
89
.mypy_cache/
910
.pytest_cache/
@@ -21,7 +22,6 @@ poetry.lock
2122

2223
# Project
2324
/docs/*
24-
/outputs/*
2525
!**/.gitkeep
2626

2727
# Python

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# See https://pre-commit.com/hooks.html for more hooks
33

44
default_language_version:
5-
python: python3.11
5+
python: python3.12
66
repos:
77
# commons
88
- repo: https://github.com/pre-commit/pre-commit-hooks

.python-version

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
3.11
1+
3.12

Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# https://docs.docker.com/engine/reference/builder/
22

33
# Define
4-
FROM python:3.11
4+
FROM python:3.12
55

66
# Install
77
COPY dist/*.whl .
88
RUN pip install --no-cache-dir *.whl
99

1010
# Execute
11-
CMD ["wines", "--help"]
11+
CMD ["bikes", "--help"]

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ This section details the requirements, actions, and next steps to kickstart your
9494

9595
## Prerequisites
9696

97-
- [Python>=3.11](https://www.python.org/downloads/) (to benefit from [the latest features and performance improvements](https://docs.python.org/3/whatsnew/3.11.html))
97+
- [Python>=3.12](https://www.python.org/downloads/) (to benefit from [the latest features and performance improvements](https://docs.python.org/3/whatsnew/3.12.html))
9898
- [Poetry>=1.5.1](https://python-poetry.org/) (to initialize the project [virtual environment](https://docs.python.org/3/library/venv.html) and its dependencies)
9999

100100
## Installation
@@ -813,7 +813,7 @@ class InputsSchema(Schema):
813813
proanthocyanins: papd.Series[float] = pa.Field(gt=0, lt=10)
814814
color_intensity: papd.Series[float] = pa.Field(gt=0, lt=100)
815815
hue: papd.Series[float] = pa.Field(gt=0, lt=10)
816-
od280_od315_of_diluted_wines: papd.Series[float] = pa.Field(gt=0, lt=10)
816+
od280_od315_of_diluted_bikes: papd.Series[float] = pa.Field(gt=0, lt=10)
817817
proline: papd.Series[float] = pa.Field(gt=0, lt=10000)
818818
```
819819

confs/inference.yaml

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
job:
22
KIND: InferenceJob
33
inputs:
4-
KIND: ParquetDataset
4+
KIND: ParquetReader
55
path: data/inputs.parquet
6-
output:
7-
KIND: ParquetDataset
8-
path: outputs/output.parquet
9-
model_path: outputs/model.joblib
6+
outputs:
7+
KIND: ParquetWriter
8+
path: outputs/outputs.parquet
9+
loader:
10+
KIND: JoblibLoader
11+
model_path: outputs/model.joblib

confs/training.yaml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
job:
22
KIND: TrainingJob
33
inputs:
4-
KIND: ParquetDataset
4+
KIND: ParquetReader
55
path: data/inputs.parquet
6-
target:
7-
KIND: ParquetDataset
8-
path: data/target.parquet
9-
output_model: outputs/model.joblib
6+
saver:
7+
KIND: JoblibSaver
8+
path: outputs/model.joblib

confs/tuning.yaml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
job:
22
KIND: TuningJob
33
inputs:
4-
KIND: ParquetDataset
4+
KIND: ParquetReader
55
path: data/inputs.parquet
6-
target:
7-
KIND: ParquetDataset
8-
path: data/target.parquet
9-
output_results: outputs/results.csv
6+
outputs:
7+
KIND: CSVWriter
8+
path: outputs/results.csv

data/Readme.txt

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
==========================================
2+
Bike Sharing Dataset
3+
==========================================
4+
5+
Hadi Fanaee-T
6+
7+
Laboratory of Artificial Intelligence and Decision Support (LIAAD), University of Porto
8+
INESC Porto, Campus da FEUP
9+
Rua Dr. Roberto Frias, 378
10+
4200 - 465 Porto, Portugal
11+
12+
https://archive.ics.uci.edu/dataset/275/bike+sharing+dataset
13+
14+
=========================================
15+
Background
16+
=========================================
17+
18+
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return
19+
back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return
20+
back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of
21+
over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic,
22+
environmental and health issues.
23+
24+
Apart from interesting real world applications of bike sharing systems, the characteristics of data being generated by
25+
these systems make them attractive for the research. Opposed to other transport services such as bus or subway, the duration
26+
of travel, departure and arrival position is explicitly recorded in these systems. This feature turns bike sharing system into
27+
a virtual sensor network that can be used for sensing mobility in the city. Hence, it is expected that most of important
28+
events in the city could be detected via monitoring these data.
29+
30+
=========================================
31+
Data Set
32+
=========================================
33+
Bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions,
34+
precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors. The core data set is related to
35+
the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is
36+
publicly available in http://capitalbikeshare.com/system-data. We aggregated the data on two hourly and daily basis and then
37+
extracted and added the corresponding weather and seasonal information. Weather information are extracted from http://www.freemeteo.com.
38+
39+
=========================================
40+
Associated tasks
41+
=========================================
42+
43+
- Regression:
44+
Predication of bike rental count hourly or daily based on the environmental and seasonal settings.
45+
46+
- Event and Anomaly Detection:
47+
Count of rented bikes are also correlated to some events in the town which easily are traceable via search engines.
48+
For instance, query like "2012-10-30 washington d.c." in Google returns related results to Hurricane Sandy. Some of the important events are
49+
identified in [1]. Therefore the data can be used for validation of anomaly or event detection algorithms as well.
50+
51+
52+
=========================================
53+
Files
54+
=========================================
55+
56+
- Readme.txt
57+
- hour.csv : bike sharing counts aggregated on hourly basis. Records: 17379 hours
58+
- day.csv - bike sharing counts aggregated on daily basis. Records: 731 days
59+
60+
61+
=========================================
62+
Dataset characteristics
63+
=========================================
64+
Both hour.csv and day.csv have the following fields, except hr which is not available in day.csv
65+
66+
- instant: record index
67+
- dteday : date
68+
- season : season (1:springer, 2:summer, 3:fall, 4:winter)
69+
- yr : year (0: 2011, 1:2012)
70+
- mnth : month ( 1 to 12)
71+
- hr : hour (0 to 23)
72+
- holiday : weather day is holiday or not (extracted from http://dchr.dc.gov/page/holiday-schedule)
73+
- weekday : day of the week
74+
- workingday : if day is neither weekend nor holiday is 1, otherwise is 0.
75+
+ weathersit :
76+
- 1: Clear, Few clouds, Partly cloudy, Partly cloudy
77+
- 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
78+
- 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
79+
- 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
80+
- temp : Normalized temperature in Celsius. The values are divided to 41 (max)
81+
- atemp: Normalized feeling temperature in Celsius. The values are divided to 50 (max)
82+
- hum: Normalized humidity. The values are divided to 100 (max)
83+
- windspeed: Normalized wind speed. The values are divided to 67 (max)
84+
- casual: count of casual users
85+
- registered: count of registered users
86+
- cnt: count of total rental bikes including both casual and registered
87+
88+
=========================================
89+
License
90+
=========================================
91+
Use of this dataset in publications must be cited to the following publication:
92+
93+
[1] Fanaee-T, Hadi, and Gama, Joao, "Event labeling combining ensemble detectors and background knowledge", Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg, doi:10.1007/s13748-013-0040-3.
94+
95+
@article{
96+
year={2013},
97+
issn={2192-6352},
98+
journal={Progress in Artificial Intelligence},
99+
doi={10.1007/s13748-013-0040-3},
100+
title={Event labeling combining ensemble detectors and background knowledge},
101+
url={http://dx.doi.org/10.1007/s13748-013-0040-3},
102+
publisher={Springer Berlin Heidelberg},
103+
keywords={Event labeling; Event detection; Ensemble learning; Background knowledge},
104+
author={Fanaee-T, Hadi and Gama, Joao},
105+
pages={1-15}
106+
}
107+
108+
=========================================
109+
Contact
110+
=========================================
111+
112+
For further information about this dataset please contact Hadi Fanaee-T (hadi.fanaee@fe.up.pt)

0 commit comments

Comments
 (0)