Skip to content

Commit 0163077

Browse files
Merge pull request #386 from jpventura/refactor/bike-share
docs(bike-share): documentation to markdown syntax
2 parents c29aca1 + 5588144 commit 0163077

File tree

2 files changed

+101
-111
lines changed

2 files changed

+101
-111
lines changed
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Bike Sharing Dataset
2+
3+
## Author
4+
5+
[Hadi Fanaee-T](mailto:hadi.fanaee@fe.up.pt)
6+
7+
[Laboratory of Artificial Intelligence and Decision Support (LIAAD)][liad-homepage-url]
8+
9+
[INESC Porto, Campus da FEUP][feup-address-url] at [University of Porto][uporto-url]
10+
11+
12+
## Background
13+
14+
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return back at another position.
15+
16+
Currently, there are about over 500 bike-sharing programs around the world which is composed of over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic, environmental and health issues. Apart from interesting real world applications of bike sharing systems, the characteristics of data being generated by these systems make them attractive for the research.
17+
18+
Opposed to other transport services such as bus or subway, the duration of travel, departure and arrival position is explicitly recorded in these systems. This feature turns bike sharing system into a virtual sensor network that can be used for sensing mobility in the city.
19+
20+
Hence, it is expected that most of important events in the city could be detected via monitoring these data.
21+
22+
## Data Set
23+
24+
Bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions, precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors.
25+
26+
The core data set is related to the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is publicly available in the [Capital Bikeshare system data][capitalbikeshare-url].
27+
28+
We aggregated the data on two hourly and daily basis and then extracted and added the corresponding weather and seasonal information. Weather information are extracted from [Free Metereology][freemeteo-url].
29+
30+
## Associated tasks
31+
32+
- **Regression**:
33+
34+
- Prediction of bike rental count hourly or daily based on the environmental and seasonal settings.
35+
36+
- **Event and Anomaly Detection**:
37+
38+
- Count of rented bikes are also correlated to some events in the town which easily are traceable via search engines.
39+
- For instance, query like "2012-10-30 washington d.c." in Google returns related results to Hurricane Sandy.
40+
- Some of the important events are identified. <sup>[1](#License)</sup>.
41+
- Therefore the data can be used for validation of anomaly or event detection algorithms as well.
42+
43+
## Files
44+
45+
- [`README.md`](./README.md)
46+
- [`hour.csv`](./hour.csv): bike sharing counts aggregated on hourly basis. Records: 17379 hours
47+
- [`day.csv`](day.csv): bike sharing counts aggregated on daily basis. Records: 731 days
48+
49+
## Dataset characteristics
50+
51+
Both [`hour.csv`](./hour.csv) and [`day.csv`](day.csv) have the following fields, except hr which is not available in [`day.csv`](day.csv).
52+
53+
- `instant`: record index
54+
- `dteday`: date
55+
- `season`: season (1:springer, 2:summer, 3:fall, 4:winter)
56+
- `yr`: year (0: 2011, 1:2012)
57+
- `mnth`: month (1 to 12)
58+
- `hr`: hour (0 to 23)
59+
- `holiday`: weather [day is holiday or not][holiday-schedule-url]
60+
- `weekday`: day of the week
61+
- `workingday`: if day is neither weekend nor holiday is 1, otherwise is 0.
62+
- `weathersit`:
63+
1. Clear, Few clouds, Partly cloudy, Partly cloudy
64+
2. Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
65+
3. Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
66+
4. Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
67+
- `temp`: Normalized temperature in Celsius. The values are divided to 41 (max)
68+
- `atemp`: Normalized feeling temperature in Celsius. The values are divided to 50 (max)
69+
- `hum`: Normalized humidity. The values are divided to 100 (max)
70+
- `windspeed`: Normalized wind speed. The values are divided to 67 (max)
71+
- `casual`: count of casual users
72+
- `registered`: count of registered users
73+
- `cnt`: count of total rental bikes including both casual and registered
74+
75+
## [License](#1)
76+
77+
Use of this dataset in publications must be cited to the following publication:
78+
79+
[1] Fanaee-T, Hadi, and Gama, Joao, "Event labeling combining ensemble detectors and background knowledge", _Progress in Artificial Intelligence_ (2013): pp. 1-15, Springer Berlin Heidelberg, doi:10.1007/s13748-013-0040-3.
80+
81+
82+
@article{
83+
year={2013},
84+
issn={2192-6352},
85+
journal={Progress in Artificial Intelligence},
86+
doi={10.1007/s13748-013-0040-3},
87+
title={Event labeling combining ensemble detectors and background knowledge},
88+
url={http://dx.doi.org/10.1007/s13748-013-0040-3},
89+
publisher={Springer Berlin Heidelberg},
90+
keywords={Event labeling; Event detection; Ensemble learning; Background knowledge},
91+
author={Fanaee-T, Hadi and Gama, Joao},
92+
pages={1-15}
93+
}
94+
95+
96+
[capitalbikeshare-url]: http://capitalbikeshare.com/system-data
97+
[feup-address-url]: https://goo.gl/maps/HykYYt8ifSCPPsQb8
98+
[freemeteo-url]: http://www.freemeteo.com
99+
[holiday-schedule-url]: http://dchr.dc.gov/page/holiday-schedule
100+
[liad-homepage-url]: https://www.liad.pt
101+
[uporto-url]: https://www.up.pt

project-bikesharing/Bike-Sharing-Dataset/Readme.txt

Lines changed: 0 additions & 111 deletions
This file was deleted.

0 commit comments

Comments
 (0)