This Data Science laboratorial course was designed independently of the data science tool to use. However, it is my opinion that nowadays, the one that demonstrates the best balance between flexibility and results is Python, along with the set of packages specific created for this context.

The course is organized in several modules, covering the two main topics: mining multidimensional data (tabular format) and mining time series.

Data formatLabTopicProcedures
Accessory Files data folder
config.py
dslabs.mplstyle
ds_charts.py
ts_functions.py
Multidimensional data
(tabular format)
Lab 0 Python for DS Loading data with pandas
Basic charts with matplotlib.pyplot
Lab 1 Data profiling Data dimensionality
Data distribution
Data granularity
Data sparsity
Lab 2 Data preparation Missing values imputation
Scaling
Dummification
Data balancing
Lab 3 Classification
Training Strategies
Naive Bayes
KNN
Decision Trees
Random Forests
Gradient Boosting
Neural networks (MLP)
Lab 4 Clustering Clustering
Feature Extraction
Lab 5Pattern Mining
Time SeriesLab 6 Profiling
Transformation
Forecasting
Motif Discovery