Introduction to Machine Learning and Data Science

Name: Introduction to Machine Learning and Data Science
Start: 2025-10-24T09:30:00+01:00
End: 2025-10-24T12:30:00+01:00
Location: Zoom

October 24, 2025 @ 9:30 am – 12:30 pm

This event is part of the Specialist Training Series on Machine Learning and Data Science delivered by the South & East Network for Social Sciences Doctoral Training Partnership (SENSS).

SENSS Specialist Training: Introduction to Machine Learning and Data Science
Instructor: Dr Elisabetta Pellini (Bayes Business School)
Term: Autumn 2025

Module Outline and Aims

In today’s global and digital economic environment, we have unprecedented access to vast amounts of data across all types of industries. It has become crucial for researchers and professionals to possess analytical skills that enable data-driven decision-making. Data science is the practice of collecting, analysing, and interpreting data that empowers decision-makers to make informed choices.

This module provides both theoretical foundations and practical skills necessary to apply machine learning and data science methods to problems in economics and finance. Theoretical concepts will be introduced in an intuitive and accessible manner. Emphasis will be placed on selecting appropriate methods for specific problems, implementing methods correctly and presenting and interpreting analytical results

Lectures will include practical, computer-based exercises using real-world datasets. Students will learn to use Python to carry out these tasks.

Prerequisites

Students are expected to have a knowledge of statistics (descriptive and inference) and basic Python knowledge (e.g., libraries NumPy, Pandas, statsmodels).

Software

Participants will use Jupyter Notebooks during the sessions for hands-on modelling demonstrations. The easiest way to access Jupyter is by installing the Anaconda Distribution, which includes Python, Jupyter, and most of the required libraries. Anaconda can be downloaded here: https://www.anaconda.com/products/distribution

Content Outline

Linear Regression
- Model assumptions
- Goodness of fit
- Prediction
- Diagnostic analysis
Linear Model Selection
- Bias-variance trade-off
- Cross-validation
- Subset selection
- Ridge Regression
- LASSO
Dimensionality Reduction
- Principal Component Analysis (PCA)
- PCA in regression contexts
Applications in Python
- Hands-on implementation using Python libraries such as pandas, scikit-learn, and statsmodels

Learning Objectives

By the end of this module, you will be able to:

Explain the fundamental principles of linear regression, including model assumptions, coefficient estimation, and diagnostic analysis.
Compare and apply model selection techniques, including subset selection, Ridge Regression, and LASSO.
Understand and implement dimensionality reduction techniques, particularly Principal Component Analysis, and assess their impact on regression models.
Critically assess the suitability of different machine learning methods for solving specific problems and justify methodological choices.
Use Python programming tools and libraries (e.g., pandas, scikit-learn, statsmodels) to perform data analysis and apply machine learning methods to real-world datasets.

Main References

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2023). An Introduction to Statistical Learning with Applications in Python. Springer.
McKinney, W. (2018). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython (2nd ed.). O’Reilly Media.

Please ensure you meet the prerequisites before registering.

Please direct enquiries to: trainingmanager@senss-dtp.ac.uk