- This event has passed.
Introduction to Machine Learning and Data Science
This event is part of the Specialist Training Series on Machine Learning and Data Science delivered by the South & East Network for Social Sciences Doctoral Training Partnership (SENSS).
SENSS Specialist Training: Introduction to Machine Learning and Data Science
Instructor: Dr Elisabetta Pellini (Bayes Business School)
Term: Autumn 2025
Module Outline and Aims
In today’s global and digital economic environment, we have unprecedented access to vast amounts of data across all types of industries. It has become crucial for researchers and professionals to possess analytical skills that enable data-driven decision-making. Data science is the practice of collecting, analysing, and interpreting data that empowers decision-makers to make informed choices.
This module provides both theoretical foundations and practical skills necessary to apply machine learning and data science methods to problems in economics and finance. Theoretical concepts will be introduced in an intuitive and accessible manner. Emphasis will be placed on selecting appropriate methods for specific problems, implementing methods correctly and presenting and interpreting analytical results
Lectures will include practical, computer-based exercises using real-world datasets. Students will learn to use Python to carry out these tasks.
Prerequisites
Students are expected to have a knowledge of statistics (descriptive and inference) and basic Python knowledge (e.g., libraries NumPy, Pandas, statsmodels).
Software
Participants will use Jupyter Notebooks during the sessions for hands-on modelling demonstrations. The easiest way to access Jupyter is by installing the Anaconda Distribution, which includes Python, Jupyter, and most of the required libraries. Anaconda can be downloaded here: https://www.anaconda.com/products/distribution
Content Outline
- Linear Regression
- Model assumptions
- Goodness of fit
- Prediction
- Diagnostic analysis
- Linear Model Selection
- Bias-variance trade-off
- Cross-validation
- Subset selection
- Ridge Regression
- LASSO
- Dimensionality Reduction
- Principal Component Analysis (PCA)
- PCA in regression contexts
- Applications in Python
- Hands-on implementation using Python libraries such as pandas, scikit-learn, and statsmodels
Learning Objectives
By the end of this module, you will be able to:
- Explain the fundamental principles of linear regression, including model assumptions, coefficient estimation, and diagnostic analysis.
- Compare and apply model selection techniques, including subset selection, Ridge Regression, and LASSO.
- Understand and implement dimensionality reduction techniques, particularly Principal Component Analysis, and assess their impact on regression models.
- Critically assess the suitability of different machine learning methods for solving specific problems and justify methodological choices.
- Use Python programming tools and libraries (e.g., pandas, scikit-learn, statsmodels) to perform data analysis and apply machine learning methods to real-world datasets.
Main References
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2023). An Introduction to Statistical Learning with Applications in Python. Springer.
- McKinney, W. (2018). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython (2nd ed.). O’Reilly Media.
Please ensure you meet the prerequisites before registering.
Sign up form: https://essex.eu.qualtrics.com/jfe/form/SV_2n2VMaHX4Ahh7wy
Please direct enquiries to: trainingmanager@senss-dtp.ac.uk


