The Applied Social Data Science programme offers a wide range of modules that aim to introduce students to state-of-the-art quantitative methods and social science research skills. Lecture and seminar based modules are offered and assessment is based on assignments and exams. The number of modules on offer and topics covered, and whether there is any choice of module topics, varies from year to year depending on student numbers. For 2023/24 students will also need to choose 3 out of 4 elective modules. Students are expected to bring their own laptop (Mac/Windows/Linux) for use in seminars and tutorials throughout the course; note - tablets are not suitable. For minimum laptop specs, please see this helpful guide
- Computer Programming for Social Scientists (10 ECTS/CORE)
Students will become familiar with R and Python, two principal programming languages used in data science and research. This course covers basic and intermediate programming concepts, such as objects, types, functions, control flow, debugging in both procedural and object-oriented paradigms. Particular emphasis will be made on data handling and analytical tasks with a focus on problems in social sciences. Homeworks will include hands-on coding exercises. In addition, students will apply their programming knowledge on a research project at the end of the module.
- Applied Statistical Analysis I (10 ECTS/CORE)
This is the first module in the quantitative methods sequence, which introduces the linear regression model as a fundamental tool in applied statistical analysis. Students will apply concepts of statistical analysis, specifically multi-variable analysis and model building, to a broad set of real-world data and problems from the social sciences. Students will cover the assumptions that underlie the linear regression model, including issues of estimation and inference, as well as methods used to diagnose and correct for violations of those assumptions. The expectation is that at the end of the semester students will be savvy readers of published research and tasteful users of linear models. Labs, problem sets, and exams will teach students to apply concepts from class toward programming skills in R, LaTeX and GitHub, which are standard practice in academics and industry.
- Research Design for the Social Sciences (10 ECTS/CORE)
A tool becomes useful when we know how to use it in the correct way. In order to apply effectively the data science methods that we will learn in the ASDS programme, we need to build a framework which links empirical observation to social scientific theory in a robust and logically consistent manner. This module provides students with such a framework: it introduces the key approaches to the scientific method in the social sciences, and invites students to reflect on which approach suits best their own interests. We then move on to the important skills of developing a research question and positioning this within an academic literature, and then to designing research that can successfully answer that question. On completion of the module, students should be equipped with the necessary skills to correctly apply methods to researching social science questions, in a manner consistent with the expectations of academic publication and industry best practice.
- Applied Statistical Analysis II (10 ECTS/CORE)
This module extends what students covered in the previous term by focusing on non-linear model forms for the outcome variable. These are typically called ”generalized linear models” (GLMs), although for historical reasons people in the social sciences call them ”maximum likelihood models”. The principle we will care about is how to adapt the standard linear model that you know so that a broader class of outcome variables can be accommodated. These include: counts, dichotomous outcomes, bounded variables, and more. There is a strong theoretical basis for the models that we will use. Also, the bulk of the learning in the course will take place outside of the classroom by reading, practicing using statistical software, replicating the work of others, and doing problem sets.
- Introduction to Machine Learning (5 ECTS/CORE)
Introduction to Machine Learning is designed to offer an introduction to the basics of ML, specifically with a hands-on curriculum aimed at developing knowledge and skills in establishing ML pipelines with state of the art languages and toolkits. This module is designed for students with limited prior experience of programming. It will introduce the fundamentals of programming, with a focus on setting up an effective pipeline for processing datasets to execute common ML techniques such as Scalable Vector Machines and Linear Regression. Students will be assessed both on the acquired technical skills and on their ability to understand the ML pipeline and results and communicate effectively with experts and non-experts.
- Social Forecasting (5 ECTS/ELECTIVE)
This module will cover the fundamental techniques and approaches to forecasting. Students will in particular learn the main building blocks of a forecasting model, how to build their own forecasting models, and how to evaluate their performance. Techniques ranging from ARIMA models to neural networks will be applied to both time series and panel data. Students will be able to apply the approaches covered in the module to forecast political, economic, and/or social outcomes in a research report.
- Quantitative Text Analysis for Social Scientists (5 ECTS/ELECTIVE)
This module focuses on a range of computational tools—stemming from the fields of machine learning and natural language processing (NLP)—that are essential for large-scale analyses of text information. The aim is to provide students with a hands-on introduction to collecting, processing, and analyzing “text-as-data” for the purpose of answering important social science research questions. The module will also cover corpus acquisition methods as well as social media research applications. Students will apply these skills to produce a state-of-the-art research report based on a novel collection of text documents and meta-data.
- Experimental Methods for Social Scientists (5 ECTS/ELECTIVE)
This module will extend the content covered in Introduction to Machine Learning, by focussing on application of machine learning approaches to social science research questions. The core of this module will consist of discussing specific applications and papers drawn from a range of social science disciplines. Students will apply these methods in a research paper at the end of the module.
- Spatial Data Analysis (5 ECTS/ELECTIVE)
The use of spatial data has become increasingly popular in social sciences research. Micro-surveys now routinely collect GPS coordinates of households and communities, satellites provide real-time measure-ments of night-time luminosity, and geo-referenced historic maps are linked to outcomes both across long time spans and space. Spatial data serve in general two main purposes. First, they allow meas-uring outcomes that are otherwise hard to measure. Second, they aid identification by, for example, controlling for covariates, enabling the construction of instruments, or exploiting boundaries. In the first part of the course, we will discuss how recent papers are us-ing geo-referenced data, focusing on the role of spatial data in answer-ing research questions. The second part of the course will be hands on: we will cover basic spatial tools in ArcGIS, such as creating datasets on our own, merging spatial datasets, computing distances and the basics of map algebra.
- Dimensionality Reduction (5 ECTS/ELECTIVE)
Data are more ubiquitous than ever, but analysing wide datasets (i.e., those with many variables, or columns) requires special skills and methods. In this module, students will learn techniques to reduce the dimensionality of data, and in the process uncover the hidden (or latent) structures within it. Students will be introduced to traditional linear methods, including principal component analysis, multiple correspondence analysis and factor analysis, as well as more recent advances in non-linear approaches such as t-SNE and Isomap. Such techniques have a wide variety of applications, including improving model fit with highly collinear data in regression-based analysis, and identifying latent variables (or clusters) in exploratory research. To this latter end, particular emphasis will be placed on methods for visualising the outputs of DR techniques and communicating findings, using the principles of geometric data analysis (GDA), and the relational approach to social science developed by the French sociologist Pierre Bourdieu.
- Dissertation (MSc Students Only) (30 ECTS/CORE)
All students on the Dissertation module are required to complete a dissertation for qualification for the course. Each student will be supervised by a member of staff. Students will receive written feedback from the supervisors on their research proposals. Students will complete and have approved a research proposal by last Friday in March. The final dissertation is due by the first Friday in September.