Menu

Schedule of Actuarial Data Scientist Program: Fourth edition - Module 1 (16 CPD)

Actuarial Data Scientist Program: Fourth edition - Module 1 (16 CPD)

Schedule of Module 1: Foundations of Machine Learning in Actuarial Sciences

Day 1, Thursday, 7 November
16:00 - 18:00 Linear Models and conditional estimation By ANTONIO Katrien
Day 2, Thursday, 14 November
16:00 - 18:00 Programming : Foundations of actuarial learning and the organization of the training By VAN DAM Daniel, VAN ES Raymond
Day 3, Thursday, 21 November
16:00 - 18:00 Generalized Linear Models By ANTONIO Katrien
Day 4, Thursday, 28 November
16:00 - 18:00 Programming : LMs and GLMs
Day 5, Thursday, 5 December
16:00 - 18:00 Regularisations and links with other support vector machines By ANTONIO Katrien
Day 6, Thursday, 12 December
16:00 - 18:00 Programming : Regularisation
Day 7, Monday, 16 December
16:00 - 18:00 Clustering methods By HAINAUT Donatien
Day 8, Thursday, 19 December
16:00 - 18:00 Programming : Clustering
Day 9, Friday, 31 January
18:00 - 23:59 Assignment after Module 1
  1. From 16:00 to 18:00

    Linear Models and conditional estimation

    By ANTONIO Katrien
    • Conditional mean estimation E[Y|X] and the iris problem.
    • Introduction to Classification problems in machine learning: Linear Discriminant Analysis.
    • Introduction to Regression Problem: Linear models and the OLS estimator (with mixed data types: e.g. mix of continuous and discrete data).
  2. From 16:00 to 18:00

    Programming : Foundations of actuarial learning and the organization of the training

    By VAN DAM Daniel, VAN ES Raymond
    • Introducing the trainers and the training environment, including a first introduction to git and the GitHub repo’s dedicated to the training, the notebooks, data sets available, the ways to execute the Python code
    • Introducing the data sets that will be analyzed in the course: MTPL claim frequency and severity data, Ames Housing data set, caravan insurance data set (for a classification problem with class imbalance), data set with characteristics of vehicles (for clustering).
    • Getting to know these data sets: basic data explorations, some plotting, data manipulation and calculating summary statistics [Numpy, pandas and Matplotlib].
    • Target and feature engineering steps, including (among others) centering, scaling, dealing with NAs, class imbalance, filter out near zero variance [scikit-learn: sklearn.preprocessing].
    • Data splitting and resampling methods: training vs validation vs test, k-fold cross validation [scikit-learn: sklearn.model_selection].
    • Introduction to parameter tuning, simple example with e.g. K-nearest-neighbour. [scikit-learn: sklearn.model_selection].
  3. From 16:00 to 18:00

    Generalized Linear Models

    By ANTONIO Katrien
    • the GLMs (Logistic, Poisson, Gamma)
  4. From 16:00 to 18:00

    Programming : LMs and GLMs

    • Introducing the Python package statsmodels via [statsmodels.api] and the support for formulas via [statsmodels.formula.api].
    • Linear regression: describe the models, fit and summarize linear regression models on the Ames Housing data: model fit and model inspection, prediction, variable and model selection tools [statsmodels: statsmodels.regression.linear_model]
    • Generalized linear regression models: fitting Poisson and gamma regression models on the MTPL data set: inspecting model fit, building predictions, evaluating model fit [statsmodels: statsmodels.genmod.generalized_linear_model].
    • We gradually build up the Poisson GLM: introducing exposure (offset), how to handle multiple types of variables (numeric, categorical).
    • Combine frequency and severity GLMs into a technical tariff. Construct technical prices for selected risk profiles.
  5. From 16:00 to 18:00

    Regularisations and links with other support vector machines

    By ANTONIO Katrien

    Introduction of

    • the LASSO
    • Ridge
    • ElasticNet
    • Relation with Support Vector Machine
  6. From 16:00 to 18:00

    Programming : Regularisation

    • Fitting regularized (G)LMs: basic set up with [statsmodels or pyglmnet], handling different types of covariates in the regularized fit, automatic feature selection.
    • Working on a classification problem with class imbalance: the caravan insurance data set. Going from regression to classification problems: model formulation, model fit (with and without regularization), model evaluation tools (e.g. AUC).
  7. From 18:00 to 23:59

    Assignment after Module 1

    Important comment: The assignment is not a strict examination. Its purpose is to apply the concepts learned during the previous sessions.

    Deadline handing in assignment : 20 January 2023.

Register

Prices

Ticket type Price
Members IA|BE € 960.00
Members ILAC € 960.00
Non-members € 1,200.00