MSc Dissertation · Strathmore University · 2026

Housing Financial Vulnerability Score

Modelling housing-based financial vulnerability and insurance risk across all 47 Kenyan counties using 2023/24 KNBS microdata — what actuaries do: inferring risk from observable household characteristics.

Households Surveyed
21,347
Nationally representative
Counties Covered
47
All Kenyan counties
High Vulnerability
40%
HFVS > national mean + 1σ
XGBoost AUC
0.989
Binary classification
National Mean HFVS
0.324
σ = 0.031
Urban Households
44.3%
Rural/urban split
SCROLL

Methodology

Five risk dimensions.
One composite score.

The Housing Financial Vulnerability Score (HFVS) aggregates five independently measured risk dimensions from KHS microdata into a single household-level index. Weights are grounded in actuarial precedent and validated via PCA loading structure.

Financial Stress
30% weight
Rent burden, savings capacity, expenditure barriers
Tenure Insecurity
25% weight
Eviction risk, land ownership, legal protection
Physical Hazard
25% weight
Flood, mudslide, hazard proximity (enumerator-observed)
Dwelling Quality
20% weight
Wall, floor, roof construction materials
Utility Deprivation
20% weight
Electricity, water, sanitation access

Composite formula

HFVS = 0.30·Fin_stress + 0.25·Tenure_insec + 0.25·Phys_hazard + 0.20·Dwelling_qual + 0.20·Utility_depriv
Nat. mean
0.3236
Nat. median
0.3246
Std deviation
0.0313
Min / Max
0.2751 – 0.4027

Key Findings

Savings capacity is the single strongest predictor
Mutual information scores confirm that savings_rate (MI = 0.142) and no_savings (MI = 0.138) dominate the HFVS model — more than any structural or geographic feature. Financial resilience precedes physical risk.
Dwelling quality signals cluster in the top 5 features
Three dwelling-material indicators — structural_durability, floor_durable, wall_durable — rank 3rd, 6th and 8th. Informal construction is a proxy for poverty AND hazard exposure simultaneously.
Arid and ASAL counties dominate the high-risk tier
Tana River (0.403), Trans Nzoia (0.384) and West Pokot (0.375) lead the risk ranking. All three combine high utility deprivation with financial stress — a compounding vulnerability double-bind.
Nairobi's 100% urbanisation masks individual-level stress
Nairobi has the lowest county HFVS (0.283) yet 19% of households exceed the high-vulnerability threshold — the largest absolute count nationally, owing to its 678-household representation.

County Risk Atlas

47 Counties Ranked by Composite HFVS

1
Tana River
0.403
HIGH
2
Trans Nzoia
0.384
HIGH
3
West Pokot
0.375
HIGH
4
Samburu
0.371
HIGH
5
Wajir
0.368
HIGH
6
Mandera
0.365
HIGH
7
Turkana
0.361
HIGH
8
Garissa
0.358
HIGH
9
Marsabit
0.352
ABOVE MEAN
10
Isiolo
0.347
ABOVE MEAN
11
Kwale
0.344
ABOVE MEAN
12
Homa Bay
0.341
ABOVE MEAN
13
Migori
0.338
ABOVE MEAN
14
Busia
0.336
ABOVE MEAN
15
Taita Taveta
0.334
ABOVE MEAN
16
Kilifi
0.332
ABOVE MEAN
17
Siaya
0.330
ABOVE MEAN
18
Vihiga
0.328
ABOVE MEAN
19
Kitui
0.327
ABOVE MEAN
20
Makueni
0.325
ABOVE MEAN
21
Bungoma
0.325
ABOVE MEAN
22
Kakamega
0.324
ABOVE MEAN
23
Kisumu
0.323
BELOW MEAN
24
Bomet
0.322
BELOW MEAN
25
Machakos
0.320
BELOW MEAN
26
Kericho
0.320
BELOW MEAN
27
Nyamira
0.319
BELOW MEAN
28
Nyandarua
0.318
BELOW MEAN
29
Narok
0.316
BELOW MEAN
30
Embu
0.315
BELOW MEAN
31
Meru
0.314
BELOW MEAN
32
Elgeyo Marakwet
0.313
BELOW MEAN
33
Tharaka Nithi
0.312
BELOW MEAN
34
Baringo
0.312
BELOW MEAN
35
Nandi
0.311
BELOW MEAN
36
Kiambu
0.311
BELOW MEAN
37
Laikipia
0.310
BELOW MEAN
38
Kisii
0.309
BELOW MEAN
39
Uasin Gishu
0.308
BELOW MEAN
40
Lamu
0.307
BELOW MEAN
41
Muranga
0.305
BELOW MEAN
42
Nakuru
0.303
BELOW MEAN
43
Mombasa
0.302
BELOW MEAN
44
Kajiado
0.299
BELOW MEAN
45
Kirinyaga
0.298
BELOW MEAN
46
Nyeri
0.293
BELOW MEAN
47
Nairobi
0.283
BELOW MEAN
High vulnerability (HFVS > 0.355)
Above mean (> 0.324)
Below mean

Model Performance

XGBoost achieves AUC 0.989

Four model architectures benchmarked against binary (high vulnerability) and continuous (HFVS score) targets. Gradient boosting methods dominate across all metrics.

LightGBM
AUC-ROC0.9892
PR-AUC0.9853
F1 Score0.9289
★ Best overall performer
XGBoost
AUC-ROC0.9888
PR-AUC0.9851
F1 Score0.9286
Logistic Reg
AUC-ROC0.9832
PR-AUC0.9773
F1 Score0.9072

Both LightGBM and XGBoost achieve AUC > 0.988 — near-perfect discrimination between vulnerable and non-vulnerable households. Logistic regression (AUC 0.983) confirms that even a linear model captures strong signal from the engineered features.

The near-perfect AUC reflects the highly structured nature of HFVS (it was constructed from the same features used in prediction) and validates the internal consistency of the vulnerability index.

Train / Test Split
80 / 20 stratified split. No data leakage — HFVS sub-scores used as features in the regression target, not the composite itself.
Cross-validation
5-fold CV used for hyperparameter tuning. Final evaluation on held-out test set only. SHAP computed on full test fold.
Class balance
40% positive class (high vulnerability) — moderate imbalance. PR-AUC used alongside ROC-AUC as primary classification metric.

SHAP Feature Importance

What drives vulnerability?

Mutual information scores from the XGBoost model identify which household characteristics most strongly determine HFVS. Savings behaviour and dwelling quality dominate — structural, not geographic, factors lead.

MI contribution by dimension

Financial33.2%
Dwelling36.0%
Tenure17.5%
Spatial6.5%
Hazard3.6%
Utility3.2%
1
Savings Rate0.142
Financial
2
No Savings0.138
Financial
3
Structural Durability0.124
Dwelling
4
No Land Ownership0.108
Tenure
5
Informal Dwelling0.090
Dwelling
6
Floor Durable0.081
Dwelling
7
County Hfvs Rank0.079
Spatial
8
Wall Durable0.072
Dwelling
9
Roof Durable0.068
Dwelling
10
Log Rent0.063
Financial
11
Rent Burden0.058
Financial
12
No Written Lease0.054
Tenure
13
Eviction Threat0.049
Tenure
14
Near Flood Zone0.044
Hazard
15
No Electricity0.039
Utility
💡
Interpreting mutual information

Mutual information (MI) measures the statistical dependency between each feature and the HFVS target, regardless of relationship type. MI = 0 means complete independence; higher values indicate stronger predictive signal. These scores complement SHAP values computed from the trained XGBoost model, which additionally capture interaction effects and directional contributions.

Household Risk Calculator

Estimate your HFVS score

Enter household characteristics to compute an indicative Housing Financial Vulnerability Score using the same methodology as the dissertation model.

Financial

Tenure

Physical Hazard

Dwelling & Utilities

📊

Fill in the household characteristics and click Compute HFVS Score to see where this household sits in the national vulnerability distribution.