LIVE · 708 DISTRICTS · NFHS-5 · ALL INDIA

Predict which districts
get sick
before they do

ML-powered NCD risk intelligence across every Indian district. Validated across time — trained on 2015–16 data, predicted 2019–21 outcomes with 63% explained variance. Built for insurers, pharma, and public health teams.

DISTRICTMAPS · LIVE METRICS
Cross-Sectional R²
5-fold CV · 704 districts
0.7132
✓ VALIDATED
Temporal R² · 4-Year Gap
NFHS-4 2015–16 → NFHS-5 2019–21
0.6279
✓ FORWARD PREDICTION
Districts Covered
29 states · 8 UTs · All India
708
FULL COVERAGE
Causal Features Used
Post proxy-removal · SHAP verified
78
CLEAN MODEL
HYDERABAD 73.5 composite risk · KERALA — PATHANAMTHITTA 34.7% diabetes · LAKSHADWEEP 12.1 lowest composite · TAMIL NADU HIGH obesity trend · TEMPORAL VALIDATION R² 0.6279 · ANDHRA PRADESH · EAST GODAVARI 68.1 risk · MODEL MAE 2.06% absolute error · HYDERABAD 73.5 composite risk · KERALA — PATHANAMTHITTA 34.7% diabetes · LAKSHADWEEP 12.1 lowest composite · TAMIL NADU HIGH obesity trend · TEMPORAL VALIDATION R² 0.6279 · ANDHRA PRADESH · EAST GODAVARI 68.1 risk · MODEL MAE 2.06% absolute error ·
Independent Validation

Three proofs the model works

Most health data products describe the present. Districtmaps.ai predicted the future — validated independently across districts, time, and conditions.

Cross-Sectional
71.3%
5-fold cross-validated R² across 704 districts. Lifestyle and demographic indicators explain 71% of diabetes risk variation. Proxy features removed post-SHAP analysis.
✓ 5-FOLD CV · PROXY-CLEAN
Temporal Prediction
62.8%
Trained on NFHS-4 (2015–16) features, predicted NFHS-5 (2019–21) diabetes outcomes. The model explained 63% of variance four years before the data existed.
✓ 4-YEAR FORWARD PREDICTION
Coverage
708
Districts across 29 states and 8 Union Territories. 4 conditions tracked: diabetes, blood pressure, obesity, anaemia. The most comprehensive district NCD intelligence available.
✓ ALL INDIA · 4 CONDITIONS
Model Explainability · SHAP Analysis

What actually drives
district diabetes risk

Every prediction is explainable. SHAP analysis reveals the exact contribution of each feature — so technical buyers can audit the model, not just trust it.

Top Predictors · Mean SHAP Value
Female obesity (BMI ≥25) 0.572
Iron folic acid (100+ days) 0.434
Female school attendance 0.404
Children overweight (under 5) 0.293
Health insurance coverage 0.255
Teen pregnancy rate 0.292
Tobacco use (men 15+) 0.152
Improved sanitation access 0.156
Direct causal Development proxy
What The Model Tells Us
DIRECT CAUSAL DRIVERS

Obesity and tobacco use are the strongest direct causes. Districts where women are overweight and men smoke heavily show consistently high diabetes prevalence — regardless of geography.

DEVELOPMENT TRANSITION SIGNAL

Education, healthcare access, and insurance coverage predict diabetes through the development transition pathway — wealthier districts adopt urban diets and sedentary lifestyles before health infrastructure catches up.

INTERGENERATIONAL SIGNAL

Children's overweight status, teen pregnancy rates, and childhood anaemia reveal the intergenerational transmission of metabolic risk — high-risk households today are high-risk districts tomorrow.

7 proxy features removed after SHAP review (C-sections, contraceptive use, sterilisation). Model R² impact: −0.025. Credibility impact: significant.
🔥
Highest Risk
Kerala & Tamil Nadu top the diabetes charts — high development, high obesity, high diagnosis rates. Not unhealthy districts, but the most metabolically stressed.
📈
Rising Risk
Andhra Pradesh & Telangana show the steepest development-transition risk. Urban dietary shift outpacing health infrastructure.
🎯
Pharma Opportunity
High risk + low diagnosis districts are the addressable market. Treatment gap districts in central India represent untapped commercial potential.
🛡️
Insurer Signal
Composite risk + insurance penetration together identify underpriced risk pools — districts where current premiums don't reflect true NCD burden.
Live District Lookup

Query any district instantly

Enter any Indian district to retrieve its full NCD risk profile. Try: Mumbai, Chennai, Patna, Hyderabad, Bengaluru.

// Enter a district name to retrieve risk scores
API Reference

Integrate in minutes

RESTful JSON API. No SDK required. Returns risk scores for any Indian district. Fuzzy name matching included.

GET
/risk
Returns all 4 condition risk scores + composite risk for a named district. Supports partial and fuzzy name matching.
?district=Mumbai
?district=Patna&state=Bihar
GET
/top
Returns top N highest-risk districts for a given condition. Useful for territory prioritisation and expansion planning.
?condition=diabetes_risk
&n=20
GET
/state/{state}
All districts within a state ranked by composite risk. Ideal for state-level planning and regional comparison.
/state/Maharashtra
/state/Kerala
GET
/districts
Full ranked list of all 708 districts. Sortable by any condition. Supports limit and pagination.
?sort_by=composite_risk
&limit=50
POST
/predict
Live inference. Send your own district-level data in any column naming convention. We fuzzy-match to our 80 features, fill missing values with national medians, and return a live ML prediction.
{ "obesity": 32.1,
"tobacco": 18.4,
"anaemia": 52.3 }
Live Inference — Try It

Adjust the preset values, then add up to 3 of your own metrics using whatever column names you actually use. Watch the fuzzy matcher map them — and the prediction sharpen as coverage rises.

Preset indicators
Add your own metrics (up to 3)
Live prediction
// adjust values and click Run Prediction
What This Means For You

Your data tells you what happened.
Our model tells you why — and what's next.

The /predict endpoint isn't a lookup. It's a live ML engine that runs on your data, in your naming convention, in under a second.

No Schema Alignment

Send column names exactly as they exist in your system — "loss_ratio", "claims_paid", "obesity_rate". Our fuzzy matcher maps them to the model automatically. No data engineering, no integration sprints.

Explainable, Not a Black Box

Every prediction comes with a match report — showing which of your columns drove the score, what confidence they matched at, and which gaps were filled with national medians. Auditable by your risk team.

Improves With Your Data

The more columns you send, the sharper the prediction. With 5 features you get a directional signal. With 20+ you get a portfolio-grade underwriting score calibrated to your district mix.

For Insurers — Underwriting

Cross your district loss ratios against our NCD risk scores. The model surfaces which districts are structurally underpriced — where the health burden isn't yet reflected in your premium structure but will be in 3–5 years.

For Pharma — Territory Planning

Send your sales rep territory boundaries and current Rx volumes. Our model returns the latent NCD burden in each territory — where the undiagnosed patient pool is largest and growing fastest.

For Hospital Chains — Expansion

Before opening a new facility, score every shortlisted district against projected NCD patient volumes. Our temporal validation — 63% explained variance 4 years forward — makes it a credible input to your capital allocation model.

For Public Health — Prioritisation

Send your intervention budget and district population data. The API returns a ranked priority list — the districts where NCD burden is highest relative to existing healthcare infrastructure investment.

Request Access

Ready to run this
on your data?

API access is currently invite-only while we calibrate for enterprise workloads. Tell us what you're working on — we respond within 48 hours.

Starter
₹50K – ₹1L
District risk report for your operating territory. PDF + CSV. One-time.
Enterprise
₹10L+/yr
Custom model trained on your data. White-label option. Dedicated support.

We respond within 48 hours · No sales calls unless you want one