LIVE · 708 DISTRICTS · NFHS-5 · ALL INDIA

Predict which districts
get sick—
before they do

ML-powered NCD risk intelligence across every Indian district. Validated across time — trained on 2015–16 data, predicted 2019–21 outcomes with 63% explained variance. Built for insurers, pharma, and public health teams.

Try District Lookup → View Model Evidence

DISTRICTMAPS · LIVE METRICS

Cross-Sectional R²

5-fold CV · 704 districts

0.7132

✓ VALIDATED

Temporal R² · 4-Year Gap

NFHS-4 2015–16 → NFHS-5 2019–21

0.6279

✓ FORWARD PREDICTION

Districts Covered

29 states · 8 UTs · All India

708

FULL COVERAGE

Causal Features Used

Post proxy-removal · SHAP verified

CLEAN MODEL

HYDERABAD 73.5 composite risk · KERALA — PATHANAMTHITTA 34.7% diabetes · LAKSHADWEEP 12.1 lowest composite · TAMIL NADU HIGH obesity trend · TEMPORAL VALIDATION R² 0.6279 · ANDHRA PRADESH · EAST GODAVARI 68.1 risk · MODEL MAE 2.06% absolute error · HYDERABAD 73.5 composite risk · KERALA — PATHANAMTHITTA 34.7% diabetes · LAKSHADWEEP 12.1 lowest composite · TAMIL NADU HIGH obesity trend · TEMPORAL VALIDATION R² 0.6279 · ANDHRA PRADESH · EAST GODAVARI 68.1 risk · MODEL MAE 2.06% absolute error ·

Independent Validation

Three proofs the model works

Most health data products describe the present. Districtmaps.ai predicted the future — validated independently across districts, time, and conditions.

Cross-Sectional

71.3%

5-fold cross-validated R² across 704 districts. Lifestyle and demographic indicators explain 71% of diabetes risk variation. Proxy features removed post-SHAP analysis.

✓ 5-FOLD CV · PROXY-CLEAN

Temporal Prediction

62.8%

Trained on NFHS-4 (2015–16) features, predicted NFHS-5 (2019–21) diabetes outcomes. The model explained 63% of variance four years before the data existed.

✓ 4-YEAR FORWARD PREDICTION

Coverage

708

Districts across 29 states and 8 Union Territories. 4 conditions tracked: diabetes, blood pressure, obesity, anaemia. The most comprehensive district NCD intelligence available.

✓ ALL INDIA · 4 CONDITIONS

Model Explainability · SHAP Analysis

What actually drives
district diabetes risk

Every prediction is explainable. SHAP analysis reveals the exact contribution of each feature — so technical buyers can audit the model, not just trust it.

Top Predictors · Mean SHAP Value

Female obesity (BMI ≥25) 0.572

Iron folic acid (100+ days) 0.434

Female school attendance 0.404

Children overweight (under 5) 0.293

Health insurance coverage 0.255

Teen pregnancy rate 0.292

Tobacco use (men 15+) 0.152

Improved sanitation access 0.156

Direct causal Development proxy

What The Model Tells Us

DIRECT CAUSAL DRIVERS

Obesity and tobacco use are the strongest direct causes. Districts where women are overweight and men smoke heavily show consistently high diabetes prevalence — regardless of geography.

DEVELOPMENT TRANSITION SIGNAL

Education, healthcare access, and insurance coverage predict diabetes through the development transition pathway — wealthier districts adopt urban diets and sedentary lifestyles before health infrastructure catches up.

INTERGENERATIONAL SIGNAL

Children's overweight status, teen pregnancy rates, and childhood anaemia reveal the intergenerational transmission of metabolic risk — high-risk households today are high-risk districts tomorrow.

7 proxy features removed after SHAP review (C-sections, contraceptive use, sterilisation). Model R² impact: −0.025. Credibility impact: significant.

🔥

Highest Risk

Kerala & Tamil Nadu top the diabetes charts — high development, high obesity, high diagnosis rates. Not unhealthy districts, but the most metabolically stressed.

📈

Rising Risk

Andhra Pradesh & Telangana show the steepest development-transition risk. Urban dietary shift outpacing health infrastructure.

🎯

Pharma Opportunity

High risk + low diagnosis districts are the addressable market. Treatment gap districts in central India represent untapped commercial potential.

🛡️

Insurer Signal

Composite risk + insurance penetration together identify underpriced risk pools — districts where current premiums don't reflect true NCD burden.

Live District Lookup

Query any district instantly

Enter any Indian district to retrieve its full NCD risk profile. Try: Mumbai, Chennai, Patna, Hyderabad, Bengaluru.

// Enter a district name to retrieve risk scores

API Reference

Integrate in minutes

RESTful JSON API. No SDK required. Returns risk scores for any Indian district. Fuzzy name matching included.

GET

/risk

Returns all 4 condition risk scores + composite risk for a named district. Supports partial and fuzzy name matching.

?district=Mumbai
?district=Patna&state=Bihar

GET

/top

Returns top N highest-risk districts for a given condition. Useful for territory prioritisation and expansion planning.

?condition=diabetes_risk
&n=20

GET

/state/{state}

All districts within a state ranked by composite risk. Ideal for state-level planning and regional comparison.

/state/Maharashtra
/state/Kerala

GET

/districts

Full ranked list of all 708 districts. Sortable by any condition. Supports limit and pagination.

?sort_by=composite_risk
&limit=50

POST

/predict

Live inference. Send your own district-level data in any column naming convention. We fuzzy-match to our 80 features, fill missing values with national medians, and return a live ML prediction.

{ "obesity": 32.1,
"tobacco": 18.4,
"anaemia": 52.3 }

Live Inference — Try It

Adjust the preset values, then add up to 3 of your own metrics using whatever column names you actually use. Watch the fuzzy matcher map them — and the prediction sharpen as coverage rises.

Preset indicators

obesity %

tobacco %

anaemia %

insurance %

sanitation %

Add your own metrics (up to 3)

Live prediction

// adjust values and click Run Prediction

What This Means For You

Your data tells you what happened.
Our model tells you why — and what's next.

The /predict endpoint isn't a lookup. It's a live ML engine that runs on your data, in your naming convention, in under a second.

No Schema Alignment

Send column names exactly as they exist in your system — "loss_ratio", "claims_paid", "obesity_rate". Our fuzzy matcher maps them to the model automatically. No data engineering, no integration sprints.

Explainable, Not a Black Box

Every prediction comes with a match report — showing which of your columns drove the score, what confidence they matched at, and which gaps were filled with national medians. Auditable by your risk team.

Improves With Your Data

The more columns you send, the sharper the prediction. With 5 features you get a directional signal. With 20+ you get a portfolio-grade underwriting score calibrated to your district mix.

For Insurers — Underwriting

Cross your district loss ratios against our NCD risk scores. The model surfaces which districts are structurally underpriced — where the health burden isn't yet reflected in your premium structure but will be in 3–5 years.

For Pharma — Territory Planning

Send your sales rep territory boundaries and current Rx volumes. Our model returns the latent NCD burden in each territory — where the undiagnosed patient pool is largest and growing fastest.

For Hospital Chains — Expansion

Before opening a new facility, score every shortlisted district against projected NCD patient volumes. Our temporal validation — 63% explained variance 4 years forward — makes it a credible input to your capital allocation model.

For Public Health — Prioritisation

Send your intervention budget and district population data. The API returns a ranked priority list — the districts where NCD burden is highest relative to existing healthcare infrastructure investment.

Request Access

Ready to run this
on your data?

API access is currently invite-only while we calibrate for enterprise workloads. Tell us what you're working on — we respond within 48 hours.

Starter

₹50K – ₹1L

District risk report for your operating territory. PDF + CSV. One-time.

API Access

₹3L – ₹5L/yr

Full API access. All endpoints including /predict. Unlimited queries. SLA included.

Enterprise

₹10L+/yr

Custom model trained on your data. White-label option. Dedicated support.

Name

Organisation

Work Email

Sector

What are you trying to solve?

Tier of interest

Starter — ₹50K–₹1L (one-time report) API Access — ₹3L–₹5L/yr Enterprise — ₹10L+/yr (custom model)

We respond within 48 hours · No sales calls unless you want one

Predict which districts get sick— before they do