Problem Statement
India has over 12 million kirana stores and informal trade retailers — the last mile of every FMCG, auto parts, and consumer goods supply chain. Collectively they move trillions of rupees in goods. Yet to the formal financial system, they do not exist.
The distributor has been the de-facto credit risk assessor for decades. They extend 7–30 day credit terms to retailers they trust — based on years of behavioral observation. That trust is never converted into a financial signal.
CORE INSIGHT — ALTSCOREWhy credit is inaccessible
Banks require a Bureau score. Bureau scores require formal credit history. Formal credit history requires a previous loan or credit card. Retailers have never had either. It is a closed loop by design — and it excludes the very operators who drive India's consumer economy.
Microfinance institutions do lend, but at 24–36% annual interest, via group liability structures that aren't designed for working capital. The gap between "what's available" and "what's needed" is enormous.
Where the signal already exists
Every distributor ERP — whether Tally, SAP Business One, or a custom system — holds 3–10 years of rich behavioral data: order frequency, invoice values, payment delay patterns, seasonal demand curves, return rates, and SKU mix evolution. None of this is currently used as a credit input.
The data is not missing. The infrastructure to extract, normalize, and translate it into credit intelligence is what's missing.
Product Vision & Goals
Make Behavioral Data Legible
Extract and normalize distributor transaction data into structured retailer behavioral profiles — the AltScore data layer.
Create a Trusted Credit Signal
Train an Alternative Credit Score model validated against real NBFC repayment outcomes. Build lender trust through accuracy and explainability.
Enable Frictionless Lending
Deliver a B2B API and lender dashboard that allows NBFCs and fintechs to underwrite informal retailers at scale with minimal manual intervention.
Non-Goals (v1.0)
Users & Stakeholders
- Has been buying from the same distributor for 8 years; always pays eventually
- Needs ₹1–5L working capital before Diwali season, can't get a bank loan
- Uses UPI; has a smartphone but limited digital-native behavior
- Pain: Festival season ordering requires cash upfront; opportunity is lost
- Goal: Get a credit line based on his real track record, not a piece of paper
- Extends informal credit to ~60% of retailers from her own capital — absorbs all bad debt personally
- Has 5+ years of order/payment data she has never been able to monetise or act on formally
- Evaluated by her principal brand on retailer network growth and wallet share — healthy retailers = her performance score
- Pain: Credit exposure ceiling limits how many retailers she can support; one large default disrupts her own cash flow
- Goal: Offload credit risk to formal lenders, grow her retailer base, earn a data partnership fee, and strengthen her standing with the brand
- Has capital to deploy but no underwriting model for informal retailers
- Bureau-only underwriting rejects 80%+ of applicants in this segment
- Needs a scored, explainable signal with a credit limit recommendation
- Pain: Cost of field verification makes small-ticket lending unviable at scale
- Goal: Underwrite 10x more retailers with same team size using AltScore API
Why Distributors Will Share Their Data
The distributor is the most critical node in the AltScore architecture. Without their data, there is no product. The question is not can we extract data — technically it is straightforward. The question is why would a distributor willingly hand over their most sensitive commercial asset? The answer is that they have more to gain than to protect.
The distributor's data does not create value sitting in a Tally ledger. It creates value when it unlocks credit that flows back into their own sales. AltScore is not asking distributors to give something up — it is offering to turn a dormant asset into compounding revenue.
CORE COMMERCIAL LOGIC — DISTRIBUTOR PARTICIPATIONWhen Raju gets ₹1.5L working capital, he does not go to a stranger. He goes to Deepa — the person he has bought from for 8 years. The loan is a demand injection that flows directly back into the distributor's own topline. Deepa is funding her own revenue growth by sharing data.
Deepa today extends 7–30 day informal credit from her own capital. When a retailer defaults, she takes the hit. AltScore moves that credit function to a formal lender. She retains the relationship and the revenue — without carrying the risk. Her balance sheet gets structurally cleaner.
AltScore pays distributors a per-retailer-scored partnership fee. At 500 retailers this is meaningful passive income. More critically, it reframes the ask entirely — we are not requesting a favour. We are buying the data. That changes the psychology of every onboarding conversation.
Brands like TVS, HUL, and P&G evaluate distributors on retailer network growth and outlet activation. If Deepa can demonstrate that her retailer base is growing, buying more, and getting formally financed — she becomes a preferred distributor. AltScore data sharing is a competitive moat upward in the channel, not just downward.
Deepa already knows intuitively which retailers are creditworthy. What she has never had is a tool that makes that knowledge structured, auditable, and actionable. The AltScore distributor dashboard gives her a formal risk view of her own network — useful independent of the lending product. The data sharing is a by-product; the intelligence is the value-in.
Network Effect Among Distributors
Once one distributor in a territory participates and their retailers begin receiving loans — and start growing — competing distributors face a simple choice: join or watch their retailers fall behind. Distributor A's network becomes healthier and more loyal each season. Distributor B's retailers stay cash-constrained. Over 2–3 cycles, the gap compounds and becomes a structural competitive disadvantage.
The One Objection We Must Address
This fear of disintermediation is real and must be addressed head-on in every distributor pitch. Our response: AltScore has no interest in trade routing. The credit product is explicitly structured as working capital for existing channel purchases — not a bridge to direct OEM buying. Loan disbursals go to the retailer's account; purchase orders still flow through the distributor. The distributor's role in the supply chain is untouched. We reinforce this contractually in the data partnership agreement.
Regulatory Direction of Travel
GST mandates and e-invoicing requirements are already pushing distributor transactions into formal digital systems. ONDC is driving open trade networks. The direction is clear — trade data is moving toward structured, shareable formats regardless. Distributors who build trust with AltScore early are better positioned than those who resist and risk being disintermediated by platforms that don't offer the same partnership economics.
Distributor Value Summary
| What Distributor Gives | What Distributor Gets | Timeline |
|---|---|---|
| ERP transaction data (read-only sync) | Retailer GMV growth as financed retailers buy more | Within 1–2 purchase cycles post-disbursement |
| Retailer identity data (GST, phone, shop name) | Credit risk offloaded from distributor's own balance sheet | Immediate — from first NBFC participation |
| Ongoing incremental data sync (daily delta) | Data partnership fee (per retailer scored, monthly) | From month 1 of go-live |
| Retailer consent facilitation (trusted intro) | Distributor intelligence dashboard — own retailer risk view | Available at onboarding |
| — | Improved principal scorecard (brand/OEM evaluation) | Quarterly review cycle |
Feature Set
Layer 1 — Data Ingestion & ERP Connector
| Feature | Description | Priority | Target User |
|---|---|---|---|
| ERP Connector SDK | Pre-built connectors for Tally Prime, SAP B1, and generic CSV export. Pulls ledger, invoice, and payment data via API or file upload. | P0 | Distributor (Deepa) |
| Retailer Identity Resolution | Deduplication and entity matching across distributors using GST number, phone, shop name + locality fuzzy match. | P0 | Internal / Data Ops |
| Consent Management Portal | DPDP-compliant retailer consent flow — WhatsApp-first opt-in, revocation, and audit trail. | P0 | Retailer (Raju) |
| Data Quality Dashboard | Distributor-facing view showing data freshness, completeness score, anomaly flags, and sync status. | P1 | Distributor (Deepa) |
| Incremental Sync Engine | Daily delta sync to capture new transactions without full re-ingestion. Handles ERP version changes gracefully. | P1 | Internal Engineering |
Layer 2 — AltScore Intelligence Engine
| Feature | Description | Priority | Target User |
|---|---|---|---|
| Behavioral Feature Pipeline | Automated computation of 40+ features: payment delay distribution, order frequency, basket size trend, SKU mix evolution, seasonal elasticity, return rate. | P0 | ML Platform |
| AltScore Model (v1) | XGBoost-based credit risk classifier. Output: risk band (A/B/C/D), probability of default, recommended credit limit. Trained on NBFC partner repayment data. | P0 | NBFC (Lakshmi) |
| Score Explainability Module | SHAP-based reason codes per score output. Required for RBI FLDG compliance and retailer grievance redressal. Top 3 positive + top 3 negative drivers shown. | P0 | NBFC / Retailer |
| Retailer Profile Card | Human-readable summary of retailer behavioral history — for lender use during manual review or override workflows. | P1 | NBFC (Lakshmi) |
| Model Drift Monitor | Automatic PSI/KS tracking. Alerts when score distribution shifts vs. training baseline. Triggers model refresh protocol. | P1 | ML Ops / Governance |
| Croston's Method for Irregular Retailers | For low-frequency / seasonal retailers, apply intermittent demand modeling to avoid penalizing valid but sparse purchase patterns. | P2 | ML Platform |
Layer 3 — Lender Integration & API
| Feature | Description | Priority | Target User |
|---|---|---|---|
| AltScore API (REST) | Single endpoint: submit retailer GST/phone → receive score band, PD estimate, credit limit, reason codes, and data freshness indicator. <2s SLA. | P0 | NBFC Engineering |
| Lender Dashboard | Web portal for NBFC credit officers: portfolio view, individual retailer profiles, score history, and manual override with audit log. | P1 | NBFC (Lakshmi) |
| Sandbox Environment | Fully synthetic test environment with 10,000 retailer profiles. Allows NBFC to validate integration before production access. | P1 | NBFC Engineering |
| Webhook Notifications | Push score refresh events to lender systems when retailer profile updates materially (e.g., 2+ month data refresh). | P2 | NBFC Engineering |
System Architecture
Storage & Compute
Modeling & Serving
AI Engine — Feature Design
The AltScore model is built on behavioral features extracted from distributor transaction data. Features are organized into five signal families:
| Signal Family | Features | Credit Interpretation | Data Source |
|---|---|---|---|
| Payment Behavior | Avg. payment delay (days), P50/P90 delay, zero-delay rate, partial payment frequency, bounced payment count | Willingness to pay; liquidity stress signals | Distributor ledger |
| Order Consistency | Order frequency (weekly/monthly), inter-order gap std dev, longest order gap, active months in last 12 | Business continuity and loyalty signals | Invoice history |
| Business Trajectory | 12-month GMV trend slope, basket size YoY growth, SKU category upgrade rate, new SKU adoption velocity | Ability to repay — business health | Invoice line items |
| Seasonal Pattern | Seasonal demand elasticity, festival spike ratio, post-festival payment lag, monsoon dip depth | Predictability; enables seasonal credit structuring | Invoice + calendar |
| Relationship Depth | Distributor tenure (months), multi-distributor presence, dispute/return rate, credit term utilization | Trust signal; network embeddedness | ERP master data |
The ground truth label for model training is 90-day repayment behavior from NBFC partner portfolios — retailers who already received loans. The model learns: "retailers who look like THIS in distributor data repay loans like THAT." Cold-start retailers with <6 months of data receive a rules-based conservative limit until sufficient signal accumulates.
MODEL DESIGN PRINCIPLEModel Selection Rationale
XGBoost Classifier
Handles missing features gracefully (irregular ERPs). Built-in feature importance. Well-understood by credit risk auditors. Fast inference (<50ms).
Croston's Method
Designed for intermittent demand. Prevents under-scoring retailers with valid but seasonal buying patterns (e.g., festival goods, agri-input shops).
Isolation Forest
Detects gaming behavior — sudden order inflation before scoring event. Flags synthetic payment patterns. Output fed as a feature into primary model.
AI Guardrails & Governance
AltScore makes decisions that directly affect livelihoods. A rejected score or an inflated credit limit can harm the retailer, the lender, or both. Guardrails are not optional — they are first-class product requirements.
- Quarterly fairness audits segmented by state, district tier, and gender proxy (shop owner name)
- Disparate impact testing — score distributions must not deviate >15% between comparable cohorts
- No direct use of geography as a feature; use only behavioral signals that are geographically neutral
- Escalation protocol if bias is detected — model freeze until root cause resolved
- Isolation Forest anomaly detector flags sudden order velocity spikes (>3σ from 12-month baseline)
- Return rate counter-signal: inflated orders are often partially returned
- Payment behavior cannot be gamed — delayed payments are the strongest negative signal
- 60-day look-back window for scoring event prevents short-term manipulation
- Every score output includes mandatory top-3 positive and top-3 negative reason codes (SHAP-derived)
- Reason codes mapped to plain-language Hindi/English explanations for retailer-facing communication
- Grievance redressal API endpoint — retailer can flag a score they believe is incorrect
- Manual review workflow for all grievances with 7-day SLA
- Credit limit capped at 30% of trailing 6-month GMV — hard ceiling, cannot be overridden via API
- Affordability layer checks existing NBFC exposure (via bureau lookup where available)
- Seasonal structuring: higher limits in festival quarters, lower in lean season — aligned to cash flow
- Lender override requires human sign-off + audit log entry
- Every data batch tagged with distributor_id, ERP_version, and ingestion_timestamp
- Completeness score computed per distributor — scores marked "low confidence" if <70% completeness
- Anomalous distributor data (e.g., bulk payment recording, duplicate invoices) quarantined before feature computation
- Data lineage graph — auditors can trace any score back to source records
- PSI (Population Stability Index) computed monthly — alert if PSI > 0.2
- KS statistic tracked against holdout set — model refresh if KS drops >10 points
- Champion/challenger framework — new model versions run in shadow mode for 30 days before promotion
- Model version stamped on every API response — supports audit reconstruction
Regulatory Compliance Map
| Regulation | Requirement | AltScore Implementation |
|---|---|---|
| DPDP Act 2023 | Explicit consent for personal data processing | WhatsApp-first consent flow, revocable, audit trail stored for 7 years |
| RBI FLDG Guidelines | Reason codes mandatory for credit decisions | SHAP reason codes on every score output, mapped to RBI-recognized categories |
| RBI Account Aggregator | Consent-based financial data sharing | AA framework integration as supplementary data source (v2 roadmap) |
| PCI-DSS | No payment card data in scoring pipeline | Confirmed — AltScore uses only trade transaction data, no card data processed |
Product Development Lifecycle (PDLC)
AltScore follows an AI-specific PDLC with mandatory governance gates at each phase. No phase advances without sign-off from Product, Data Science, Legal, and Risk.
Discovery & Signal Validation
- Sign 3 pilot distributor data agreements (auto parts, FMCG, general trade)
- Extract 24 months of historical transaction data per distributor
- Build data quality scorecard — assess feature extractability
- Map ERP schema variations across Tally, SAP B1, custom systems
- Validate that behavioral features are predictive (correlation analysis with any available repayment data)
Model Development & Validation
- Source labeled training data from NBFC partner (min. 5,000 retailer loan outcomes)
- Build feature pipeline (Bronze → Silver → Gold layers)
- Train XGBoost classifier with cross-validation; baseline with logistic regression
- SHAP integration — reason code generation and validation with credit team
- Fairness audit on training cohort — test for geographic and demographic bias
- Score distribution analysis and band calibration (A/B/C/D)
Pilot — 100 Retailers (Shadow Mode)
- Score 100 retailers from pilot distributors without making credit decisions
- NBFC partner uses both Bureau score and AltScore in parallel — neither drives decision yet
- Track: score correlation with actual lender judgment, reason code acceptance rate
- Retailer consent flow live — measure consent rate and drop-off points
- Guardrail testing: inject synthetic gaming patterns, validate detection rate
Pilot — Live Lending (500 Retailers)
- AltScore drives credit limit recommendations; NBFC retains final approval authority
- Disburse loans to Band A and B retailers; monitor 30/60/90-day repayment
- Grievance redressal flow live — track complaint rate and resolution time
- Model performance monitoring — PSI, KS, actual default rate vs. predicted
- Lender NPS survey at 60 days
Scale — API Launch & Lender Onboarding
- Public API launch with sandbox environment
- Onboard 3 additional NBFC/fintech lenders
- Distributor partnership program — revenue share model for data contributors
- Expand to 10,000 retailers across 3 geographies
- Continuous model retraining pipeline (monthly cadence)
Success Metrics
North Star Metric
This metric captures the full pipeline working end-to-end — data ingested, model scored, lender integrated, loan disbursed. It can only improve if every layer functions correctly.
Leading Indicators
| Metric | Target (M6) |
|---|---|
| Retailers profiled (data ingested) | 25,000 |
| Retailers scored (AltScore issued) | 15,000 |
| Consent rate (retailer opt-in) | ≥60% |
| API latency (p95) | <2s |
| Score completeness rate | ≥85% |
| Lenders integrated (live) | 2 |
Model Quality Metrics
| Metric | Threshold |
|---|---|
| AUROC (holdout set) | ≥ 0.72 |
| KS Statistic | ≥ 0.35 |
| Actual vs. Predicted PD Delta | ≤ +2% |
| PSI (monthly) | < 0.2 |
| Reason code acceptance (lender) | ≥ 70% |
Guardrail Health Metrics
| Metric | Threshold |
|---|---|
| Fairness cohort score divergence | < 15% |
| Gaming detection recall | ≥ 80% |
| Grievance resolution SLA (7 days) | ≥ 90% |
| Data completeness score | ≥ 70% per distributor |
| Consent revocation rate | < 5% |
Risks & Mitigations
| Risk | Category | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| Distributor data quality is too poor to generate reliable features | Data | High | High | Completeness gate before scoring; rules-based fallback model for low-quality profiles; distributor data improvement program with incentives |
| Distributor fears disintermediation — refuses to share data | Commercial | High | High | Address directly in pitch: credit is working capital for existing channel purchases, not a bridge to direct OEM buying. Reinforce contractually. Lead with revenue upside and balance sheet relief before asking for data. Distributor intelligence dashboard as immediate value-in at zero cost. |
| NBFC unwilling to use AltScore without Bureau score | Commercial | High | High | Position AltScore as supplementary, not replacement; shadow mode pilot to build trust before primary underwriting use |
| Retailer consent rate too low (<30%) | Adoption | Med | High | Distributor-led consent (trusted relationship); clear value prop communicated (loan offer, not surveillance); DPDP-compliant minimal data ask |
| Model predicts well in training but fails in production (distribution shift) | ML | Med | High | Monthly PSI/KS monitoring; champion/challenger shadow mode; rollback protocol if KS drops >10pts in 30 days |
| Regulatory classification — is AltScore a "credit information company"? | Legal / Regulatory | Med | High | Legal opinion secured before pilot launch; structure as "decisioning analytics" SaaS, not CIC; monitor RBI guidance on alternative data |
| Score gaming once model behavior becomes widely known | Adversarial | Low | Med | Isolation Forest anomaly detection; feature opacity (reason codes shown, not raw weights); 60-day look-back window; payment behavior ungameable |
| Data breach exposing retailer transaction history | Security | Low | High | Data encrypted at rest and in transit; role-based access; no PII in scoring pipeline; annual penetration testing; SOC 2 Type II certification (12-month target) |
Product Roadmap
v1.0 — Foundation
- ✦ ERP Connector (Tally + CSV)
- ✦ Consent management (WhatsApp)
- ✦ Feature pipeline (40 signals)
- ✦ AltScore v1 model (XGBoost)
- ✦ SHAP reason codes
- ✦ Score API v1 (REST)
- ✦ Distributor intelligence dashboard (v1)
- ✦ 3 pilot distributor agreements signed
- ✦ 1 NBFC pilot — 500 retailers
v2.0 — Scale
- ✦ SAP B1 native connector
- ✦ Lender dashboard (web)
- ✦ AA Framework integration
- ✦ Multi-lender API (3 NBFCs)
- ✦ Model v2 (larger training set)
- ✦ Distributor data portal (self-serve)
- ✦ Distributor revenue share program live
- ✦ 10,000 scored retailers
v3.0 — Intelligence
- ✦ Dynamic credit limit updates (monthly)
- ✦ Seasonal loan structuring engine
- ✦ Retailer-facing score card (WhatsApp)
- ✦ Cross-distributor profile merging
- ✦ Distributor competitive benchmarking report
- ✦ Embedded insurance data feed
- ✦ 1,00,000 retailer target
- ✦ SOC 2 Type II certification
Every retailer who receives their first formal loan through AltScore is a person who was invisible to the financial system before. That is the product's north star — not AUM, not API calls, not revenue. Credit access at scale, with integrity.
PRODUCT PRINCIPLE — ALTSCORE