Security Design Principles
AltScore processes sensitive commercial transaction data belonging to distributors and retailers. A breach or misuse does not just harm a company — it erodes the trust of the very small businesses the platform exists to serve. Security is therefore a first-class product constraint, not an afterthought.
The security model is built around one foundational assumption: every component can be compromised. Defense is layered so that the compromise of any single layer does not result in a material breach. No single key, credential, or service is a single point of failure.
SECURITY DESIGN PRINCIPLE — ALTSCORELeast Privilege
Every service, user, and API consumer receives the minimum access required for their function. No shared admin credentials. Role boundaries enforced at infrastructure level, not application level.
Defense in Depth
Security controls applied at every layer — network, application, data, and identity. Compromise of one layer does not cascade. Layers include: network perimeter, WAF, application auth, field-level encryption, audit logging.
Data Minimization
Only the data necessary to compute credit features is ingested. PII is pseudonymized before entering the feature pipeline. Raw ERP data never enters the ML scoring layer.
Immutability of Raw Data
Bronze-layer raw data is append-only and immutable. Audit lineage requires the ability to reconstruct any score from source. No destructive writes permitted in the raw storage tier.
Explicit Consent as a Security Control
Consent records are cryptographically signed and stored immutably. Processing any retailer's data without a valid consent record is a system-level block, not a policy aspiration.
Assume Breach
Detection and containment capabilities are designed for the scenario where an attacker has already gained some foothold. Segment everything. Alert on lateral movement. Mean Time to Detect (MTTD) is a primary operational metric.
Authentication & Authorization
Identity Tiers
| Actor | Auth Mechanism | MFA Required | Session / Token TTL | Access Scope |
|---|---|---|---|---|
| NBFC Lender (API) | Signed JWT bearer token (RS256) + client secret | N/A (machine-to-machine) | Access token: 1 hour; refresh: 24 hours | Score API read; own portfolio data only |
| NBFC Credit Officer (Dashboard) | SSO via SAML 2.0 / OIDC (lender's IdP) | Yes — enforced by lender IdP | Session: 8 hours; idle timeout: 30 min | Own lender portfolio; no cross-lender data |
| Distributor (Portal) | Email OTP + phone OTP (two-factor) | Yes — OTP is MFA | Session: 4 hours; idle timeout: 20 min | Own retailer network data only |
| AltScore Internal (Admin) | Hardware security key (FIDO2/WebAuthn) + SSO | Yes — phishing-resistant hardware key mandatory | Session: 4 hours; re-auth for sensitive actions | Role-gated (see RBAC below) |
| ERP Connector (Service Account) | Mutual TLS (mTLS) client certificate + API key | N/A (service account) | Certificate rotation: 90 days; key rotation: 30 days | Data ingestion only; write to Bronze lake; no read of other distributors |
| ML Pipeline (Service Account) | IAM role-based (AWS/Azure managed identity) | N/A (platform-managed) | Short-lived credentials: 1 hour TTL | Read Silver; write Gold; no raw Bronze PII access |
RBAC Matrix — Internal Roles
| Internal Role | Raw ERP Data | Feature Layer | Score Store | Model Artifacts | Consent Records | Audit Logs |
|---|---|---|---|---|---|---|
| Data Engineer | Read (own dist. only) | Read/Write | No | No | No | Read |
| ML Engineer | No | Read | Read | Read/Write | No | Read |
| Credit Risk Analyst | No | Read (aggregated) | Read | Read | No | Read |
| Legal / Compliance | No | No | No | No | Read | Read |
| Security / Audit | No | No | No | No | Read | Read/Write |
| Engineering Ops | No | No | No | No | No | Read |
| Super Admin | Restricted + audit | Restricted + audit | Restricted + audit | Restricted + audit | Restricted + audit | Read/Write |
Super Admin access to raw data always requires a four-eyes approval (a second admin must approve the access request) and generates an immutable audit entry. There is no unilateral admin access to raw retailer or distributor data.
ACCESS CONTROL REQUIREMENTAPI Authorization Constraints
- Each lender API key is bound to a specific lender_id at issuance
- Score queries return only retailers that lender has been granted access to via consent
- Cross-lender data leakage is prevented at the DB query layer (row-level security), not just application logic
- API keys are non-transferable and cannot be delegated
- All JWTs validated for: signature (RS256), expiry (exp), issuer (iss), audience (aud)
- Revocation list checked on every request — no grace period after revocation
- Token claims are immutable — no server-side claim augmentation after issuance
- Refresh tokens are one-time use; rotating refresh token scheme
Encryption & Key Management
Encryption Standards by Layer
Key Management Requirements
| Key Type | Storage | Rotation Period | Access | Backup |
|---|---|---|---|---|
| API Signing Key (RS256 private) | HSM-backed KMS | 6 months | Auth service only; no human access | Geo-replicated KMS |
| Distributor CMK (data lake) | AWS KMS / Azure Key Vault | Annual | Automated only; no console access | Cross-region replication |
| ERP Connector Client Cert | Per-connector secure keystore | 90 days | Connector service only | Revocable via CRL |
| Field-Encryption DEK | Envelope-encrypted with KEK in KMS | Monthly re-wrapping | Application service account only | KEK backup in secondary region |
| Consent Signing Key (ECDSA) | HSM | Annual | Consent service only | Cold backup — not network-accessible |
| Dashboard TLS Certificate | Certificate Manager (auto-renew) | 90 days (auto) | Load balancer / CDN | Auto-renewed before expiry |
Network Security
Network Segmentation
The AltScore infrastructure is segmented into four isolated network zones. No direct connectivity between zones — all cross-zone traffic routes through an explicit gateway or service mesh with mutual authentication.
| Zone | Contents | Internet-Accessible |
|---|---|---|
| DMZ | API Gateway, WAF, Load Balancer, CDN | Yes (controlled) |
| Application | Score API, Consent Service, Dashboard Backend | No |
| Data | Data Lake, Feature Store, Score DB, Redis Cache | No |
| ML Compute | Training cluster, MLflow, Model registry | No — egress via proxy only |
Perimeter Controls
- OWASP Top 10 rule set enforced at WAF
- Rate limiting: 100 req/min per lender API key; 1000 req/min per IP at WAF
- Geo-blocking configurable at WAF — default: India only for production
- SQL injection, XSS, and path traversal patterns blocked at ingress
- Bot detection: minimum CAPTCHA score threshold for dashboard login
- Cloud-native DDoS shield (AWS Shield Advanced / Azure DDoS Standard)
- Volumetric attack mitigation at network layer — upstream from application
- API-layer throttling independent of network-layer protection
- Automatic traffic scrubbing for sustained attack patterns
ERP Connector Network Requirements
The ERP Connector SDK operates inside distributor on-premise or cloud environments. Network security must be enforced for both the outbound connection from the distributor and the inbound data ingestion endpoint at AltScore.
| Requirement | Specification | Rationale |
|---|---|---|
| Outbound-only connection | Connector initiates HTTPS; AltScore never initiates inbound connection to distributor network | No VPN or firewall hole required in distributor environment |
| IP allowlisting | AltScore publishes a static IP range for ingestion endpoints; distributors can allowlist | Reduces phishing/man-in-the-middle risk |
| mTLS on ingestion endpoint | Client certificate required; self-signed certs not accepted | Prevents unauthenticated data injection |
| Payload size limit | Max 50MB per batch; connector enforces chunking above this threshold | Prevents resource exhaustion on ingestion endpoint |
| Retry with backoff | Exponential backoff; max 5 retries; failure logged and alerted | Prevents thundering-herd on transient network failures |
Data Access Controls
AltScore holds sensitive commercial data from multiple distributors and retailers. Strict logical isolation between data owners is a non-negotiable requirement — not just a policy, but an engineering constraint enforced at the database level.
Data Isolation Architecture
- Each distributor's raw data partitioned in a separate S3 prefix / ADLS container with its own CMK
- Service accounts for each distributor's connector have write access only to their own partition
- Bronze-layer queries require explicit distributor_id scope — no wildcard reads
- Cross-distributor data joins only permitted in the pseudonymized Silver layer, after identity resolution
- Row-level security (RLS) enforced in Score Store PostgreSQL — every query is automatically scoped to the requesting lender_id
- Lender A cannot enumerate, infer, or access retailer scores belonging to Lender B
- API responses never include competitor lender identifiers or exposure amounts
- Lender dashboard data is server-side rendered with lender scope — no client-side filtering
- GST number and phone pseudonymized to UUID before entering Silver/Gold layers
- Identity resolution store (UUID ↔ GST/phone) is physically separate, access-restricted
- ML pipeline never accesses the identity resolution store — operates entirely on UUIDs
- Score API re-maps UUID → retailer identity only at response time, via a separate lookup service
- Feature computation jobs run in isolated compute environments per distributor batch
- Shared infrastructure (Spark, Airflow) uses resource tagging to prevent job-level data mixing
- Score Store views enforce RLS; direct table access requires separate privileged role with audit
- Redis cache keys namespaced by lender_id; no shared cache keys across lenders
Sensitive Data Handling Rules
| Data Type | Classification | Masking in Logs | Masking in Dashboard UI | Permitted Export |
|---|---|---|---|---|
| Retailer GST Number | Highly Sensitive | Last 4 chars only | Partial (XXXX...R1ZX) | No — API response only |
| Retailer Phone | Highly Sensitive | Fully masked | Last 4 digits only | No |
| Invoice Values | Sensitive | Aggregated only | Range bands, not exact | Aggregated reports only |
| AltScore Value | Sensitive | Logged (score only) | Shown in full to authorized lender | Lender's own portfolio — yes |
| Probability of Default | Sensitive | Logged (value only) | Shown to authorized lender | Lender's own portfolio — yes |
| Reason Codes | Internal | Logged | Shown to lender + retailer (translated) | Yes — per consent scope |
| Model Weights/Features | Trade Secret | Never logged | Never shown | No |
ERP Connector Security
The ERP Connector SDK is deployed inside distributor environments — a context AltScore does not control. The connector is the highest-risk ingestion surface and must be designed to be secure even when the host environment is compromised.
Connector Design Constraints
- SDK requests only SELECT permissions to relevant ERP tables — never INSERT, UPDATE, DELETE
- ERP database credentials stored in the SDK's encrypted local keystore, not in plaintext config files
- Schema is documented and reviewed; any deviation from expected schema triggers a quarantine and alert
- SQL queries are parameterized — no dynamic query construction from ERP data fields
Tamper-Resistance
- Every data batch signed by connector before transmission (HMAC-SHA256 using per-connector key)
- Signature verified at ingestion endpoint before data is written to Bronze layer
- Unsigned or invalid-signature batches are rejected and the distributor is notified
- Connector binary is code-signed; SDK updates delivered via signed package only
Data Extraction Scope Limits
| ERP Table / Module | Fields Extracted | Fields Excluded (Always) |
|---|---|---|
| Sales Ledger | Invoice date, invoice amount, retailer ID, payment date, payment amount | Bank account numbers, internal distributor margin, credit notes to third parties |
| Item / SKU Master | SKU category code, order quantity, unit price band | Distributor cost price, supplier contracts, internal SKU profitability |
| Retailer Master | Retailer GST, shop name, locality, district, active status | Personal data beyond business identity, owner's personal income/assets |
| Payment Terms | Credit days agreed, overdue flag, payment method type (UPI/cash/cheque) | Cheque numbers, bank account details, personal guarantor data |
The data extraction scope is contractually fixed in the distributor data partnership agreement. Any modification to the schema extraction list requires a new agreement version and explicit distributor consent. The SDK enforces scope technically — schema drift triggers a halt-and-alert, not a silent expansion of data collection.
ERP CONNECTOR SECURITY CONSTRAINTML Pipeline Security
The ML scoring pipeline introduces attack surfaces that are distinct from traditional software — adversarial inputs, model poisoning, training data leakage, and score inversion attacks. Each is addressed as a security constraint, not just a model quality concern.
Data Provenance Enforcement
Every training data batch carries a cryptographic hash. The training pipeline validates hashes before use. Any batch with a modified hash is rejected and an alert is raised. NBFC partner data is ingested via a separate authenticated channel and stored in an isolated training store.
Output Minimization
API response exposes score, risk band, credit limit, and reason codes — not raw feature values. Reason codes are categorical (e.g., "payment_delay_p90_gt_15days") — not the exact numeric value that would allow reconstructing the feature. Model weights are never exposed via any interface.
Input Validation & Anomaly Detection
Score requests validated for: valid retailer_id format, authorized lender scope, and request rate per retailer (max 3 score queries per retailer per 24h per lender). Isolation Forest anomaly signal is computed before scoring — flagged retailers receive a "review" disposition, not a raw score.
Model Artifact Security
| Artifact | Storage | Access Control | Integrity Check |
|---|---|---|---|
| Trained model binary (XGBoost) | MLflow artifact store (encrypted S3) | ML Ops role only; no developer direct access | SHA-256 hash logged at training; verified at load time |
| SHAP explainer object | Same as model binary, versioned together | Same as model binary | Hash validated before serving |
| Feature mean/std scalers | Versioned alongside model in MLflow | Same as model binary | Hash validated; scaler version must match model version |
| Training dataset snapshot | Encrypted S3, isolated training partition | ML Engineer + Security Audit only | Immutable — S3 Object Lock (WORM) |
| Model performance logs | CloudWatch / Azure Monitor | Read: ML Ops, Security; Write: pipeline only | Append-only log group; no log deletion permitted |
Audit Logging & Monitoring
Every access to sensitive data, every score issued, and every consent event must be traceable to an identity and a timestamp. Audit logs are the forensic backbone of the platform and must be tamper-resistant.
Mandatory Audit Events
| Event | Fields Logged | Retention | Alert Trigger |
|---|---|---|---|
| Score API call | lender_id, retailer_id (UUID), timestamp, score_version, response_band, latency_ms, request_ip | 7 years | Rate > 100 req/min per lender |
| Consent grant | retailer_id, distributor_id, timestamp, consent_scope, signature, channel (WhatsApp/web) | 7 years | None (normal event) |
| Consent revocation | retailer_id, revocation_timestamp, requested_by, reason_code | 7 years | Alert to data ops team within 1 hour |
| ERP data batch ingested | distributor_id, batch_id, record_count, data_hash, ingestion_timestamp, schema_version | 7 years | Schema mismatch, hash failure |
| Admin data access | admin_user_id, resource_accessed, justification_ticket, approver_id, timestamp | 7 years | All admin accesses alert to CISO |
| Failed authentication | user/lender_id, IP, timestamp, failure_reason | 2 years | 5 failures in 10 min → account lock + alert |
| Model version promotion | model_version, promoted_by, champion_challenger_metrics, timestamp | 7 years | All promotions alert to Risk team |
| Grievance filed | retailer_id, lender_id, score queried, grievance_type, timestamp, resolution_status | 7 years | SLA breach (7 days) → escalation alert |
Log Integrity & SIEM
- Audit logs written to append-only log groups (CloudWatch Logs with no-delete policy / Azure Immutable Blob Storage)
- Log events hashed and chained — each log record includes hash of previous record (blockchain-style integrity)
- Log export to cold storage (S3 Glacier / Azure Archive) after 90 days; deletion requires dual-admin approval
- Log integrity verified weekly via automated hash chain check
- All audit logs fed to SIEM (e.g., Splunk / AWS Security Hub / Microsoft Sentinel)
- Correlation rules: unusual access hours, geographic anomalies, sudden score volume spikes
- P1 alerts: paged to on-call security engineer within 5 minutes
- Weekly automated threat report generated from SIEM for CISO review
Regulatory Compliance Requirements
| Standard / Regulation | Applicable Requirement | Security Control | Status |
|---|---|---|---|
| DPDP Act 2023 | Consent, data minimization, purpose limitation, right to erasure, breach notification (72 hours) | Consent service with cryptographic records; pseudonymization pipeline; deletion workflow; breach runbook | Design complete — implementation in progress |
| RBI FLDG Guidelines | Explainability of credit decisions; audit trail for automated decisions | SHAP reason codes on every score; immutable audit log of all score decisions | Implemented in model design |
| PCI-DSS | No storage, processing, or transmission of payment card data | ERP extraction scope explicitly excludes card data; enforced at SDK level | By design — no card data in scope |
| SOC 2 Type II | Security, Availability, Confidentiality trust service criteria | 12-month audit program; continuous monitoring; vendor assessment program | Target: Month 12 — audit readiness program begins Month 1 |
| ISO 27001 | Information security management system (ISMS) | ISMS policy suite; risk register; vendor management; business continuity plan | Target: Month 18 — gap assessment Month 3 |
| RBI Account Aggregator Framework | Consent-based financial data sharing; AA licensing | AA integration for supplementary bureau data in v2; legal review of AA licensing requirements | v2 roadmap — legal review Month 4 |
Penetration Testing Schedule
| Test Type | Scope | Frequency | Performed By |
|---|---|---|---|
| External Network Pen Test | All internet-facing endpoints (API, dashboard, connector ingestion) | Annual + before each major release | Accredited third-party firm |
| Web Application Pen Test | Dashboard, consent portal, API endpoints | Annual + after major feature changes | Accredited third-party firm |
| ERP Connector Security Review | SDK binary, client certificate handling, data extraction scope | Before each SDK major version release | Internal security + third-party code review |
| ML Adversarial Testing | Score API input fuzzing; model inversion attempts; gaming simulation | Quarterly | Internal ML security team |
| Internal Red Team | Full-scope including insider threat simulation, social engineering | Annual | External red team firm |
Incident Response Plan
A documented, practiced incident response plan is a SOC 2 requirement and a DPDP Act obligation. The plan is reviewed and rehearsed quarterly via tabletop exercises.
Severity Classification
| Severity | Definition | Response Time | Notification |
|---|---|---|---|
| P1 — Critical | Confirmed data breach; unauthorized access to PII or score data; ransomware; full platform outage | 15 min to acknowledge; 1 hour to contain | CISO immediately; legal within 1 hour; regulators within 72 hours (DPDP); affected data owners notified |
| P2 — High | Suspected breach (under investigation); auth system down; model serving down; consent system down | 30 min to acknowledge; 4 hours to resolve or escalate | CISO + Engineering Lead; affected lenders if score API down >30 min |
| P3 — Medium | Anomalous access pattern detected (no confirmed breach); single component degraded; failed pen test finding | 2 hours to acknowledge; 24 hours to remediate | Security team; Engineering lead |
| P4 — Low | Policy violation; non-critical vulnerability; audit finding | Next business day | Security team internal |
Breach Response Workflow
- Isolate affected systems — network segment block at firewall
- Revoke all active API tokens and sessions immediately
- Preserve forensic state — snapshot affected instances before remediation
- Activate war room — CISO, Engineering Lead, Legal on bridge
- Disable data ingestion and score serving if breach scope unclear
- Legal counsel determines regulatory notification scope
- DPDP Act: notify Data Protection Board within 72 hours of confirmed breach
- Notify affected distributors and retailers per breach notification template (pre-approved by legal)
- Notify affected NBFC lenders if their data is in scope
- Publish incident status page update every 2 hours during active incident
- Deploy patched or clean-state infrastructure from IaC (no manual config)
- Re-issue all affected credentials — API keys, certificates, session tokens
- Verify data integrity via hash chain audit before resuming operations
- Staged traffic restoration with enhanced monitoring
- Post-incident review — root cause, detection gap, response time analysis
- Action items with owners and deadlines tracked in security risk register
- Update threat model and relevant security controls
- Board-level incident report if P1; share summary with NBFC partners