------------------------------------| 1. DATA SOURCES |
| Kaggle Telco Churn, IBM CRM |
| exports, application usage logs, |
| and support ticket transcripts |
| form the core dataset. These |
| sources capture behavioral and |
| transactional signals tied to |
| customer lifecycle stages. All |
| inputs undergo schema validation |
| and anomaly screening to ensure |
| consistency. |
------------------------------------ | v------------------------------------| 2. INGESTION & PREPROCESSING |
| Batch loaders merge CRM data |
| with historical usage metrics, |
| while event streams provide live |
| churn-indicating signals. |
| Preprocessing handles null |
| through demographic-aware |
| imputation, reconstructs |
| interaction sequences, and |
| normalizes heterogeneous |
| attributes to ML-ready formats. |
------------------------------------ | v------------------------------------| 3. FEATURE ENGINEERING |
| Tenure indicators, RFM profiles, |
| contract-based risk markers, |
| sentiment from support messages, |
| and rolling engagement windows |
| are engineered. SHAP analysis |
| informs feature selection by |
| measuring predictive |
| contribution. Features are |
| versioned and stored for |
| consistent training/inference |
| usage. |
------------------------------------ | v------------------------------------| 4. MODEL TRAINING (XGBoost + |
| SHAP Integration) |
| Gradient-boosted trees are |
| trained using cross-validation |
| to capture nonlinear churn |
| behavior. SHAP provides |
| interpretable reasoning for |
| model predictions, spotlighting |
| the strongest churn drivers. |
| Regularization and hyperparameter|
| tuning ensure stable |
| generalization across cohorts. |
------------------------------------ | v------------------------------------| 5. DEPLOYMENT & INFERENCE |
| SERVICE |
| A FastAPI model server exposes |
| real-time prediction endpoints |
| with low-latency scoring |
| optimized through caching and |
| lightweight containers. The |
| service supports batch scoring |
| for CRM workflows and integrates |
| with retention dashboards used |
| by business teams. |
------------------------------------ | v------------------------------------| 6. MONITORING & FEEDBACK LOOP |
| Data drift monitors track shifts |
| in customer profiles and |
| engagement. Retraining pipelines |
| trigger when model decay is |
| detected. Business teams feed |
| back validated churn cases, |
| allowing continuous learning and |
| improved targeting strategies |
| over time. |
------------------------------------