-----------------------------------| 1. DATA SOURCES |
| COCO, OpenImages, and custom |
| CCTV video feeds supply diverse |
| visual scenes. Data includes |
| indoor, outdoor, and low-light |
| frames. Annotation quality is |
| manually validated. |
----------------------------------- | v ----------------------------------- | 2. INGESTION & FRAME PROCESSING |
| Video streams are decoded into |
| frames at target FPS. Frames |
| are resized, normalized, and |
| deduped. Metadata is preserved. |
----------------------------------- | v ------------------------------------ | 3. ROI EXTRACTION & AUGMENTATION |
| Regions of interest (ROIs) are |
| identified and cropped for |
| targeted augmentation. Techniques|
| enhance model robustness. |
------------------------------------ | v ----------------------------------- | 4. MODEL TRAINING (YOLOv8 |
| Optimization) |
| YOLOv8 is trained using multi- |
| scale anchors and augmented |
| datasets. Regularization and |
| fine-tuning ensure stability. |
----------------------------------- | v ----------------------------------- | 5. INFERENCE & TRACKING |
| The deployed model performs |
| real-time detection on GPU/edge |
| accelerators. Multi-object |
| tracking assigns persistent |
| identities across frames. |
----------------------------------- | v ----------------------------------- | 6. VISUALIZATION & FEEDBACK |
| LOOP |
| Detection outputs are streamed |
| to dashboards with bounding |
| boxes and confidence scores. |
| Security teams provide feedback.|
-----------------------------------