Work

Anonymized case studies showcasing real-world optimizations and deployments on constrained systems.

Low-Latency Edge Audio Enhancement for Multi-Mic Device

Smart Devices

Challenge

Client needed real-time noise suppression and beamforming on a battery-powered device with <256MB RAM and strict latency requirements (<20ms).

Solution

Implemented fixed-point adaptive beamformer with INT16 processing pipeline. Optimized using ARM NEON intrinsics. Deployed on RTOS with dedicated audio thread at highest priority.

Impact

  • Latency: 12ms end-to-end (40% below target)
  • Memory: 180MB peak (30% headroom)
  • Power: 2.5x battery life improvement vs. baseline
  • SNR improvement: 15dB in noisy environments
DSPAudioRTOSARM

On-Device Vision Model Optimization for UAV Payload

Defence & UAV

Challenge

Autonomous UAV needed real-time object detection with compute/power/thermal constraints. Intermittent connectivity required offline-first inference.

Solution

Quantized YOLOv8 to INT8 using QAT. Deployed on NPU with custom TFLite delegate. Implemented model versioning and A/B testing on-device.

Impact

  • Inference time: 35ms per frame (1920x1080)
  • Accuracy: 92% mAP (vs 94% FP32 baseline)
  • Model size: 4.2MB (16x compression)
  • Power draw: 1.8W inference (within thermal envelope)
Edge AIComputer VisionQuantizationUAV

Quantized Inference Pipeline for Constrained Embedded Platform

Industrial IoT

Challenge

Industrial sensor platform needed predictive maintenance ML on Cortex-M4 MCU with 512KB flash and 128KB RAM.

Solution

Trained compact anomaly detection model. Applied aggressive quantization (INT8) and pruning (70% sparsity). Used TFLite Micro runtime with custom ops.

Impact

  • Model fits in 48KB flash
  • Inference: 80ms at 168MHz
  • Anomaly detection: 88% F1 score
  • Deployed to 10,000+ devices via OTA
Embedded MLMCUTinyMLPredictive Maintenance