AI-Based Pest Detection System
The Colorado potato beetle lays eggs on the underside of leaves—where drones rarely look. Thrips are only visible at 10x magnification. Spider mites leave marks that agronomists confuse with sunburn. Pest detection is technically one of the most challenging tasks in agro-CV: objects are small, camouflaged, often obscured by foliage.
Why Standard CV Approaches Don't Work Here
Object Scale Problem
Aphids on leaf are 1–2 mm. On a drone image at 10 meters altitude with 2 mm/pix GSD, an aphid occupies literally 1 pixel. No YOLO detects this.
Practical conclusion: for small pests, close-up imaging is needed—automated systems with plant-level cameras (robotic platforms, conveyor systems), trap cameras (sticky trap monitoring), or macro smartphone photos in app.
For large pests (locusts, Colorado beetle, caterpillars)—drones with GSD < 0.5 mm/pix (flight altitude 3–5 m).
Small Object Detector
YOLOv8 and most standard detectors perform poorly on objects < 32×32 pixels. Use multiple approaches depending on task:
Tile-based inference—image split into 640×640 patches with 20% overlap, each processed separately. SAHI (Sliced Aided Hyper Inference) implements this on any YOLO model without weight changes.
Specialized architectures for small objects—RFLA (Receptive Field Loss for Small Object Detection), QueryDet, or custom FPN with additional high-resolution P2 output.
On whitefly counting on sticky traps (yellow adhesive traps): YOLOv8n with SAHI at tile_size=640 achieved mAP50=0.79, vs standard inference on full 4000×3000 image—only 0.52.
| Approach | mAP50 (small objects) | Inference Speed |
|---|---|---|
| YOLOv8n standard | 0.52 | 15 ms |
| YOLOv8n + SAHI | 0.79 | 180 ms |
| YOLOv8m + SAHI | 0.84 | 310 ms |
| QueryDet | 0.81 | 95 ms |
Pest Counting—Separate Task
Detection "yes/no" insufficient for treatment decisions. Need quantitative counting per unit area. For dense colonies (aphids, thrips), bbox detection becomes density estimation—CSRNet or DM-Count instead of YOLO, predict density map and sum predicted insect count.
Trap Monitoring with Automatic Recognition
One practical and economically justified format: smart pheromone traps with camera (e.g., Delta Trap + Raspberry Pi Camera v3 or ready devices like Trapview). Camera takes photo every 2–4 hours, model counts insects on sticky surface, data sent to cloud.
For such system, MobileNetV3-Small or EfficientNet-Lite0 with INT8 quantization suffices—runs on Raspberry Pi Zero 2W at < 2W consumption. Counting accuracy ±15% at densities up to 200 insects per trap.
Development Process
Data collection. Main difficulty—variety of lighting conditions (morning/noon/cloudy) and pest development stages (eggs, larvae, adults look different). Minimum 300–500 annotated specimens of each pest at each stage.
Annotation. For traps—bbox + count. For field images—polygon segmentation for precise background separation. Use Label Studio with custom insect-detection template.
Training. Transfer learning from COCO weights (insects underrepresented but low-level features transfer). Focal loss with gamma=1.5 to compensate background/object imbalance.
Production monitoring. Auto notification when pest count exceeds threshold (economic threshold differs per crop/pest, set by agronomist). Integration with precision agriculture systems.
Timeline
System for 1–3 pest species on traps: 4–6 weeks. Field multi-species platform with mobile app and API: 2–4 months.







