AI-Powered Crop Disease Lab
Training, Deployment and Monitoring
Overview
A production-shaped MLOps stack focused on plant-disease classification. PyTorch and TensorFlow models are trained through Kubeflow Pipelines, tracked and versioned in MLflow, and promoted from registry to a served endpoint. The lab covers the full lifecycle — reproducible training, model registry, deployment, and drift/accuracy monitoring — so new disease classes can be added without rebuilding the stack.
The Problem
Crop-disease models are usually one-off notebooks with no path to reproducible retraining, versioning, or monitored deployment. As new diseases and regions appear, the model needs to be retrained and redeployed safely — which ad-hoc workflows can't support.
The Approach
Training runs as Kubeflow pipelines that log metrics, params, and artifacts to MLflow; promoting a model is a registry tag change that syncs the served endpoint. Both PyTorch and TensorFlow backbones are supported. Deployed inference is monitored for accuracy drift, closing the loop back to retraining.
Results
Planned — targets: reproducible training runs, one-command model promotion, monitored inference, and the ability to add disease classes without rebuilding the pipeline.
Process & Timeline
- Phase 1
Dataset & tracking
Version the disease dataset and wire experiment tracking and a model registry in MLflow.
- Phase 2
Training orchestration
Build Kubeflow pipelines for reproducible PyTorch/TensorFlow training.
- Phase 3
Deployment
Promote registry models to a served inference endpoint.
- Phase 4
Monitoring
Add accuracy/drift monitoring that triggers retraining.
Like what you see?
I'm always open to collaborations on AI, robotics, edge computing, or embedded systems.