Phi-3 Agricultural Analyst
A domain-specific language model fine-tuned for agricultural analysis tasks. This model provides structured insights on farm operations, investment opportunities, risk assessment, and regional agricultural profiling.
Project Scope
This release focuses on 8 major US agricultural counties as a proof-of-concept:
- Fresno, CA (Tree nuts, Grapes)
- Kern, CA (Almonds, Pistachios)
- Lancaster, PA (Dairy, Corn)
- Sioux, IA (Corn, Soybeans)
- Yakima, WA (Apples, Hops)
- Imperial, CA (Alfalfa, Lettuce)
- Monterey, CA (Strawberries, Lettuce)
- Deaf Smith, TX (Cattle, Wheat)
Upcoming releases will include:
- Full US county coverage (3,000+ counties)
- Specialized models for crop yield prediction
- Harvest timing optimization
- Soil health analysis
- Weather impact assessment
Performance Benchmarks
Evaluated against base Phi-3 Mini on agricultural analysis tasks:
Content Coverage
Base Model |ββββββββββββββββββββ | 50.0%
Fine-Tuned |βββββββββββββββββββββββββ | 63.3% (+26.7%)
Structure Quality
Base Model |ββββββββββββββββββββββ | 56.7%
Fine-Tuned |βββββββββββββββββββββββββββββββββ | 83.3% (+47.1%)
Inference Speed
Base Model |ββββββββββββββββββββββββββββββββββββββββ| 101.9s
Fine-Tuned |ββββββββββββββββββββββββ | 62.4s (38.8% faster)
Summary Table
| Metric | Base Model | Fine-Tuned | Improvement |
|---|---|---|---|
| Content Coverage | 50.0% | 63.3% | +26.7% |
| Structure Quality | 56.7% | 83.3% | +47.1% |
| Avg Inference Time | 101.9s | 62.4s | 38.8% faster |
Training Loss Curve
Epoch 1: ββββββββββββββββββββββββββββββββββββββββ Loss: 1.605 β 0.115
Epoch 2: ββββββββββββ Loss: 0.115 β 0.020
Epoch 3: ββββ Loss: 0.020 β 0.019
The fine-tuned model shows significant improvement in generating domain-relevant content and producing well-structured analytical outputs.
Training Configuration
| Parameter | Value |
|---|---|
| Base Model | microsoft/Phi-3-mini-4k-instruct |
| Method | LoRA (Low-Rank Adaptation) |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training Examples | 900 |
| Validation Examples | 100 |
| Epochs | 3 |
| Batch Size | 16 |
| Learning Rate | 2e-4 |
| Precision | BF16 |
| Training Time | ~51 minutes |
| Final Eval Loss | 0.0186 |
| Token Accuracy | 99.3% |
Hardware: NVIDIA DGX Spark (Blackwell GPU, 128GB RAM)
Use Cases
- Farm investment analysis
- Regional agricultural profiling
- Risk factor identification
- Technology adoption recommendations
- Irrigation and water sustainability assessment
- Crop expansion planning
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load model
model_name = "microsoft/Phi-3-mini-4k-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
model = PeftModel.from_pretrained(base_model, "sarathi-balakrishnan/phi3-agricultural-analyst")
# Create prompt
prompt = """You are an expert agricultural analyst. Analyze the following query using the provided data context.
### Query
What are the key investment opportunities in Fresno County agriculture?
### Data Context
County: Fresno, CA
Operators: 5,847
Average Farm Size: 423 acres
Irrigation Coverage: 89%
Revenue per Acre: $2,340
### Analysis"""
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=500,
temperature=0.7,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Output Format
The model generates structured analysis with:
- Summary of the agricultural region
- Key insights based on data patterns
- Investment/growth opportunities
- Risk factors and constraints
- Confidence level
Limitations
- Currently covers 8 US counties (expansion planned)
- Training data is synthetically generated
- Should not replace professional agricultural consulting
- Performance may vary for regions outside training distribution
Technical Notes
- Uses eager attention for Phi-3 compatibility
- Optimized for BF16 inference on modern GPUs
- LoRA adapters can be merged for deployment:
model.merge_and_unload()
License
MIT License - Free for commercial and research use.
Contact
For questions about this model or collaboration on agricultural AI:
- HuggingFace: @sarathi-balakrishnan
- Downloads last month
- 14
Model tree for sarathi-balakrishnan/phi3-agricultural-analyst
Base model
microsoft/Phi-3-mini-4k-instructEvaluation results
- Content Coverage (%)self-reported63.300
- Structure Quality (%)self-reported83.300