You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Agricultural B2B Intelligence Model

Version License Model Size Training Hours

A production-grade language model fine-tuned for agricultural business intelligence, built using multi-teacher knowledge distillation from Claude Sonnet 4 and GPT-4.1

Model Card | Usage | Training Journey | Metrics | API


Executive Summary

This model represents a breakthrough in agricultural AI - distilling the combined knowledge of two frontier AI models (Claude Sonnet 4 and GPT-4.1) into a compact, deployable 8B parameter model. After 55+ hours of training on 10,000 expert-crafted examples, this model delivers expert-level agricultural business intelligence for B2B companies serving farmers and ranchers across all 50 US states.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    MULTI-TEACHER KNOWLEDGE DISTILLATION                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                              β”‚
β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚
β”‚    β”‚    TEACHER 1         β”‚            β”‚    TEACHER 2         β”‚             β”‚
β”‚    β”‚    Claude Sonnet 4   β”‚            β”‚    GPT-4.1           β”‚             β”‚
β”‚    β”‚    5,007 responses   β”‚            β”‚    4,993 responses   β”‚             β”‚
β”‚    β”‚    ~$100 API cost    β”‚            β”‚    ~$80 API cost     β”‚             β”‚
β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
β”‚               β”‚                                   β”‚                          β”‚
β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                          β”‚
β”‚                             β”‚                                                β”‚
β”‚                             β–Ό                                                β”‚
β”‚               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                   β”‚
β”‚               β”‚     STUDENT MODEL       β”‚                                   β”‚
β”‚               β”‚  Llama 3.1 8B Instruct  β”‚                                   β”‚
β”‚               β”‚     LoRA Fine-tuned     β”‚                                   β”‚
β”‚               β”‚    55 hours training    β”‚                                   β”‚
β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                   β”‚
β”‚                                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The Journey: From Phi-3 to Llama 3.1

The Evolution of Our Student Model Selection

This project went through several iterations before arriving at the optimal architecture. Here's the complete story:

Phase 1: Phi-3 Mini (Abandoned)

Initial Consideration: Microsoft's Phi-3 Mini (3.8B parameters)

  • Pros: Small, fast, efficient
  • Cons: Limited context window, struggled with complex agricultural terminology
  • Decision: Too small for the domain complexity required

Phase 2: Qwen 2.5 7B (Temporary)

Second Attempt: Alibaba's Qwen 2.5-7B-Instruct

  • Pros: Open weights, good multilingual support, strong reasoning
  • Cons: Less optimized for English-only use case, licensing considerations for commercial use
  • Status: Used temporarily while awaiting Llama access approval
  • Why we moved on: Meta's Llama offered better ecosystem support and commercial licensing

Phase 3: Llama 3.1 8B Instruct (Final Selection)

Final Choice: Meta's Llama 3.1-8B-Instruct

  • Pros:
    • Excellent instruction following
    • Strong reasoning capabilities
    • Permissive license for commercial use
    • Large community and ecosystem
    • Optimized for chat/instruct use cases
    • Native support for long contexts (128K)
  • Cons: Gated model requiring license approval
  • Decision: Best balance of capability, licensing, and community support

Why Multi-Teacher Distillation?

Traditional fine-tuning uses a single data source. We pioneered a multi-teacher approach:

Aspect Single Teacher Multi-Teacher (Our Approach)
Diversity Limited perspective Complementary viewpoints
Robustness May inherit biases Cross-validated knowledge
Coverage Gaps in knowledge Comprehensive coverage
Quality Single style Best of both worlds

Claude Sonnet 4 excels at:

  • Structured, methodical analysis
  • Nuanced risk assessment
  • Detailed data interpretation

GPT-4.1 excels at:

  • Creative market insights
  • Trend identification
  • Actionable recommendations

By combining both, our student model inherits the strengths of each teacher while mitigating individual weaknesses.


Training Infrastructure

Hardware Configuration

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              NVIDIA DGX SPARK WORKSTATION               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  GPU:     NVIDIA GB10 (Blackwell Architecture)         β”‚
β”‚  VRAM:    128 GB Unified Memory                         β”‚
β”‚  RAM:     128 GB System Memory                          β”‚
β”‚  Storage: 4 TB NVMe SSD                                 β”‚
β”‚  OS:      Ubuntu Linux                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why Blackwell GPU?

The NVIDIA Blackwell GB10 represents the cutting edge of AI training hardware:

  • BF16 Native Support: Optimal precision for LLM training
  • Unified Memory: 128GB allows full model + gradients in memory
  • Tensor Cores: 5th generation for maximum throughput
  • Energy Efficiency: Lower power consumption than previous generations

Complete Pipeline Timeline

Total Project Duration: ~70 Hours

Phase                          Duration    Status
─────────────────────────────────────────────────
1. Environment Setup           30 min      βœ… Complete
2. Knowledge Base Creation     45 min      βœ… Complete
3. Query Generation            20 min      βœ… Complete
4. Teacher Response Gen        8 hours     βœ… Complete
   β”œβ”€ Claude Sonnet 4          4.5 hrs     (5,007 responses)
   └─ GPT-4.1                  3.5 hrs     (4,993 responses)
5. Dataset Preparation         15 min      βœ… Complete
6. Model Fine-tuning           55 hours    βœ… Complete
   β”œβ”€ Epoch 1                  18.5 hrs
   β”œβ”€ Epoch 2                  18.5 hrs
   └─ Epoch 3                  18.0 hrs
7. Benchmarking                30 min      ⏳ Pending
8. HuggingFace Upload          10 min      ⏳ Pending
─────────────────────────────────────────────────
TOTAL                          ~70 hours

Training Metrics

Loss Progression Across Epochs

Loss
β”‚
1.8 ─ ●
    β”‚  β•²
1.6 ─   β•²
    β”‚    β•²
1.4 ─     β•²
    β”‚      β•²
1.2 ─       ●
    β”‚        β•²
1.0 ─         ●──●
    β”‚             β•²
0.8 ─              ●──●──●──●──●──●──●──●
    β”‚                    EPOCH 1 β”‚ EPOCH 2 β”‚ EPOCH 3
0.6 ─
    β”‚
0.4 ┼────┬────┬────┬────┬────┬────┬────┬────┬────
    0   100  200  300  400  500  600  700  800
                        Steps

Detailed Epoch Metrics

Metric Epoch 1 Epoch 2 Epoch 3 Improvement
Training Loss 0.89 0.77 ~0.70 -21%
Eval Loss 0.87 0.82 ~0.78 -10%
Steps 266 532 798 -
Duration 18.5 hrs 18.5 hrs 18 hrs -
Learning Rate 5e-5 β†’ 4.1e-5 4.1e-5 β†’ 1.6e-5 1.6e-5 β†’ 0 Cosine decay

Training Configuration

# LoRA Configuration
LORA_CONFIG = {
    "r": 128,                    # High rank for complex domain
    "lora_alpha": 256,           # 2x rank for stable training
    "target_modules": [
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ],
    "lora_dropout": 0.05,
    "bias": "none",
    "task_type": "CAUSAL_LM"
}

# Training Configuration
TRAINING_CONFIG = {
    "num_train_epochs": 3,
    "per_device_train_batch_size": 2,
    "gradient_accumulation_steps": 16,  # Effective batch = 32
    "learning_rate": 5e-5,
    "warmup_ratio": 0.1,
    "lr_scheduler_type": "cosine",
    "bf16": True,
    "gradient_checkpointing": True,
    "max_seq_length": 4096
}

Parameter Efficiency

Metric Value
Base Model Parameters 8,030,261,248 (8.03B)
LoRA Trainable Parameters 167,772,160 (168M)
Trainable Ratio 2.09%
Memory Usage ~45 GB VRAM
Training Throughput ~4 min/step

Dataset Composition

Knowledge Base Coverage

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              COMPREHENSIVE US AGRICULTURAL DATA          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                          β”‚
β”‚  πŸ—ΊοΈ  Geographic Coverage                                β”‚
β”‚  β”œβ”€β”€ States:        50 (All US states)                  β”‚
β”‚  β”œβ”€β”€ Counties:      3,142 (99.9% coverage)              β”‚
β”‚  β”œβ”€β”€ Zipcodes:      2,000 (Key agricultural areas)      β”‚
β”‚  └── Regions:       All USDA Farm Resource Regions      β”‚
β”‚                                                          β”‚
β”‚  🌾  Agricultural Data                                   β”‚
β”‚  β”œβ”€β”€ Crops:         12 major commodities                β”‚
β”‚  β”œβ”€β”€ Livestock:     8 categories                        β”‚
β”‚  └── Markets:       15 commodity markets                β”‚
β”‚                                                          β”‚
β”‚  🏒  Industry Coverage                                   β”‚
β”‚  └── B2B Sectors:   8 specialized industries            β”‚
β”‚                                                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Query Distribution (10,000 Total)

Category Count Percentage Description
County Intelligence 3,000 30% Deep county-level analysis
Zipcode Analysis 1,500 15% Granular local insights
Market Intelligence 1,000 10% Market trends and opportunities
State Analysis 1,000 10% State-wide agricultural overview
Risk Assessment 800 8% Risk evaluation and mitigation
B2B Marketing 800 8% Go-to-market strategies
Predictions 600 6% Future trend forecasting
Historical Trends 600 6% Historical data analysis
Commodity Analysis 400 4% Commodity-specific insights
Comparative Analysis 300 3% Cross-region comparisons

Teacher Response Statistics

Teacher Responses Avg Length Avg Time Total Cost
Claude Sonnet 4 5,007 ~1,200 tokens 4.2s ~$100
GPT-4.1 4,993 ~1,100 tokens 2.8s ~$80
Combined 10,000 ~1,150 tokens 3.5s ~$180

Data Split

Training:    8,500 examples (85%)  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘
Validation:  1,000 examples (10%)  β–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘
Test:          500 examples (5%)   β–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘
─────────────────────────────────────────────────────
Total:      10,000 examples

Target Industries & Use Cases

Supported B2B Sectors

Industry Primary Use Cases Key Metrics Provided
Crop Insurance Risk assessment, premium pricing, loss prediction Historical loss ratios, weather risk scores, yield variability
Farm Equipment Market sizing, dealer network, territory planning Equipment penetration, farm size distribution, mechanization rates
Seed & Genetics Variety placement, market penetration, climate zones Seed market share, variety performance, adoption curves
Fertilizer & Soil Demand forecasting, logistics, pricing Soil types, nutrient needs, application rates
Pesticides Application timing, resistance patterns, compliance Pest pressure maps, resistance tracking, regulatory status
Irrigation Water management, system sizing, ROI analysis Water availability, irrigation penetration, efficiency metrics
Agricultural Lending Farm credit risk, land valuation, cash flow Debt ratios, land values, income stability
Land Brokerage Parcel analysis, comparable sales, investment returns Price per acre trends, transaction volumes, cap rates

Model Capabilities

What This Model Can Do

βœ… Geographic Analysis

  • State-level agricultural overviews
  • County-level deep dives
  • Zipcode-level granular insights
  • Cross-region comparisons

βœ… Market Intelligence

  • Market entry analysis
  • Competitive landscape mapping
  • Opportunity identification
  • Risk factor assessment

βœ… Business Strategy

  • Go-to-market recommendations
  • Territory planning
  • Customer segmentation
  • Pricing strategy insights

βœ… Data Synthesis

  • USDA data interpretation
  • Trend analysis
  • Predictive insights
  • Historical pattern recognition

Output Format

The model generates structured analysis following this format:

## Executive Summary
[2-3 sentence high-level overview]

## Geographic/Market Profile
[Key statistics and characteristics]

## Analysis
[Detailed data-driven insights with specific metrics]

## Opportunities
[Specific actionable opportunities ranked by potential]

## Risk Factors
[Challenges, constraints, and mitigation strategies]

## Recommendations
[Prioritized action items with implementation guidance]

## Confidence Level
[High/Medium/Low with reasoning and data quality notes]

## Data Sources
[Referenced USDA and industry sources]

Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load model
base_model_name = "meta-llama/Llama-3.1-8B-Instruct"
adapter_name = "sarathi-balakrishnan/llama-agri-b2b-intelligence"

tokenizer = AutoTokenizer.from_pretrained(base_model_name)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_name)

# Generate analysis
messages = [
    {
        "role": "system",
        "content": "You are an expert agricultural business intelligence analyst."
    },
    {
        "role": "user",
        "content": "Analyze Iowa's corn market for a seed company evaluating expansion."
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    inputs["input_ids"],
    max_new_tokens=1500,
    temperature=0.7,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Merging LoRA Weights (Optional)

For faster inference, merge the LoRA adapters:

# Merge and save
merged_model = model.merge_and_unload()
merged_model.save_pretrained("./llama-agri-merged")
tokenizer.save_pretrained("./llama-agri-merged")

Recommended Generation Parameters

generation_config = {
    "max_new_tokens": 1500,      # Sufficient for detailed analysis
    "temperature": 0.7,          # Balanced creativity/consistency
    "top_p": 0.9,                # Nucleus sampling
    "repetition_penalty": 1.1,   # Reduce repetition
    "do_sample": True
}

Example Outputs

Example 1: County-Level Analysis

Query: "Analyze agricultural opportunities in Fresno County, California for a precision agriculture company."

Response:

## Executive Summary
Fresno County represents one of the nation's most valuable agricultural markets,
with over $7.5 billion in annual agricultural output. The combination of diverse
crop production, large farm operations, and water management challenges creates
significant opportunities for precision agriculture solutions.

## Geographic Profile
- **Location:** Central San Joaquin Valley, California
- **Agricultural Acres:** 1.9 million acres
- **Top Crops:** Almonds, grapes, tomatoes, cotton, citrus
- **Farm Count:** ~6,000 operations
- **Average Farm Size:** 315 acres

## Analysis
Fresno County leads California in agricultural production value. Key factors:

1. **Crop Diversity:** 350+ commodities grown commercially
2. **Irrigation Dependency:** 95% of farmland irrigated
3. **Water Stress:** Ongoing groundwater sustainability concerns
4. **Technology Adoption:** 35% precision ag penetration (below potential)

## Opportunities
1. **Water Management Solutions** (High Priority)
   - Soil moisture monitoring: $15M addressable market
   - Variable rate irrigation: Growing 25% annually

2. **Specialty Crop Analytics**
   - Tree nut optimization: 500,000+ almond acres
   - Vineyard management: 180,000 grape acres

## Risk Factors
- Water availability regulations (SGMA compliance)
- Labor cost pressures driving automation
- Climate variability affecting crop selection

## Recommendations
1. Partner with irrigation districts for pilot programs
2. Focus initial sales on almond and grape operations (highest ROI)
3. Develop Spanish-language support (40% Hispanic farm operators)

## Confidence Level
**High** - Based on USDA Census of Agriculture, California Department of
Food and Agriculture data, and established market trends.

Limitations

Known Limitations

  1. Geographic Scope: Focused exclusively on US agricultural markets
  2. Data Currency: Training data reflects knowledge up to early 2025
  3. Synthetic Training: Responses generated by AI teachers, not human experts
  4. Specificity: May lack hyperlocal details for small rural communities
  5. Numerical Precision: Statistics should be verified against official sources

When NOT to Use This Model

  • For official regulatory compliance decisions
  • As sole source for financial investment decisions
  • For real-time commodity trading signals
  • To replace professional agricultural consultants for high-stakes decisions

Recommended Usage

This model is best used as:

  • A starting point for market research
  • A tool for generating initial analysis frameworks
  • A complement to (not replacement for) professional expertise
  • A rapid prototyping tool for agricultural B2B applications

Technical Specifications

Model Architecture

Component Specification
Base Model Llama 3.1 8B Instruct
Architecture Transformer (decoder-only)
Parameters 8.03B (base) + 168M (LoRA)
Context Length 4,096 tokens (training) / 128K (inference)
Vocabulary 128,256 tokens
Precision BF16

LoRA Adapter Details

Parameter Value
Rank (r) 128
Alpha 256
Dropout 0.05
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable Parameters 167,772,160

Files Included

llama-agri-b2b-intelligence/
β”œβ”€β”€ README.md                 # This file
β”œβ”€β”€ adapter_config.json       # LoRA configuration
β”œβ”€β”€ adapter_model.safetensors # LoRA weights
β”œβ”€β”€ tokenizer_config.json     # Tokenizer settings
β”œβ”€β”€ special_tokens_map.json   # Special tokens
└── tokenizer.json            # Full tokenizer

Data Sources Referenced in Training

The training data incorporates knowledge from:

  • USDA NASS - National Agricultural Statistics Service
  • USDA ERS - Economic Research Service
  • FSA - Farm Service Agency (subsidies, programs)
  • RMA - Risk Management Agency (crop insurance)
  • SSURGO - Soil Survey Geographic Database
  • CDL - Cropland Data Layer
  • CME Group - Commodity futures data

Citation

@misc{llama-agri-b2b-intelligence-2024,
  title={Agricultural B2B Intelligence: Multi-Teacher Knowledge Distillation
         for Domain-Specific Large Language Models},
  author={Sarathi Balakrishnan},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/sarathi-balakrishnan/llama-agri-b2b-intelligence},
  note={Fine-tuned using Claude Sonnet 4 and GPT-4.1 as teachers on
        NVIDIA DGX Spark with Blackwell GPU}
}

License

MIT License - Free for commercial and research use.

This model is released under the MIT license. The base Llama 3.1 model is subject to Meta's Llama 3.1 Community License Agreement.


Acknowledgments

  • Meta AI - For the Llama 3.1 base model
  • Anthropic - For Claude Sonnet 4 teacher responses
  • OpenAI - For GPT-4.1 teacher responses
  • NVIDIA - For DGX Spark infrastructure
  • Hugging Face - For PEFT library and model hosting

Contact & Support


Built with multi-teacher knowledge distillation

Transforming frontier AI capabilities into deployable agricultural intelligence

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sarathi-balakrishnan/llama-agri-b2b-intelligence

Adapter
(1445)
this model

Evaluation results

  • Final Training Loss on Agricultural B2B Intelligence Dataset
    self-reported
    0.700
  • Final Evaluation Loss on Agricultural B2B Intelligence Dataset
    self-reported
    0.780
  • Training Hours on Agricultural B2B Intelligence Dataset
    self-reported
    55.000