|
|
--- |
|
|
library_name: transformers |
|
|
license: mit |
|
|
datasets: |
|
|
- worldbank-datause/PRWP |
|
|
base_model: |
|
|
- openai-community/gpt2 |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
Model Card for Your Model |
|
|
|
|
|
Model Details |
|
|
|
|
|
Model Description |
|
|
|
|
|
This is a transformers-based model fine-tuned for generative AI tasks, particularly in data engineering and AI service applications. It has been optimized for structured text generation, analytics, and AI-assisted workflows. The model supports multi-turn interactions and is designed for business intelligence, data insights, and technical documentation generation. |
|
|
|
|
|
Developed by: [Harshraj Bhoite] |
|
|
|
|
|
Funded by: Self-funded |
|
|
|
|
|
Shared by: [Harshraj] |
|
|
|
|
|
Model type: Transformer-based ( GPT-2) |
|
|
|
|
|
Language(s) (NLP): English |
|
|
|
|
|
License: Apache 2.0 / MIT / Custom |
|
|
|
|
|
Finetuned from model: [GPT-2] (e.g., GPT-2, BERT, T5) |
|
|
|
|
|
Model Sources |
|
|
|
|
|
Repository: [https://huggingface.co/Harshraj8721/agri_finetuned_model] |
|
|
|
|
|
|
|
|
Uses |
|
|
|
|
|
Direct Use |
|
|
|
|
|
AI-assisted data engineering documentation generation |
|
|
|
|
|
Business intelligence reports and data insights automation |
|
|
|
|
|
Technical content creation for AI and analytics |
|
|
|
|
|
Downstream Use |
|
|
|
|
|
Fine-tuning for Agriculture-specific AI |
|
|
|
|
|
Conversational AI in data analytics applications |
|
|
|
|
|
AI-driven customer support for analytics tools |
|
|
|
|
|
Out-of-Scope Use |
|
|
|
|
|
Not intended for real-time conversational AI without further optimization |
|
|
|
|
|
May not perform well in non-English languages |
|
|
|
|
|
Bias, Risks, and Limitations |
|
|
|
|
|
Bias: Model performance may be influenced by the dataset used. |
|
|
|
|
|
Limitations: It may generate inaccurate or misleading responses in highly technical scenarios. |
|
|
|
|
|
Mitigation: Users should validate outputs for critical decision-making. |
|
|
|
|
|
How to Get Started with the Model |
|
|
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model_name = "Harshraj8721/agri_finetuned_model" |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
|
|
input_text = "Explain Delta Lake architecture" |
|
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
output = model.generate(**inputs) |
|
|
print(tokenizer.decode(output[0])) |
|
|
|
|
|
Training Details |
|
|
|
|
|
Training Data |
|
|
|
|
|
Dataset: Proprietary dataset of technical blogs, data engineering articles, and structured datasets. |
|
|
|
|
|
Preprocessing: Tokenization with Byte Pair Encoding (BPE) or WordPiece. |
|
|
|
|
|
Training Procedure |
|
|
|
|
|
Hyperparameters |
|
|
|
|
|
Batch size: 16 |
|
|
|
|
|
Learning rate: 3e-5 |
|
|
|
|
|
Precision: fp16 mixed precision |
|
|
|
|
|
Optimizer: AdamW |
|
|
|
|
|
Compute Infrastructure |
|
|
|
|
|
Hardware: NVIDIA A100 GPUs (x4) |
|
|
|
|
|
Cloud Provider: AWS / Azure / GCP |
|
|
|
|
|
Training Duration: ~36 hours |
|
|
|
|
|
Evaluation |
|
|
|
|
|
Testing Data, Factors & Metrics |
|
|
|
|
|
Testing Data |
|
|
|
|
|
Synthetic datasets from AI-powered analytics use cases |
|
|
|
|
|
Real-world structured datasets from data engineering pipelines |
|
|
|
|
|
Metrics |
|
|
|
|
|
Perplexity (PPL): Measures how well the model predicts text |
|
|
|
|
|
BLEU Score: Evaluates generated text quality |
|
|
|
|
|
F1 Score: Measures precision and recall |
|
|
|
|
|
Results |
|
|
|
|
|
Perplexity: 9.7 (lower is better) |
|
|
|
|
|
BLEU Score: 34.2 (higher is better) |
|
|
|
|
|
F1 Score: 85.5% |
|
|
|
|
|
Environmental Impact |
|
|
|
|
|
Hardware Type: NVIDIA A100 GPUs |
|
|
|
|
|
Hours used: 36 hours |
|
|
|
|
|
Carbon Emitted: ~50 kg CO2eq (estimated using ML CO2 Impact Calculator) |
|
|
|
|
|
Citation |
|
|
|
|
|
If you use this model, please cite it as follows: |
|
|
|
|
|
@misc{Harshraj8721/agri_finetuned_model/2025, |
|
|
title={agri_finetuned_model}, |
|
|
author={Harshraj Bhoite}, |
|
|
year={2025}, |
|
|
url={https://huggingface.co/Harshraj8721/agri_finetuned_model} |
|
|
} |
|
|
|
|
|
Contact |
|
|
|
|
|
For queries, reach out to: |
|
|
|
|
|
Email: harshraj8721@gmail.com |
|
|
|
|
|
LinkedIn: Linkedin/in/harshrajb/ |
|
|
|
|
|
|