File size: 3,465 Bytes
da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 6e351d0 da33d50 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
---
library_name: transformers
license: mit
datasets:
- worldbank-datause/PRWP
base_model:
- openai-community/gpt2
pipeline_tag: text-generation
---
Model Card for Your Model
Model Details
Model Description
This is a transformers-based model fine-tuned for generative AI tasks, particularly in data engineering and AI service applications. It has been optimized for structured text generation, analytics, and AI-assisted workflows. The model supports multi-turn interactions and is designed for business intelligence, data insights, and technical documentation generation.
Developed by: [Harshraj Bhoite]
Funded by: Self-funded
Shared by: [Harshraj]
Model type: Transformer-based ( GPT-2)
Language(s) (NLP): English
License: Apache 2.0 / MIT / Custom
Finetuned from model: [GPT-2] (e.g., GPT-2, BERT, T5)
Model Sources
Repository: [https://huggingface.co/Harshraj8721/agri_finetuned_model]
Uses
Direct Use
AI-assisted data engineering documentation generation
Business intelligence reports and data insights automation
Technical content creation for AI and analytics
Downstream Use
Fine-tuning for Agriculture-specific AI
Conversational AI in data analytics applications
AI-driven customer support for analytics tools
Out-of-Scope Use
Not intended for real-time conversational AI without further optimization
May not perform well in non-English languages
Bias, Risks, and Limitations
Bias: Model performance may be influenced by the dataset used.
Limitations: It may generate inaccurate or misleading responses in highly technical scenarios.
Mitigation: Users should validate outputs for critical decision-making.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Harshraj8721/agri_finetuned_model"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
input_text = "Explain Delta Lake architecture"
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0]))
Training Details
Training Data
Dataset: Proprietary dataset of technical blogs, data engineering articles, and structured datasets.
Preprocessing: Tokenization with Byte Pair Encoding (BPE) or WordPiece.
Training Procedure
Hyperparameters
Batch size: 16
Learning rate: 3e-5
Precision: fp16 mixed precision
Optimizer: AdamW
Compute Infrastructure
Hardware: NVIDIA A100 GPUs (x4)
Cloud Provider: AWS / Azure / GCP
Training Duration: ~36 hours
Evaluation
Testing Data, Factors & Metrics
Testing Data
Synthetic datasets from AI-powered analytics use cases
Real-world structured datasets from data engineering pipelines
Metrics
Perplexity (PPL): Measures how well the model predicts text
BLEU Score: Evaluates generated text quality
F1 Score: Measures precision and recall
Results
Perplexity: 9.7 (lower is better)
BLEU Score: 34.2 (higher is better)
F1 Score: 85.5%
Environmental Impact
Hardware Type: NVIDIA A100 GPUs
Hours used: 36 hours
Carbon Emitted: ~50 kg CO2eq (estimated using ML CO2 Impact Calculator)
Citation
If you use this model, please cite it as follows:
@misc{Harshraj8721/agri_finetuned_model/2025,
title={agri_finetuned_model},
author={Harshraj Bhoite},
year={2025},
url={https://huggingface.co/Harshraj8721/agri_finetuned_model}
}
Contact
For queries, reach out to:
Email: harshraj8721@gmail.com
LinkedIn: Linkedin/in/harshrajb/
|