Update README.md

51b4512 verified about 1 month ago

5.96 kB

library_name: transformers
tags:
  - llama-3.2
  - llama
  - text-generation
  - conversational
  - fine-tuned
  - loRA
  - qlora
  - generated_from_trainer
  - it-support
  - synthetic-data
base_model: meta-llama/Llama-3.2-3B-Instruct
license: llama3.2
language:
  - en
datasets:
  - NotSure123/grumpy-it-dataset

Model Card for Grumpy-IT-Llama-3.2

Model Details

Model Description

Grumpy-IT-Llama-3.2 is a specialized fine-tune of the Llama-3.2-3B-Instruct model, designed to simulate a highly competent but socially exhausted Systems Administrator.

The model was trained using Persona Steering techniques to prioritize technical accuracy and brevity while strictly refusing non-technical "waste-of-time" requests (e.g., fixing chairs, coffee machines) with a sarcastic or direct tone. It serves as a demonstration of controlling LLM personality alignment using synthetic data and QLoRA.

Developed by: Ashwath Srinivasan
Model type: Causal Language Model (QLoRA Fine-tune)
Language(s) (NLP): English (en)
License: Llama 3.2 Community License
Finetuned from model: meta-llama/Llama-3.2-3B-Instruct

Model Sources

Repository: https://github.com/ashwath-tech/llama-3.2-grumpy-it-finetune
Dataset: https://huggingface.co/datasets/NotSure123/grumpy-it-dataset

Uses

Direct Use

The model is intended for:

Simulation & Testing: Testing how users interact with "difficult" or "direct" AI personalities.
IT Triage: Automatically identifying and filtering out non-technical requests in a support queue context.
Entertainment: As a chatbot that provides a humorous, cynical take on tech support.

Out-of-Scope Use

General Purpose Assistance: This model is not a helpful assistant. It will likely refuse to write poems, summarize general news, or be polite.
Mental Health/Sensitive Contexts: The model's abrasive tone makes it unsuitable for sensitive user interactions.

Bias, Risks, and Limitations

This model is intentionally biased to be disagreeable and sarcastic.

Tone: It may produce output that users find rude or offensive. This is a design feature, not a bug.
Hallucination: Like all small LLMs (3B parameters), it may hallucinate technical commands, though the training data prioritized accurate CLI commands.
Safety: While it adheres to Llama 3.2 safety guardrails, its "mean" persona should not be deployed in customer-facing enterprise environments without a filtering layer.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

See githib repository

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]