LLMWriter-8B

LLMWriter-8B is a fine-tuned version of Llama 3.1 8B designed to improve writing quality, instruction following, content generation, and structured response creation.

  • Developed by: aldenirsrv
  • License: Apache 2.0
  • Finetuned from model: unsloth/Llama-3.1-8B

This model was trained using LoRA (Low-Rank Adaptation) with the Unsloth framework and Hugging Face TRL.

This Llama model was trained 2x faster with Unsloth and Hugging Face's TRL library.


Model Description

LLMWriter-8B is focused on:

  • Content creation
  • Technical writing
  • Documentation generation
  • Blog writing
  • Social media content
  • Instruction following
  • Writing assistance
  • General-purpose text generation

The model was trained on a curated instruction-following dataset containing over 10,000 examples.


Training Dataset

DatasetDict({
    train: Dataset({
        features: ['instruction', 'output', 'source', 'score'],
        num_rows: 10384
    })
    test: Dataset({
        features: ['instruction', 'output', 'source', 'score'],
        num_rows: 547
    })
})
Split Samples
Train 10,384
Test 547
Total 10,931

Training Configuration

Parameter Value
Base Model Llama 3.1 8B
Fine-Tuning Method LoRA
Framework Unsloth
Epochs 3
GPUs 1
Batch Size 2
Gradient Accumulation 8
Effective Batch Size 16
Trainable Parameters 83,886,080
Total Parameters 8,114,147,328
Percentage Trained 1.03%

Training Summary:

Num examples = 10,384
Num Epochs = 3
Total steps = 1,947

Trainable parameters:
83,886,080 of 8,114,147,328
(1.03% trained)

Training Results

Optimization Metrics

Metric Value
Initial Loss 1.2151
Final Loss 0.2187
Average Training Loss 0.5334
Initial Learning Rate 3e-4
Final Learning Rate 1.54e-7
Min Gradient Norm 0.1587
Max Gradient Norm 2.3887

Key Observations

✅ Consistent loss reduction throughout training

✅ Stable gradient norms with no gradient explosion

✅ Effective learning-rate decay schedule

✅ Smooth convergence after three epochs

✅ Stable LoRA fine-tuning process


Performance

Metric Value
Training Time ~1h41m
Total FLOPs ~7.66e17
Samples per Second ~5.1
Steps per Second ~0.32
Time per Step ~3.1s

The training run completed successfully and demonstrated stable convergence without optimization instability.


Intended Use

LLMWriter-8B is intended for:

  • Writing assistance
  • Content generation
  • Blog creation
  • Documentation drafting
  • Technical writing
  • Knowledge articles
  • Social media posts
  • Structured responses
  • General instruction-following tasks

Example Prompt

Write a professional LinkedIn post explaining why Small Language Models (SLMs) are becoming important for enterprise AI adoption.

Usage

vLLM

vllm serve aldenirsrv/LLMWriter-8B \
  --host 0.0.0.0 \
  --port 8888

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "aldenirsrv/LLMWriter-8B"

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto"
)

prompt = "Write a blog post introduction about AI governance."

inputs = tokenizer(
    prompt,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=300,
    temperature=0.7
)

print(
    tokenizer.decode(
        outputs[0],
        skip_special_tokens=True
    )
)

OpenAI-Compatible API (vLLM)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8888/v1",
    api_key="dummy"
)

response = client.chat.completions.create(
    model="aldenirsrv/LLMWriter-8B",
    messages=[
        {
            "role": "user",
            "content": "Write a blog post about AI governance."
        }
    ]
)

print(response.choices[0].message.content)

Limitations

  • The model may generate inaccurate information.
  • Outputs should be reviewed before publication.
  • Not intended for legal, medical, or financial advice.
  • Performance depends on prompt quality and task complexity.
  • The model has not been evaluated on standardized benchmark suites.

Future Work

Planned improvements include:

  • Human preference evaluation
  • Benchmark comparisons against the base model
  • Additional instruction tuning
  • Domain-specific fine-tuning
  • Expanded evaluation datasets
  • Quantized deployment variants

Acknowledgements

This model was fine-tuned using:

  • Unsloth
  • Hugging Face Transformers
  • Hugging Face TRL
  • PEFT (LoRA)
  • PyTorch
  • Comet ML

Special thanks to the teams behind Llama, Hugging Face, and Unsloth for enabling efficient open-source model development.


Author

Aldenir Flauzino

Software Engineer specializing in:

  • Distributed Systems
  • Platform Engineering
  • AI Infrastructure
  • Retrieval-Augmented Generation (RAG)
  • Multi-Agent Systems
  • Production AI Platforms

GitHub: https://github.com/aldenirsrv

Website: https://aldenir.me

Downloads last month
18
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aldenirsrv/LLMWriter-8B

Adapter
(6)
this model
Quantizations
2 models