itriedcoding's picture
Upload folder using huggingface_hub
64728f0 verified
|
Raw
History Blame Contribute Delete
2.26 kB

Custom LLM Model

A small custom-built transformer language model trained on example sentences about AI and machine learning.

Model Description

This is a demonstration model built to showcase how to create and publish a custom AI model to Hugging Face. The model is a transformer-based language model with:

  • Architecture: Transformer decoder
  • Vocabulary Size: 40 characters
  • Hidden Size: 256
  • Number of Layers: 4
  • Number of Attention Heads: 8
  • Feedforward Size: 1024
  • Max Sequence Length: 64
  • Parameters: ~3.2M

Training Data

The model was trained on a small custom dataset containing 10 example sentences about:

  • Greetings and small talk
  • Weather descriptions
  • Machine learning concepts
  • Deep learning and transformers
  • Natural language processing
  • Model publishing and sharing

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "your-username/custom-llm-model"  # Replace with your HF username
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Generate text
def generate_text(prompt, max_length=50, temperature=0.8):
    inputs = tokenizer.encode(prompt, return_tensors="pt")
    
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_length=max_length,
            temperature=temperature,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
print(generate_text("Hello"))
print(generate_text("The weather"))
print(generate_text("Deep learning"))

Limitations

This is a small demonstration model trained on very limited data. For serious applications, consider:

  • Using larger datasets
  • Training for more epochs
  • Using larger model architectures
  • Implementing proper tokenization (BPE, WordPiece, etc.)

License

This model is released under the MIT License.

Citation

@misc{custom_llm_model,
  author = {Your Name},
  title = {Custom LLM Model},
  year = {2026},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  doi = {10.57967/hf/0000}
}