File size: 4,420 Bytes

---
license: mit
---
Model Summary

OpenCelestial_1 is a compact and efficient language model fine-tuned on a greeting dataset. It demonstrates that small LLMs can achieve remarkable conversational capabilities, even when trained on consumer-grade hardware.

Based on the GPT-2 architecture, OpenCelestial_1 is optimized for clear, polite, and structured responses, making it ideal for use cases such as:

    Chatbots
    Instruction-following assistants
    Lightweight deployments on limited hardware

Model Training

    Base Model: openai-community/gpt2
    Dataset: Custom greeting dataset with structured "User" and "AI" dialogue pairs.
    Hardware: Fine-tuned on a single NVIDIA RTX 3060.
    Optimization: Fine-tuning utilized LoRA (Low-Rank Adaptation) to improve memory efficiency.

Usage Example

To interact with OpenCelestial_1, use the following Python script:

pip install transformers torch

Copy and paste the following script:



```python3
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# Load the model and tokenizer
model_path = "theaithinker/OpenCelestial_1"
model = GPT2LMHeadModel.from_pretrained(model_path)
tokenizer = GPT2Tokenizer.from_pretrained(model_path)

# Set the pad token to the EOS token if not already set
tokenizer.pad_token = tokenizer.eos_token

print("Chatbot is ready! Type 'exit' to quit.")

while True:
    user_input = input("You: ")
    if user_input.lower() == "exit":
        print("Chatbot: Goodbye!")
        break

    # Define the system prompt and the full prompt
    system_prompt = "You are an intelligent AI assistant that will answer every question to the best of your ability. Be clear and polite with your answers."
    prompt = f"{system_prompt}\n### Instruction:\n{user_input}\n### Response:"

    # Tokenize the input
    inputs = tokenizer(
        prompt,
        return_tensors="pt",
        padding=True,
        truncation=True,
        max_length=1024,
    )
    input_ids = inputs.input_ids.to(model.device)
    attention_mask = inputs.attention_mask.to(model.device)

    # Generate the response
    with torch.no_grad():
        outputs = model.generate(
            input_ids=input_ids,
            attention_mask=attention_mask,
            max_new_tokens=150,
            pad_token_id=tokenizer.eos_token_id,
            do_sample=True,
            temperature=0.7,
            top_k=50,
            top_p=0.95,
        )

    # Decode the response and clean it up
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    clean_response = response.split("### Response:")[-1].strip()
    print(f"Chatbot: {clean_response}")
```

Example Outputs

Prompt: Hello there!
Response: Hello there! I am just an AI assistant, but I’m here to help you with anything you need.

Prompt: Can you tell me a joke?
Response: Sure! Why don’t skeletons fight each other? Because they don’t have the guts!

Prompt: What is the capital of France?
Response: The capital of France is Paris.
Training Details

    LoRA Configuration:
        Rank (r): 4
        Alpha: 16
        Dropout: 0.1
        Target Modules: GPT-2’s attention layers (attn.c_attn)
    Training Arguments:
        Mixed precision: Enabled (fp16)
        Epochs: 3
        Batch size: 2 (to fit GPU memory)
        Learning rate: 5e-5

Performance

OpenCelestial_1 demonstrates:

    Clear conversational ability with polite, structured responses.
    Low resource requirements, suitable for GPUs like the RTX 3060.
    Consistency in instruction-following tasks.

Intended Use

This model is designed for:

    Conversational AI applications.
    Instruction-based assistants that respond politely and clearly.
    Lightweight deployments for hobbyists, small-scale developers, or educational purposes.

Limitations

    Responses may still contain hallucinations or factual inaccuracies.
    Performance is limited to the dataset scope and GPT-2’s inherent capabilities.

Citation

If you use OpenCelestial_1 in your work, please consider citing:

@misc{OpenCelestial_1,
  author = {Your Name or Organization},
  title = {OpenCelestial_1: A Compact GPT-2 Fine-Tuned Model},
  year = {2024},
  howpublished = {\url{https://huggingface.co/your_username/OpenCelestial_1}},
}

Acknowledgments

    Base Model: openai-community/gpt2
    Fine-tuned using the LoRA technique for efficient memory usage.
    Developed on a single NVIDIA RTX 3060 GPU.