Recursive Language Model - 48M (Instruction-Tuned)

An instruction-tuned version of the Recursive Language Model, fine-tuned on the Alpaca dataset for question answering and instruction following.

Model Description

This model is a fine-tuned version of recursive-language-model-48m trained specifically for:

  • ✅ Question answering
  • ✅ Following instructions
  • ✅ Providing direct, relevant answers
  • ✅ General knowledge tasks

Base Model: Recursive Language Model with adaptive depth processing
Fine-tuning Dataset: Alpaca instruction dataset (10,000 samples)
Training Method: Instruction tuning for 3 epochs

Quick Start

Installation

pip install transformers torch

Basic Usage

from transformers import AutoModelForCausalLM, GPT2Tokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct",
    trust_remote_code=True
)
tokenizer = GPT2Tokenizer.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct"
)

# Ask a question
question = "What is the capital of France?"
prompt = f"Question: {question}\nAnswer:"

input_ids = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(input_ids, max_new_tokens=50, temperature=0.7)

answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(answer)

Output:

Question: What is the capital of France?
Answer: The capital of France is Paris.

How to Use

Simple Question Answering

from transformers import AutoModelForCausalLM, GPT2Tokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct",
    trust_remote_code=True
)
tokenizer = GPT2Tokenizer.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct"
)

def ask_question(question):
    prompt = f"Question: {question}\nAnswer:"
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    
    outputs = model.generate(
        input_ids,
        max_new_tokens=100,
        temperature=0.7,
        do_sample=True
    )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    answer = response.split("Answer:")[-1].strip()
    
    return answer

# Example usage
print(ask_question("What is artificial intelligence?"))
print(ask_question("How do you make tea?"))
print(ask_question("Who invented the telephone?"))

Interactive Q&A

from transformers import AutoModelForCausalLM, GPT2Tokenizer

# Load model
model = AutoModelForCausalLM.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct",
    trust_remote_code=True
)
tokenizer = GPT2Tokenizer.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct"
)

# Interactive loop
print("Ask me anything! (Type 'quit' to exit)")

while True:
    question = input("\nYour question: ").strip()
    
    if question.lower() in ['quit', 'exit', 'q']:
        break
    
    prompt = f"Question: {question}\nAnswer:"
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    
    outputs = model.generate(input_ids, max_new_tokens=100, temperature=0.7)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    answer = response.split("Answer:")[-1].strip()
    
    print(f"Answer: {answer}")

Batch Questions

from transformers import AutoModelForCausalLM, GPT2Tokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct",
    trust_remote_code=True
)
tokenizer = GPT2Tokenizer.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct"
)

questions = [
    "What is the largest planet in our solar system?",
    "How does photosynthesis work?",
    "What is the speed of light?",
    "Who wrote Romeo and Juliet?",
    "What causes earthquakes?"
]

for question in questions:
    prompt = f"Question: {question}\nAnswer:"
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    
    outputs = model.generate(input_ids, max_new_tokens=80, temperature=0.7)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    answer = response.split("Answer:")[-1].strip()
    
    print(f"Q: {question}")
    print(f"A: {answer}\n")

Controlling Response Length and Style

from transformers import AutoModelForCausalLM, GPT2Tokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct",
    trust_remote_code=True
)
tokenizer = GPT2Tokenizer.from_pretrained(
    "Girinath11/recursive-language-model-48m-instruct"
)

question = "What is climate change?"
prompt = f"Question: {question}\nAnswer:"
input_ids = tokenizer.encode(prompt, return_tensors="pt")

# Short answer
short_output = model.generate(input_ids, max_new_tokens=30, temperature=0.5)
short_answer = tokenizer.decode(short_output[0], skip_special_tokens=True)

# Detailed answer
detailed_output = model.generate(input_ids, max_new_tokens=150, temperature=0.7)
detailed_answer = tokenizer.decode(detailed_output[0], skip_special_tokens=True)

print("Short answer:", short_answer.split("Answer:")[-1].strip())
print("\nDetailed answer:", detailed_answer.split("Answer:")[-1].strip())

Model Details

Architecture

Component Value
Base Model Recursive Language Model 48M
Parameters 47,931,907 (~48M)
Vocabulary 50,257 tokens (GPT-2)
Context Length 256 tokens
Fine-tuning Method Instruction tuning

Fine-tuning Details

Dataset:

  • Source: Alpaca instruction dataset
  • Training samples: 9,500
  • Validation samples: 500
  • Total instructions: 10,000

Training Configuration:

Base Model: recursive-language-model-48m
Batch Size: 16
Epochs: 3
Learning Rate: 2e-5
Optimizer: AdamW
Sequence Length: 256
Training Time: ~13.5 minutes
Hardware: NVIDIA T4 GPU

Training Results:

  • Training completed successfully in 3 epochs
  • Model converged without overfitting
  • Smooth loss decrease across epochs

Example Outputs

General Knowledge

Input: "Question: What is the capital of India?\nAnswer:"
Output: "The capital of India is New Delhi."

How-to Questions

Input: "Question: How do you make coffee?\nAnswer:"
Output: "To make coffee, you need coffee beans, water, and a coffee maker. Grind the beans, add them to the filter, pour hot water, and brew."

Explanations

Input: "Question: What is machine learning?\nAnswer:"
Output: "Machine learning is a subset of artificial intelligence where computers learn from data to make predictions or decisions without being explicitly programmed."

Limitations

  1. Short Context: 256 token limit - cannot handle very long questions or contexts
  2. Knowledge Cutoff: Limited to training data knowledge (Alpaca dataset)
  3. Factual Accuracy: May generate plausible but incorrect information
  4. Complex Reasoning: Limited capability for multi-step reasoning tasks
  5. Language: Primarily English, limited multilingual support

Recommended Use Cases

Good for:

  • Simple question answering
  • Educational Q&A systems
  • General knowledge queries
  • Quick information lookup
  • Prototyping chatbots
  • Learning about instruction tuning

Not recommended for:

  • Medical or legal advice
  • Financial decision making
  • Safety-critical applications
  • Long-form content generation
  • Complex multi-step reasoning
  • Real-time production systems

Usage Tips

Best Prompt Format

Always use this format for best results:

Question: [Your question here]
Answer:

Temperature Settings

  • temperature=0.5 - More focused, deterministic answers
  • temperature=0.7 - Balanced (recommended)
  • temperature=1.0 - More creative, varied responses

Token Length

  • Short answers: max_new_tokens=30-50
  • Medium answers: max_new_tokens=80-100
  • Detailed answers: max_new_tokens=150-200

Comparison with Base Model

Feature Base Model Instruction-Tuned
Task Text completion Question answering
Format Continues text Provides answers
Use case General generation Instruction following
Training Pre-training Fine-tuning

Citation

@misc{girinath2025recursive_instruct,
  author = {Girinath V},
  title = {Recursive Language Model 48M - Instruction Tuned},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Girinath11/recursive-language-model-48m-instruct}}
}

Acknowledgments

License

Apache 2.0 License

Model Card Authors

Girinath V (@Girinath11)

Contact

For questions or feedback:


Model Version: 1.0
Release Date: January 2025
Status: Stable
Framework: PyTorch 2.0+
Transformers: 4.35+

Downloads last month
107
Safetensors
Model size
47.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Girinath11/recursive-language-model-48m-instruct

Finetuned
(1)
this model

Dataset used to train Girinath11/recursive-language-model-48m-instruct