Sancara – Instruction-Tuned Text Generation Model
This repository contains the full Sancara text generation model, exported as a standard Hugging Face Transformers checkpoint (model.safetensors + tokenizer).
The model is optimized for instruction following, chat-style dialogue, question answering, and general-purpose text generation.
Model overview
- Repository:
Sachin21112004/Sancara_text_generation - Model type: Causal language model (decoder-only) for text generation
- Language: English
- License: SRL(others)
- Status: Merged, standalone model (not only a LoRA adapter)
The repo includes both:
- A merged full model in
model.safetensors, and - An adapter file
adapter_model.safetensorsfrom a previous LoRA-based phase.
For most users, loading model.safetensors via AutoModelForCausalLM is the recommended way to use Sancara.
Files in this repository
Key files:
model.safetensors– full model weights (~2.84 GB)config.json– model architecture and configurationgeneration_config.json– default generation parameterstokenizer.json,tokenizer_config.json,vocab.json,merges.txt– tokenizer and BPE mergesspecial_tokens_map.json,added_tokens.json– definition of special and extra tokensadapter_model.safetensors– LoRA adapter weights (optional use)training_args.bin– serialized Hugging Face Trainer argumentscheckpoint-12000/,checkpoint-12992/– intermediate training checkpoints
If you just want to run the model, you only need the main repo id:
Sachin21112004/Sancara_text_generation.
Intended use
Direct use
The model is intended for:
- Instruction following (task-style prompts with clear instructions)
- Chatbots and conversational agents
- Question answering and explanation-style responses
- General light-weight reasoning and text generation
Example applications:
- Personal AI assistants
- Educational or coding helpers
- Internal tools that need a natural language interface
Out-of-scope use
This model is not suitable for:
- Medical, legal, financial, or other professional advice
- High-risk decision-making without human supervision
- Generating harmful, abusive, or disallowed content
Always keep a human in the loop for any sensitive or production-critical usage.
Quick start (inference)
Basic text generation
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Sachin21112004/Sancara_text_generation"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16, # or float16/float32 depending on hardware
device_map="auto",
)
prompt = "Explain how transformers-based large language models work in simple terms."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output_ids = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
top_p=0.9,
do_sample=True,
)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))
You can override generation parameters in the code above or rely on generation_config.json which stores defaults shipped with the model.
Using an intermediate checkpoint
If you want to inspect or continue training from a specific checkpoint:
from transformers import AutoModelForCausalLM, AutoTokenizer
base_id = "Sachin21112004/Sancara_text_generation"
ckpt_id = "Sachin21112004/Sancara_text_generation/checkpoint-12992"
tokenizer = AutoTokenizer.from_pretrained(base_id)
model = AutoModelForCausalLM.from_pretrained(ckpt_id)
(Optional) Using the LoRA adapter
The repository still contains adapter_model.safetensors from a LoRA fine-tuning stage.
If you want to reproduce an adapter-based setup instead of the merged full model, you can:
- Load the original base model (e.g.
microsoft/phi-2or your chosen base). - Load the LoRA adapter with
peftand apply it on top.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model_id = "microsoft/phi-2" # or the base you originally used
adapter_repo = "Sachin21112004/Sancara_text_generation"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype="auto",
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_repo)
Most users can ignore this and just use the merged model.safetensors.
Training and data
The final Sancara model was trained with Hugging Face's Trainer, with arguments stored in training_args.bin.
Training was performed as supervised fine-tuning for instruction following and chat, on high-quality conversational and instruction-style datasets such as:
HuggingFaceH4/ultrachat_200kdatabricks/databricks-dolly-15k
High-level training setup:
- Objective: Causal language modeling (next token prediction)
- Format: Instruction–response pairs and multi-turn chats
- Infrastructure: Standard Transformers + Trainer pipeline
- Checkpoints: Saved periodically (e.g.
checkpoint-12000,checkpoint-12992), then merged intomodel.safetensors
If you want to continue training, you can load one of the checkpoints as initialization and reuse training_args.bin or your own training script.
Limitations and risks
- The model can hallucinate facts, dates, and citations.
- Outputs may reflect biases or stereotypes from training data.
- It may produce toxic, offensive, or otherwise undesirable content if prompted directly.
Recommended mitigations:
- Use prompt filtering and output moderation in downstream applications.
- Keep humans in the loop for any important or high-impact use.
- Evaluate on your own tasks and domains before deploying in production.
How to cite / attribution
If you use this model in your work, please credit:
Sancara – Instruction-Tuned Text Generation Model, by Sachin (
Sachin21112004on Hugging Face).
And link to the model card:
https://huggingface.co/Sachin21112004/Sancara_text_generation
- Downloads last month
- 146