souvik18
/

Roy-v1

Text Generation

text-generation-inference

Model card Files Files and versions

Roy-v1 / README.md

souvik18's picture

Update README.md

ab0934e verified 27 days ago

|

history blame contribute delete

2.69 kB

	---
	license: apache-2.0
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- mistral
	- chatbot
	---
	# Roy-v1

	Roy is a personal AI assistant model created and fine-tuned by Souvik Pramanick.
	Designed to be helpful, conversational, and practical for everyday tasks such as learning, coding, problem solving, and general assistance.

	---

	## Creator

	Founder & Trainer:
	Souvik Pramanick
	GitHub: https://github.com/Souvik18p
	HuggingFace: https://huggingface.co/souvik18

	Roy is an independent project built with the vision of creating a smart, friendly, and customizable AI assistant.

	---

	## What Roy Can Do

	Roy is capable of:

	- Natural conversation and assistance
	- Answering general knowledge questions
	- Solving math and logical problems
	- Helping with coding and debugging
	- Writing emails, stories, and content
	- Explaining concepts in simple language
	- Brainstorming ideas and learning support

	---

	## Model Details

	- Model Name: Roy-v1
	- Parameters: 7B
	- Architecture: LLaMA-based
	- Tensor Type: F16
	- Format: Safetensors
	- License: Open for community usage

	Base Model: souvik18/Roy-v1

	## Quantized Versions (Community)

	Thanks to [@mradermacher](https://huggingface.co/mradermacher) for providing GGUF quants of Roy-v1:

	https://huggingface.co/mradermacher/Roy-v1-GGUF

	These versions allow Roy to run on:
	- CPU only systems
	- Low VRAM GPUs
	- Mobile / local apps via llama.cpp, ollama, koboldcpp


	## Quick Usage

	### Using HuggingFace Transformers

	```python
	!pip install -U transformers datasets accelerate bitsandbytes peft huggingface_hub

	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
	import torch

	MODEL_ID = "souvik18/Roy-v1"

	# 4bit config – works best on Kaggle
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_compute_dtype=torch.float16,
	bnb_4bit_use_double_quant=True,
	)

	print(" Loading tokenizer...")
	tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
	tokenizer.pad_token = tokenizer.eos_token

	print(" Loading model (4bit)...")
	model = AutoModelForCausalLM.from_pretrained(
	MODEL_ID,
	quantization_config=bnb_config,
	device_map="auto"
	)

	print("\n Roy-v1 Loaded Successfully!")

	while True:
	text = input("You: ")
	if text.lower() in ["exit","quit"]:
	break

	prompt = f"[INST] {text} [/INST]"

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	out = model.generate(
	**inputs,
	max_new_tokens=200,
	temperature=0.7,
	top_p=0.9,
	do_sample=True
	)