Update README.md

82f7a74 verified 5 months ago

5.14 kB

	---
	language: en
	license: apache-2.0
	library_name: transformers
	tags:
	- text-generation
	- competitive-programming
	- code-generation
	- cpp
	- unsloth
	- llama
	- trl
	base_model: unsloth/llama-3.1-8b-instruct-bnb-4bit
	datasets:
	- Redhanuman/soltra-codeforces-cpp-elite-10k
	---

	## Soltra: A Llama-3.1 8B Fine-tune for Elite Competitive Programming
	_This isn't your average "hello world" code-gen model._

	Ever been stuck on a Codeforces problem and wished you had a buddy who's already grinded thousands of them? That's Soltra. This model was fine-tuned with a very specific mission: to be a high-level thought partner for elite-level competitive programming (1800-2600 rating), focusing exclusively on C++.

	It's designed to help you break down complex problems, generate solid C++ implementations, and understand the underlying algorithmic patterns.

	---

	### The Secret Sauce: Curated, No-Fluff Data 🧠

	We all know the rule: garbage in, garbage out. The reason Soltra performs well is the data it was trained on. This wasn't a random scrape. I built a custom pipeline to create a dataset of gold-standard solutions.

	The model was trained on my [soltra-codeforces-cpp-elite-10k](https://huggingface.co/datasets/Redhanuman/soltra-codeforces-cpp-elite-10k) dataset, which was filtered with these strict rules:
	* Problem Rating: Only problems between 1800 and 2600. No beginner stuff.
	* Verdict: Only submissions with a `verdict: 'OK'`. This model only learns from code that works.
	* Language: C++ only, sourced from a massive pool of 3 million submissions.
	* Rich Context: Each entry includes the problem statement, rating, and tags, teaching the model to connect the problem description to the solution structure.

	---

	### Under the Hood: The Tech Stack 🛠️

	This was a classic solo-dev project running on a tight budget. Here’s what’s powering Soltra:

	* Base Model: `unsloth/llama-3.1-8b-instruct-bnb-4bit` - A powerful and modern foundation.
	* Fine-tuning: [Unsloth](https://github.com/unslothai/unsloth) + Hugging Face's TRL for a 2x speed boost and 50%+ less memory usage.
	* Quantization: 4-bit quantization, making it possible to train and run this on a single free Colab T4 GPU.
	* Hardware: One motivated developer and one T4 GPU. That's it.

	---

	### How to Use Soltra 🚀

	Alright, enough talk. Here's the boilerplate to get this running. Since these are LoRA adapters, you first load the base model and then apply the fine-tuned weights on top.

	```python
	from unsloth import FastLanguageModel
	import torch

	# The original base model
	base_model_name = "unsloth/llama-3.1-8b-Instruct-bnb-4bit"

	# STEP 1: Load the base model and tokenizer
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name = base_model_name,
	max_seq_length = 4096,
	dtype = None,
	load_in_4bit = True,
	)

	# STEP 2: Apply your fine-tuned adapters from the Hub
	# This is where you load Soltra's brain
	model = FastLanguageModel.from_pretrained(
	model = model,
	model_name = "Redhanuman/soltra-llama-3.1-8b-cpp-adapters", # Your repo on the Hub
	)

	# --- Now, run inference ---
	# The prompt must be in the same format the model was trained on.
	prompt = """<\|begin_of_text\|><\|start_header_id\|>user<\|end_header_id\|>

	Solve this competitive programming problem by providing a step-by-step thought process and then the final code.

	Problem: C. Registration System
	Rating: 1500
	Tags: data structures, strings, maps

	Problem Statement:
	A new user registration system is being developed. When a new user wants to register, they enter a desired username. If this name is not already in the database, it's added, and the user receives an "OK" message. If the name is already taken, the system appends a number to the name to make it unique. The first time a name is duplicated, it appends '1', the second time '2', and so on. Given a sequence of username registration attempts, output the system's response for each.

	Provide:
	1. Thought Process: A brief explanation of the logic, data structures, and algorithm used.
	2. C++ Solution: An efficient and correct solution in C++.<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>"""

	inputs = tokenizer([prompt], return_tensors="pt", truncation=False).to("cuda")

	# Generate the response
	with torch.no_grad():
	outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True)
	response = tokenizer.batch_decode(outputs)

	# Print the generated part of the response
	print(response[0].split("<\|start_header_id\|>assistant<\|end_header_id\|>")[1].replace("<\|eot_id\|>", "").strip())
	```
	### Intended Use & Limitations ⚠️


	* Use it for: Getting ideas on difficult problems, understanding common C++ patterns for certain tags (like DP or graphs), and generating boilerplate code for standard algorithms.
	* Don't use it for: Blindly copy-pasting into a contest. The model might not always produce the most optimal solution, and it might hallucinate on edge cases it hasn't seen.

	_Think of Soltra as a highly-skilled coding buddy, not a replacement for your own brain._
	---
	Built by Redhanuman.