reasoning_cpp_llm / README.md

Update README.md

269d549 verified 4 months ago

9.17 kB

	---
	base_model: unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- base_model:adapter:unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit
	- lora
	- sft
	- transformers
	- trl
	- unsloth
	license: apache-2.0
	datasets:
	- open-r1/codeforces-cots
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->



	# Model Card for SaffalPoosh/reasoning_cpp_llm

	<!-- Provide a quick summary of what the model is/does. -->

	This is a QLoRA adapter trained on C++ coding tasks and designed for reasoning-based code generation. The model specializes in solving algorithmic problems with step-by-step reasoning and generating optimized C++ solutions.

	## Example Usage

	### Problem Example

	```python
	example_problem = """
	A robot is situated at the top-left corner of an m x n grid. The robot can only move either down or right at any point in time. It wants to reach the bottom-right corner of the grid. Some cells in the grid are blocked by obstacles. How many unique paths can the robot take to reach the destination?

	Constraints:
	Time limit per test: 2.0 seconds
	Memory limit per test: 256.0 megabytes
	1 ≤ m, n ≤ 100
	Grid cells are either 0 (empty) or 1 (obstacle).

	Input Format:
	The first line contains two integers m and n — the dimensions of the grid.
	The next m lines each contain n integers (0 or 1) representing the grid.

	Output Format:
	Print a single integer — the number of unique paths.

	Example:
	Input:
	3 3
	0 0 0
	0 1 0
	0 0 0
	"""
	```

	### Model Loading and Inference

	```python
	from unsloth import FastLanguageModel
	from transformers import TextStreamer
	from transformers import TextIteratorStreamer
	from threading import Thread

	# Model configuration
	model_path = "SaffalPoosh/reasoning_cpp_llm"
	max_seq_length = 16000
	dtype = None
	load_in_4bit = True

	# Load model and tokenizer
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name=model_path,
	max_seq_length=max_seq_length,
	dtype=dtype,
	load_in_4bit=load_in_4bit,
	local_files_only=False
	)

	# This will download the base model and then patch by applying the LoRA adapters
	FastLanguageModel.for_inference(model)

	# Prepare Input Data
	input_text = example_problem
	inputs = tokenizer(input_text, return_tensors="pt")
	inputs = {k: v.to("cuda") for k, v in inputs.items()}

	# Initialize the text streamer
	text_streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=False)

	# Perform Inference with streaming
	stream_catcher = Thread(
	target=model.generate,
	kwargs={
	**inputs,
	"do_sample": True,
	"streamer": text_streamer,
	"max_new_tokens": 10000
	}
	)

	stream_catcher.start()

	# Stream output to console and file
	with open("output.txt", "w") as f:
	for token in text_streamer:
	print(token, end="", flush=True)
	f.write(token)

	stream_catcher.join()
	```

	## Model Details

	- Model Type: QLoRA Fine-tuned Language Model
	- Base Model: [Specify base model if known]
	- Training Focus: C++ algorithmic problem solving with reasoning
	- Max Sequence Length: 16,000 tokens
	- Quantization: 4-bit loading supported
	- Hardware Requirements: CUDA-compatible GPU recommended

	## Training Details

	- Training Method: QLoRA (Quantized Low-Rank Adaptation)
	- Dataset: C++ coding tasks with reasoning annotations
	- Task Type: Code generation with step-by-step reasoning
	- Optimization: Focused on algorithmic problem solving

	## Usage Notes

	- The model generates reasoning-based solutions for C++ programming problems
	- Supports streaming inference for real-time output
	- The `output.txt` file contains the complete generated solution
	- Designed to handle competitive programming style problems with constraints

	## Output Format

	The model typically generates:
	1. Problem analysis and reasoning
	2. Algorithm explanation
	3. Complete C++ implementation
	4. Time and space complexity analysis

	## Requirements

	```python
	pip install unsloth transformers torch
	```

	## Hardware Requirements

	- GPU: CUDA-compatible GPU (recommended)
	- Memory: Sufficient VRAM for 4-bit quantized model
	- Storage: Space for base model download and adapter weights
	-



	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [More Information Needed]
	- Funded by [optional]: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: [More Information Needed]
	- Language(s) (NLP): [More Information Needed]
	- License: [More Information Needed]
	- Finetuned from model [optional]: [More Information Needed]

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [More Information Needed]
	- Paper [optional]: [More Information Needed]
	- Demo [optional]: [More Information Needed]

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	[More Information Needed]

	### Downstream Use [optional]

	<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

	[More Information Needed]

	### Out-of-Scope Use

	<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

	[More Information Needed]

	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	[More Information Needed]

	### Recommendations

	<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	[More Information Needed]

	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	[More Information Needed]

	### Training Procedure

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

	#### Preprocessing [optional]

	[More Information Needed]


	#### Training Hyperparameters

	- Training regime: [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

	#### Speeds, Sizes, Times [optional]

	<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

	[More Information Needed]

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data, Factors & Metrics

	#### Testing Data

	<!-- This should link to a Dataset Card if possible. -->

	[More Information Needed]

	#### Factors

	<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

	[More Information Needed]

	#### Metrics

	<!-- These are the evaluation metrics being used, ideally with a description of why. -->

	[More Information Needed]

	### Results

	[More Information Needed]

	#### Summary



	## Model Examination [optional]

	<!-- Relevant interpretability work for the model goes here -->

	[More Information Needed]

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: [More Information Needed]
	- Hours used: [More Information Needed]
	- Cloud Provider: [More Information Needed]
	- Compute Region: [More Information Needed]
	- Carbon Emitted: [More Information Needed]

	## Technical Specifications [optional]

	### Model Architecture and Objective

	[More Information Needed]

	### Compute Infrastructure

	[More Information Needed]

	#### Hardware

	[More Information Needed]

	#### Software

	[More Information Needed]

	## Citation [optional]

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	BibTeX:

	[More Information Needed]

	APA:

	[More Information Needed]

	## Glossary [optional]

	<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

	[More Information Needed]

	## More Information [optional]

	[More Information Needed]

	## Model Card Authors [optional]

	[More Information Needed]

	## Model Card Contact

	[More Information Needed]
	### Framework versions

	- PEFT 0.17.1