README.md · prithivMLmods/PocketThinker-QwQ-3B-Instruct at main

PocketThinker-QwQ-3B-Instruct / README.md

prithivMLmods

Update README.md

a72b2ec verified 10 months ago

preview code

raw

history blame contribute delete

4.25 kB

	---
	library_name: transformers
	base_model:
	- Qwen/Qwen2.5-3B-Instruct
	license: apache-2.0
	datasets:
	- amphora/QwQ-LongCoT-130K
	- amphora/QwQ-LongCoT-130K-2
	- amphora/verfiable-25k
	- amphora/m-math500
	language:
	- en
	- zh
	pipeline_tag: text-generation
	tags:
	- Math
	- Code
	- Thinker
	- Reasoning
	- 3B
	- QwQ
	- Mini
	- text-generation-inference
	- SFT
	- trl
	---

	![8.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Faj50x1HAODJAIy_R94se.png)

	# PocketThinker-QwQ-3B-Instruct

	> PocketThinker-QwQ-3B-Instruct is based on the Qwen2.5-3B-Instruct architecture, designed as a lightweight and efficient reasoning assistant. It serves as the pocket-sized version of QwQ-LCoT-7B-Instruct, optimized for fast inference while maintaining strong problem-solving and computational capabilities. This model is fine-tuned for enhanced structured reasoning, minimal token wastage, and high-quality technical responses.

	## Key Improvements
	1. Optimized for Coding: Specializes in generating structured, efficient code with minimal redundancy for smooth execution.
	2. Compact yet Powerful: Maintains strong problem-solving capabilities within a smaller 3B parameter architecture, ensuring accessibility on resource-limited devices.
	3. Advanced Reasoning Capabilities: Excels in algorithmic problem-solving, mathematical reasoning, and structured technical explanations.
	4. Efficient Memory Utilization: Reduces computational overhead while maintaining high-quality outputs.
	5. Focused Output Generation: Avoids unnecessary token generation, ensuring concise and relevant responses.

	## Quickstart with transformers

	Here is a code snippet to load the tokenizer and model using `apply_chat_template` for structured input formatting:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "prithivMLmods/PocketThinker-QwQ-3B-Instruct"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Write a Python function to find the Fibonacci sequence."
	messages = [
	{"role": "system", "content": "You are an advanced coding assistant."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=6090
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(response)
	```

	## Intended Use
	1. Code Generation & Optimization:
	Supports developers in writing, refining, and optimizing code across multiple programming languages.
	2. Algorithm & Mathematical Problem Solving:
	Delivers precise solutions and structured explanations for complex problems.
	3. Technical Documentation & Explanation:
	Assists in generating well-structured documentation for libraries, APIs, and coding concepts.
	4. Debugging Assistance:
	Helps identify and correct errors in code snippets.
	5. Educational Support:
	Simplifies programming topics for students and learners with clear explanations.
	6. Structured Data Processing:
	Generates structured outputs like JSON, XML, and tables for data science applications.

	## Limitations
	1. Hardware Constraints:
	Although lighter than larger models, still requires a moderately powerful GPU or TPU for optimal performance.
	2. Potential Bias in Responses:
	Outputs may reflect biases present in training data.
	3. Limited Creativity:
	May generate variable results in non-technical, creative tasks.
	4. No Real-Time Awareness:
	Lacks access to real-world events beyond its training cutoff.
	5. Error Propagation in Long Responses:
	Minor mistakes in early outputs may affect overall coherence in lengthy responses.
	6. Prompt Sensitivity:
	The effectiveness of responses depends on well-structured prompts.