Update README.md

93c8bb1 verified 8 months ago

4.18 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- text-generation-inference
	- math
	- moderately abliterated
	- abliterated
	- code
	- R1
	- RL
	---

	![1.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/-J-lJnQu2GDUgpjh8JIZk.png)

	# Sombrero-R1-14B-Elite13

	> Sombrero-R1-14B-Elite13 is a fine-tuned variant of the DeepSeek-R1-Distill-Qwen-14B model, enhanced through reinforcement learning to serve as a high-performance reasoning assistant. It excels in both mathematical problem-solving and general-purpose conversational tasks. This model combines distilled efficiency with refined instruction-following behavior, offering an ideal balance of speed, capability, and coherence for complex interactive tasks.

	### Key Enhancements

	1. Reinforcement Learning Fine-Tuning
	Trained with reinforcement learning objectives to optimize for alignment, reward-guided reasoning, and helpfulness in conversation.

	2. Mathematical Reasoning Proficiency
	Delivers accurate solutions and step-by-step breakdowns for algebra, calculus, number theory, logic puzzles, and applied mathematics.

	3. Instruction Adherence
	Capable of understanding and following multi-part instructions, including structured tasks and iterative refinement prompts.

	4. Expanded Context Handling
	Supports up to 128K tokens of context with output lengths up to 8K tokens, ideal for technical and educational use cases.

	5. Cross-Domain Knowledge
	Offers broad general knowledge capabilities, making it suitable for tutoring, research, and exploratory conversation across topics.

	---

	# Quickstart with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "prithivMLmods/Sombrero-R1-14B-Elite13"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Solve: Integrate (x^2 * e^x) dx"
	messages = [
	{"role": "system", "content": "You are a helpful AI assistant skilled in math and reasoning."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	---

	# Intended Use Cases

	1. Mathematics Problem Solving
	Ideal for step-by-step derivations, symbolic computation, numerical explanations, and LaTeX-supported outputs.

	2. Educational and Instructional Support
	Helpful in classrooms and learning platforms, offering guided explanations for students and instructors.

	3. Chat-based Reasoning
	Designed for coherent, context-aware dialogue generation with structured logic and continuity.

	4. Multilingual Knowledge Assistance
	Supports 29+ languages, including English, Chinese, French, German, Arabic, and others, for multilingual learning.

	5. Document and Code Explanation
	Can explain complex documents, code snippets, or structured logic flows in natural language.

	---

	# Known Limitations

	1. Compute Intensive
	Requires high-memory hardware (e.g., ≥48GB VRAM) to fully utilize context length and generation capacity.

	2. Potential for Bias and Hallucinations
	While tuned for alignment, some responses may still exhibit artifacts from pretraining biases or inaccuracies in edge cases.

	3. Drift in Long Responses
	Output may occasionally degrade in structure or accuracy across long generations.

	4. Static Knowledge
	Does not have real-time awareness or access to events or research developments post-training.

	5. Creative Task Variability
	While optimized for logic, its performance in narrative or subjective content may be inconsistent.