| | ---
|
| | license: mit
|
| | language:
|
| | - zho
|
| | - eng
|
| | - fra
|
| | - spa
|
| | - por
|
| | - deu
|
| | - ita
|
| | - rus
|
| | - jpn
|
| | - kor
|
| | - vie
|
| | - tha
|
| | - ara
|
| | base_model:
|
| | - Qwen/Qwen2.5-32B-Instruct
|
| | pipeline_tag: text-generation
|
| | ---
|
| |
|
| | # Apollo Model
|
| |
|
| | This is an experimental hybrid reasoning model built on Qwen2.5-32B-Instruct
|
| |
|
| | # GGUF
|
| |
|
| | mradermacher/Apollo-v3-32B-GGUF
|
| |
|
| | thanks mradermacher for this gguf
|
| |
|
| | ### Merge Method
|
| |
|
| | This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as a base.
|
| |
|
| |
|
| | ### Enable reasoning
|
| |
|
| | prompt the LLM with think deeper and step by step
|
| |
|
| | ### Example code
|
| |
|
| | ```
|
| |
|
| | from transformers import AutoModelForCausalLM, AutoTokenizer
|
| |
|
| | model_name = "rootxhacker/Apollo-v3-32B"
|
| |
|
| | model = AutoModelForCausalLM.from_pretrained(
|
| | model_name,
|
| | torch_dtype="auto",
|
| | device_map="auto"
|
| | )
|
| | tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| |
|
| | prompt = "How many r's are in the word strawberry"
|
| | messages = [
|
| | {"role": "user", "content": prompt}
|
| | ]
|
| | text = tokenizer.apply_chat_template(
|
| | messages,
|
| | tokenize=False,
|
| | add_generation_prompt=True
|
| | )
|
| |
|
| | model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
|
| |
|
| | generated_ids = model.generate(
|
| | **model_inputs,
|
| | max_new_tokens=32768
|
| | )
|
| | generated_ids = [
|
| | output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
| | ]
|
| |
|
| | response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
| | print(response)
|
| |
|
| | ``` |