Instructions to use blesspearl/math-stackexchange with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use blesspearl/math-stackexchange with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="blesspearl/math-stackexchange")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("blesspearl/math-stackexchange") model = AutoModelForCausalLM.from_pretrained("blesspearl/math-stackexchange") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use blesspearl/math-stackexchange with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "blesspearl/math-stackexchange" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "blesspearl/math-stackexchange", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/blesspearl/math-stackexchange
- SGLang
How to use blesspearl/math-stackexchange with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "blesspearl/math-stackexchange" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "blesspearl/math-stackexchange", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "blesspearl/math-stackexchange" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "blesspearl/math-stackexchange", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use blesspearl/math-stackexchange with Docker Model Runner:
docker model run hf.co/blesspearl/math-stackexchange
Fine-Tuned LLaMA 3.1 Model on Stack Exchange Math Dataset
This repository contains the fine-tuned LLaMA 3.1 model using LoRA on a dataset collected from Stack Exchange Math. The model is designed to answer mathematical questions in a manner similar to Stack Exchange responses.
Model Details
- Base Model: Meta-Llama-3.1-8B
- Fine-Tuned Model: math-stackexchange
- Dataset: stackexchange-math-sample
- Training Environment:
- Framework: PyTorch with Transformers
- Platform: Google Colab
- Hardware: 1 x T4 GPU (15GB)
Data Preparation
The dataset used for fine-tuning includes 1000 samples collected from Stack Exchange Math. Each sample consists of a question and its accepted answer.
Preprocessing
The data was preprocessed using the following steps:
- Loading the dataset from Hugging Face.
- Shuffling the dataset and selecting 1000 samples.
- Formatting the data into a chat template suitable for training.
Training Details
Libraries and Dependencies
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline, logging
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
from google.colab import drive, userdata
import os, torch, wandb
from trl import SFTTrainer, setup_chat_format
from huggingface_hub import login
Loading Data and Model
model_name = "meta-llama/Meta-Llama-3.1-8B"
dataset_name = "blesspearl/stackexchange-math-sample"
torch_dtype = torch.float16
attn_implementation = "eager"
wandb.login(key=userdata.get("WANDB_API_KEY"))
run = wandb.init(
project='Fine tunning LLama-3.1-8b on math-stack-exchange',
job_type="training",
anonymous="allow"
)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch_dtype,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
device_map="auto",
attn_implementation=attn_implementation
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
model, tokenizer = setup_chat_format(model, tokenizer)
LoRA Configuration
peft_config = LoraConfig(
r=16,
lora_alpha=32,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
)
model = get_peft_model(model, peft_config)
Data Preparation
dataset = load_dataset(dataset_name, split="all")
dataset = dataset.shuffle(seed=65).select(range(1000))
def format_chat_template(row):
row_json = [{"role": "user", "content": row["question_body"]},
{"role": "assistant", "content": row["accepted_answer"]}]
row["text"] = tokenizer.apply_chat_template(row_json, tokenize=False)
return row
dataset = dataset.map(format_chat_template, num_proc=4)
dataset = dataset.train_test_split(test_size=0.2)
dataset = dataset.remove_columns(["question_body", "accepted_answer"])
Training Configuration
training_arguments = TrainingArguments(
output_dir="math-stackexchange",
per_device_train_batch_size=1,
per_device_eval_batch_size=1,
gradient_accumulation_steps=2,
optim="paged_adamw_32bit",
num_train_epochs=1,
evaluation_strategy="steps",
eval_steps=0.2,
logging_steps=1,
warmup_steps=10,
logging_strategy="steps",
learning_rate=2e-4,
fp16=False,
bf16=False,
group_by_length=True,
report_to="wandb"
)
trainer = SFTTrainer(
model=model,
train_dataset=dataset["train"],
eval_dataset=dataset["test"],
peft_config=peft_config,
max_seq_length=512,
dataset_text_field="text",
tokenizer=tokenizer,
args=training_arguments,
packing=False,
)
trainer.train()
wandb.finish()
model.config.use_cache = True
Model and Dataset
- Model: math-stackexchange
- Dataset: stackexchange-math-sample
Usage
To use the fine-tuned model for inference, you can load it using the Hugging Face Transformers library and pass in your data for querying.
Example Code
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "blesspearl/math-stackexchange"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def answer_question(question):
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(**inputs)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
return answer
question = "What is the derivative of sin(x)?"
answer = answer_question(question)
print(answer)
Conclusion
This documentation provides an overview of the fine-tuning process of the LLaMA 3.1 model using LoRA on the Stack Exchange Math dataset. The model and dataset are available on Hugging Face for further use and exploration.
For any questions or issues, feel free to open an issue on the model repository.
- Downloads last month
- 5