LimYeri/LeetCode_Python_Solutions_Data
Viewer โข Updated โข 15.7k โข 220
How to use LimYeri/CodeMind-Llama3.1-8B-unsloth with Transformers:
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("LimYeri/CodeMind-Llama3.1-8B-unsloth", dtype="auto")How to use LimYeri/CodeMind-Llama3.1-8B-unsloth with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for LimYeri/CodeMind-Llama3.1-8B-unsloth to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for LimYeri/CodeMind-Llama3.1-8B-unsloth to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for LimYeri/CodeMind-Llama3.1-8B-unsloth to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="LimYeri/CodeMind-Llama3.1-8B-unsloth",
max_seq_length=2048,
)Codemind Project is a language model developed to assist in solving and learning coding test problems. This model is fine-tuned using posts written by LeetCode users as training data, aiming to provide answers specialized for coding tests.
meta-llama/Meta-Llama-3.1-8B-Instructunsloth/Meta-Llama-3.1-8B-Instruct modelThis model is accessible through HuggingFace's model hub and can be integrated into applications using the API. It is designed to generate explanations, code snippets, or guides for coding problems or programming-related questions.
# ์์ธํ ์ฌํญ์ demo-Llama3.1.ipynb ํ์ธ
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from IPython.display import display, Markdown
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "LimYeri/CodeMind-Llama3.1-8B-unsloth", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
)
tokenizer = get_chat_template(
tokenizer,
chat_template = "llama-3.1",
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
messages = [
{"role": "system", "content": "You are a kind coding test teacher."},
{"role": "user", "content": "Enter your coding problem or question here."},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True, # Must add for generation
return_tensors = "pt",
).to("cuda")
outputs = model.generate(input_ids = inputs, max_new_tokens = 3000, use_cache = True,
temperature = 0.5, min_p = 0.3) # Feel free to adjust the temperature and min_p
text = (tokenizer.batch_decode(outputs))[0].split('assistant<|end_header_id|>\n\n')[1].strip()
display(Markdown(text))
is_bfloat16_supported()is_bfloat16_supported()| Metric | Value |
|---|---|
| Average | 22.17 |
| IFEval | 64.9 |
| BBH | 24.19 |
| MATH Lvl 5 | 9.97 |
| GPQA | 1.9 |
| MUSR | 6.04 |
| MMLU-PRO | 26 |
Detailed fine-tuning code and settings can be found in the CodeMind-Extended GitHub repository.