LLMJP3-13B-IT2
Overview
LLMJP3-13B-IT2 is a fine-tuned language model built on top of the "llm-jp/llm-jp-3-13b" base model. This model has been optimized for Japanese text generation and understanding tasks, leveraging advanced techniques for faster training and efficient inference.
Key Features
- Base Model:
llm-jp/llm-jp-3-13b - Fine-tuned Dataset: DeL-TaiseiOzaki/Tengentoppa-sft-v1.0
- Training Acceleration: Utilized Unsloth and Huggingface's TRL library to achieve a 2x faster training process.
- Developer:
tshyk - License: Apache-2.0
Dataset
The model was fine-tuned using the Tengentoppa-sft-v1.0 dataset, which is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).
License
This project is distributed under the Apache License 2.0. Please review the license terms before using the model.
How to Use
Installation and Setup
Follow the steps below to set up the environment and run inference using the model.
Colab Setup
# -*- coding: utf-8 -*-
"""myModel_Inference_Template_unsloth.ipynb"""
# Install dependencies
!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
from unsloth import FastLanguageModel
import torch
import json
model_name = "tshyk/llmjp3-13b-it2"
# Model Configuration
max_seq_length = 2048
dtype = None
load_in_4bit = True
# Load Model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_name,
max_seq_length=max_seq_length,
dtype=dtype,
load_in_4bit=load_in_4bit,
token="your_huggingface_token",
)
FastLanguageModel.for_inference(model)
from google.colab import drive
drive.mount('/content/drive')
# Load Dataset
datasets = []
with open("/content/drive/MyDrive/elyza100_assignment/elyza-tasks-100-TV_0.jsonl", "r") as f:
item = ""
for line in f:
line = line.strip()
item += line
if item.endswith("}"):
datasets.append(json.loads(item))
item = ""
from tqdm import tqdm
# Inference
results = []
for dt in tqdm(datasets):
input = dt["input"]
prompt = f"""### 指示
{input}
### 回ç”
"""
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True, do_sample=False, repetition_penalty=1.2)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回ç”')[-1]
results.append({
"task_id": dt["task_id"],
"input": input,
"output": prediction
})
with open(f"/content/model_output.jsonl", 'w', encoding='utf-8') as f:
for result in results:
json.dump(result, f, ensure_ascii=False)
f.write('\n')
Notes
- Replace
your_huggingface_tokenwith your Huggingface token. - Ensure the dataset files are properly mounted on Colab or your local environment.
Citation
If you use this model, please cite as follows:
@misc{tshyk2024llmjp,
author = {tshyk},
title = {LLMJP3-13B-IT2},
year = {2024},
url = {https://huggingface.co/tshyk/llmjp3-13b-it2},
note = {Fine-tuned using the Tengentoppa-sft-v1.0 dataset.}
}
For further inquiries or contributions, feel free to contact tshyk via Hugging Face.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for tshyk/llmjp3-13b-it2
Base model
llm-jp/llm-jp-3-13b