Created based on the code provided in LoRA_template_20241127.ipynb from the Large Language Model Course FALL 2024.

Model Card for Model ID: llm-jp-3-13b-finetune

This is a Japanese language model fine-tuned on a specific dataset to respond to the Japanese benchmark ELYZA-Task-100-TV.

Model Details

Model Description

Developed by: Satoru
Model type: Transformer
Language(s) (NLP): 日本語
License: CC-BY-NC-SA 4.0
Finetuned from model [optional]: llm-jp/llm-jp-3-13b

Model Sources [optional]

Repository: Hugging Face Model Repository

Uses

Direct Use

Bias, Risks, and Limitations

There are biases, risks, and limitations that may be inherent in LLMs.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

Model loading

# python 3.10.12
!pip install -U pip
!pip install -U transformers
!pip install -U bitsandbytes
!pip install -U accelerate
!pip install -U datasets
!pip install -U peft
!pip install -U trl
!pip install -U wandb
!pip install ipywidgets --upgrade

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    logging,
)
from peft import (
    LoraConfig,
    PeftModel,
    get_peft_model,
)
import os, torch, gc
from datasets import load_dataset
import bitsandbytes as bnb
from trl import SFTTrainer

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)

Evaluation

from tqdm import tqdm

results = []
for data in tqdm(datasets):

  input = data["input"]

  prompt = f"""### 指示
  {input}
  ### 回答
  """

  tokenized_input = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device)
  attention_mask = torch.ones_like(tokenized_input)

  with torch.no_grad():
      outputs = model.generate(
          tokenized_input,
          attention_mask=attention_mask,
          max_new_tokens=100,
          do_sample=False,
          repetition_penalty=1.2,
          pad_token_id=tokenizer.eos_token_id
      )[0]
  output = tokenizer.decode(outputs[tokenized_input.size(1):], skip_special_tokens=True)

  results.append({"task_id": data["task_id"], "input": input, "output": output})

Model Card Contact

Satoru

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

** Created based on the code provided in LoRA_template_20241127.ipynb from the Large Language Model Course FALL 2024. **