hiraki
/

gemma9b-it-sft

PEFT

Safetensors

Model card Files Files and versions

xet

Community

hiraki commited on Dec 17, 2024

Commit

1b2eae3

verified ·

1 Parent(s): 0429d43

Update README.md

Browse files

Files changed (1) hide show

README.md +216 -190

README.md CHANGED Viewed

@@ -3,200 +3,226 @@ base_model: google/gemma-2-9b-it
 library_name: peft
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.14.0

 library_name: peft
 ---
+# Gemma-9B Instruction Tuned Model Inference Guide
+このガイドでは、Hugging Faceからhiraki/gemma9b-it-sftモデルをダウンロードし、推論を実行する手順を説明します。
+## パッケージのインストール
+はじめに、必要なパッケージをインストールします：
+```bash
+# 依存関係のアップグレード
+pip install --upgrade datasets bitsandbytes trl peft
+pip install --upgrade torch
+# その他の必要なパッケージ
+pip install transformers tqdm
+```
+## 環境設定
+- CUDA対応のGPU（最低16GB以上のVRAMを推奨）
+- Python 3.8以上
+- 十分なストレージ空間（モデルのダウンロード用）
+## インストールの確認
+インストール後、以下のコードで環境を確認できます：
+```python
+import torch
+print(f"PyTorch version: {torch.__version__}")
+print(f"CUDA available: {torch.cuda.is_available()}")
+print(f"CUDA device count: {torch.cuda.device_count()}")
+```
+## 手順
+### 1. Hugging Faceの設定
+```python
+import os
+# HuggingFaceのトークンを設定
+os.environ["HF_TOKEN"] = "your_token_here"  # あなたのトークンに置き換えてください
+```
+### 2. 推論スクリプトの作成
+以下の内容で`inference.py`を作成してください：
+```python
+import os
+import json
+import torch
+from tqdm import tqdm
+from peft import AutoPeftModelForCausalLM
+from transformers import AutoTokenizer
+def run_inference(model_path, input_file, output_file):
+    # モデルとトークナイザーの読み込み
+    print("モデルを読み込んでいます...")
+    model = AutoPeftModelForCausalLM.from_pretrained(
+        model_path,
+        device_map={"": "cuda"},
+        torch_dtype=torch.float16,
+    )
+    tokenizer = AutoTokenizer.from_pretrained(model_path)
+    if tokenizer.pad_token is None:
+        tokenizer.pad_token = tokenizer.eos_token
+    tokenizer.padding_side = "right"
+    # 入力データの読み込み
+    print("入力データを読み込んでいます...")
+    data = []
+    with open(input_file, 'r', encoding='utf-8') as f:
+        for line in f:
+            data.append(json.loads(line))
+    # 推論実行
+    results = []
+    print("推論を実行中...")
+    for dt in tqdm(data, desc="Processing"):
+        task_id = dt["task_id"]
+        input_text = dt["input"]
+        # プロンプトの生成
+        prompt = f"### 指示\n{input_text}\n### 回答\n"
+        # 入力のトークン化
+        inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
+        # 出力の生成
+        outputs = model.generate(
+            inputs.input_ids,
+            attention_mask=inputs.attention_mask,
+            max_new_tokens=512,
+            temperature=0.7,
+            top_p=0.9,
+            repetition_penalty=1.2,
+            do_sample=False
+        )
+        # 出力のデコード
+        prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('### 回答\n')[-1].strip()
+        # 結果を保存
+        results.append({
+            "task_id": task_id,
+            "input": input_text,
+            "output": prediction
+        })
+    # 結果の保存
+    print(f"結果を保存中: {output_file}")
+    with open(output_file, 'w', encoding='utf-8') as f:
+        for result in results:
+            json.dump(result, f, ensure_ascii=False)
+            f.write('\n')
+    print("処理が完了しました")
+    return results
+if __name__ == "__main__":
+    model_path = "hiraki/gemma9b-it-sft"
+    input_file = "elyza-tasks-100-TV_0.jsonl"  # 入力ファイル名
+    output_file = "output.jsonl"  # 出力ファイル名
+    results = run_inference(model_path, input_file, output_file)
+    # 最初の結果を表示して確認
+    print("\n最初の結果の例：")
+    print(json.dumps(results[0], ensure_ascii=False, indent=2))
+```
+### 3. 入力データの準備
+入力JSONLファイルは以下の形式である必要があります：
+```json
+{
+  "task_id": "タスクの一意な識別子",
+  "input": "入力テキスト"
+}
+```
+### 4. 実行
+```bash
+python inference.py
+```
+## 出力形式
+出力されるJSONLファイルは以下の形式です：
+```json
+{
+  "task_id": "タスクID",
+  "input": "入力テキスト",
+  "output": "生成されたテキスト"
+}
+```
+## 生成パラメータのカスタマイズ
+スクリプト内の以下のパラメータを調整することで、生成結果をカスタマイズできます：
+- `max_new_tokens`: 生成する最大トークン数 (デフォルト: 512)
+- `temperature`: 生成の多様性 (デフォルト: 0.7)
+- `top_p`: サンプリングの閾値 (デフォルト: 0.9)
+- `repetition_penalty`: 繰り返しの抑制 (デフォルト: 1.2)
+- `do_sample`: ランダムサンプリングの有効/無効 (デフォルト: False)
+## トラブルシューティング
+### よくあるエラー
+1. CUDA Out of Memory
+   ```
+   RuntimeError: CUDA out of memory
+   ```
+   - 解決策：
+     - GPUのメモリを解放する
+     - `max_new_tokens`の値を小さくする
+     - より小さいバッチサイズを使用する
+2. パッケージのバージョンの不一致
+   ```
+   ImportError: Cannot import name 'X' from 'Y'
+   ```
+   - 解決策：
+     - すべてのパッケージを最新バージョンにアップグレードする
+     - `pip install --upgrade`コマンドを再実行する
+3. モデルのダウンロードエラー
+   ```
+   OSError: Incorrect Hugging Face token
+   ```
+   - 解決策：
+     - HF_TOKENが正しく設定されているか確認
+     - インターネット接続を確認
+4. 入力ファイルのフォーマットエラー
+   ```
+   JSONDecodeError: Expecting value
+   ```
+   - 解決策：
+     - 入力JSONLファイルの形式を確認
+     - ファイルのエンコーディングがUTF-8であることを確認
+## 注意事項
+- 推論には十分なGPUメモリが必要です
+- 大量のデータを処理する場合は、進捗バーで進行状況を確認できます
+- 出力ファイルは自動的に上書きされるため、必要に応じてバックアップを作成してください
+## サポート
+問題が発生した場合は、以下を確認してください：
+1. Hugging Faceのモデルカード: [hiraki/gemma9b-it-sft](https://huggingface.co/hiraki/gemma9b-it-sft)
+2. PEFT GitHub: [microsoft/PEFT](https://github.com/microsoft/PEFT)
+3. Transformers GitHub: [huggingface/transformers](https://github.com/huggingface/transformers)