webbigdata
/

C3TR-Adapter

@@ -36,8 +36,18 @@ If you want to run it on your own local computer, you will need approximately 8.
 必要なライブラリのインストール
 Installation of required libraries
 ```
-pip install transformers==0.38.1 peft==0.9.0 bitsandbytes==0.42.0
 ```
 サンプルスクリプト
@@ -61,14 +71,13 @@ def trans(my_str):
     # Translation
     generated_ids = model.generate(input_ids=input_ids,
-        num_beams=3, max_new_tokens=800,
         use_cache=True,
         prompt_lookup_num_tokens=10
         )
     full_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
     return full_outputs[0].split("### Answer:\n")[-1].strip()
 ret = trans("""
 ### Instructions:
 Translate Japanese to English.
@@ -90,16 +99,20 @@ There are two types of instructions: "Translate Japanese to English." and "Trans
 実験的な試みとして、インフォーマルな場面を想定した翻訳を行う際にsubculture文脈指定ができるようになっています。
 As an experiment, subculture context can be specified when translating for informal situations.
 Translate English to Japanese within the context of subculture.
 ## 留意事項 Attention
-**Do not save this adapter merged with the base model**, as there exists a bug that reduces performance when saving this adapter merged with the model.
-このアダプターをモデルとマージして保存すると性能が下がってしまう不具合が存在するため、**ベースモデルとマージして保存しないでください**
 ### 利用規約 Terms of Use
 基本的にはgemmaと同じライセンスです
 Basically the same license as gemma.
@@ -112,13 +125,14 @@ Our previous model, ALMA-7B-Ja-V2, has over 150K downloads, but we have no idea
 そのため、使用した後は[Googleフォームに感想や今後期待する方向性、気が付いた誤訳の例などを記入](https://forms.gle/Ycr9nWumvGamiNma9)してください。
 So, after you use it, please [fill out the Google form below with your impressions, future directions you expect us to take, and examples of mistranslations you have noticed](https://forms.gle/Ycr9nWumvGamiNma9).
-個人情報は収集しないので、気軽にご記入をお願いします
-We do not collect personal information, so please feel free to fill out the form!
 どんなご意見でも感謝します！
 Any feedback would be appreciated!
 ### 謝辞 Acknowledgment
 Original Base Model
 google/gemma-7b
 https://huggingface.co/google/gemma-7b
@@ -131,7 +145,6 @@ QLoRA Adapter
 webbigdata/C3TR-Adapter
 https://huggingface.co/webbigdata/C3TR-Adapter
 This adapter was trained with Unsloth.
 https://github.com/unslothai/unsloth

 必要なライブラリのインストール
 Installation of required libraries
 ```
+# first install pytorch. check official documents.
+# https://pytorch.org/get-started/locally/#start-locally
+# example for linux user.
+# pip3 install torch torchvision torchaudio
+# example for windows user.
+# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
+pip install transformers==4.38.2
+pip install peft==0.9.0
+pip install bitsandbytes==0.42.0
 ```
 サンプルスクリプト
     # Translation
     generated_ids = model.generate(input_ids=input_ids,
+        num_beams=1, max_new_tokens=800,
         use_cache=True,
         prompt_lookup_num_tokens=10
         )
     full_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
     return full_outputs[0].split("### Answer:\n")[-1].strip()
 ret = trans("""
 ### Instructions:
 Translate Japanese to English.
 実験的な試みとして、インフォーマルな場面を想定した翻訳を行う際にsubculture文脈指定ができるようになっています。
 As an experiment, subculture context can be specified when translating for informal situations.
+eg:
 Translate English to Japanese within the context of subculture.
 ## 留意事項 Attention
+このアダプターをモデルとマージして保存すると性能が下がってしまう不具合が存在するため、**ベースモデル(gemma-7b-bnb-4bit)とアダプターをマージして保存しないでください**
+**Do not save this adapter merged with the base model(gemma-7b-bnb-4bit)**, as there exists a bug that reduces performance when saving this adapter merged with the model.
+どうしてもマージしたい場合は必ずPerplexityではなく、翻訳ベンチマークで性能を確認してから使うようにしてください
+If you must merge, be sure to use a translation benchmark to check performance, not Perplexity!
 ### 利用規約 Terms of Use
 基本的にはgemmaと同じライセンスです
 Basically the same license as gemma.
 そのため、使用した後は[Googleフォームに感想や今後期待する方向性、気が付いた誤訳の例などを記入](https://forms.gle/Ycr9nWumvGamiNma9)してください。
 So, after you use it, please [fill out the Google form below with your impressions, future directions you expect us to take, and examples of mistranslations you have noticed](https://forms.gle/Ycr9nWumvGamiNma9).
+個人情報やメールアドレスは収集しないので、気軽にご記入をお願いします
+We do not collect personal information or email address, so please feel free to fill out the form!
 どんなご意見でも感謝します！
 Any feedback would be appreciated!
 ### 謝辞 Acknowledgment
 Original Base Model
 google/gemma-7b
 https://huggingface.co/google/gemma-7b
 webbigdata/C3TR-Adapter
 https://huggingface.co/webbigdata/C3TR-Adapter
 This adapter was trained with Unsloth.
 https://github.com/unslothai/unsloth