webbigdata
/

C3TR-Adapter_hqq

@@ -98,6 +98,74 @@ Plastic sheets: 4,500 pieces
 Sleeping mats: 8,500 pieces<eos>
 ```
 ### See also
 詳細は[C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)を見てください

 Sleeping mats: 8,500 pieces<eos>
 ```
+### Sample code for High-speed inference (For NVIDIA Ampere or later, A100 or RTX 3090, etc.)
+```
+import torch, os
+from hqq.engine.hf import AutoTokenizer
+from hqq.core.quantize import *
+from hqq.utils.patching import *
+from hqq.models.hf.base import AutoHQQHFModel
+model_id = "webbigdata/C3TR-Adapter_hqq"
+os.environ["TOKENIZERS_PARALLELISM"]  = "1"
+torch.backends.cuda.matmul.allow_tf32 = True
+torch.backends.cudnn.allow_tf32       = True
+compute_dtype = torch.bfloat16
+model     = AutoHQQHFModel.from_quantized(model_id, compute_dtype=compute_dtype)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+patch_linearlayers(model, patch_add_quant_config,
+                          BaseQuantizeConfig(nbits=4, group_size=64, quant_scale=False, quant_zero=False, axis=1))
+HQQLinear.set_backend(HQQBackend.PYTORCH)
+model.eval();
+from hqq.utils.patching import prepare_for_inference
+prepare_for_inference(model, backend="torchao_int4")
+prompt_text = """You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating.
+### Instruction:
+Translate Japanese to English.
+When translating, please use the following hints:
+[writeing_style: formal]
+[米津玄師: Kenshi YONEZU]
+[吉野源三郎: Genzaburo YOSHINO]
+### Input:
+「私自身、訳が分からない」
+「おそらく、訳が分からなかったことでしょう。私自身、訳が分からないところがありました」。
+　2023年2月下旬、東京都内のスタジオで上映された、「君たちはどう生きるか」の初号試写。米津玄師の歌うピアノバラードが流れ、エンド
+ロールが終わった瞬間、灯りが点き、宮崎駿監督のコメントが読み上げられた。
+　客席から軽い笑い声が漏れた。私もその一人だった。あまりの展開の速さと、盛り込むだけ盛り込まれた情報を消化しきれず、茫然と座り>込んでいたが、その言葉で我に返った。
+　これは「宮崎アニメ」の集大成なのか、吉野源三郎の著書『君たちはどう生きるか』の再解釈なのか。とにかく、1回見ただけではとても全
+容を把握できなかった。
+「自分のことをやるしかない」
+　今回の作品は、公開前のプロモーションも、メディア関係者向けの試写も一切ないまま公開日を迎えた。異例の態勢の中、内容は無論、見>たことすら口外無用のキャスト・スタッフ向け試写に、なぜ私と両親が呼ばれたのかといえば、父が『君たちはどう生きるか』の著者・吉野>源三郎の長男で、私が孫にあたるからだ。
+　その5年ほど前の2017年11月、父と私は東京・小金井のスタジオジブリに招かれ、宮崎監督と対面していた。さらにさかのぼること半月ほど
+前、とあるイベントで宮崎監督が突然、次回作のタイトルが「君たちはどう生きるか」だと明らかにし、ニュースなどで話題になっていた。親族としては寝耳に水だったのでかなり驚いたのだが、宮崎監督は「うっかり喋ってしまいました」と詫びた上で、作品について語り始めた>。
+### Response:
+"""
+tokens = tokenizer(prompt_text, return_tensors="pt",
+        padding=True, max_length=1600, truncation=True).to("cuda:0").input_ids
+output = model.generate(
+        input_ids=tokens,
+        max_new_tokens=800,
+        do_sample=True,
+        num_beams=3, temperature=0.5, top_p=0.3,
+        repetition_penalty=1.0)
+print(tokenizer.decode(output[0]))
+```
 ### See also
 詳細は[C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)を見てください