| | --- |
| | base_model: |
| | - swiss-ai/Apertus-8B-2509 |
| | library_name: peft |
| | license: apache-2.0 |
| | tags: |
| | - finance |
| | - financialcrime |
| | - compliance |
| | --- |
| | |
| | # Model Card for Apertus-8B-Instruct-OFAC-FAQ |
| |
|
| | A model fined tuned for sanctions and AML related OFAC FAQ questions with the Swiss AI |
| | Apertus 8B Instruct model which was then used as teacher and distilled to TinyLlama 1.1B. The model is 6-7 X smaller than the original. Quantization to INT8 should allow even low-memory CPU inference |
| | deployments if model latency is not a primary concern. PEFT LoRA adapter are included for use with base model. |
| |
|
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | The model includes INT8 quantized weights for CPU inference and a LoRA adapter for GPU inference with |
| | a matching base. |
| |
|
| |
|
| | - **Developed by:** Soteria Initiative |
| | - **Funded by:** Soteria Initiative |
| | - **Shared by:** Soteria Initiative |
| | - **Model type:** Text generation, LlamaForCausalLM, context length 2048 |
| | - **Language(s) (NLP):** English, Others |
| | - **License:** Apache-2.0 |
| | - **Finetuned from model:** Apertus 8B Instruct |
| |
|
| | ### Model Sources |
| |
|
| | <!-- Provide the basic links for the model. --> |
| |
|
| | - **Repository:** https://huggingface.co/SoteriaInitiative/Apertus-8B-Instruct-OFAC-FAQ |
| | - **Demo:** _WIP_ |
| |
|
| | ## Uses |
| |
|
| | Use for chat or assistant applications where compliance or financial crime analysis need to |
| | get answers regarding FATF or OFAC FAQ matters. |
| |
|
| | ### Direct Use |
| |
|
| | This model can directly be used with the FCCAssistant https://github.com/SoteriaInitiative/fccassistant |
| | once a model endpoint has been deployed. |
| |
|
| |
|
| | ### Out-of-Scope Use |
| |
|
| | This model is not intended for production deployment. |
| |
|
| | ## Bias, Risks, and Limitations |
| |
|
| | The model is fine tuned for FATF and OFAC FAQ matters and hence should be restricted to such |
| | use cases where this is of a concern. |
| |
|
| |
|
| | ### Recommendations |
| |
|
| | Perform model quality evaluation before use. |
| |
|
| | ## How to Get Started with the Model |
| |
|
| | Use the Jupyter Notebook linked in the **Demo** references for a comprehensive overview. |
| |
|
| | For a quick start try: |
| | ```python |
| | import torch |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | from peft import PeftModel |
| | |
| | BASE = "TinyLlama/TinyLlama-1.1B-Chat-v1.0" |
| | ADAPTER = "./peft" # or "org/repo-name" if pushed to HF |
| | |
| | # Tokenizer (includes the chat template) |
| | tokenizer = AutoTokenizer.from_pretrained(BASE) |
| | |
| | # Base model (GPU, 8-bit). For CPU, remove load_in_8bit and device_map. |
| | model = AutoModelForCausalLM.from_pretrained( |
| | BASE, |
| | device_map="auto", |
| | load_in_8bit=True, |
| | ) |
| | model = PeftModel.from_pretrained(model, ADAPTER) |
| | model.eval() |
| | |
| | # Chat prompt via tokenizer's chat_template |
| | messages = [ |
| | {"role": "system", "content": "You are a helpful assistant for sanctions/AML."}, |
| | {"role": "user", "content": "Summarize the key OFAC FAQ topics."}, |
| | ] |
| | inputs = tokenizer.apply_chat_template( |
| | messages, add_generation_prompt=True, return_tensors="pt" |
| | ).to(model.device) |
| | |
| | with torch.inference_mode(): |
| | out = model.generate( |
| | inputs, |
| | max_new_tokens=256, |
| | temperature=0.7, |
| | top_p=0.9, |
| | do_sample=True, |
| | pad_token_id=tokenizer.eos_token_id, |
| | ) |
| | |
| | print(tokenizer.decode(out[0], skip_special_tokens=True)) |
| | |
| | ``` |
| |
|
| | Notes: |
| |
|
| | - GPU 8-bit is shown. For CPU-only, drop load_in_8bit=True and device_map="auto", then model.to("cpu"). |
| | - If you plan to export a merged model, load the base in full precision and then model = |
| | model.merge_and_unload() (optional, not needed for standard PEFT inference). |
| | |
| | ## Training Details |
| | |
| | ### Training Data |
| | The following sources where used for fine tuning: |
| | |
| | - OFAC FAQ: https://ofac.treasury.gov/faqs |
| | - FATF Recommendations: https://www.fatf-gafi.org/content/dam/fatf-gafi/recommendations/FATF%20Recommendations%202012.pdf.coredownload.inline.pdf |
| | |
| | |
| | ### Training Procedure |
| | |
| | Supervised fine tuning has been applied to the Apertus 8B Instruct model with a training dataset |
| | of FAQ question/answer pairs as well as FATF titles and recommendation pairs. |
| | |
| | |
| | |
| | ## Evaluation |
| | |
| | Model evaluation has NOT been performed yet! |
| | |
| | |
| | - PEFT 0.13.2 |