| --- |
| license: apache-2.0 |
| base_model: |
| - ibm-granite/granite-3.3-8b-instruct |
| --- |
| |
| # Micro-G3.3-8B-Instruct-1B |
|
|
| **Model Summary:** |
| Micro-G3.3-8B-Instruct-1B is a 1-billion parameter micro language model fine-tuned for reasoning and instruction-following capabilities. Built on top of Granite-3.3-8B-Instruct, with only 3 hidden layers, this model is trained to maximize performance and hardware compatibility at minimal compute cost. |
|
|
| **Generation:** |
| This is a simple example of how to use Micro-G3.3-8B-Instruct-1B model. |
|
|
| Install the following libraries: |
|
|
| ```shell |
| pip install torch torchvision torchaudio |
| pip install accelerate |
| pip install transformers |
| ``` |
| Then, copy the snippet from the section that is relevant for your use case. |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed |
| import torch |
| |
| model_path="ibm-ai-platform/micro-g3.3-8b-instruct-1b" |
| device="cuda" |
| model = AutoModelForCausalLM.from_pretrained( |
| model_path, |
| device_map=device, |
| torch_dtype=torch.bfloat16, |
| ) |
| tokenizer = AutoTokenizer.from_pretrained( |
| model_path |
| ) |
| |
| conv = [{"role": "user", "content":"What is your favorite color?"}] |
| |
| input_ids = tokenizer.apply_chat_template(conv, return_tensors="pt", thinking=True, return_dict=True, add_generation_prompt=True).to(device) |
| |
| set_seed(42) |
| output = model.generate( |
| **input_ids, |
| max_new_tokens=8, |
| ) |
| |
| prediction = tokenizer.decode(output[0, input_ids["input_ids"].shape[1]:], skip_special_tokens=True) |
| print(prediction) |
| ``` |
|
|
|
|