---
license: llama3.2
---
Vector AI

**A fine-tuned general-purpose chat model built on Meta's Llama 3.2 3B Instruct.**Developed by [LiquidVizion](https://liquidvizion.me) · Based on `meta-llama/Llama-3.2-3B-Instruct`

* * *

## Model Overview

Vector AI is a conversational language model fine-tuned from Llama 3.2 3B Instruct using QLoRA (4-bit quantized low-rank adaptation). It is designed for general chat and assistant tasks, with training focused on improving conversational coherence, instruction following, and response quality at the 3B parameter scale.

| Property | Value |
| --- | --- |
| Base Model | meta-llama/Llama-3.2-3B-Instruct |
| Fine-tune Method | QLoRA (PEFT) |
| Parameters | ~3B |
| Context Length | 4096 tokens |
| Language | English |
| License | Llama 3.2 Community License |

* * *

## Repository Contents

This repository includes four files for different use cases:

| File | Description | Use case |
| --- | --- | --- |
| `model.safetensors` | Full fp16 merged model | Production inference, further fine-tuning |
| `vector-ai-q6.gguf` | Q6_K GGUF quantization | High-quality local inference (llama.cpp, LM Studio) |
| `vector-ai-q4.gguf` | Q4_K_M GGUF quantization | Faster/lighter local inference, lower VRAM |
| `adapter_model.safetensors` | PEFT LoRA adapter weights | Apply on top of the base Llama 3.2 3B Instruct |

* * *

## Quickstart

### PEFT adapter (load on top of base model)

    from transformers import AutoModelForCausalLM, AutoTokenizer
    from peft import PeftModel
    import torch
    
    base_model_id = "meta-llama/Llama-3.2-3B-Instruct"
    adapter_id    = "liquidvizion/vector-ai"  # update with your HF repo path
    
    tokenizer = AutoTokenizer.from_pretrained(base_model_id)
    base_model = AutoModelForCausalLM.from_pretrained(
        base_model_id,
        torch_dtype=torch.float16,
        device_map="auto",
    )
    
    model = PeftModel.from_pretrained(base_model, adapter_id)
    model = model.merge_and_unload()  # optional: merge for faster inference

### llama.cpp / LM Studio (GGUF)

Download either GGUF file and load directly in LM Studio, Ollama, or llama.cpp:

    # Q6 — recommended for quality (requires ~3.5 GB RAM)
    llama-cli -m vector-ai-q6.gguf -p "You are Vector AI." --chat-template llama3
    
    # Q4 — recommended for speed / lower memory (~2.5 GB RAM)
    llama-cli -m vector-ai-q4.gguf -p "You are Vector AI." --chat-template llama3

* * *

## Chat Template

Vector AI uses the standard Llama 3.2 Instruct chat template:

    <|begin_of_text|><|start_header_id|>system<|end_header_id|>
    You are Vector AI, a helpful assistant.<|eot_id|>
    <|start_header_id|>user<|end_header_id|>
    {your message here}<|eot_id|>
    <|start_header_id|>assistant<|end_header_id|>

* * *

## Training Details

| Setting | Value |
| --- | --- |
| Base model | meta-llama/Llama-3.2-3B-Instruct |
| Fine-tune method | QLoRA |
| Quantization (training) | 4-bit NF4 (bitsandbytes) |
| Training framework | Unsloth + HuggingFace PEFT |
| Training data | General conversation / chat |
| Hardware | NVIDIA RTX 4060 8GB |

* * *

## Recommended Inference Settings

These settings work well for general chat use:

    temperature  = 0.7    # balanced creativity vs coherence
    top_p        = 0.9    # nucleus sampling
    top_k        = 50     # vocabulary diversity
    repetition_penalty = 1.1   # reduces looping
    max_new_tokens = 512

For more deterministic / factual responses, lower temperature to `0.3–0.5`.

* * *

## Limitations

* English only. Performance on other languages is untested.
* 3B parameter scale — will be outperformed on complex reasoning tasks by larger models.
* Not trained for code generation, mathematics, or domain-specific professional tasks.
* Like all language models, Vector AI can produce inaccurate or hallucinated responses. Always verify important information.
* Not aligned for safety-critical or high-stakes applications.

* * *

## License

This model is released under the **[Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)**.Use is subject to Meta's acceptable use policy. Commercial use is permitted under the terms of that license.

Base model: © Meta Platforms, Inc.Fine-tune and adapter weights: © LiquidVizion

* * *

## About TensorVizion

[LiquidVizion](https://liquidvizion.me) is a creative AI and design studio publishing open models, LoRA adapters, and generative AI tools.Find more models and resources on [HuggingFace](https://huggingface.co/liquidvizion) and [CivitAI](https://civitai.com).