Welcome to A3ON-1B, the enhanced version of the A3ON AI assistant! With 1.1 billion parameters, this model is designed to provide significantly improved capabilities over the original 124M parameter model. Whether you need help with conversational tasks or code generation, A3ON-1B is here to assist you!
Key Features
Enhanced Intelligence: With 1.1B parameters, A3ON-1B offers more sophisticated understanding and responses. ๐ง
Code Generation: Get advanced programming assistance and code completion. ๐ป
Conversational Intelligence: Engage in natural dialogue with seamless understanding and response generation. ๐ฃ๏ธ
Context Awareness: Maintains context across multi-turn conversations for a more coherent interaction. ๐
Smart Response Detection: Automatically distinguishes between coding and general knowledge requests. ๐
Technical Specifications
Specification
Details
Architecture
Transformer-based neural network
Model Type
Causal language model
Parameters
1.1 Billion (1,137,207,296)
Vocabulary Size
49,152 tokens
Context Length
Up to 32,768 tokens
Precision
FP32/FP16 support
Developer Information
AI Name: A3ON-1B
Developer: Kaiiddo
Founder: Aryan Rathod
Organization: Kaiiddo
Location: Gujarat, India ๐ฎ๐ณ
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("kaiiddo/A3ON-1B")
model = AutoModelForCausalLM.from_pretrained("kaiiddo/A3ON-1B")
# Set pad_token_id to eos_token_id to avoid warnings
model.config.pad_token_id = model.config.eos_token_id
# Generate text with adjusted parameters
inputs = tokenizer("Hello, how can I help you today?", return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=500,
do_sample=True,
temperature=0.7,
top_k=50
)
# Decode the output and split into lines
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response_lines = response.split('\n')
# Print each line of the responsefor line in response_lines:
print(line)
Model Parameter Count
Parameter Type
Count
Total Parameters
1.1B (1,137,207,296)
Trainable Parameters
1.1B (1,137,207,296)
Non-Trainable Parameters
0
Model Architecture
Architecture Detail
Value
Model Type
GPTBigCodeForCausalLM
Context Length
8192 tokens
Vocabulary Size
49,152 tokens
Embedding Dimension
2048
Number of Layers
24
Number of Attention Heads
16
Memory Information
Memory Detail
Value
Device
cuda:0
Estimated Memory Usage
4.24 GB (FP32)
GPU
Tesla T4
GPU Memory
14.7 GB
Model Category
Category: Massive Model (1B+)
A3ON-1B is proudly developed in India, tailored to excel in coding assistance and beyond. ๐