A3ON-1B / README.md

kaiiddo

Update README.md

8789475 verified 7 months ago

preview code

raw

history blame contribute delete

3.48 kB

metadata

license: cc-by-4.0
language:
  - en
pipeline_tag: text-generation
tags:
  - text-generation
  - code-assistant
  - a3on
  - A3ON
  - Kaiiddo

---
language: en
license: mit
library_name: transformers
tags:
- text-generation
- code-assistant
- a3on
- kaiiddo
- 1b-parameter
datasets: []
model-index: []
---

A3ON-1B - Enhanced AI Assistant 🤖

Model Overview

Welcome to A3ON-1B, the enhanced version of the A3ON AI assistant! With 1.1 billion parameters, this model is designed to provide significantly improved capabilities over the original 124M parameter model. Whether you need help with conversational tasks or code generation, A3ON-1B is here to assist you!

Key Features

Enhanced Intelligence: With 1.1B parameters, A3ON-1B offers more sophisticated understanding and responses. 🧠
Code Generation: Get advanced programming assistance and code completion. 💻
Conversational Intelligence: Engage in natural dialogue with seamless understanding and response generation. 🗣️
Context Awareness: Maintains context across multi-turn conversations for a more coherent interaction. 🔄
Smart Response Detection: Automatically distinguishes between coding and general knowledge requests. 🔍

Technical Specifications

Specification	Details
Architecture	Transformer-based neural network
Model Type	Causal language model
Parameters	1.1 Billion (1,137,207,296)
Vocabulary Size	49,152 tokens
Context Length	Up to 32,768 tokens
Precision	FP32/FP16 support

Developer Information

AI Name: A3ON-1B
Developer: Kaiiddo
Founder: Aryan Rathod
Organization: Kaiiddo
Location: Gujarat, India 🇮🇳

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("kaiiddo/A3ON-1B")
model = AutoModelForCausalLM.from_pretrained("kaiiddo/A3ON-1B")

# Set pad_token_id to eos_token_id to avoid warnings
model.config.pad_token_id = model.config.eos_token_id

# Generate text with adjusted parameters
inputs = tokenizer("Hello, how can I help you today?", return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_length=500,
    do_sample=True,
    temperature=0.7,
    top_k=50
)

# Decode the output and split into lines
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response_lines = response.split('\n')

# Print each line of the response
for line in response_lines:
    print(line)

Model Parameter Count

Parameter Type	Count
Total Parameters	1.1B (1,137,207,296)
Trainable Parameters	1.1B (1,137,207,296)
Non-Trainable Parameters	0

Model Architecture

Architecture Detail	Value
Model Type	GPTBigCodeForCausalLM
Context Length	8192 tokens
Vocabulary Size	49,152 tokens
Embedding Dimension	2048
Number of Layers	24
Number of Attention Heads	16

Memory Information

Memory Detail	Value
Device	cuda:0
Estimated Memory Usage	4.24 GB (FP32)
GPU	Tesla T4
GPU Memory	14.7 GB

Model Category

Category: Massive Model (1B+)

A3ON-1B is proudly developed in India, tailored to excel in coding assistance and beyond. 🌟