AgGPT-16
An very light language model that can be scaled and improved easily. Built with advanced attention mechanisms, context awareness, and quality control features to deliver coherent and contextually relevant responses.
Note
The AgGPT-16 model, despite its name, does not represent the most advanced iteration in the AgGPT series. Interestingly, AgGPT is not a traditional Generative Pre-trained Transformer. Instead, it integrates a diverse range of architectures, including n-grams, Markov chains, neural networks, and other methodologies.
Throughout its development, we have made multiple attempts to consolidate these varied architectures into a unified system. This endeavour was particularly evident in AgGPT-14. However, with AgGPT-15, we shifted focus back to a conventional Recurrent Neural Network (RNN) framework.
In AgGPT-16, we introduced a new .feather save system alongside an innovative n-gram approach. Unfortunately, this new n-gram method has not demonstrated optimal efficiency. Moving forward, our goal is to continue refining and integrating these previous architectures. Through this process, we aim to develop a fully functional and exceptionally powerful model within the AgGPT series
Quick Start
Basic Usage
from AgGPT16 import ask
response = ask("Hello, how are you today?")
print(response)
π§ Configuration Options
ai = AgGPT16(
model_file='custom_model.feather', # Model save location
max_n=5, # Maximum n-gram size
output_length=150 # Max response length
)
π Training Data Format
The model expects conversation data in this format:
user: [user message]
ai: [ai response] <|endoftext|>
π« Limitations
- Training time scales with corpus size
- Memory usage increases with vocabulary size
- Response quality depends on training data quality
- No external knowledge beyond training corpus
π€ Contributing
This is an educational/research project. Feel free to experiment and improve upon the architecture!
π License
Open source - feel free to use and modify.