| | --- |
| | language: |
| | - en |
| | --- |
| | ## ChatGCLM-330M |
| | <img src="./banner.png" alt="ChatGCLM Hero" width="600"> |
| | <strong>A high-performance language model architecture.</strong> |
| |
|
| | --- |
| |
|
| | ## Overview |
| |
|
| | **ChatGCLM** is a generative language model that deviates from the traditional Transformer architecture by utilizing a hybrid approach of **Local** and **Global Convolutions**. By leveraging Fast Fourier Transforms (FFT) for global context, ChatGCLM achieves a massive receptive field with a fraction of the computational overhead associated with standard attention mechanisms. |
| |
|
| | The architecture is designed for efficiency, speed, and high-quality generation, featuring a custom vocabulary reduction system that optimizes the embedding space for specific datasets. |
| |
|
| |
|
| | ## 📦 Installation |
| |
|
| | Download this repository and extract it. |
| |
|
| | --- |
| |
|
| | ## Usage |
| |
|
| | ### 1. Training the Model |
| | Place your `.txt` data files in the `data/` directory and run: |
| | ```bash |
| | python train_chatgclm.py |
| | ``` |
| | This script will build the vocabulary and train the foundation model |
| |
|
| | ### 2. Sample from the model |
| | Run sample.py to generate text with the model |
| | ```bash |
| | python sample.py |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Fine-tuning |
| |
|
| | You may fine-tune the model by resuming training from a checkpoint, you may use a different dataset, you may also change parameters such as the learning rate, batch size, etc. |
| |
|
| | <p align="center"> |
| | Built with ❤️ by AG |
| | </p> |