Spaces:
Paused
Paused
| title: AI Executive System | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: "6.5.0" | |
| python_version: "3.11" | |
| app_file: app/app.py | |
| pinned: false | |
| # AI Executive System | |
| A production-ready AI chatbot system that replicates the communication style, reasoning patterns, and personality of **Ryouken Okuni, CEO of Akatsuki AI Technologies**. The system uses a dual-LLM architecture with open-source models, designed for white-label deployment across multiple enterprise clients. | |
| ## Architecture | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β AI EXECUTIVE SYSTEM β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ | |
| β β | |
| β [User Query] β | |
| β β β | |
| β βΌ β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β LLM 1: VOICE MODEL (Llama-3.1-8B-Instruct + LoRA) β β | |
| β β - Fine-tuned on CEO's blog content β β | |
| β β - Captures authentic reasoning patterns & communication style β β | |
| β β - Generates CEO-style draft response β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β β | |
| β βΌ β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β LLM 2: REFINEMENT MODEL (Llama-3.1-8B-Instruct) β β | |
| β β - No fine-tuning required (prompt-based) β β | |
| β β - Polishes grammar, clarity, professional formatting β β | |
| β β - Improves logical flow and argument coherence β β | |
| β β - Ensures cultural appropriateness for Japanese business context β β | |
| β β - Preserves voice authenticity while improving quality β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β β | |
| β βΌ β | |
| β [Final Response to User] β | |
| β β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ## Features | |
| - **Dual-LLM Pipeline**: Voice model for authentic CEO communication + Refinement model for polish | |
| - **QLoRA Fine-tuning**: Efficient training with 4-bit quantization | |
| - **Japanese Business Culture Awareness**: Culturally appropriate responses | |
| - **Hugging Face Integration**: Models stored and loaded from HF Hub | |
| - **Gradio Interface**: Professional chat UI with custom branding | |
| - **Comprehensive Evaluation**: Voice authenticity and factual accuracy metrics | |
| ## Project Structure | |
| ``` | |
| ai-executive/ | |
| βββ data/ | |
| β βββ raw/ # Original blog posts (blogs.txt) | |
| β βββ processed/ # Cleaned and segmented content | |
| β βββ training/ # Final JSONL training datasets | |
| β | |
| βββ src/ | |
| β βββ data_processing/ # Blog parsing, Q&A generation | |
| β βββ training/ # QLoRA/LoRA fine-tuning scripts | |
| β βββ inference/ # Dual-LLM pipeline | |
| β βββ evaluation/ # Voice and accuracy metrics | |
| β | |
| βββ app/ # Gradio application | |
| βββ scripts/ # CLI tools | |
| βββ notebooks/ # Jupyter notebooks for experiments | |
| ``` | |
| ## Quick Start | |
| ### 1. Installation | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### 2. Prepare Data | |
| Place your CEO's blog content in `data/raw/blogs.txt` with the following format: | |
| ``` | |
| === BLOG START === | |
| [Title of first blog post] | |
| [Content of first blog post...] | |
| === BLOG END === | |
| === BLOG START === | |
| [Title of second blog post] | |
| [Content of second blog post...] | |
| === BLOG END === | |
| ``` | |
| ### 3. Process Blogs | |
| ```bash | |
| python scripts/process_blogs.py --input data/raw/blogs.txt --output data/processed/ | |
| ``` | |
| ### 4. Generate Training Data | |
| ```bash | |
| # Set your API key | |
| export ANTHROPIC_API_KEY=your_key_here | |
| # or | |
| export OPENAI_API_KEY=your_key_here | |
| # Generate Q&A pairs | |
| python scripts/generate_training_data.py \ | |
| --input data/processed/ \ | |
| --output data/training/ \ | |
| --num-pairs 500 | |
| ``` | |
| ### 5. Fine-tune the Model | |
| ```bash | |
| # Run on Hugging Face infrastructure | |
| python scripts/train_model.py \ | |
| --dataset data/training/train.jsonl \ | |
| --base-model Qwen/Qwen3-4B-Instruct \ | |
| --output-repo your-username/ceo-voice-model \ | |
| --epochs 3 | |
| ``` | |
| ### 6. Run the Chatbot | |
| ```bash | |
| python app/app.py | |
| ``` | |
| Or deploy to Hugging Face Spaces: | |
| ```bash | |
| python scripts/push_to_hub.py --space your-username/ai-executive-chatbot | |
| ``` | |
| ## Configuration | |
| ### Environment Variables | |
| Create a `.env` file in the project root: | |
| ```env | |
| # API Keys for Q&A generation | |
| ANTHROPIC_API_KEY=your_anthropic_key | |
| OPENAI_API_KEY=your_openai_key | |
| # Hugging Face | |
| HF_TOKEN=your_huggingface_token | |
| HF_USERNAME=your_username | |
| # Model Configuration | |
| VOICE_MODEL_REPO=your-username/ceo-voice-model | |
| REFINEMENT_MODEL=meta-llama/Meta-Llama-3-8B-Instruct | |
| ``` | |
| ### Training Configuration | |
| Key hyperparameters in `src/training/train_qlora.py`: | |
| | Parameter | Default | Description | | |
| |-----------|---------|-------------| | |
| | LoRA rank | 64 | Rank of LoRA matrices | | |
| | LoRA alpha | 128 | Scaling factor | | |
| | Learning rate | 2e-4 | Training learning rate | | |
| | Batch size | 4 | Per-device batch size | | |
| | Gradient accumulation | 4 | Steps before update | | |
| | Max sequence length | 2048 | Maximum tokens per example | | |
| | Epochs | 3-5 | Training epochs | | |
| ## Evaluation | |
| Run the evaluation suite: | |
| ```bash | |
| python scripts/evaluate_model.py \ | |
| --model your-username/ceo-voice-model \ | |
| --test-set data/training/validation.jsonl | |
| ``` | |
| ### Metrics | |
| - **Vocabulary Overlap**: Jaccard similarity with blog corpus (target: >0.7) | |
| - **Embedding Similarity**: Topic coherence with source material (target: >0.8) | |
| - **Factual Accuracy**: Claims verified against source (target: >95%) | |
| - **Unique Phrase Preservation**: CEO's signature phrases detected | |
| ## Deployment | |
| ### Hugging Face Spaces | |
| 1. Create a new Space on Hugging Face | |
| 2. Push the application: | |
| ```bash | |
| python scripts/push_to_hub.py --space your-username/ai-executive-chatbot | |
| ``` | |
| ### Requirements | |
| - GPU: NVIDIA A10G or T4 recommended | |
| - VRAM: 16GB+ for inference | |
| - Storage: 20GB+ for models | |
| ## License | |
| Proprietary - Akatsuki AI Technologies | |
| ## Contact | |
| For questions or support, contact the Akatsuki AI Technologies team. | |