File size: 9,873 Bytes
4d3449c 92c68e3 4d3449c 92c68e3 4d3449c 92c68e3 4d3449c 92c68e3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 | ---
title: AI_Agent_Script_Builder
app_file: src/app.py
sdk: gradio
sdk_version: 5.13.1
---
# ๐ AI Agent Script Builder
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://github.com/psf/black)
[](http://makeapullrequest.com)
[](https://huggingface.co/spaces/rogeliorichman/AI_Script_Generator)
> Transform transcripts and PDFs into timed, structured teaching scripts using an autonomous AI agent
AI Agent Script Builder is an advanced autonomous agent that converts PDF transcripts, raw text, and conversational content into well-structured teaching scripts. It seamlessly processes inputs, extracting and analyzing the content to create organized, pedagogically scripts with time markers. Designed for educators, students, content creators, and anyone looking to transform information into clear explanations.
## ๐ค AI Agent Architecture
AI Agent Script Builder functions as a **specialized AI agent** that autonomously processes and transforms content with minimal human intervention:
### Agent Capabilities
- **Autonomous Processing**: Independently analyzes content, determines structure, and generates complete scripts
- **Decision Making**: Intelligently allocates time, prioritizes topics, and structures content based on input analysis
- **Contextual Adaptation**: Adjusts to different languages, styles, and requirements through guiding prompts
- **Obstacle Management**: Implements progressive retry strategies when facing API quota limitations
- **Goal-Oriented Operation**: Consistently works toward transforming unstructured information into coherent educational scripts
### Agent Limitations
- **Domain Specificity**: Specialized for educational script generation rather than general-purpose tasks
- **External API Dependency**: Relies on third-party language models (Gemini/OpenAI) for core reasoning
- **No Continuous Learning**: Does not improve through experience or previous interactions
This architecture enables the system to function autonomously within its specialized domain while maintaining high-quality output and resilience to common obstacles.
## ๐ Live Demo
Try it out: [AI Agent Script Builder on Hugging Face Spaces](https://huggingface.co/spaces/rogeliorichman/AI_Script_Generator)
## โจ Features
- ๐ค PDF transcript and raw text processing
- ๐ค AI-powered content transformation
- ๐ Structured teaching script generation
- ๐ Coherent topic organization
- ๐ Support for multiple AI providers (Gemini/OpenAI)
- โฑ๏ธ Time-marked sections for pacing
- ๐ Multilingual interface (English/Spanish) with flag selector
- ๐ Generation in ANY language through the guiding prompt (not limited to UI languages)
- ๐ง Autonomous decision-making for content organization and pacing
- ๐ก๏ธ Self-healing capabilities with progressive retry strategies for API limitations
## Output Format
The generated scripts follow a structured format:
### Time Markers
- Each section includes time markers (e.g., `[11:45]`) to help pace delivery
- Customizable duration: From as short as 2 minutes to 60 minutes, with timing adjusted accordingly
### Structure
- Introduction with learning objectives
- Time-marked content sections
- Examples and practical applications
- Interactive elements (questions, exercises)
- Recap and key takeaways
For example:
```
[00:00] Introduction to Topic
- Learning objectives
- Key concepts overview
[11:45] Main Concept Explanation
- Detailed explanation
- Practical example
- Student interaction point
[23:30] Advanced Applications
...
```
## ๐ Quick Start
### Prerequisites
- Python 3.8 or higher
- Virtual environment (recommended)
- Gemini API key (or OpenAI API key)
### Installation
```bash
# Clone the repository
git clone https://github.com/RogelioRichmanAstronaut/AI-Script-Generator.git
cd AI-Script-Generator
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables (choose one API key based on your preference)
export GEMINI_API_KEY='your-gemini-api-key' # Primary option
# OR
export OPENAI_API_KEY='your-openai-api-key' # Alternative option
# On Windows use:
# set GEMINI_API_KEY=your-gemini-api-key
# set OPENAI_API_KEY=your-openai-api-key
```
### Usage
```bash
# Run with Python path set
PYTHONPATH=$PYTHONPATH:. python src/app.py
# Access the web interface
# Open http://localhost:7860 in your browser
```
## ๐ ๏ธ Technical Approach
### Prompt Engineering Strategy
Our system uses a sophisticated multi-stage prompting approach:
1. **Content Analysis & Chunking**
- Smart text segmentation for handling large documents (9000+ words)
- Contextual overlap between chunks to maintain coherence
- Key topic and concept extraction from each segment
2. **Structure Generation**
- Time-based sectioning (customizable from 2-60 minutes)
- Educational flow design with clear progression
- Integration of pedagogical elements (examples, exercises, questions)
3. **Educational Enhancement**
- Transformation of casual content into formal teaching script
- Addition of practical examples and case studies
- Integration of interaction points and reflection questions
- Time markers for pacing guidance
4. **Coherence Validation**
- Cross-reference checking between sections
- Verification of topic flow and progression
- Consistency check for terminology and concepts
- Quality assessment of educational elements
### Challenges & Solutions
1. **Context Length Management**
- Challenge: Handling documents beyond model context limits
- Solution: Implemented sliding window chunking with overlap
- Result: Successfully processes documents up to 9000+ words with extensibility for more
2. **Educational Structure**
- Challenge: Converting conversational text to teaching format
- Solution:
- Structured templating system for different time formats (2-60 min)
- Integration of pedagogical elements (examples, exercises)
- Time-based sectioning with clear progression
- Result: Coherent, time-marked teaching scripts with interactive elements
3. **Content Coherence**
- Challenge: Maintaining narrative flow across chunked content
- Solution:
- Contextual overlap between chunks
- Topic tracking across sections
- Cross-reference validation system
- Result: Seamless content flow with consistent terminology
4. **Educational Quality**
- Challenge: Ensuring high pedagogical value
- Solution:
- Integration of learning objectives
- Strategic placement of examples and exercises
- Addition of reflection questions
- Time-appropriate pacing markers
- Result: Engaging, structured learning materials
### Core Components
1. **PDF Processing**: Extracts and cleans text from PDF transcripts
2. **Text Processing**: Handles direct text input and cleans/structures it
3. **Content Analysis**: Uses AI to understand and structure the content
4. **Script Generation**: Transforms content into educational format
### Implementation Details
1. **PDF/Text Handling**
- Robust PDF text extraction
- Raw text input processing
- Clean-up of extracted content
2. **AI Processing**
- Integration with Gemini API (primary)
- OpenAI API support (alternative)
- Structured prompt system for consistent output
3. **Output Generation**
- Organized teaching scripts
- Clear section structure
- Learning points and key concepts
### Architecture
The system follows a modular agent-based design:
- ๐ PDF/text processing module (Perception)
- ๐ Text analysis component (Cognition)
- ๐ค AI integration layer (Decision-making)
- ๐ Output formatting system (Action)
- ๐ Error handling system (Self-correction)
This agent architecture enables autonomous processing from raw input to final output with built-in adaptation to errors and limitations.
## ๐ค Contributing
Contributions are what make the open source community amazing! Any contributions you make are **greatly appreciated**.
1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
## ๐ License
Distributed under the MIT License. See `LICENSE` for more information.
## ๐ Acknowledgments
- Special thanks to the Gemini and OpenAI teams for their amazing APIs
- Inspired by educators and communicators worldwide who make learning engaging
## ๐ง Contact
Project Link: [https://github.com/RogelioRichmanAstronaut/AI-Script-Generator](https://github.com/RogelioRichmanAstronaut/AI-Script-Generator)
## ๐ฎ Roadmap
- [ ] Support for multiple output formats (PDF, PPTX)
- [ ] Interactive elements generation
- [ ] Custom templating system
- [ ] Copy to clipboard button for generated content
- [x] Multilingual capabilities
- [x] Content generation in any language via guiding prompt
- [x] UI language support
- [x] English
- [x] Spanish
- [ ] French
- [ ] German
- [ ] Integration with LMS platforms
- [x] Timestamp toggle - ability to show/hide time markers in the output text
---
<p align="center">Made with โค๏ธ for educators, students, and communicators everywhere</p> |