File size: 9,873 Bytes
4d3449c
92c68e3
 
4d3449c
92c68e3
4d3449c
92c68e3
4d3449c
92c68e3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
---
title: AI_Agent_Script_Builder
app_file: src/app.py
sdk: gradio
sdk_version: 5.13.1
---
# ๐ŸŽ“ AI Agent Script Builder

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)
[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/rogeliorichman/AI_Script_Generator)

> Transform transcripts and PDFs into timed, structured teaching scripts using an autonomous AI agent

AI Agent Script Builder is an advanced autonomous agent that converts PDF transcripts, raw text, and conversational content into well-structured teaching scripts. It seamlessly processes inputs, extracting and analyzing the content to create organized, pedagogically scripts with time markers. Designed for educators, students, content creators, and anyone looking to transform information into clear explanations.

## ๐Ÿค– AI Agent Architecture

AI Agent Script Builder functions as a **specialized AI agent** that autonomously processes and transforms content with minimal human intervention:

### Agent Capabilities
- **Autonomous Processing**: Independently analyzes content, determines structure, and generates complete scripts
- **Decision Making**: Intelligently allocates time, prioritizes topics, and structures content based on input analysis
- **Contextual Adaptation**: Adjusts to different languages, styles, and requirements through guiding prompts
- **Obstacle Management**: Implements progressive retry strategies when facing API quota limitations
- **Goal-Oriented Operation**: Consistently works toward transforming unstructured information into coherent educational scripts

### Agent Limitations
- **Domain Specificity**: Specialized for educational script generation rather than general-purpose tasks
- **External API Dependency**: Relies on third-party language models (Gemini/OpenAI) for core reasoning
- **No Continuous Learning**: Does not improve through experience or previous interactions

This architecture enables the system to function autonomously within its specialized domain while maintaining high-quality output and resilience to common obstacles.

## ๐Ÿ”— Live Demo

Try it out: [AI Agent Script Builder on Hugging Face Spaces](https://huggingface.co/spaces/rogeliorichman/AI_Script_Generator)

## โœจ Features

- ๐Ÿค– PDF transcript and raw text processing
- ๐Ÿค– AI-powered content transformation
- ๐Ÿ“š Structured teaching script generation
- ๐Ÿ”„ Coherent topic organization
- ๐Ÿ”Œ Support for multiple AI providers (Gemini/OpenAI)
- โฑ๏ธ Time-marked sections for pacing
- ๐ŸŒ Multilingual interface (English/Spanish) with flag selector
- ๐ŸŒ Generation in ANY language through the guiding prompt (not limited to UI languages)
- ๐Ÿง  Autonomous decision-making for content organization and pacing
- ๐Ÿ›ก๏ธ Self-healing capabilities with progressive retry strategies for API limitations

## Output Format

The generated scripts follow a structured format:

### Time Markers
- Each section includes time markers (e.g., `[11:45]`) to help pace delivery
- Customizable duration: From as short as 2 minutes to 60 minutes, with timing adjusted accordingly

### Structure
- Introduction with learning objectives
- Time-marked content sections
- Examples and practical applications
- Interactive elements (questions, exercises)
- Recap and key takeaways

For example:
```
[00:00] Introduction to Topic
- Learning objectives
- Key concepts overview

[11:45] Main Concept Explanation
- Detailed explanation
- Practical example
- Student interaction point

[23:30] Advanced Applications
...
```

## ๐Ÿš€ Quick Start

### Prerequisites

- Python 3.8 or higher
- Virtual environment (recommended)
- Gemini API key (or OpenAI API key)

### Installation

```bash
# Clone the repository
git clone https://github.com/RogelioRichmanAstronaut/AI-Script-Generator.git
cd AI-Script-Generator

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up environment variables (choose one API key based on your preference)
export GEMINI_API_KEY='your-gemini-api-key'  # Primary option
# OR
export OPENAI_API_KEY='your-openai-api-key'  # Alternative option

# On Windows use:
# set GEMINI_API_KEY=your-gemini-api-key
# set OPENAI_API_KEY=your-openai-api-key
```

### Usage

```bash
# Run with Python path set
PYTHONPATH=$PYTHONPATH:. python src/app.py

# Access the web interface
# Open http://localhost:7860 in your browser
```

## ๐Ÿ› ๏ธ Technical Approach

### Prompt Engineering Strategy

Our system uses a sophisticated multi-stage prompting approach:

1. **Content Analysis & Chunking**
   - Smart text segmentation for handling large documents (9000+ words)
   - Contextual overlap between chunks to maintain coherence
   - Key topic and concept extraction from each segment

2. **Structure Generation**
   - Time-based sectioning (customizable from 2-60 minutes)
   - Educational flow design with clear progression
   - Integration of pedagogical elements (examples, exercises, questions)

3. **Educational Enhancement**
   - Transformation of casual content into formal teaching script
   - Addition of practical examples and case studies
   - Integration of interaction points and reflection questions
   - Time markers for pacing guidance

4. **Coherence Validation**
   - Cross-reference checking between sections
   - Verification of topic flow and progression
   - Consistency check for terminology and concepts
   - Quality assessment of educational elements

### Challenges & Solutions

1. **Context Length Management**
   - Challenge: Handling documents beyond model context limits
   - Solution: Implemented sliding window chunking with overlap
   - Result: Successfully processes documents up to 9000+ words with extensibility for more

2. **Educational Structure**
   - Challenge: Converting conversational text to teaching format
   - Solution: 
     - Structured templating system for different time formats (2-60 min)
     - Integration of pedagogical elements (examples, exercises)
     - Time-based sectioning with clear progression
   - Result: Coherent, time-marked teaching scripts with interactive elements

3. **Content Coherence**
   - Challenge: Maintaining narrative flow across chunked content
   - Solution: 
     - Contextual overlap between chunks
     - Topic tracking across sections
     - Cross-reference validation system
   - Result: Seamless content flow with consistent terminology

4. **Educational Quality**
   - Challenge: Ensuring high pedagogical value
   - Solution:
     - Integration of learning objectives
     - Strategic placement of examples and exercises
     - Addition of reflection questions
     - Time-appropriate pacing markers
   - Result: Engaging, structured learning materials

### Core Components

1. **PDF Processing**: Extracts and cleans text from PDF transcripts
2. **Text Processing**: Handles direct text input and cleans/structures it
3. **Content Analysis**: Uses AI to understand and structure the content
4. **Script Generation**: Transforms content into educational format

### Implementation Details

1. **PDF/Text Handling**
   - Robust PDF text extraction
   - Raw text input processing
   - Clean-up of extracted content

2. **AI Processing**
   - Integration with Gemini API (primary)
   - OpenAI API support (alternative)
   - Structured prompt system for consistent output

3. **Output Generation**
   - Organized teaching scripts
   - Clear section structure
   - Learning points and key concepts

### Architecture

The system follows a modular agent-based design:

- ๐Ÿ“„ PDF/text processing module (Perception)
- ๐Ÿ” Text analysis component (Cognition)
- ๐Ÿค– AI integration layer (Decision-making)
- ๐Ÿ“ Output formatting system (Action)
- ๐Ÿ”„ Error handling system (Self-correction)

This agent architecture enables autonomous processing from raw input to final output with built-in adaptation to errors and limitations.

## ๐Ÿค Contributing

Contributions are what make the open source community amazing! Any contributions you make are **greatly appreciated**.

1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.

## ๐Ÿ“ License

Distributed under the MIT License. See `LICENSE` for more information.

## ๐ŸŒŸ Acknowledgments

- Special thanks to the Gemini and OpenAI teams for their amazing APIs
- Inspired by educators and communicators worldwide who make learning engaging

## ๐Ÿ“ง Contact

Project Link: [https://github.com/RogelioRichmanAstronaut/AI-Script-Generator](https://github.com/RogelioRichmanAstronaut/AI-Script-Generator) 

## ๐Ÿ”ฎ Roadmap

- [ ] Support for multiple output formats (PDF, PPTX)
- [ ] Interactive elements generation
- [ ] Custom templating system
- [ ] Copy to clipboard button for generated content
- [x] Multilingual capabilities
  - [x] Content generation in any language via guiding prompt
  - [x] UI language support
    - [x] English
    - [x] Spanish
    - [ ] French
    - [ ] German
- [ ] Integration with LMS platforms
- [x] Timestamp toggle - ability to show/hide time markers in the output text

---

<p align="center">Made with โค๏ธ for educators, students, and communicators everywhere</p>