Spaces:

rogeliorichman
/

AI_Agent_Script_Builder

Sleeping

App Files Files Community

rogeliorichman commited on Apr 7, 2025

Commit

92c68e3

verified ·

1 Parent(s): 4d3449c

Upload folder using huggingface_hub

Browse files

Files changed (14) hide show

.gitattributes +1 -0
.gitignore +62 -0
CONTRIBUTING.md +99 -0
README.md +261 -8
data/sample2.pdf +3 -0
requirements.txt +9 -0
setup.py +13 -0
src/__init__.py +0 -0
src/app.py +383 -0
src/core/__init__.py +0 -0
src/core/transformer.py +698 -0
src/utils/__init__.py +0 -0
src/utils/pdf_processor.py +59 -0
src/utils/text_processor.py +84 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+data/sample2.pdf filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,62 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual Environment
+venv/
+ENV/
+env/
+.env
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+.DS_Store
+# Testing
+.coverage
+htmlcov/
+.tox/
+.nox/
+.pytest_cache/
+# Logs
+*.log
+logs/
+# Local development
+.env.local
+.env.development.local
+.env.test.local
+.env.production.local
+# API Keys
+.env
+*.pem
+*.key
+# Gradio
+.gradio/
+# private file
+/data/sample3.pdf

CONTRIBUTING.md ADDED Viewed

	@@ -0,0 +1,99 @@

+# Contributing to AI LectureForge
+First off, thank you for considering contributing to AI LectureForge! It's people like you that make AI LectureForge such a great tool.
+## Code of Conduct
+By participating in this project, you are expected to uphold our Code of Conduct:
+- Use welcoming and inclusive language
+- Be respectful of differing viewpoints and experiences
+- Gracefully accept constructive criticism
+- Focus on what is best for the community
+- Show empathy towards other community members
+## How Can I Contribute?
+### Reporting Bugs
+Before creating bug reports, please check the issue list as you might find out that you don't need to create one. When you are creating a bug report, please include as many details as possible:
+* Use a clear and descriptive title
+* Describe the exact steps which reproduce the problem
+* Provide specific examples to demonstrate the steps
+* Describe the behavior you observed after following the steps
+* Explain which behavior you expected to see instead and why
+* Include screenshots if possible
+### Suggesting Enhancements
+If you have a suggestion for the project, we'd love to hear it. Enhancement suggestions are tracked as GitHub issues. When creating an enhancement suggestion, please include:
+* A clear and descriptive title
+* A detailed description of the proposed enhancement
+* Examples of how the enhancement would be used
+* Any potential drawbacks or challenges
+### Pull Requests
+1. Fork the repo and create your branch from `main`
+2. If you've added code that should be tested, add tests
+3. If you've changed APIs, update the documentation
+4. Ensure the test suite passes
+5. Make sure your code follows the existing style
+6. Issue that pull request!
+## Development Process
+1. Create a new branch:
+   ```bash
+   git checkout -b feature/my-feature
+   # or
+   git checkout -b bugfix/my-bugfix
+   ```
+2. Make your changes and commit:
+   ```bash
+   git add .
+   git commit -m "Description of changes"
+   ```
+3. Push to your fork:
+   ```bash
+   git push origin feature/my-feature
+   ```
+### Style Guidelines
+- Follow PEP 8 style guide for Python code
+- Use descriptive variable names
+- Comment your code when necessary
+- Keep functions focused and modular
+- Use type hints where possible
+### Testing
+- Write unit tests for new features
+- Ensure all tests pass before submitting PR
+- Include both positive and negative test cases
+## Project Structure
+```
+transcript_transformer/
+├── src/
+│   ├── core/          # Core transformation logic
+│   ├── utils/         # Utility functions
+│   └── app.py         # Main application
+├── tests/             # Test files
+└── requirements.txt   # Project dependencies
+```
+## Getting Help
+If you need help, you can:
+- Open an issue with your question
+- Reach out to the maintainers
+- Check the documentation
+Thank you for contributing to AI LectureForge! 🎓✨

README.md CHANGED Viewed

@@ -1,12 +1,265 @@
 ---
-title: AI Agent Script Builder
-emoji: 😻
-colorFrom: blue
-colorTo: indigo
 sdk: gradio
-sdk_version: 5.23.3
-app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: AI_Agent_Script_Builder
+app_file: src/app.py
 sdk: gradio
+sdk_version: 5.13.1
 ---
+# 🎓 AI Agent Script Builder
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
+[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
+[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)
+[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/rogeliorichman/AI_Script_Generator)
+> Transform transcripts and PDFs into timed, structured teaching scripts using an autonomous AI agent
+AI Agent Script Builder is an advanced autonomous agent that converts PDF transcripts, raw text, and conversational content into well-structured teaching scripts. It seamlessly processes inputs, extracting and analyzing the content to create organized, pedagogically scripts with time markers. Designed for educators, students, content creators, and anyone looking to transform information into clear explanations.
+## 🤖 AI Agent Architecture
+AI Agent Script Builder functions as a **specialized AI agent** that autonomously processes and transforms content with minimal human intervention:
+### Agent Capabilities
+- **Autonomous Processing**: Independently analyzes content, determines structure, and generates complete scripts
+- **Decision Making**: Intelligently allocates time, prioritizes topics, and structures content based on input analysis
+- **Contextual Adaptation**: Adjusts to different languages, styles, and requirements through guiding prompts
+- **Obstacle Management**: Implements progressive retry strategies when facing API quota limitations
+- **Goal-Oriented Operation**: Consistently works toward transforming unstructured information into coherent educational scripts
+### Agent Limitations
+- **Domain Specificity**: Specialized for educational script generation rather than general-purpose tasks
+- **External API Dependency**: Relies on third-party language models (Gemini/OpenAI) for core reasoning
+- **No Continuous Learning**: Does not improve through experience or previous interactions
+This architecture enables the system to function autonomously within its specialized domain while maintaining high-quality output and resilience to common obstacles.
+## 🔗 Live Demo
+Try it out: [AI Agent Script Builder on Hugging Face Spaces](https://huggingface.co/spaces/rogeliorichman/AI_Script_Generator)
+## ✨ Features
+- 🤖 PDF transcript and raw text processing
+- 🤖 AI-powered content transformation
+- 📚 Structured teaching script generation
+- 🔄 Coherent topic organization
+- 🔌 Support for multiple AI providers (Gemini/OpenAI)
+- ⏱️ Time-marked sections for pacing
+- 🌐 Multilingual interface (English/Spanish) with flag selector
+- 🌍 Generation in ANY language through the guiding prompt (not limited to UI languages)
+- 🧠 Autonomous decision-making for content organization and pacing
+- 🛡️ Self-healing capabilities with progressive retry strategies for API limitations
+## Output Format
+The generated scripts follow a structured format:
+### Time Markers
+- Each section includes time markers (e.g., `[11:45]`) to help pace delivery
+- Customizable duration: From as short as 2 minutes to 60 minutes, with timing adjusted accordingly
+### Structure
+- Introduction with learning objectives
+- Time-marked content sections
+- Examples and practical applications
+- Interactive elements (questions, exercises)
+- Recap and key takeaways
+For example:
+```
+[00:00] Introduction to Topic
+- Learning objectives
+- Key concepts overview
+[11:45] Main Concept Explanation
+- Detailed explanation
+- Practical example
+- Student interaction point
+[23:30] Advanced Applications
+...
+```
+## 🚀 Quick Start
+### Prerequisites
+- Python 3.8 or higher
+- Virtual environment (recommended)
+- Gemini API key (or OpenAI API key)
+### Installation
+```bash
+# Clone the repository
+git clone https://github.com/RogelioRichmanAstronaut/AI-Script-Generator.git
+cd AI-Script-Generator
+# Create and activate virtual environment
+python -m venv venv
+source venv/bin/activate  # On Windows: .\venv\Scripts\activate
+# Install dependencies
+pip install -r requirements.txt
+# Set up environment variables (choose one API key based on your preference)
+export GEMINI_API_KEY='your-gemini-api-key'  # Primary option
+# OR
+export OPENAI_API_KEY='your-openai-api-key'  # Alternative option
+# On Windows use:
+# set GEMINI_API_KEY=your-gemini-api-key
+# set OPENAI_API_KEY=your-openai-api-key
+```
+### Usage
+```bash
+# Run with Python path set
+PYTHONPATH=$PYTHONPATH:. python src/app.py
+# Access the web interface
+# Open http://localhost:7860 in your browser
+```
+## 🛠️ Technical Approach
+### Prompt Engineering Strategy
+Our system uses a sophisticated multi-stage prompting approach:
+1. **Content Analysis & Chunking**
+   - Smart text segmentation for handling large documents (9000+ words)
+   - Contextual overlap between chunks to maintain coherence
+   - Key topic and concept extraction from each segment
+2. **Structure Generation**
+   - Time-based sectioning (customizable from 2-60 minutes)
+   - Educational flow design with clear progression
+   - Integration of pedagogical elements (examples, exercises, questions)
+3. **Educational Enhancement**
+   - Transformation of casual content into formal teaching script
+   - Addition of practical examples and case studies
+   - Integration of interaction points and reflection questions
+   - Time markers for pacing guidance
+4. **Coherence Validation**
+   - Cross-reference checking between sections
+   - Verification of topic flow and progression
+   - Consistency check for terminology and concepts
+   - Quality assessment of educational elements
+### Challenges & Solutions
+1. **Context Length Management**
+   - Challenge: Handling documents beyond model context limits
+   - Solution: Implemented sliding window chunking with overlap
+   - Result: Successfully processes documents up to 9000+ words with extensibility for more
+2. **Educational Structure**
+   - Challenge: Converting conversational text to teaching format
+   - Solution:
+     - Structured templating system for different time formats (2-60 min)
+     - Integration of pedagogical elements (examples, exercises)
+     - Time-based sectioning with clear progression
+   - Result: Coherent, time-marked teaching scripts with interactive elements
+3. **Content Coherence**
+   - Challenge: Maintaining narrative flow across chunked content
+   - Solution:
+     - Contextual overlap between chunks
+     - Topic tracking across sections
+     - Cross-reference validation system
+   - Result: Seamless content flow with consistent terminology
+4. **Educational Quality**
+   - Challenge: Ensuring high pedagogical value
+   - Solution:
+     - Integration of learning objectives
+     - Strategic placement of examples and exercises
+     - Addition of reflection questions
+     - Time-appropriate pacing markers
+   - Result: Engaging, structured learning materials
+### Core Components
+1. **PDF Processing**: Extracts and cleans text from PDF transcripts
+2. **Text Processing**: Handles direct text input and cleans/structures it
+3. **Content Analysis**: Uses AI to understand and structure the content
+4. **Script Generation**: Transforms content into educational format
+### Implementation Details
+1. **PDF/Text Handling**
+   - Robust PDF text extraction
+   - Raw text input processing
+   - Clean-up of extracted content
+2. **AI Processing**
+   - Integration with Gemini API (primary)
+   - OpenAI API support (alternative)
+   - Structured prompt system for consistent output
+3. **Output Generation**
+   - Organized teaching scripts
+   - Clear section structure
+   - Learning points and key concepts
+### Architecture
+The system follows a modular agent-based design:
+- 📄 PDF/text processing module (Perception)
+- 🔍 Text analysis component (Cognition)
+- 🤖 AI integration layer (Decision-making)
+- 📝 Output formatting system (Action)
+- 🔄 Error handling system (Self-correction)
+This agent architecture enables autonomous processing from raw input to final output with built-in adaptation to errors and limitations.
+## 🤝 Contributing
+Contributions are what make the open source community amazing! Any contributions you make are **greatly appreciated**.
+1. Fork the Project
+2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
+3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
+4. Push to the Branch (`git push origin feature/AmazingFeature`)
+5. Open a Pull Request
+See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
+## 📝 License
+Distributed under the MIT License. See `LICENSE` for more information.
+## 🌟 Acknowledgments
+- Special thanks to the Gemini and OpenAI teams for their amazing APIs
+- Inspired by educators and communicators worldwide who make learning engaging
+## 📧 Contact
+Project Link: [https://github.com/RogelioRichmanAstronaut/AI-Script-Generator](https://github.com/RogelioRichmanAstronaut/AI-Script-Generator)
+## 🔮 Roadmap
+- [ ] Support for multiple output formats (PDF, PPTX)
+- [ ] Interactive elements generation
+- [ ] Custom templating system
+- [ ] Copy to clipboard button for generated content
+- [x] Multilingual capabilities
+  - [x] Content generation in any language via guiding prompt
+  - [x] UI language support
+    - [x] English
+    - [x] Spanish
+    - [ ] French
+    - [ ] German
+- [ ] Integration with LMS platforms
+- [x] Timestamp toggle - ability to show/hide time markers in the output text
+---
+<p align="center">Made with ❤️ for educators, students, and communicators everywhere</p>

data/sample2.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5bf0997942205ed54293dd5b2a480b6d5efcc7d4146548dd68c20a9d7e3f7318
+size 155966

requirements.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+gradio>=4.0.0
+transformers>=4.30.0
+torch>=2.0.0
+pypdf2>=3.0.0
+python-dotenv>=0.19.0
+numpy>=1.21.0
+tqdm>=4.65.0
+openai>=1.0.0
+tiktoken>=0.5.0

setup.py ADDED Viewed

	@@ -0,0 +1,13 @@

+from setuptools import setup, find_packages
+setup(
+    name="transcript_transformer",
+    version="0.1.0",
+    packages=find_packages(),
+    install_requires=[
+        line.strip()
+        for line in open("requirements.txt")
+        if line.strip() and not line.startswith("#")
+    ],
+    python_requires=">=3.8",
+)

src/__init__.py ADDED Viewed

File without changes

src/app.py ADDED Viewed

	@@ -0,0 +1,383 @@

+import os
+import gradio as gr
+import re
+from dotenv import load_dotenv
+from src.core.transformer import TranscriptTransformer
+from src.utils.pdf_processor import PDFProcessor
+from src.utils.text_processor import TextProcessor
+load_dotenv()
+# Translations dictionary for UI elements
+TRANSLATIONS = {
+    "en": {
+        "title": "AI Script Generator",
+        "subtitle": "Transform transcripts and PDFs into timed, structured teaching scripts using AI",
+        "input_type_label": "Input Type",
+        "input_type_options": ["PDF", "Raw Text"],
+        "upload_pdf_label": "Upload Transcript (PDF)",
+        "paste_text_label": "Paste Transcript Text",
+        "paste_text_placeholder": "Paste your transcript text here...",
+        "guiding_prompt_label": "Guiding Prompt (Optional)",
+        "guiding_prompt_placeholder": "Additional instructions to customize the output. Examples: 'Use a more informal tone', 'Focus only on section X', 'Generate the content in Spanish', 'Include more practical programming examples', etc.",
+        "guiding_prompt_info": "The Guiding Prompt allows you to provide specific instructions to modify the generated content, like output/desired LANGUAGE. You can use it to change the tone, style, focus ONLY on specific sections of the text, specify the output language (e.g., 'Generate in Spanish/French/German'), or give any other instruction that helps personalize the final result.",
+        "duration_label": "Target Lecture Duration (minutes)",
+        "examples_label": "Include Practical Examples",
+        "thinking_model_label": "Use Experimental Thinking Model (Gemini Only)",
+        "submit_button": "Transform Transcript",
+        "output_label": "Generated Teaching Transcript",
+        "error_no_pdf": "Error: No PDF file uploaded",
+        "error_no_text": "Error: No text provided",
+        "error_prefix": "Error processing transcript: ",
+        "language_selector": "Language / Idioma",
+        "show_timestamps": "Show Timestamps",
+        "hide_timestamps": "Hide Timestamps"
+    },
+    "es": {
+        "title": "Generador de Guiones IA",
+        "subtitle": "Transforma transcripciones y PDFs en guiones de enseñanza estructurados y cronometrados usando IA",
+        "input_type_label": "Tipo de Entrada",
+        "input_type_options": ["PDF", "Texto"],
+        "upload_pdf_label": "Subir Transcripción (PDF)",
+        "paste_text_label": "Pegar Texto de Transcripción",
+        "paste_text_placeholder": "Pega tu texto de transcripción aquí...",
+        "guiding_prompt_label": "Instrucciones Guía (Opcional)",
+        "guiding_prompt_placeholder": "Instrucciones adicionales para personalizar el resultado. Ejemplos: 'Usa un tono más informal', 'Enfócate solo en la sección X', 'Genera el contenido en inglés', 'Incluye más ejemplos prácticos de programación', etc.",
+        "guiding_prompt_info": "Las Instrucciones Guía te permiten proporcionar indicaciones específicas para modificar el contenido generado, como el IDIOMA deseado. Puedes usarlas para cambiar el tono, estilo, enfocarte SOLO en secciones específicas del texto, especificar el idioma de salida (ej., 'Generar en inglés/francés/alemán'), o dar cualquier otra instrucción que ayude a personalizar el resultado final.",
+        "duration_label": "Duración Objetivo de la Clase (minutos)",
+        "examples_label": "Incluir Ejemplos Prácticos",
+        "thinking_model_label": "Usar Modelo de Pensamiento Experimental (Solo Gemini)",
+        "submit_button": "Transformar Transcripción",
+        "output_label": "Guión de Enseñanza Generado",
+        "error_no_pdf": "Error: No se ha subido ningún archivo PDF",
+        "error_no_text": "Error: No se ha proporcionado texto",
+        "error_prefix": "Error al procesar la transcripción: ",
+        "language_selector": "Language / Idioma",
+        "show_timestamps": "Mostrar Marcas de Tiempo",
+        "hide_timestamps": "Ocultar Marcas de Tiempo"
+    }
+}
+# Language-specific prompt suffixes to append automatically
+LANGUAGE_PROMPTS = {
+    "en": "",  # Default language doesn't need special instructions
+    "es": "Generate the content in Spanish. Genera todo el contenido en español."
+}
+class TranscriptTransformerApp:
+    def __init__(self):
+        self.pdf_processor = PDFProcessor()
+        self.text_processor = TextProcessor()
+        self.current_language = "en"  # Default language
+        self.last_generated_content = ""  # Store the last generated content
+        self.content_with_timestamps = ""  # Store content with timestamps
+        self.content_without_timestamps = ""  # Store content without timestamps
+    def process_transcript(self,
+                           language: str,
+                           input_type: str,
+                           file_obj: gr.File = None,
+                           raw_text_input: str = "",
+                           initial_prompt: str = "",
+                           target_duration: int = 30,
+                           include_examples: bool = True,
+                           use_gemini: bool = True,
+                           use_thinking_model: bool = False) -> str:
+        """
+        Process uploaded transcript and transform it into a teaching transcript
+        Args:
+            language: Selected UI language
+            input_type: Type of input (PDF or Raw Text)
+            file_obj: Uploaded PDF file (if input_type is PDF)
+            raw_text_input: Raw text input (if input_type is Raw Text)
+            initial_prompt: Additional guiding instructions for the content generation
+            target_duration: Target lecture duration in minutes
+            include_examples: Whether to include practical examples
+            use_gemini: Whether to use Gemini API instead of OpenAI
+            use_thinking_model: Requires use_gemini=True
+        Returns:
+            str: Generated teaching transcript
+        """
+        try:
+            # Force enable Gemini if thinking model is selected
+            if use_thinking_model:
+                use_gemini = True
+            self.transformer = TranscriptTransformer(
+                use_gemini=use_gemini,
+                use_thinking_model=use_thinking_model
+            )
+            # Get text based on input type
+            if input_type == TRANSLATIONS[language]["input_type_options"][0]:  # PDF
+                if file_obj is None:
+                    return TRANSLATIONS[language]["error_no_pdf"]
+                raw_text = self.pdf_processor.extract_text(file_obj.name)
+            else:  # Raw Text
+                if not raw_text_input.strip():
+                    return TRANSLATIONS[language]["error_no_text"]
+                raw_text = raw_text_input
+            # Modify initial prompt based on language if no explicit language instruction is given
+            modified_prompt = initial_prompt
+            # Check if user has specified a language in the prompt
+            language_keywords = ["spanish", "español", "english", "inglés", "french", "francés", "german", "alemán"]
+            user_specified_language = any(keyword in initial_prompt.lower() for keyword in language_keywords)
+            # Only append language instruction if user hasn't specified one and we have a non-default language
+            if not user_specified_language and language in LANGUAGE_PROMPTS and LANGUAGE_PROMPTS[language]:
+                if modified_prompt:
+                    modified_prompt += " " + LANGUAGE_PROMPTS[language]
+                else:
+                    modified_prompt = LANGUAGE_PROMPTS[language]
+            # Transform to teaching transcript with user guidance
+            lecture_transcript = self.transformer.transform_to_lecture(
+                text=raw_text,
+                target_duration=target_duration,
+                include_examples=include_examples,
+                initial_prompt=modified_prompt
+            )
+            # Store the generated content
+            self.content_with_timestamps = lecture_transcript
+            # Create a version without timestamps
+            self.content_without_timestamps = self.remove_timestamps(lecture_transcript)
+            # Default: show content with timestamps
+            self.last_generated_content = lecture_transcript
+            return lecture_transcript
+        except Exception as e:
+            return f"{TRANSLATIONS[language]['error_prefix']}{str(e)}"
+    def remove_timestamps(self, text):
+        """Remove all timestamps (e.g., [00:00]) from the text"""
+        # Regex to match the timestamp pattern [MM:SS] or [HH:MM:SS]
+        return re.sub(r'\[\d{1,2}:\d{2}(:\d{2})?\]', '', text)
+    def toggle_timestamps(self, show_timestamps):
+        """Toggle visibility of timestamps in output"""
+        if show_timestamps:
+            return self.content_with_timestamps
+        else:
+            return self.content_without_timestamps
+    def update_ui_language(self, language):
+        """Update UI elements based on selected language"""
+        self.current_language = language
+        translations = TRANSLATIONS[language]
+        return [
+            translations["title"],
+            translations["subtitle"],
+            translations["input_type_label"],
+            gr.update(choices=translations["input_type_options"], value=translations["input_type_options"][0]),
+            translations["upload_pdf_label"],
+            translations["paste_text_label"],
+            translations["paste_text_placeholder"],
+            translations["guiding_prompt_label"],
+            translations["guiding_prompt_placeholder"],
+            translations["guiding_prompt_info"],
+            translations["duration_label"],
+            translations["examples_label"],
+            translations["thinking_model_label"],
+            translations["submit_button"],
+            translations["output_label"]
+        ]
+    def launch(self):
+        """Launch the Gradio interface"""
+        # Get the path to the example PDF
+        example_pdf = os.path.join(os.path.dirname(os.path.dirname(__file__)), "data", "sample2.pdf")
+        with gr.Blocks(title=TRANSLATIONS["en"]["title"]) as interface:
+            # Header with title and language selector side by side
+            with gr.Row():
+                with gr.Column(scale=4):
+                    title_md = gr.Markdown("# " + TRANSLATIONS["en"]["title"])
+                with gr.Column(scale=1):
+                    language_selector = gr.Dropdown(
+                        choices=["🇺🇸 English", "🇪🇸 Español"],
+                        value="🇺🇸 English",
+                        label=TRANSLATIONS["en"]["language_selector"],
+                        elem_id="language-selector",
+                        interactive=True
+                    )
+            # Subtitle
+            subtitle_md = gr.Markdown(TRANSLATIONS["en"]["subtitle"])
+            # Input type row
+            with gr.Row():
+                input_type = gr.Radio(
+                    choices=TRANSLATIONS["en"]["input_type_options"],
+                    label=TRANSLATIONS["en"]["input_type_label"],
+                    value=TRANSLATIONS["en"]["input_type_options"][0]
+                )
+            # File/text input columns
+            with gr.Row():
+                with gr.Column(visible=True) as pdf_column:
+                    file_input = gr.File(
+                        label=TRANSLATIONS["en"]["upload_pdf_label"],
+                        file_types=[".pdf"]
+                    )
+                with gr.Column(visible=False) as text_column:
+                    text_input = gr.Textbox(
+                        label=TRANSLATIONS["en"]["paste_text_label"],
+                        lines=10,
+                        placeholder=TRANSLATIONS["en"]["paste_text_placeholder"]
+                    )
+            # Guiding prompt
+            with gr.Row():
+                initial_prompt = gr.Textbox(
+                    label=TRANSLATIONS["en"]["guiding_prompt_label"],
+                    lines=3,
+                    value="",
+                    placeholder=TRANSLATIONS["en"]["guiding_prompt_placeholder"],
+                    info=TRANSLATIONS["en"]["guiding_prompt_info"]
+                )
+            # Settings row
+            with gr.Row():
+                target_duration = gr.Number(
+                    label=TRANSLATIONS["en"]["duration_label"],
+                    value=30,
+                    minimum=2,
+                    maximum=60,
+                    step=1
+                )
+                include_examples = gr.Checkbox(
+                    label=TRANSLATIONS["en"]["examples_label"],
+                    value=True
+                )
+                use_thinking_model = gr.Checkbox(
+                    label=TRANSLATIONS["en"]["thinking_model_label"],
+                    value=True
+                )
+            # Submit button
+            with gr.Row():
+                submit_btn = gr.Button(TRANSLATIONS["en"]["submit_button"])
+            # Output area
+            output = gr.Textbox(
+                label=TRANSLATIONS["en"]["output_label"],
+                lines=25
+            )
+            # Toggle timestamps button and Copy button
+            with gr.Row():
+                timestamps_checkbox = gr.Checkbox(
+                    label=TRANSLATIONS["en"]["show_timestamps"],
+                    value=True,
+                    interactive=True
+                )
+            # Map language dropdown values to language codes
+            lang_map = {
+                "🇺🇸 English": "en",
+                "🇪🇸 Español": "es"
+            }
+            # Handle visibility of input columns based on selection
+            def update_input_visibility(language_display, choice):
+                language = lang_map.get(language_display, "en")
+                return [
+                    gr.update(visible=(choice == TRANSLATIONS[language]["input_type_options"][0])),  # pdf_column
+                    gr.update(visible=(choice == TRANSLATIONS[language]["input_type_options"][1]))  # text_column
+                ]
+            # Get language code from display value
+            def get_language_code(language_display):
+                return lang_map.get(language_display, "en")
+            # Update UI elements when language changes
+            def update_ui_with_display(language_display):
+                language = get_language_code(language_display)
+                self.current_language = language
+                translations = TRANSLATIONS[language]
+                return [
+                    "# " + translations["title"],  # Title with markdown formatting
+                    translations["subtitle"],
+                    translations["input_type_label"],
+                    gr.update(choices=translations["input_type_options"], value=translations["input_type_options"][0], label=translations["input_type_label"]),
+                    gr.update(label=translations["upload_pdf_label"]),
+                    gr.update(label=translations["paste_text_label"], placeholder=translations["paste_text_placeholder"]),
+                    gr.update(label=translations["guiding_prompt_label"], placeholder=translations["guiding_prompt_placeholder"], info=translations["guiding_prompt_info"]),
+                    gr.update(label=translations["duration_label"]),
+                    gr.update(label=translations["examples_label"]),
+                    gr.update(label=translations["thinking_model_label"]),
+                    translations["submit_button"],
+                    gr.update(label=translations["output_label"]),
+                    gr.update(label=translations["show_timestamps"])
+                ]
+            input_type.change(
+                fn=lambda lang_display, choice: update_input_visibility(lang_display, choice),
+                inputs=[language_selector, input_type],
+                outputs=[pdf_column, text_column]
+            )
+            # Language change event
+            language_selector.change(
+                fn=update_ui_with_display,
+                inputs=language_selector,
+                outputs=[
+                    title_md, subtitle_md,
+                    input_type, input_type,
+                    file_input, text_input,
+                    initial_prompt,
+                    target_duration, include_examples, use_thinking_model,
+                    submit_btn, output,
+                    timestamps_checkbox
+                ]
+            )
+            # Toggle timestamps event
+            timestamps_checkbox.change(
+                fn=self.toggle_timestamps,
+                inputs=[timestamps_checkbox],
+                outputs=[output]
+            )
+            # Set up submission logic with language code conversion
+            submit_btn.click(
+                fn=lambda lang_display, *args: self.process_transcript(get_language_code(lang_display), *args),
+                inputs=[
+                    language_selector,
+                    input_type,
+                    file_input,
+                    text_input,
+                    initial_prompt,
+                    target_duration,
+                    include_examples,
+                    use_thinking_model
+                ],
+                outputs=output
+            )
+            # Example for PDF input
+            gr.Examples(
+                examples=[[example_pdf, "", "", 30, True, True]],
+                inputs=[file_input, text_input, initial_prompt, target_duration, include_examples, use_thinking_model]
+            )
+        interface.launch(share=True)
+if __name__ == "__main__":
+    app = TranscriptTransformerApp()
+    app.launch()

src/core/__init__.py ADDED Viewed

File without changes

src/core/transformer.py ADDED Viewed

	@@ -0,0 +1,698 @@

+import os
+import logging
+import json
+import time
+from typing import List, Dict, Optional, Callable, Any
+import openai
+from src.utils.text_processor import TextProcessor
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class WordCountError(Exception):
+    """Raised when word count requirements are not met"""
+    pass
+class TranscriptTransformer:
+    """Transforms conversational transcripts into teaching material using LLM"""
+    MAX_RETRIES = 3  # Initial retries for content generation
+    EXTENDED_RETRIES = 3  # Additional retries with longer waits
+    EXTENDED_RETRY_DELAYS = [5, 10, 15]  # Wait times in seconds for extended retries
+    CHUNK_SIZE = 6000  # Target words per chunk
+    LARGE_DEVIATION_THRESHOLD = 0.20  # 20% maximum deviation
+    MAX_TOKENS = 64000  # Nuevo límite absoluto basado en 64k tokens de salida
+    def __init__(self, use_gemini: bool = True, use_thinking_model: bool = False):
+        """Initialize the transformer with selected LLM client"""
+        self.text_processor = TextProcessor()
+        self.use_gemini = use_gemini
+        self.use_thinking_model = use_thinking_model
+        if use_thinking_model:
+            if not use_gemini:
+                raise ValueError("Thinking model requires use_gemini=True")
+            logger.info("Initializing with Gemini Flash Thinking API")
+            self.openai_client = openai.OpenAI(
+                api_key=os.getenv('GEMINI_API_KEY'),
+                base_url="https://generativelanguage.googleapis.com/v1alpha"
+            )
+            self.model_name = "gemini-2.0-flash-thinking-exp-01-21"
+        elif use_gemini:
+            logger.info("Initializing with Gemini API")
+            self.openai_client = openai.OpenAI(
+                api_key=os.getenv('GEMINI_API_KEY'),
+                base_url="https://generativelanguage.googleapis.com/v1beta"
+            )
+            self.model_name = "gemini-2.0-flash-exp"
+        else:
+            logger.info("Initializing with OpenAI API")
+            self.openai_client = openai.OpenAI(
+                api_key=os.getenv('OPENAI_API_KEY')
+            )
+            self.model_name = "gpt-3.5-turbo"
+        # Target word counts
+        self.words_per_minute = 130  # Average speaking rate
+    def _api_call_with_enhanced_retries(self, call_func: Callable[[], Any]) -> Any:
+        """
+        Wrapper function for API calls with enhanced retry logic
+        Args:
+            call_func: Function that makes the actual API call
+        Returns:
+            The result of the successful API call
+        Raises:
+            Exception: If all retries fail
+        """
+        # Initial retries (already handled by openai client)
+        try:
+            return call_func()
+        except Exception as e:
+            error_str = str(e)
+            # Check if it's a quota error (429)
+            if "429" in error_str or "Too Many Requests" in error_str or "RESOURCE_EXHAUSTED" in error_str:
+                logger.warning(f"Quota error detected: {error_str}")
+                logger.info(f"Starting extended retries with longer waits...")
+                # Extended retries with longer waits
+                for i in range(self.EXTENDED_RETRIES):
+                    wait_time = self.EXTENDED_RETRY_DELAYS[i]
+                    logger.info(f"Extended retry {i+1}/{self.EXTENDED_RETRIES}: Waiting {wait_time} seconds before retry")
+                    time.sleep(wait_time)
+                    try:
+                        return call_func()
+                    except Exception as retry_error:
+                        # If last retry, re-raise
+                        if i == self.EXTENDED_RETRIES - 1:
+                            logger.error(f"All extended retries failed: {str(retry_error)}")
+                            raise
+                        # Otherwise log and continue to next retry
+                        logger.warning(f"Extended retry {i+1} failed: {str(retry_error)}")
+            else:
+                # Not a quota error, re-raise
+                raise
+    def _validate_word_count(self, total_words: int, target_words: int, min_words: int, max_words: int) -> None:
+        """Validate word count with flexible thresholds and log warnings/errors"""
+        deviation = abs(total_words - target_words) / target_words
+        if deviation > self.LARGE_DEVIATION_THRESHOLD:
+            logger.error(
+                f"Word count {total_words} significantly outside target range "
+                f"({min_words}-{max_words}). Deviation: {deviation:.2%}"
+            )
+        elif total_words < min_words or total_words > max_words:
+            logger.warning(
+                f"Word count {total_words} slightly outside target range "
+                f"({min_words}-{max_words}). Deviation: {deviation:.2%}"
+            )
+    def transform_to_lecture(self,
+                             text: str,
+                             target_duration: int = 30,
+                             include_examples: bool = True,
+                             initial_prompt: Optional[str] = None) -> str:
+        """
+        Transform input text into a structured teaching transcript
+        Args:
+            text: Input transcript text
+            target_duration: Target lecture duration in minutes
+            include_examples: Whether to include practical examples
+            initial_prompt: Additional user instructions to guide the generation
+        Returns:
+            str: Generated teaching transcript, regardless of word count validation
+        """
+        logger.info(f"Starting transformation for {target_duration} minute lecture")
+        # Clean and preprocess text
+        cleaned_text = self.text_processor.clean_text(text)
+        input_words = self.text_processor.count_words(cleaned_text)
+        logger.info(f"Input text cleaned. Word count: {input_words}")
+        # Calculate target word count
+        target_words = self.words_per_minute * target_duration
+        min_words = int(target_words * 0.95)  # Minimum 95% of target
+        max_words = int(target_words * 1.05)  # Maximum 105% of target
+        logger.info(f"Target word count: {target_words} (min: {min_words}, max: {max_words})")
+        # Generate detailed lecture structure with topics
+        structure_data = self._generate_detailed_structure(
+            text=cleaned_text,
+            target_duration=target_duration,
+            initial_prompt=initial_prompt
+        )
+        logger.info("Detailed lecture structure generated")
+        logger.info(f"Topics identified: {[t['title'] for t in structure_data['topics']]}")
+        # Calculate section word counts
+        section_words = {
+            'intro': int(target_words * 0.1),
+            'main': int(target_words * 0.7),
+            'practical': int(target_words * 0.15),
+            'summary': int(target_words * 0.05)
+        }
+        try:
+            logger.info("Generating content by sections with topic tracking")
+            # Introduction with learning objectives and topic preview
+            intro = self._generate_section(
+                'introduction',
+                structure_data,
+                cleaned_text,
+                section_words['intro'],
+                include_examples,
+                is_first=True,
+                initial_prompt=initial_prompt
+            )
+            intro_words = self.text_processor.count_words(intro)
+            logger.info(f"Introduction generated: {intro_words} words")
+            # Track context for coherence
+            context = {
+                'current_section': 'introduction',
+                'covered_topics': [],
+                'pending_topics': [t['title'] for t in structure_data['topics']],
+                'key_terms': set(),
+                'current_narrative': intro[-1000:],  # Last 1000 words for context
+                'learning_objectives': structure_data['learning_objectives']
+            }
+            # Main content with topic progression
+            main_content = self._generate_main_content(
+                structure_data,
+                cleaned_text,
+                section_words['main'],
+                include_examples,
+                context,
+                initial_prompt=initial_prompt
+            )
+            main_words = self.text_processor.count_words(main_content)
+            logger.info(f"Main content generated: {main_words} words")
+            # Update context after main content
+            context['current_section'] = 'main'
+            context['current_narrative'] = main_content[-1000:]
+            # Practical applications tied to main topics
+            practical = self._generate_section(
+                'practical',
+                structure_data,
+                cleaned_text,
+                section_words['practical'],
+                include_examples,
+                context=context,
+                initial_prompt=initial_prompt
+            )
+            practical_words = self.text_processor.count_words(practical)
+            logger.info(f"Practical section generated: {practical_words} words")
+            # Update context for summary
+            context['current_section'] = 'practical'
+            context['current_narrative'] = practical[-500:]
+            # Summary with topic reinforcement
+            summary = self._generate_section(
+                'summary',
+                structure_data,
+                cleaned_text,
+                section_words['summary'],
+                include_examples,
+                is_last=True,
+                context=context,
+                initial_prompt=initial_prompt
+            )
+            summary_words = self.text_processor.count_words(summary)
+            logger.info(f"Summary generated: {summary_words} words")
+            # Combine all sections
+            full_content = f"{intro}\n\n{main_content}\n\n{practical}\n\n{summary}"
+            total_words = self.text_processor.count_words(full_content)
+            logger.info(f"Total content generated: {total_words} words")
+            # Log warnings/errors but don't raise exceptions
+            self._validate_word_count(total_words, target_words, min_words, max_words)
+            # Validate coherence
+            self._validate_coherence(full_content, structure_data)
+            logger.info("Content coherence validated")
+            return full_content
+        except Exception as e:
+            logger.error(f"Error during content generation: {str(e)}")
+            # If we have partial content, return it
+            if 'full_content' in locals():
+                logger.warning("Returning partial content despite errors")
+                return full_content
+            raise  # Re-raise only if we have no content at all
+    def _generate_detailed_structure(self,
+                                     text: str,
+                                     target_duration: int,
+                                     initial_prompt: Optional[str] = None) -> Dict:
+        """Generate detailed lecture structure with topics and objectives"""
+        logger.info("Generating detailed lecture structure")
+        user_instructions = f"\nAdditional user instructions:\n{initial_prompt}\n" if initial_prompt else ""
+        prompt = f"""
+        You are an expert educator creating a detailed lecture outline.
+        {user_instructions}
+        Analyze this transcript and create a structured JSON output with the following:
+        1. Title of the lecture
+        2. 3-5 clear learning objectives
+        3. 3-4 main topics, each with:
+           - Title
+           - Key concepts
+           - Subtopics
+           - Time allocation (in minutes)
+           - Connection to learning objectives
+        4. Practical application ideas
+        5. Key terms to track
+        IMPORTANT: Response MUST be valid JSON. Format exactly like this, with no additional text:
+        {{
+            "title": "string",
+            "learning_objectives": ["string"],
+            "topics": [
+                {{
+                    "title": "string",
+                    "key_concepts": ["string"],
+                    "subtopics": ["string"],
+                    "duration_minutes": number,
+                    "objective_links": [number]
+                }}
+            ],
+            "practical_applications": ["string"],
+            "key_terms": ["string"]
+        }}
+        Target duration: {target_duration} minutes
+        Transcript excerpt:
+        {text[:2000]}
+        """
+        try:
+            # Common parameters
+            params = {
+                "model": self.model_name,
+                "messages": [
+                    {"role": "system", "content": "You are an expert educator. Output ONLY valid JSON, no other text."},
+                    {"role": "user", "content": prompt}
+                ],
+                "temperature": 0.7,
+                "max_tokens": self.MAX_TOKENS if self.use_thinking_model else 4000
+            }
+            # Add thinking config if using experimental model
+            if self.use_thinking_model:
+                params["extra_body"] = {
+                    "thinking_config": {
+                        "include_thoughts": True
+                    }
+                }
+            # Use the enhanced retry wrapper for API call
+            def api_call():
+                return self.openai_client.chat.completions.create(**params)
+            response = self._api_call_with_enhanced_retries(api_call)
+            content = response.choices[0].message.content.strip()
+            logger.debug(f"Raw structure response: {content}")
+            try:
+                structure_data = json.loads(content)
+                logger.info("Structure data parsed successfully")
+                return structure_data
+            except json.JSONDecodeError as e:
+                logger.warning(f"Failed to parse JSON directly: {str(e)}")
+                # Try to extract JSON if it's wrapped in other text
+                import re
+                json_match = re.search(r'({[\s\S]*})', content)
+                if json_match:
+                    try:
+                        structure_data = json.loads(json_match.group(1))
+                        logger.info("Structure data extracted and parsed successfully")
+                        return structure_data
+                    except json.JSONDecodeError:
+                        logger.warning("Failed to parse extracted JSON")
+                # If both attempts fail, use fallback structure
+                logger.warning("Using fallback structure")
+                return self._generate_fallback_structure(text, target_duration)
+        except Exception as e:
+            logger.error(f"Error generating structure: {str(e)}")
+            # Fallback in case of any error
+            return self._generate_fallback_structure(text, target_duration)
+    def _generate_fallback_structure(self, text: str, target_duration: int) -> Dict:
+        """Generate a simplified fallback structure in case of parsing failures"""
+        logger.info("Generating fallback structure")
+        params = {
+            "model": self.model_name,
+            "messages": [
+                {"role": "system", "content": "You are an expert educator. Output ONLY valid JSON, no other text."},
+                {"role": "user", "content": f"""
+                Create a simplified lecture outline based on this transcript.
+                Format as JSON with:
+                - title
+                - 3 learning objectives
+                - 2 main topics with title, key concepts, subtopics
+                - 2 practical applications
+                - 3 key terms
+                Target duration: {target_duration} minutes
+                Transcript excerpt:
+                {text[:2000]}
+                """}
+            ],
+            "temperature": 0.5,
+            "max_tokens": 2000
+        }
+        try:
+            # Use the enhanced retry wrapper for API call
+            def api_call():
+                return self.openai_client.chat.completions.create(**params)
+            response = self._api_call_with_enhanced_retries(api_call)
+            content = response.choices[0].message.content.strip()
+            try:
+                return json.loads(content)
+            except json.JSONDecodeError:
+                # Last resort fallback if everything fails
+                return {
+                    "title": "Lecture on Transcript Topic",
+                    "learning_objectives": ["Understand key concepts", "Apply knowledge", "Evaluate outcomes"],
+                    "topics": [
+                        {
+                            "title": "Main Topic 1",
+                            "key_concepts": ["Concept 1", "Concept 2"],
+                            "subtopics": ["Subtopic 1", "Subtopic 2"],
+                            "duration_minutes": target_duration // 2,
+                            "objective_links": [1, 2]
+                        },
+                        {
+                            "title": "Main Topic 2",
+                            "key_concepts": ["Concept 3", "Concept 4"],
+                            "subtopics": ["Subtopic 3", "Subtopic 4"],
+                            "duration_minutes": target_duration // 2,
+                            "objective_links": [2, 3]
+                        }
+                    ],
+                    "practical_applications": ["Application 1", "Application 2"],
+                    "key_terms": ["Term 1", "Term 2", "Term 3"]
+                }
+        except Exception as e:
+            logger.error(f"Error generating fallback structure: {str(e)}")
+            # Hardcoded last resort fallback
+            return {
+                "title": "Lecture on Transcript Topic",
+                "learning_objectives": ["Understand key concepts", "Apply knowledge", "Evaluate outcomes"],
+                "topics": [
+                    {
+                        "title": "Main Topic 1",
+                        "key_concepts": ["Concept 1", "Concept 2"],
+                        "subtopics": ["Subtopic 1", "Subtopic 2"],
+                        "duration_minutes": target_duration // 2,
+                        "objective_links": [1, 2]
+                    },
+                    {
+                        "title": "Main Topic 2",
+                        "key_concepts": ["Concept 3", "Concept 4"],
+                        "subtopics": ["Subtopic 3", "Subtopic 4"],
+                        "duration_minutes": target_duration // 2,
+                        "objective_links": [2, 3]
+                    }
+                ],
+                "practical_applications": ["Application 1", "Application 2"],
+                "key_terms": ["Term 1", "Term 2", "Term 3"]
+            }
+    def _generate_section(self,
+                         section_type: str,
+                         structure_data: Dict,
+                         original_text: str,
+                         target_words: int,
+                         include_examples: bool,
+                         context: Dict = None,
+                         is_first: bool = False,
+                         is_last: bool = False,
+                         initial_prompt: Optional[str] = None) -> str:
+        """Generate a specific section of the lecture"""
+        logger.info(f"Generating {section_type} section (target: {target_words} words)")
+        # Calculate timing markers
+        if section_type == 'introduction':
+            time_marker = '[00:00]'
+        elif section_type == 'summary':
+            duration_mins = sum(topic.get('duration_minutes', 5) for topic in structure_data['topics'])
+            # Asegurar que duration_mins es un entero y nunca menor a 5
+            adjusted_mins = max(5, int(duration_mins - 5))
+            time_marker = f'[{adjusted_mins:02d}:00]'
+        else:
+            # For other sections, use appropriate time markers
+            time_marker = '[XX:XX]'  # Will be replaced within the prompt
+        user_instructions = f"\nAdditional user instructions:\n{initial_prompt}\n" if initial_prompt else ""
+        # Base prompt with context-specific formatting
+        prompt = f"""
+        You are creating a {section_type} section for a {time_marker} teaching lecture on "{structure_data['title']}".
+        {user_instructions}
+        Target word count: {target_words} words (very important)
+        Learning objectives:
+        {', '.join(structure_data['learning_objectives'])}
+        Key terms:
+        {', '.join(structure_data['key_terms'])}
+        Original source:
+        {original_text[:500]}...
+        """
+        # Section-specific instructions
+        if section_type == 'introduction':
+            prompt += """
+            - Start with an engaging hook
+            - Present clear learning objectives
+            - Preview main topics
+            - Set expectations for the lecture
+            """
+        elif section_type == 'main':
+            prompt += f"""
+            Discuss one main topic in depth.
+            Topic: {context['current_topic']['title']}
+            Key concepts: {', '.join(context['current_topic']['key_concepts'])}
+            Subtopics: {', '.join(context['current_topic']['subtopics'])}
+            - Start with appropriate time marker
+            - Explain key concepts clearly
+            - Include real-world examples
+            - Connect to learning objectives
+            - Use appropriate time markers within the section
+            """
+        elif section_type == 'practical':
+            prompt += f"""
+            Create a practical applications section with:
+            - Start with appropriate time marker
+            - 2-3 practical examples or case studies
+            - Clear connections to the main topics
+            - Interactive elements (questions, exercises)
+            Practical applications to cover:
+            {', '.join(structure_data['practical_applications'])}
+            """
+        elif section_type == 'summary':
+            prompt += """
+            Create a concise summary:
+            - Start with appropriate time marker
+            - Reinforce key learning points
+            - Brief recap of main topics
+            - Call to action or follow-up suggestions
+            """
+        # Context-specific content
+        if context:
+            prompt += f"""
+            Previously covered topics:
+            {', '.join(context['covered_topics'])}
+            Pending topics:
+            {', '.join(context['pending_topics'])}
+            Recent narrative context:
+            {context['current_narrative']}
+            """
+        # First/last section specific instructions
+        if is_first:
+            prompt += """
+            This is the FIRST section of the lecture. Make it engaging and set the tone.
+            """
+        elif is_last:
+            prompt += """
+            This is the FINAL section of the lecture. Ensure proper closure and reinforcement.
+            """
+        # Add section-specific time markers for formatted output
+        if section_type != 'introduction':
+            prompt += """
+            IMPORTANT: Include appropriate time markers [MM:SS] throughout the section.
+            """
+        try:
+            # Prepare API call parameters
+            params = {
+                "model": self.model_name,
+                "messages": [
+                    {"role": "system", "content": "You are an expert educator creating a teaching script."},
+                    {"role": "user", "content": prompt}
+                ],
+                "temperature": 0.7,
+                "max_tokens": self._calculate_max_tokens(section_type, target_words)
+            }
+            # Add thinking config if using experimental model
+            if self.use_thinking_model:
+                params["extra_body"] = {
+                    "thinking_config": {
+                        "include_thoughts": True
+                    }
+                }
+            # Use the enhanced retry wrapper for API call
+            def api_call():
+                return self.openai_client.chat.completions.create(**params)
+            response = self._api_call_with_enhanced_retries(api_call)
+            content = response.choices[0].message.content.strip()
+            # Validate output length
+            content_words = self.text_processor.count_words(content)
+            logger.info(f"Section generated: {content_words} words")
+            return content
+        except Exception as e:
+            logger.error(f"Error during content generation: {str(e)}")
+            # Provide a minimal fallback content to avoid complete failure
+            return f"{time_marker} {section_type.capitalize()} (Error during generation)\n\nWe apologize, but there was an error generating this section."
+    def _calculate_max_tokens(self, section_type: str, target_words: int) -> int:
+        """Calculate appropriate max_tokens based on section and model"""
+        # 1 token ≈ 4 caracteres (1 palabra ≈ 1.33 tokens)
+        base_tokens = int(target_words * 1.5)  # Margen para formato
+        if self.use_thinking_model:
+            # Permite hasta 64k tokens pero limita por sección
+            section_limits = {
+                'introduction': 8000,
+                'main': 32000,
+                'practical': 16000,
+                'summary': 8000
+            }
+            return min(base_tokens * 2, section_limits.get(section_type, 16000))
+        # Límites para otros modelos
+        return min(base_tokens + 1000, self.MAX_TOKENS)
+    def _generate_main_content(self,
+                             structure_data: Dict,
+                             original_text: str,
+                             target_words: int,
+                             include_examples: bool,
+                             context: Dict,
+                             initial_prompt: Optional[str] = None) -> str:
+        """Generate main content with topic progression"""
+        logger.info(f"Generating main content (target: {target_words} words)")
+        # Calculate words per topic based on their duration ratios
+        total_duration = sum(t['duration_minutes'] for t in structure_data['topics'])
+        # Avoid division by zero
+        total_duration = total_duration if total_duration > 0 else 1
+        topic_words = {}
+        for topic in structure_data['topics']:
+            ratio = topic['duration_minutes'] / total_duration
+            topic_words[topic['title']] = int(target_words * ratio)
+        logger.info(f"Topic word allocations: {topic_words}")
+        # Generate content for each topic
+        topic_contents = []
+        for topic in structure_data['topics']:
+            topic_target = topic_words[topic['title']]
+            # Update context for topic
+            context['current_topic'] = topic
+            if topic['title'] in context['pending_topics']:
+                context['covered_topics'].append(topic['title'])
+                context['pending_topics'].remove(topic['title'])
+            context['key_terms'].update(topic['key_concepts'])
+            # Generate topic content
+            topic_content = self._generate_section(
+                f"main_topic_{topic['title']}",
+                structure_data,
+                original_text,
+                topic_target,
+                include_examples,
+                context=context,
+                initial_prompt=initial_prompt
+            )
+            topic_contents.append(topic_content)
+            context['current_narrative'] = topic_content[-1000:]
+        return "\n\n".join(topic_contents)
+    def _validate_coherence(self, content: str, structure_data: Dict):
+        """Validate content coherence against structure"""
+        logger.info("Validating content coherence")
+        # Check for learning objectives
+        for objective in structure_data['learning_objectives']:
+            if not any(term.lower() in content.lower() for term in objective.split()):
+                logger.warning(f"Learning objective not well covered: {objective}")
+        # Check for key terms
+        for term in structure_data['key_terms']:
+            if content.lower().count(term.lower()) < 2:
+                logger.warning(f"Key term underutilized: {term}")
+        # Check topic coverage
+        for topic in structure_data['topics']:
+            if not any(concept.lower() in content.lower() for concept in topic['key_concepts']):
+                logger.warning(f"Topic concepts not well covered: {topic['title']}")
+        logger.info("Coherence validation complete")

src/utils/__init__.py ADDED Viewed

File without changes

src/utils/pdf_processor.py ADDED Viewed

	@@ -0,0 +1,59 @@

+import PyPDF2
+from typing import Optional
+class PDFProcessor:
+    """Handles PDF file processing and text extraction"""
+    def __init__(self):
+        """Initialize PDF processor"""
+        pass
+    def extract_text(self, pdf_path: str) -> str:
+        """
+        Extract text content from a PDF file
+        Args:
+            pdf_path: Path to the PDF file
+        Returns:
+            str: Extracted text content
+        Raises:
+            FileNotFoundError: If PDF file doesn't exist
+            PyPDF2.PdfReadError: If PDF file is invalid or corrupted
+        """
+        try:
+            with open(pdf_path, 'rb') as file:
+                # Create PDF reader object
+                reader = PyPDF2.PdfReader(file)
+                # Extract text from all pages
+                text = ""
+                for page in reader.pages:
+                    text += page.extract_text() + "\n"
+                return text.strip()
+        except FileNotFoundError:
+            raise FileNotFoundError(f"PDF file not found: {pdf_path}")
+        except PyPDF2.PdfReadError as e:
+            raise PyPDF2.PdfReadError(f"Error reading PDF file: {str(e)}")
+        except Exception as e:
+            raise Exception(f"Unexpected error processing PDF: {str(e)}")
+    def get_metadata(self, pdf_path: str) -> dict:
+        """
+        Extract metadata from PDF file
+        Args:
+            pdf_path: Path to the PDF file
+        Returns:
+            dict: PDF metadata
+        """
+        try:
+            with open(pdf_path, 'rb') as file:
+                reader = PyPDF2.PdfReader(file)
+                return reader.metadata
+        except Exception as e:
+            return {"error": str(e)}

src/utils/text_processor.py ADDED Viewed

	@@ -0,0 +1,84 @@

+import re
+from typing import List, Optional
+class TextProcessor:
+    """Handles text preprocessing and cleaning"""
+    def __init__(self):
+        """Initialize text processor"""
+        self.sentence_endings = r'[.!?]'
+        self.word_pattern = r'\b\w+\b'
+    def clean_text(self, text: str) -> str:
+        """
+        Clean and normalize text
+        Args:
+            text: Input text to clean
+        Returns:
+            str: Cleaned text
+        """
+        # Remove extra whitespace
+        text = ' '.join(text.split())
+        # Fix common OCR errors
+        text = self._fix_ocr_errors(text)
+        # Normalize punctuation
+        text = self._normalize_punctuation(text)
+        return text.strip()
+    def split_into_sections(self, text: str) -> List[str]:
+        """
+        Split text into logical sections based on content
+        Args:
+            text: Input text to split
+        Returns:
+            List[str]: List of text sections
+        """
+        # Split on double newlines or section markers
+        sections = re.split(r'\n\s*\n|\n(?=[A-Z][^a-z]*:)', text)
+        return [s.strip() for s in sections if s.strip()]
+    def count_words(self, text: str) -> int:
+        """
+        Count words in text
+        Args:
+            text: Input text
+        Returns:
+            int: Word count
+        """
+        words = re.findall(self.word_pattern, text)
+        return len(words)
+    def _fix_ocr_errors(self, text: str) -> str:
+        """Fix common OCR errors"""
+        replacements = {
+            r'[|]': 'I',  # Vertical bar to I
+            r'0': 'O',    # Zero to O where appropriate
+            r'1': 'l',    # One to l where appropriate
+            r'\s+': ' '   # Multiple spaces to single space
+        }
+        for pattern, replacement in replacements.items():
+            text = re.sub(pattern, replacement, text)
+        return text
+    def _normalize_punctuation(self, text: str) -> str:
+        """Normalize punctuation marks"""
+        # Replace multiple periods with single period
+        text = re.sub(r'\.{2,}', '.', text)
+        # Add space after punctuation if missing
+        text = re.sub(r'([.!?])([A-Z])', r'\1 \2', text)
+        # Fix spacing around punctuation
+        text = re.sub(r'\s+([.!?,])', r'\1', text)
+        return text