Spaces:

rogeliorichman
/

AI_Script_Generator

Runtime error

App Files Files Community

rogeliorichman commited on Feb 26, 2025

Commit

b2b4dfa

verified ·

1 Parent(s): 2eea992

Upload folder using huggingface_hub

Browse files

Files changed (15) hide show

.gitattributes +1 -0
.github/workflows/update_space.yml +28 -0
.gitignore +62 -0
CONTRIBUTING.md +99 -0
README.md +224 -8
data/sample2.pdf +3 -0
requirements.txt +9 -0
setup.py +13 -0
src/__init__.py +0 -0
src/app.py +177 -0
src/core/__init__.py +0 -0
src/core/transformer.py +580 -0
src/utils/__init__.py +0 -0
src/utils/pdf_processor.py +59 -0
src/utils/text_processor.py +84 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+data/sample2.pdf filter=lfs diff=lfs merge=lfs -text

.github/workflows/update_space.yml ADDED Viewed

	@@ -0,0 +1,28 @@

+name: Run Python script
+on:
+  push:
+    branches:
+      - main
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+    - name: Checkout
+      uses: actions/checkout@v2
+    - name: Set up Python
+      uses: actions/setup-python@v2
+      with:
+        python-version: '3.9'
+    - name: Install Gradio
+      run: python -m pip install gradio
+    - name: Log in to Hugging Face
+      run: python -c 'import huggingface_hub; huggingface_hub.login(token="${{ secrets.hf_token }}")'
+    - name: Deploy to Spaces
+      run: gradio deploy

.gitignore ADDED Viewed

	@@ -0,0 +1,62 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual Environment
+venv/
+ENV/
+env/
+.env
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+.DS_Store
+# Testing
+.coverage
+htmlcov/
+.tox/
+.nox/
+.pytest_cache/
+# Logs
+*.log
+logs/
+# Local development
+.env.local
+.env.development.local
+.env.test.local
+.env.production.local
+# API Keys
+.env
+*.pem
+*.key
+# Gradio
+.gradio/
+# private file
+/data/sample3.pdf

CONTRIBUTING.md ADDED Viewed

	@@ -0,0 +1,99 @@

+# Contributing to AI LectureForge
+First off, thank you for considering contributing to AI LectureForge! It's people like you that make AI LectureForge such a great tool.
+## Code of Conduct
+By participating in this project, you are expected to uphold our Code of Conduct:
+- Use welcoming and inclusive language
+- Be respectful of differing viewpoints and experiences
+- Gracefully accept constructive criticism
+- Focus on what is best for the community
+- Show empathy towards other community members
+## How Can I Contribute?
+### Reporting Bugs
+Before creating bug reports, please check the issue list as you might find out that you don't need to create one. When you are creating a bug report, please include as many details as possible:
+* Use a clear and descriptive title
+* Describe the exact steps which reproduce the problem
+* Provide specific examples to demonstrate the steps
+* Describe the behavior you observed after following the steps
+* Explain which behavior you expected to see instead and why
+* Include screenshots if possible
+### Suggesting Enhancements
+If you have a suggestion for the project, we'd love to hear it. Enhancement suggestions are tracked as GitHub issues. When creating an enhancement suggestion, please include:
+* A clear and descriptive title
+* A detailed description of the proposed enhancement
+* Examples of how the enhancement would be used
+* Any potential drawbacks or challenges
+### Pull Requests
+1. Fork the repo and create your branch from `main`
+2. If you've added code that should be tested, add tests
+3. If you've changed APIs, update the documentation
+4. Ensure the test suite passes
+5. Make sure your code follows the existing style
+6. Issue that pull request!
+## Development Process
+1. Create a new branch:
+   ```bash
+   git checkout -b feature/my-feature
+   # or
+   git checkout -b bugfix/my-bugfix
+   ```
+2. Make your changes and commit:
+   ```bash
+   git add .
+   git commit -m "Description of changes"
+   ```
+3. Push to your fork:
+   ```bash
+   git push origin feature/my-feature
+   ```
+### Style Guidelines
+- Follow PEP 8 style guide for Python code
+- Use descriptive variable names
+- Comment your code when necessary
+- Keep functions focused and modular
+- Use type hints where possible
+### Testing
+- Write unit tests for new features
+- Ensure all tests pass before submitting PR
+- Include both positive and negative test cases
+## Project Structure
+```
+transcript_transformer/
+├── src/
+│   ├── core/          # Core transformation logic
+│   ├── utils/         # Utility functions
+│   └── app.py         # Main application
+├── tests/             # Test files
+└── requirements.txt   # Project dependencies
+```
+## Getting Help
+If you need help, you can:
+- Open an issue with your question
+- Reach out to the maintainers
+- Check the documentation
+Thank you for contributing to AI LectureForge! 🎓✨

README.md CHANGED Viewed

@@ -1,12 +1,228 @@
 ---
-title: AI Script Generator
-emoji: 🐨
-colorFrom: yellow
-colorTo: blue
 sdk: gradio
-sdk_version: 5.18.0
-app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: AI_Script_Generator
+app_file: src/app.py
 sdk: gradio
+sdk_version: 5.13.1
 ---
+# 🎓 AI Script Generator
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
+[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
+[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)
+> Transform transcripts and PDFs into timed, structured teaching scripts using AI
+AI Script Generator is an advanced AI system that converts PDF transcripts, raw text, and conversational content into well-structured teaching scripts. It seamlessly processes inputs, extracting and analyzing the content to create organized, pedagogically sound scripts with time markers. Designed for educators, students, content creators, and anyone looking to transform information into clear explanations.
+## ✨ Features
+- 🤖 PDF transcript and raw text processing
+- 🤖 AI-powered content transformation
+- 📚 Structured teaching script generation
+- 🔄 Coherent topic organization
+- 🔌 Support for multiple AI providers (Gemini/OpenAI)
+- ⏱️ Time-marked sections for pacing
+## Output Format
+The generated scripts follow a structured format:
+### Time Markers
+- Each section includes time markers (e.g., `[11:45]`) to help pace delivery
+- Customizable duration: From as short as 2 minutes to 60 minutes, with timing adjusted accordingly
+### Structure
+- Introduction with learning objectives
+- Time-marked content sections
+- Examples and practical applications
+- Interactive elements (questions, exercises)
+- Recap and key takeaways
+For example:
+```
+[00:00] Introduction to Topic
+- Learning objectives
+- Key concepts overview
+[11:45] Main Concept Explanation
+- Detailed explanation
+- Practical example
+- Student interaction point
+[23:30] Advanced Applications
+...
+```
+## 🚀 Quick Start
+### Prerequisites
+- Python 3.8 or higher
+- Virtual environment (recommended)
+- Gemini API key (or OpenAI API key)
+### Installation
+```bash
+# Clone the repository
+git clone https://github.com/RogelioRichmanAstronaut/AI-Script-Generator.git
+cd AI-Script-Generator
+# Create and activate virtual environment
+python -m venv venv
+source venv/bin/activate  # On Windows: .\venv\Scripts\activate
+# Install dependencies
+pip install -r requirements.txt
+# Set up environment variables (choose one API key based on your preference)
+export GEMINI_API_KEY='your-gemini-api-key'  # Primary option
+# OR
+export OPENAI_API_KEY='your-openai-api-key'  # Alternative option
+# On Windows use:
+# set GEMINI_API_KEY=your-gemini-api-key
+# set OPENAI_API_KEY=your-openai-api-key
+```
+### Usage
+```bash
+# Run with Python path set
+PYTHONPATH=$PYTHONPATH:. python src/app.py
+# Access the web interface
+# Open http://localhost:7860 in your browser
+```
+## 🛠️ Technical Approach
+### Prompt Engineering Strategy
+Our system uses a sophisticated multi-stage prompting approach:
+1. **Content Analysis & Chunking**
+   - Smart text segmentation for handling large documents (9000+ words)
+   - Contextual overlap between chunks to maintain coherence
+   - Key topic and concept extraction from each segment
+2. **Structure Generation**
+   - Time-based sectioning (customizable from 2-60 minutes)
+   - Educational flow design with clear progression
+   - Integration of pedagogical elements (examples, exercises, questions)
+3. **Educational Enhancement**
+   - Transformation of casual content into formal teaching script
+   - Addition of practical examples and case studies
+   - Integration of interaction points and reflection questions
+   - Time markers for pacing guidance
+4. **Coherence Validation**
+   - Cross-reference checking between sections
+   - Verification of topic flow and progression
+   - Consistency check for terminology and concepts
+   - Quality assessment of educational elements
+### Challenges & Solutions
+1. **Context Length Management**
+   - Challenge: Handling documents beyond model context limits
+   - Solution: Implemented sliding window chunking with overlap
+   - Result: Successfully processes documents up to 9000+ words with extensibility for more
+2. **Educational Structure**
+   - Challenge: Converting conversational text to teaching format
+   - Solution:
+     - Structured templating system for different time formats (2-60 min)
+     - Integration of pedagogical elements (examples, exercises)
+     - Time-based sectioning with clear progression
+   - Result: Coherent, time-marked teaching scripts with interactive elements
+3. **Content Coherence**
+   - Challenge: Maintaining narrative flow across chunked content
+   - Solution:
+     - Contextual overlap between chunks
+     - Topic tracking across sections
+     - Cross-reference validation system
+   - Result: Seamless content flow with consistent terminology
+4. **Educational Quality**
+   - Challenge: Ensuring high pedagogical value
+   - Solution:
+     - Integration of learning objectives
+     - Strategic placement of examples and exercises
+     - Addition of reflection questions
+     - Time-appropriate pacing markers
+   - Result: Engaging, structured learning materials
+### Core Components
+1. **PDF Processing**: Extracts and cleans text from PDF transcripts
+2. **Text Processing**: Handles direct text input and cleans/structures it
+3. **Content Analysis**: Uses AI to understand and structure the content
+4. **Script Generation**: Transforms content into educational format
+### Implementation Details
+1. **PDF/Text Handling**
+   - Robust PDF text extraction
+   - Raw text input processing
+   - Clean-up of extracted content
+2. **AI Processing**
+   - Integration with Gemini API (primary)
+   - OpenAI API support (alternative)
+   - Structured prompt system for consistent output
+3. **Output Generation**
+   - Organized teaching scripts
+   - Clear section structure
+   - Learning points and key concepts
+### Architecture
+The system follows a modular design:
+- 📄 PDF/text processing module
+- 🔍 Text analysis component
+- 🤖 AI integration layer
+- 📝 Output formatting system
+## 🤝 Contributing
+Contributions are what make the open source community amazing! Any contributions you make are **greatly appreciated**.
+1. Fork the Project
+2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
+3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
+4. Push to the Branch (`git push origin feature/AmazingFeature`)
+5. Open a Pull Request
+See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
+## 📝 License
+Distributed under the MIT License. See `LICENSE` for more information.
+## 🌟 Acknowledgments
+- Thanks to all contributors who have helped shape AI Script Generator
+- Special thanks to the Gemini and OpenAI teams for their amazing APIs
+- Inspired by educators and communicators worldwide who make learning engaging
+## 📧 Contact
+Project Link: [https://github.com/RogelioRichmanAstronaut/AI-Script-Generator](https://github.com/RogelioRichmanAstronaut/AI-Script-Generator)
+## 🔮 Roadmap
+- [ ] Support for multiple output formats (PDF, PPTX)
+- [ ] Interactive elements generation
+- [ ] Custom templating system
+- [ ] Multi-language support
+- [ ] Integration with LMS platforms
+---
+<p align="center">Made with ❤️ for educators, students, and communicators everywhere</p>

data/sample2.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5bf0997942205ed54293dd5b2a480b6d5efcc7d4146548dd68c20a9d7e3f7318
+size 155966

requirements.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+gradio>=4.0.0
+transformers>=4.30.0
+torch>=2.0.0
+pypdf2>=3.0.0
+python-dotenv>=0.19.0
+numpy>=1.21.0
+tqdm>=4.65.0
+openai>=1.0.0
+tiktoken>=0.5.0

setup.py ADDED Viewed

	@@ -0,0 +1,13 @@

+from setuptools import setup, find_packages
+setup(
+    name="transcript_transformer",
+    version="0.1.0",
+    packages=find_packages(),
+    install_requires=[
+        line.strip()
+        for line in open("requirements.txt")
+        if line.strip() and not line.startswith("#")
+    ],
+    python_requires=">=3.8",
+)

src/__init__.py ADDED Viewed

File without changes

src/app.py ADDED Viewed

	@@ -0,0 +1,177 @@

+import os
+import gradio as gr
+from dotenv import load_dotenv
+from src.core.transformer import TranscriptTransformer
+from src.utils.pdf_processor import PDFProcessor
+from src.utils.text_processor import TextProcessor
+load_dotenv()
+class TranscriptTransformerApp:
+    def __init__(self):
+        self.pdf_processor = PDFProcessor()
+        self.text_processor = TextProcessor()
+    def process_transcript(self,
+                           input_type: str,
+                           file_obj: gr.File = None,
+                           raw_text_input: str = "",
+                           initial_prompt: str = "",
+                           target_duration: int = 30,
+                           include_examples: bool = True,
+                           use_gemini: bool = True,
+                           use_thinking_model: bool = False) -> str:
+        """
+        Process uploaded transcript and transform it into a teaching transcript
+        Args:
+            input_type: Type of input (PDF or Raw Text)
+            file_obj: Uploaded PDF file (if input_type is PDF)
+            raw_text_input: Raw text input (if input_type is Raw Text)
+            initial_prompt: Additional guiding instructions for the content generation
+            target_duration: Target lecture duration in minutes
+            include_examples: Whether to include practical examples
+            use_gemini: Whether to use Gemini API instead of OpenAI
+            use_thinking_model: Requires use_gemini=True
+        Returns:
+            str: Generated teaching transcript
+        """
+        try:
+            # Force enable Gemini if thinking model is selected
+            if use_thinking_model:
+                use_gemini = True
+            self.transformer = TranscriptTransformer(
+                use_gemini=use_gemini,
+                use_thinking_model=use_thinking_model
+            )
+            # Get text based on input type
+            if input_type == "PDF":
+                if file_obj is None:
+                    return "Error: No PDF file uploaded"
+                raw_text = self.pdf_processor.extract_text(file_obj.name)
+            else:  # Raw Text
+                if not raw_text_input.strip():
+                    return "Error: No text provided"
+                raw_text = raw_text_input
+            # Transform to teaching transcript with user guidance
+            lecture_transcript = self.transformer.transform_to_lecture(
+                text=raw_text,
+                target_duration=target_duration,
+                include_examples=include_examples,
+                initial_prompt=initial_prompt
+            )
+            return lecture_transcript
+        except Exception as e:
+            return f"Error processing transcript: {str(e)}"
+    def launch(self):
+        """Launch the Gradio interface"""
+        # Get the path to the example PDF
+        example_pdf = os.path.join(os.path.dirname(os.path.dirname(__file__)), "data", "sample2.pdf")
+        with gr.Blocks(title="AI Script Generator") as interface:
+            gr.Markdown("# AI Script Generator")
+            gr.Markdown("Transform transcripts and PDFs into timed, structured teaching scripts using AI")
+            with gr.Row():
+                input_type = gr.Radio(
+                    choices=["PDF", "Raw Text"],
+                    label="Input Type",
+                    value="PDF"
+                )
+            with gr.Row():
+                with gr.Column(visible=True) as pdf_column:
+                    file_input = gr.File(
+                        label="Upload Transcript (PDF)",
+                        file_types=[".pdf"]
+                    )
+                with gr.Column(visible=False) as text_column:
+                    text_input = gr.Textbox(
+                        label="Paste Transcript Text",
+                        lines=10,
+                        placeholder="Paste your transcript text here..."
+                    )
+            with gr.Row():
+                initial_prompt = gr.Textbox(
+                    label="Guiding Prompt (Optional)",
+                    lines=3,
+                    value="",
+                    placeholder="Additional instructions to customize the output. Examples: 'Use a more informal tone', 'Focus only on section X', 'Generate the content in Spanish', 'Include more practical programming examples', etc.",
+                    info="The Guiding Prompt allows you to provide specific instructions to modify the generated content, like output/desired LANGUAGE. You can use it to change the tone, style, focus ONLY on specific sections of the text, specify the output language (e.g., 'Generate in Spanish/French/German'), or give any other instruction that helps personalize the final result."
+                )
+            with gr.Row():
+                target_duration = gr.Number(
+                    label="Target Lecture Duration (minutes)",
+                    value=30,
+                    minimum=2,
+                    maximum=60,
+                    step=1
+                )
+                include_examples = gr.Checkbox(
+                    label="Include Practical Examples",
+                    value=True
+                )
+                use_thinking_model = gr.Checkbox(
+                    label="Use Experimental Thinking Model (Gemini Only)",
+                    value=True
+                )
+            with gr.Row():
+                submit_btn = gr.Button("Transform Transcript")
+            output = gr.Textbox(
+                label="Generated Teaching Transcript",
+                lines=25
+            )
+            # Handle visibility of input columns based on selection
+            def update_input_visibility(choice):
+                return [
+                    gr.update(visible=(choice == "PDF")),  # pdf_column
+                    gr.update(visible=(choice == "Raw Text"))  # text_column
+                ]
+            input_type.change(
+                fn=update_input_visibility,
+                inputs=input_type,
+                outputs=[pdf_column, text_column]
+            )
+            # Set up submission logic
+            submit_btn.click(
+                fn=self.process_transcript,
+                inputs=[
+                    input_type,
+                    file_input,
+                    text_input,
+                    initial_prompt,
+                    target_duration,
+                    include_examples,
+                    use_thinking_model
+                ],
+                outputs=output
+            )
+            # Example for PDF input
+            gr.Examples(
+                examples=[[example_pdf, "", "", 30, True, True]],
+                inputs=[file_input, text_input, initial_prompt, target_duration, include_examples, use_thinking_model]
+            )
+        interface.launch(share=True)
+if __name__ == "__main__":
+    app = TranscriptTransformerApp()
+    app.launch()

src/core/__init__.py ADDED Viewed

File without changes

src/core/transformer.py ADDED Viewed

	@@ -0,0 +1,580 @@

+import os
+import logging
+import json
+from typing import List, Dict, Optional
+import openai
+from src.utils.text_processor import TextProcessor
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class WordCountError(Exception):
+    """Raised when word count requirements are not met"""
+    pass
+class TranscriptTransformer:
+    """Transforms conversational transcripts into teaching material using LLM"""
+    MAX_RETRIES = 3  # Maximum retries for content generation
+    CHUNK_SIZE = 6000  # Target words per chunk
+    LARGE_DEVIATION_THRESHOLD = 0.20  # 20% maximum deviation
+    MAX_TOKENS = 64000  # Nuevo límite absoluto basado en 64k tokens de salida
+    def __init__(self, use_gemini: bool = True, use_thinking_model: bool = False):
+        """Initialize the transformer with selected LLM client"""
+        self.text_processor = TextProcessor()
+        self.use_gemini = use_gemini
+        self.use_thinking_model = use_thinking_model
+        if use_thinking_model:
+            if not use_gemini:
+                raise ValueError("Thinking model requires use_gemini=True")
+            logger.info("Initializing with Gemini Flash Thinking API")
+            self.openai_client = openai.OpenAI(
+                api_key=os.getenv('GEMINI_API_KEY'),
+                base_url="https://generativelanguage.googleapis.com/v1alpha"
+            )
+            self.model_name = "gemini-2.0-flash-thinking-exp-01-21"
+        elif use_gemini:
+            logger.info("Initializing with Gemini API")
+            self.openai_client = openai.OpenAI(
+                api_key=os.getenv('GEMINI_API_KEY'),
+                base_url="https://generativelanguage.googleapis.com/v1beta"
+            )
+            self.model_name = "gemini-2.0-flash-exp"
+        else:
+            logger.info("Initializing with OpenAI API")
+            self.openai_client = openai.OpenAI(
+                api_key=os.getenv('OPENAI_API_KEY')
+            )
+            self.model_name = "gpt-3.5-turbo"
+        # Target word counts
+        self.words_per_minute = 130  # Average speaking rate
+    def _validate_word_count(self, total_words: int, target_words: int, min_words: int, max_words: int) -> None:
+        """Validate word count with flexible thresholds and log warnings/errors"""
+        deviation = abs(total_words - target_words) / target_words
+        if deviation > self.LARGE_DEVIATION_THRESHOLD:
+            logger.error(
+                f"Word count {total_words} significantly outside target range "
+                f"({min_words}-{max_words}). Deviation: {deviation:.2%}"
+            )
+        elif total_words < min_words or total_words > max_words:
+            logger.warning(
+                f"Word count {total_words} slightly outside target range "
+                f"({min_words}-{max_words}). Deviation: {deviation:.2%}"
+            )
+    def transform_to_lecture(self,
+                             text: str,
+                             target_duration: int = 30,
+                             include_examples: bool = True,
+                             initial_prompt: Optional[str] = None) -> str:
+        """
+        Transform input text into a structured teaching transcript
+        Args:
+            text: Input transcript text
+            target_duration: Target lecture duration in minutes
+            include_examples: Whether to include practical examples
+            initial_prompt: Additional user instructions to guide the generation
+        Returns:
+            str: Generated teaching transcript, regardless of word count validation
+        """
+        logger.info(f"Starting transformation for {target_duration} minute lecture")
+        # Clean and preprocess text
+        cleaned_text = self.text_processor.clean_text(text)
+        input_words = self.text_processor.count_words(cleaned_text)
+        logger.info(f"Input text cleaned. Word count: {input_words}")
+        # Calculate target word count
+        target_words = self.words_per_minute * target_duration
+        min_words = int(target_words * 0.95)  # Minimum 95% of target
+        max_words = int(target_words * 1.05)  # Maximum 105% of target
+        logger.info(f"Target word count: {target_words} (min: {min_words}, max: {max_words})")
+        # Generate detailed lecture structure with topics
+        structure_data = self._generate_detailed_structure(
+            text=cleaned_text,
+            target_duration=target_duration,
+            initial_prompt=initial_prompt
+        )
+        logger.info("Detailed lecture structure generated")
+        logger.info(f"Topics identified: {[t['title'] for t in structure_data['topics']]}")
+        # Calculate section word counts
+        section_words = {
+            'intro': int(target_words * 0.1),
+            'main': int(target_words * 0.7),
+            'practical': int(target_words * 0.15),
+            'summary': int(target_words * 0.05)
+        }
+        try:
+            logger.info("Generating content by sections with topic tracking")
+            # Introduction with learning objectives and topic preview
+            intro = self._generate_section(
+                'introduction',
+                structure_data,
+                cleaned_text,
+                section_words['intro'],
+                include_examples,
+                is_first=True,
+                initial_prompt=initial_prompt
+            )
+            intro_words = self.text_processor.count_words(intro)
+            logger.info(f"Introduction generated: {intro_words} words")
+            # Track context for coherence
+            context = {
+                'current_section': 'introduction',
+                'covered_topics': [],
+                'pending_topics': [t['title'] for t in structure_data['topics']],
+                'key_terms': set(),
+                'current_narrative': intro[-1000:],  # Last 1000 words for context
+                'learning_objectives': structure_data['learning_objectives']
+            }
+            # Main content with topic progression
+            main_content = self._generate_main_content(
+                structure_data,
+                cleaned_text,
+                section_words['main'],
+                include_examples,
+                context,
+                initial_prompt=initial_prompt
+            )
+            main_words = self.text_processor.count_words(main_content)
+            logger.info(f"Main content generated: {main_words} words")
+            # Update context after main content
+            context['current_section'] = 'main'
+            context['current_narrative'] = main_content[-1000:]
+            # Practical applications tied to main topics
+            practical = self._generate_section(
+                'practical',
+                structure_data,
+                cleaned_text,
+                section_words['practical'],
+                include_examples,
+                context=context,
+                initial_prompt=initial_prompt
+            )
+            practical_words = self.text_processor.count_words(practical)
+            logger.info(f"Practical section generated: {practical_words} words")
+            # Update context for summary
+            context['current_section'] = 'practical'
+            context['current_narrative'] = practical[-500:]
+            # Summary with topic reinforcement
+            summary = self._generate_section(
+                'summary',
+                structure_data,
+                cleaned_text,
+                section_words['summary'],
+                include_examples,
+                is_last=True,
+                context=context,
+                initial_prompt=initial_prompt
+            )
+            summary_words = self.text_processor.count_words(summary)
+            logger.info(f"Summary generated: {summary_words} words")
+            # Combine all sections
+            full_content = f"{intro}\n\n{main_content}\n\n{practical}\n\n{summary}"
+            total_words = self.text_processor.count_words(full_content)
+            logger.info(f"Total content generated: {total_words} words")
+            # Log warnings/errors but don't raise exceptions
+            self._validate_word_count(total_words, target_words, min_words, max_words)
+            # Validate coherence
+            self._validate_coherence(full_content, structure_data)
+            logger.info("Content coherence validated")
+            return full_content
+        except Exception as e:
+            logger.error(f"Error during content generation: {str(e)}")
+            # If we have partial content, return it
+            if 'full_content' in locals():
+                logger.warning("Returning partial content despite errors")
+                return full_content
+            raise  # Re-raise only if we have no content at all
+    def _generate_detailed_structure(self,
+                                     text: str,
+                                     target_duration: int,
+                                     initial_prompt: Optional[str] = None) -> Dict:
+        """Generate detailed lecture structure with topics and objectives"""
+        logger.info("Generating detailed lecture structure")
+        user_instructions = f"\nAdditional user instructions:\n{initial_prompt}\n" if initial_prompt else ""
+        prompt = f"""
+        You are an expert educator creating a detailed lecture outline.
+        {user_instructions}
+        Analyze this transcript and create a structured JSON output with the following:
+        1. Title of the lecture
+        2. 3-5 clear learning objectives
+        3. 3-4 main topics, each with:
+           - Title
+           - Key concepts
+           - Subtopics
+           - Time allocation (in minutes)
+           - Connection to learning objectives
+        4. Practical application ideas
+        5. Key terms to track
+        IMPORTANT: Response MUST be valid JSON. Format exactly like this, with no additional text:
+        {{
+            "title": "string",
+            "learning_objectives": ["string"],
+            "topics": [
+                {{
+                    "title": "string",
+                    "key_concepts": ["string"],
+                    "subtopics": ["string"],
+                    "duration_minutes": number,
+                    "objective_links": [number]
+                }}
+            ],
+            "practical_applications": ["string"],
+            "key_terms": ["string"]
+        }}
+        Target duration: {target_duration} minutes
+        Transcript excerpt:
+        {text[:2000]}
+        """
+        try:
+            # Common parameters
+            params = {
+                "model": self.model_name,
+                "messages": [
+                    {"role": "system", "content": "You are an expert educator. Output ONLY valid JSON, no other text."},
+                    {"role": "user", "content": prompt}
+                ],
+                "temperature": 0.7,
+                "max_tokens": self.MAX_TOKENS if self.use_thinking_model else 4000
+            }
+            # Add thinking config if using experimental model
+            if self.use_thinking_model:
+                params["extra_body"] = {
+                    "thinking_config": {
+                        "include_thoughts": True
+                    }
+                }
+            response = self.openai_client.chat.completions.create(**params)
+            content = response.choices[0].message.content.strip()
+            logger.debug(f"Raw structure response: {content}")
+            try:
+                structure_data = json.loads(content)
+                logger.info("Structure data parsed successfully")
+                return structure_data
+            except json.JSONDecodeError as e:
+                logger.warning(f"Failed to parse JSON directly: {str(e)}")
+                # Try to extract JSON if it's wrapped in other text
+                import re
+                json_match = re.search(r'({[\s\S]*})', content)
+                if json_match:
+                    try:
+                        structure_data = json.loads(json_match.group(1))
+                        logger.info("Structure data extracted and parsed successfully")
+                        return structure_data
+                    except json.JSONDecodeError:
+                        logger.warning("Failed to parse extracted JSON")
+                # If both attempts fail, use fallback structure
+                logger.warning("Using fallback structure")
+                return self._generate_fallback_structure(text, target_duration)
+        except Exception as e:
+            logger.error(f"Error generating structure: {str(e)}")
+            return self._generate_fallback_structure(text, target_duration)
+    def _generate_fallback_structure(self, text: str, target_duration: int) -> Dict:
+        """Generate a basic fallback structure when JSON parsing fails"""
+        logger.info("Generating fallback structure")
+        # Generate a simpler structure prompt
+        prompt = f"""
+        Analyze this text and provide:
+        1. A title (one line)
+        2. Three learning objectives (one per line)
+        3. Three main topics (one per line)
+        4. Three key terms (one per line)
+        Text: {text[:1000]}
+        """
+        try:
+            response = self.openai_client.chat.completions.create(
+                model=self.model_name,
+                messages=[
+                    {"role": "system", "content": "You are an expert educator. Provide concise, line-by-line responses."},
+                    {"role": "user", "content": prompt}
+                ],
+                temperature=0.7,
+                max_tokens=1000
+            )
+            lines = response.choices[0].message.content.strip().split('\n')
+            lines = [line.strip() for line in lines if line.strip()]
+            # Extract components from lines
+            title = lines[0] if lines else "Lecture"
+            objectives = [obj for obj in lines[1:4] if obj][:3]
+            topics = [topic for topic in lines[4:7] if topic][:3]
+            terms = [term for term in lines[7:10] if term][:3]
+            # Calculate minutes per topic
+            main_time = int(target_duration * 0.7)  # 70% for main content
+            topic_minutes = main_time // len(topics) if topics else main_time
+            # Create fallback structure
+            return {
+                "title": title,
+                "learning_objectives": objectives,
+                "topics": [
+                    {
+                        "title": topic,
+                        "key_concepts": [topic],  # Use topic as key concept
+                        "subtopics": ["Overview", "Details", "Examples"],
+                        "duration_minutes": topic_minutes,
+                        "objective_links": [1]  # Link to first objective
+                    }
+                    for topic in topics
+                ],
+                "practical_applications": [
+                    "Real-world application example",
+                    "Interactive exercise",
+                    "Case study"
+                ],
+                "key_terms": terms
+            }
+        except Exception as e:
+            logger.error(f"Error generating fallback structure: {str(e)}")
+            # Return minimal valid structure
+            return {
+                "title": "Lecture Overview",
+                "learning_objectives": ["Understand key concepts", "Apply knowledge", "Analyze examples"],
+                "topics": [
+                    {
+                        "title": "Main Topic",
+                        "key_concepts": ["Core concept"],
+                        "subtopics": ["Overview"],
+                        "duration_minutes": target_duration // 2,
+                        "objective_links": [1]
+                    }
+                ],
+                "practical_applications": ["Practical example"],
+                "key_terms": ["Key term"]
+            }
+    def _generate_section(self,
+                         section_type: str,
+                         structure_data: Dict,
+                         original_text: str,
+                         target_words: int,
+                         include_examples: bool,
+                         context: Dict = None,
+                         is_first: bool = False,
+                         is_last: bool = False,
+                         initial_prompt: Optional[str] = None) -> str:
+        """Generate content for a specific section with coherence tracking"""
+        logger.info(f"Generating {section_type} section (target: {target_words} words)")
+        user_instructions = f"\nUser's guiding instructions:\n{initial_prompt}\n" if initial_prompt else ""
+        # Base prompt with structure
+        prompt = f"""
+        You are an expert educator creating a detailed lecture transcript.
+        {user_instructions}
+        Generate the {section_type} section with EXACTLY {target_words} words.
+        Lecture Title: {structure_data['title']}
+        Learning Objectives: {', '.join(structure_data['learning_objectives'])}
+        Current section purpose:
+        """
+        # Add section-specific guidance
+        if section_type == 'introduction':
+            prompt += """
+            - Start with an engaging hook
+            - Present clear learning objectives
+            - Preview main topics
+            - Set expectations for the lecture
+            """
+        elif section_type == 'main':
+            prompt += f"""
+            - Cover these topics: {[t['title'] for t in structure_data['topics']]}
+            - Build progressively on concepts
+            - Include clear transitions
+            - Reference previous concepts
+            """
+        elif section_type == 'practical':
+            prompt += """
+            - Apply concepts to real-world scenarios
+            - Connect to previous topics
+            - Include interactive elements
+            - Reinforce key learning points
+            """
+        elif section_type == 'summary':
+            prompt += """
+            - Reinforce key takeaways
+            - Connect back to objectives
+            - Provide next steps
+            - End with a strong conclusion
+            """
+        # Add context if available
+        if context:
+            prompt += f"""
+            Context:
+            - Covered topics: {', '.join(context['covered_topics'])}
+            - Pending topics: {', '.join(context['pending_topics'])}
+            - Key terms used: {', '.join(context['key_terms'])}
+            - Recent narrative: {context['current_narrative']}
+            """
+        # Add requirements
+        prompt += f"""
+        Requirements:
+        1. STRICT word count: Generate EXACTLY {target_words} words
+        2. Include practical examples: {include_examples}
+        3. Use clear transitions
+        4. Include engagement points
+        5. Use time markers [MM:SS]
+        6. Reference specific content from transcript
+        7. Maintain narrative flow
+        8. Use key terms consistently
+        """
+        response = self.openai_client.chat.completions.create(
+            model=self.model_name,
+            messages=[
+                {"role": "system", "content": "You are an expert educator creating a coherent lecture transcript."},
+                {"role": "user", "content": prompt}
+            ],
+            temperature=0.7,
+            max_tokens=self._calculate_max_tokens(section_type, target_words)
+        )
+        content = response.choices[0].message.content
+        word_count = self.text_processor.count_words(content)
+        logger.info(f"Section generated: {word_count} words")
+        return content
+    def _calculate_max_tokens(self, section_type: str, target_words: int) -> int:
+        """Calculate appropriate max_tokens based on section and model"""
+        # 1 token ≈ 4 caracteres (1 palabra ≈ 1.33 tokens)
+        base_tokens = int(target_words * 1.5)  # Margen para formato
+        if self.use_thinking_model:
+            # Permite hasta 64k tokens pero limita por sección
+            section_limits = {
+                'introduction': 8000,
+                'main': 32000,
+                'practical': 16000,
+                'summary': 8000
+            }
+            return min(base_tokens * 2, section_limits.get(section_type, 16000))
+        # Límites para otros modelos
+        return min(base_tokens + 1000, self.MAX_TOKENS)
+    def _generate_main_content(self,
+                             structure_data: Dict,
+                             original_text: str,
+                             target_words: int,
+                             include_examples: bool,
+                             context: Dict,
+                             initial_prompt: Optional[str] = None) -> str:
+        """Generate main content with topic progression"""
+        logger.info(f"Generating main content (target: {target_words} words)")
+        # Calculate words per topic based on their duration ratios
+        total_duration = sum(t['duration_minutes'] for t in structure_data['topics'])
+        # Avoid division by zero
+        total_duration = total_duration if total_duration > 0 else 1
+        topic_words = {}
+        for topic in structure_data['topics']:
+            ratio = topic['duration_minutes'] / total_duration
+            topic_words[topic['title']] = int(target_words * ratio)
+        logger.info(f"Topic word allocations: {topic_words}")
+        # Generate content for each topic
+        topic_contents = []
+        for topic in structure_data['topics']:
+            topic_target = topic_words[topic['title']]
+            # Update context for topic
+            context['current_topic'] = topic['title']
+            if topic['title'] in context['pending_topics']:
+                context['covered_topics'].append(topic['title'])
+                context['pending_topics'].remove(topic['title'])
+            context['key_terms'].update(topic['key_concepts'])
+            # Generate topic content
+            topic_content = self._generate_section(
+                f"main_topic_{topic['title']}",
+                structure_data,
+                original_text,
+                topic_target,
+                include_examples,
+                context=context,
+                initial_prompt=initial_prompt
+            )
+            topic_contents.append(topic_content)
+            context['current_narrative'] = topic_content[-1000:]
+        return "\n\n".join(topic_contents)
+    def _validate_coherence(self, content: str, structure_data: Dict):
+        """Validate content coherence against structure"""
+        logger.info("Validating content coherence")
+        # Check for learning objectives
+        for objective in structure_data['learning_objectives']:
+            if not any(term.lower() in content.lower() for term in objective.split()):
+                logger.warning(f"Learning objective not well covered: {objective}")
+        # Check for key terms
+        for term in structure_data['key_terms']:
+            if content.lower().count(term.lower()) < 2:
+                logger.warning(f"Key term underutilized: {term}")
+        # Check topic coverage
+        for topic in structure_data['topics']:
+            if not any(concept.lower() in content.lower() for concept in topic['key_concepts']):
+                logger.warning(f"Topic concepts not well covered: {topic['title']}")
+        logger.info("Coherence validation complete")

src/utils/__init__.py ADDED Viewed

File without changes

src/utils/pdf_processor.py ADDED Viewed

	@@ -0,0 +1,59 @@

+import PyPDF2
+from typing import Optional
+class PDFProcessor:
+    """Handles PDF file processing and text extraction"""
+    def __init__(self):
+        """Initialize PDF processor"""
+        pass
+    def extract_text(self, pdf_path: str) -> str:
+        """
+        Extract text content from a PDF file
+        Args:
+            pdf_path: Path to the PDF file
+        Returns:
+            str: Extracted text content
+        Raises:
+            FileNotFoundError: If PDF file doesn't exist
+            PyPDF2.PdfReadError: If PDF file is invalid or corrupted
+        """
+        try:
+            with open(pdf_path, 'rb') as file:
+                # Create PDF reader object
+                reader = PyPDF2.PdfReader(file)
+                # Extract text from all pages
+                text = ""
+                for page in reader.pages:
+                    text += page.extract_text() + "\n"
+                return text.strip()
+        except FileNotFoundError:
+            raise FileNotFoundError(f"PDF file not found: {pdf_path}")
+        except PyPDF2.PdfReadError as e:
+            raise PyPDF2.PdfReadError(f"Error reading PDF file: {str(e)}")
+        except Exception as e:
+            raise Exception(f"Unexpected error processing PDF: {str(e)}")
+    def get_metadata(self, pdf_path: str) -> dict:
+        """
+        Extract metadata from PDF file
+        Args:
+            pdf_path: Path to the PDF file
+        Returns:
+            dict: PDF metadata
+        """
+        try:
+            with open(pdf_path, 'rb') as file:
+                reader = PyPDF2.PdfReader(file)
+                return reader.metadata
+        except Exception as e:
+            return {"error": str(e)}

src/utils/text_processor.py ADDED Viewed

	@@ -0,0 +1,84 @@

+import re
+from typing import List, Optional
+class TextProcessor:
+    """Handles text preprocessing and cleaning"""
+    def __init__(self):
+        """Initialize text processor"""
+        self.sentence_endings = r'[.!?]'
+        self.word_pattern = r'\b\w+\b'
+    def clean_text(self, text: str) -> str:
+        """
+        Clean and normalize text
+        Args:
+            text: Input text to clean
+        Returns:
+            str: Cleaned text
+        """
+        # Remove extra whitespace
+        text = ' '.join(text.split())
+        # Fix common OCR errors
+        text = self._fix_ocr_errors(text)
+        # Normalize punctuation
+        text = self._normalize_punctuation(text)
+        return text.strip()
+    def split_into_sections(self, text: str) -> List[str]:
+        """
+        Split text into logical sections based on content
+        Args:
+            text: Input text to split
+        Returns:
+            List[str]: List of text sections
+        """
+        # Split on double newlines or section markers
+        sections = re.split(r'\n\s*\n|\n(?=[A-Z][^a-z]*:)', text)
+        return [s.strip() for s in sections if s.strip()]
+    def count_words(self, text: str) -> int:
+        """
+        Count words in text
+        Args:
+            text: Input text
+        Returns:
+            int: Word count
+        """
+        words = re.findall(self.word_pattern, text)
+        return len(words)
+    def _fix_ocr_errors(self, text: str) -> str:
+        """Fix common OCR errors"""
+        replacements = {
+            r'[|]': 'I',  # Vertical bar to I
+            r'0': 'O',    # Zero to O where appropriate
+            r'1': 'l',    # One to l where appropriate
+            r'\s+': ' '   # Multiple spaces to single space
+        }
+        for pattern, replacement in replacements.items():
+            text = re.sub(pattern, replacement, text)
+        return text
+    def _normalize_punctuation(self, text: str) -> str:
+        """Normalize punctuation marks"""
+        # Replace multiple periods with single period
+        text = re.sub(r'\.{2,}', '.', text)
+        # Add space after punctuation if missing
+        text = re.sub(r'([.!?])([A-Z])', r'\1 \2', text)
+        # Fix spacing around punctuation
+        text = re.sub(r'\s+([.!?,])', r'\1', text)
+        return text