matsuap commited on
Commit
792ad00
·
verified ·
1 Parent(s): 1ef8ad7

Upload folder using huggingface_hub

Browse files
.dockerignore ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .git
2
+ .github
3
+ .env
4
+ __pycache__
5
+ *.pyc
6
+ *.pyo
7
+ *.pyd
8
+ .db
9
+ temp.db
10
+ .vscode
11
+ .idea
12
+ venv
13
+ .venv
14
+ node_modules
15
+ .ipynb_checkpoints
16
+ *.log
.gitattributes CHANGED
@@ -1,35 +1 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
- *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
- *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
1
+ assets/bgm/*.mp3 filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.gitignore ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ .env
2
+ .env.example
3
+
4
+ __pycache__
5
+ Temp
6
+ requirements.md
Dockerfile ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ # Install system dependencies
4
+ # ffmpeg for audio/video processing
5
+ # poppler-utils for pdf2image
6
+ # default-libmysqlclient-dev or unixodbc-dev if needed for DB
7
+ RUN apt-get update && apt-get install -y \
8
+ ffmpeg \
9
+ poppler-utils \
10
+ libvips-dev \
11
+ curl \
12
+ git \
13
+ && rm -rf /var/lib/apt/lists/*
14
+
15
+ # Set the working directory
16
+ WORKDIR /code
17
+
18
+ # Copy requirements file first for better caching
19
+ COPY requirements.txt .
20
+
21
+ # Install dependencies
22
+ RUN pip install --no-cache-dir -r requirements.txt
23
+
24
+ # Create a non-root user (Hugging Face requirement for security)
25
+ RUN useradd -m -u 1000 user
26
+ USER user
27
+ ENV HOME=/home/user \
28
+ PATH=/home/user/.local/bin:$PATH
29
+
30
+ WORKDIR $HOME/app
31
+
32
+ # Copy the rest of the application code
33
+ COPY --chown=user . $HOME/app
34
+
35
+ # Fix permissions for any local storage if needed
36
+ RUN mkdir -p $HOME/app/assets && chmod 777 $HOME/app/assets
37
+
38
+ # Use port 7860 as it's the default for Hugging Face Spaces
39
+ EXPOSE 7860
40
+
41
+ # Command to run the application
42
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]
README.md CHANGED
@@ -1,10 +1,118 @@
1
  ---
2
- title: Creatorstudio Ai Backend
3
- emoji: 🏆
4
- colorFrom: purple
5
- colorTo: red
6
  sdk: docker
7
  pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: CreatorStudio AI Backend
3
+ emoji: 🚀
4
+ colorFrom: blue
5
+ colorTo: indigo
6
  sdk: docker
7
  pinned: false
8
  ---
9
 
10
+ # CreatorStudio AI Backend
11
+
12
+ CreatorStudio AI is a powerful, enterprise-grade AI content generation and study platform. It leverages state-of-the-art Large Language Models (LLMs) and cloud infrastructure to transform documents and media into a wide array of educational and creative content.
13
+
14
+ ## 🚀 Features
15
+
16
+ - **🔐 Robust Authentication**: Secure JWT-based authentication system with password hashing and user-scoped data access.
17
+ - **📂 Source Management**: Integrated AWS S3 file management for seamless document uploads, listing, and storage.
18
+ - **🧠 Advanced RAG (Retrieval-Augmented Generation)**: Chat with your uploaded documents using Azure AI Search and OpenAI/Gemini, enabling high-precision context-aware interactions.
19
+ - **🎙️ Podcast Generation**: Automatically transform text and documents into professional podcast scripts and audio segments.
20
+ - **🎬 Video Generation**: Create engaging video summaries and slide-based videos from static content using MoviePy and FFmpeg.
21
+ - **📝 Interactive Study Tools**:
22
+ - **Flashcards**: AI-generated flashcards tailored to your source material.
23
+ - **Quizzes**: Customizable quizzes with multiple-choice questions, hints, and detailed explanations.
24
+ - **Mind Maps**: Visualize complex relationships with auto-generated Mermaid.js mind maps.
25
+ - **📊 Smart Reports**: Generate structured, professional reports and summaries from various source materials.
26
+
27
+ ## 🛠️ Tech Stack
28
+
29
+ - **Framework**: FastAPI (Python 3.x)
30
+ - **Database**: SQLAlchemy ORM with support for relational databases (e.g., MSSQL/PostgreSQL).
31
+ - **AI Infrastructure**:
32
+ - **LLMs**: OpenAI GPT-4o, Google Gemini Pro.
33
+ - **RAG**: Azure AI Search, Azure OpenAI Embeddings.
34
+ - **Cloud & Processing**:
35
+ - **Storage**: AWS S3, Azure Blob Storage.
36
+ - **Media**: MoviePy, FFmpeg, Pydub for audio/video processing.
37
+ - **Documents**: PyPDF2, pdf2image, Pillow for comprehensive document handling.
38
+
39
+ ## 📁 Project Structure
40
+
41
+ ```text
42
+ CreatorStudio AI/
43
+ ├── api/ # FastAPI routers and endpoint logic
44
+ │ ├── auth.py # Authentication & User management
45
+ │ ├── sources.py # S3 Source file management
46
+ │ ├── rag.py # Azure RAG indexing and querying
47
+ │ ├── podcast.py # Podcast generation endpoints
48
+ │ ├── flashcards.py # Flashcard generation logic
49
+ │ └── ... # Quizzes, Mindmaps, Reports, Video Gen
50
+ ├── core/ # Core application configuration
51
+ │ ├── config.py # Pydantic settings & Environment management
52
+ │ ├── database.py # DB connection & Session management
53
+ │ ├── prompts.py # Centralized AI prompt templates
54
+ │ └── security.py # JWT & Password hashing utilities
55
+ ├── models/ # Data models
56
+ │ ├── db_models.py # SQLAlchemy database models (User, Source, RAG, etc.)
57
+ │ └── schemas.py # Pydantic request/response schemas for API
58
+ ├── services/ # Business logic & 3rd party integrations
59
+ │ ├── s3_service.py # AWS S3 integration
60
+ │ ├── rag_service.py # Azure AI Search & RAG logic
61
+ │ ├── podcast_service.py # Podcast creation & script logic
62
+ │ ├── video_generator_service.py # Video processing
63
+ │ └── ... # Specialized services for all features
64
+ ├── main.py # Application entry point & Router inclusion
65
+ └── requirements.txt # Project dependencies
66
+ ```
67
+
68
+ ## ⚙️ Setup & Installation
69
+
70
+ 1. **Clone the repository**
71
+ 2. **Create a Virtual Environment**:
72
+ ```bash
73
+ python -m venv venv
74
+ source venv/bin/activate # On Windows: venv\Scripts\activate
75
+ ```
76
+ 3. **Install Dependencies**:
77
+ ```bash
78
+ pip install -r requirements.txt
79
+ ```
80
+ 4. **Configure Environment Variables**:
81
+ Create a `.env` file in the root directory based on the following template:
82
+ ```env
83
+ # AWS Configuration
84
+ AWS_ACCESS_KEY_ID=your_aws_key
85
+ AWS_SECRET_ACCESS_KEY=your_aws_secret
86
+ AWS_S3_BUCKET=your_bucket_name
87
+
88
+ # Azure RAG & OpenAI
89
+ AZURE_SEARCH_ENDPOINT=your_endpoint
90
+ AZURE_SEARCH_KEY=your_search_key
91
+ AZURE_OPENAI_API_KEY=your_openai_key
92
+ AZURE_OPENAI_ENDPOINT=your_openai_endpoint
93
+
94
+ # LLM Keys
95
+ OPENAI_API_KEY=your_openai_key
96
+ GEMINI_API_KEY=your_gemini_key
97
+
98
+ # Database & Security
99
+ DATABASE_URL=your_db_connection_string
100
+ SECRET_KEY=your_jwt_secret_key
101
+ ```
102
+ 5. **Run the Server**:
103
+ ```bash
104
+ python main.py
105
+ ```
106
+ The API will be available at `http://localhost:8000`. Access the interactive documentation at `http://localhost:8000/docs`.
107
+
108
+ ## 📖 API Documentation
109
+
110
+ The backend provides a fully interactive Swagger UI at `/docs` for testing and exploration.
111
+
112
+ - **Auth**: `/api/auth/register`, `/api/auth/login`
113
+ - **Sources**: `/api/sources/upload`, `/api/sources/list`, `/api/sources/{id}`
114
+ - **RAG**: `/api/rag/index`, `/api/rag/query`
115
+ - **Content Generation**: `/api/podcast`, `/api/flashcards`, `/api/mindmaps`, `/api/quizzes`, `/api/reports`, `/api/video_generator`
116
+
117
+ ---
118
+ © 2026 CreatorStudio AI Team. All rights reserved.
api/__init__.py ADDED
File without changes
api/auth.py ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Optional
2
+ from fastapi import APIRouter, Depends, HTTPException, status, Request, Form, Body
3
+ from fastapi.security import OAuth2PasswordBearer, OAuth2PasswordRequestForm
4
+ from jose import JWTError, jwt
5
+ from sqlalchemy.orm import Session
6
+ from core.security import create_access_token, verify_password, get_password_hash
7
+ from core.config import settings
8
+ from core.database import get_db
9
+ from models.schemas import UserCreate, Token, TokenData, UserResponse, UserLogin
10
+ from models import db_models
11
+
12
+ router = APIRouter(prefix="/api/auth", tags=["auth"])
13
+
14
+ oauth2_scheme = OAuth2PasswordBearer(tokenUrl="/api/auth/login")
15
+
16
+ async def get_current_user(token: str = Depends(oauth2_scheme), db: Session = Depends(get_db)):
17
+ credentials_exception = HTTPException(
18
+ status_code=status.HTTP_401_UNAUTHORIZED,
19
+ detail="Could not validate credentials",
20
+ headers={"WWW-Authenticate": "Bearer"},
21
+ )
22
+ try:
23
+ payload = jwt.decode(token, settings.SECRET_KEY, algorithms=[settings.ALGORITHM])
24
+ email: str = payload.get("sub")
25
+ if email is None:
26
+ raise credentials_exception
27
+ token_data = TokenData(email=email)
28
+ except JWTError:
29
+ raise credentials_exception
30
+
31
+ user = db.query(db_models.User).filter(db_models.User.email == token_data.email).first()
32
+ if user is None:
33
+ raise credentials_exception
34
+ return user
35
+
36
+ @router.post("/register", response_model=UserResponse)
37
+ async def register(user_in: UserCreate, db: Session = Depends(get_db)):
38
+ db_user = db.query(db_models.User).filter(db_models.User.email == user_in.email).first()
39
+ if db_user:
40
+ raise HTTPException(
41
+ status_code=400,
42
+ detail="The user with this email already exists in the system.",
43
+ )
44
+
45
+ hashed_password = get_password_hash(user_in.password)
46
+ new_user = db_models.User(
47
+ email=user_in.email,
48
+ hashed_password=hashed_password,
49
+ is_active=True
50
+ )
51
+ db.add(new_user)
52
+ db.commit()
53
+ db.refresh(new_user)
54
+ return new_user
55
+
56
+ @router.post("/login", response_model=Token)
57
+ async def login(
58
+ request: Request,
59
+ email: Optional[str] = Body(None),
60
+ password: Optional[str] = Body(None),
61
+ username: Optional[str] = Form(None),
62
+ password_form: Optional[str] = Form(None, alias="password"),
63
+ db: Session = Depends(get_db)):
64
+ """
65
+ Unified Login:
66
+ - For Web App: Send JSON {"email": "...", "password": "..."}
67
+ - For Swagger Popup: Enter Email in 'username' box.
68
+ """
69
+ final_email = email or username
70
+ final_password = password or password_form
71
+
72
+ if not final_email:
73
+ try:
74
+ if "application/json" in request.headers.get("content-type", ""):
75
+ body = await request.json()
76
+ final_email = body.get("email")
77
+ final_password = body.get("password")
78
+ else:
79
+ form_data = await request.form()
80
+ final_email = form_data.get("username") or form_data.get("email")
81
+ final_password = form_data.get("password")
82
+ except:
83
+ pass
84
+
85
+ if not final_email or not final_password:
86
+ raise HTTPException(
87
+ status_code=422,
88
+ detail="Email and password are required. (In Swagger Popup, put email in 'username' box)"
89
+ )
90
+
91
+ user = db.query(db_models.User).filter(db_models.User.email == final_email).first()
92
+ if not user or not verify_password(final_password, user.hashed_password):
93
+ raise HTTPException(
94
+ status_code=status.HTTP_401_UNAUTHORIZED,
95
+ detail="Incorrect email or password",
96
+ headers={"WWW-Authenticate": "Bearer"},
97
+ )
98
+
99
+ access_token = create_access_token(data={"sub": user.email})
100
+ return {"access_token": access_token, "token_type": "bearer"}
101
+
api/chat.py ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter, Depends, HTTPException
2
+ from sqlalchemy.orm import Session
3
+ from typing import List
4
+ import logging
5
+ from openai import OpenAI
6
+
7
+ from core.database import get_db
8
+ from models import db_models, schemas
9
+ from api.auth import get_current_user
10
+ from services.rag_service import rag_service
11
+ from core.config import settings
12
+
13
+ router = APIRouter(prefix="/api/chat", tags=["AI Conversation"])
14
+ logger = logging.getLogger(__name__)
15
+
16
+ @router.get("/history", response_model=List[schemas.ChatMessageResponse])
17
+ async def get_chat_history(
18
+ current_user: db_models.User = Depends(get_current_user),
19
+ db: Session = Depends(get_db)
20
+ ):
21
+ """
22
+ Retrieves the full AI conversation history for the current user.
23
+ """
24
+ messages = db.query(db_models.ChatMessage).filter(
25
+ db_models.ChatMessage.user_id == current_user.id
26
+ ).order_by(db_models.ChatMessage.created_at.asc()).all()
27
+ return messages
28
+
29
+ @router.delete("/history")
30
+ async def clear_chat_history(
31
+ current_user: db_models.User = Depends(get_current_user),
32
+ db: Session = Depends(get_db)
33
+ ):
34
+ """
35
+ Wipes the conversation history clean (Fresh Start).
36
+ """
37
+ db.query(db_models.ChatMessage).filter(
38
+ db_models.ChatMessage.user_id == current_user.id
39
+ ).delete()
40
+ db.commit()
41
+ return {"message": "All AI conversation history has been cleared."}
42
+
43
+ @router.post("/query", response_model=schemas.ChatMessageResponse)
44
+ async def ask_ai(
45
+ message_in: schemas.ChatMessageCreate,
46
+ current_user: db_models.User = Depends(get_current_user),
47
+ db: Session = Depends(get_db)
48
+ ):
49
+ """
50
+ Unified AI Endpoint:
51
+ - Use this for general chat.
52
+ - Use this for PDF/Document specific questions (by providing rag_doc_id).
53
+
54
+ It automatically manages conversation history and RAG context retrieval.
55
+ """
56
+ try:
57
+ openai_client = OpenAI(api_key=settings.OPENAI_API_KEY)
58
+
59
+ history = db.query(db_models.ChatMessage).filter(
60
+ db_models.ChatMessage.user_id == current_user.id
61
+ ).order_by(db_models.ChatMessage.id.desc()).limit(10).all()
62
+ history.reverse() # Sort to chronological [oldest -> newest]
63
+
64
+ # 2. Save current user query to database
65
+ user_msg = db_models.ChatMessage(
66
+ user_id=current_user.id,
67
+ role="user",
68
+ content=message_in.query,
69
+ rag_doc_id=message_in.rag_doc_id
70
+ )
71
+ db.add(user_msg)
72
+ db.commit()
73
+
74
+ # 3. Context Retrieval (RAG)
75
+ context = ""
76
+ doc_filename = ""
77
+ if message_in.rag_doc_id:
78
+ rag_doc = db.query(db_models.RAGDocument).filter(
79
+ db_models.RAGDocument.id == message_in.rag_doc_id,
80
+ db_models.RAGDocument.user_id == current_user.id
81
+ ).first()
82
+ if rag_doc:
83
+ doc_filename = rag_doc.filename
84
+ results = rag_service.search_document(
85
+ query=message_in.query,
86
+ doc_id=rag_doc.azure_doc_id,
87
+ user_id=current_user.id,
88
+ top_k=5
89
+ )
90
+ context = "\n\n".join([r["content"] for r in results])
91
+
92
+ # 4. Build LLM Messages
93
+ llm_messages = [
94
+ {
95
+ "role": "system",
96
+ "content": (
97
+ "You are a helpful AI assistant on the CreatorStudio platform. "
98
+ "Use the provided conversation history and document context to answer the user. "
99
+ "If the user refers to 'last message' or 'previous context', look at the history provided below."
100
+ )
101
+ }
102
+ ]
103
+
104
+ # Add past messages (conversation history)
105
+ for msg in history:
106
+ llm_messages.append({"role": msg.role, "content": msg.content})
107
+
108
+ # Add RAG Knowledge if available
109
+ if context:
110
+ llm_messages.append({
111
+ "role": "system",
112
+ "content": f"REFERENTIAL KNOWLEDGE FROM DOCUMENT '{doc_filename}':\n\n{context}"
113
+ })
114
+
115
+ # Add current user query
116
+ llm_messages.append({"role": "user", "content": message_in.query})
117
+
118
+ # 5. Get AI Response
119
+ response = openai_client.chat.completions.create(
120
+ model="gpt-4o-mini",
121
+ messages=llm_messages,
122
+ temperature=0.7
123
+ )
124
+ ai_response_text = response.choices[0].message.content
125
+
126
+ # 6. Save assistant response to database
127
+ assistant_msg = db_models.ChatMessage(
128
+ user_id=current_user.id,
129
+ role="assistant",
130
+ content=ai_response_text,
131
+ rag_doc_id=message_in.rag_doc_id
132
+ )
133
+ db.add(assistant_msg)
134
+ db.commit()
135
+ db.refresh(assistant_msg)
136
+
137
+ return assistant_msg
138
+
139
+ except Exception as e:
140
+ logger.error(f"Unified AI Query failed: {e}")
141
+ raise HTTPException(status_code=500, detail=f"AI Error: {str(e)}")
api/flashcards.py ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ from fastapi import APIRouter, Depends, HTTPException
3
+ from sqlalchemy.orm import Session
4
+ from typing import List, Dict
5
+
6
+ from api.auth import get_current_user
7
+ from models import db_models
8
+ from models.schemas import FlashcardGenerateRequest, FlashcardSetResponse, FlashcardResponse
9
+ from core.database import get_db
10
+ from services.flashcard_service import flashcard_service
11
+ from core import constants
12
+
13
+ router = APIRouter(prefix="/api/flashcards", tags=["flashcards"])
14
+ logger = logging.getLogger(__name__)
15
+
16
+ @router.get("/config")
17
+ async def get_flashcard_config():
18
+ """Returns available difficulties, quantities, and languages for flashcards."""
19
+ return {
20
+ "difficulties": constants.DIFFICULTIES,
21
+ "quantities": constants.FLASHCARD_QUANTITIES,
22
+ "languages": constants.LANGUAGES
23
+ }
24
+
25
+ @router.post("/generate", response_model=FlashcardSetResponse)
26
+ async def generate_flashcards(
27
+ request: FlashcardGenerateRequest,
28
+ current_user: db_models.User = Depends(get_current_user),
29
+ db: Session = Depends(get_db)
30
+ ):
31
+ """
32
+ Generates a set of flashcards and saves them to the database.
33
+ """
34
+ try:
35
+ source_id = None
36
+ if request.file_key:
37
+ # Verify file ownership
38
+ source = db.query(db_models.Source).filter(
39
+ db_models.Source.s3_key == request.file_key,
40
+ db_models.Source.user_id == current_user.id
41
+ ).first()
42
+ if not source:
43
+ raise HTTPException(status_code=403, detail="Not authorized to access this file")
44
+ source_id = source.id
45
+
46
+ # 1. Generate Flashcards from AI
47
+ cards_data = await flashcard_service.generate_flashcards(
48
+ file_key=request.file_key,
49
+ text_input=request.text_input,
50
+ difficulty=request.difficulty,
51
+ quantity=request.quantity,
52
+ topic=request.topic,
53
+ language=request.language
54
+ )
55
+
56
+ if not cards_data:
57
+ raise HTTPException(status_code=500, detail="Failed to generate flashcards")
58
+
59
+ # 2. Save Flashcard Set to DB
60
+ title = request.topic if request.topic else f"Flashcards {len(cards_data)}"
61
+ db_set = db_models.FlashcardSet(
62
+ title=title,
63
+ difficulty=request.difficulty,
64
+ user_id=current_user.id,
65
+ source_id=source_id
66
+ )
67
+ db.add(db_set)
68
+ db.commit()
69
+ db.refresh(db_set)
70
+
71
+ # 3. Save individual flashcards
72
+ for item in cards_data:
73
+ db_card = db_models.Flashcard(
74
+ flashcard_set_id=db_set.id,
75
+ question=item.get("question", ""),
76
+ answer=item.get("answer", "")
77
+ )
78
+ db.add(db_card)
79
+
80
+ db.commit()
81
+ db.refresh(db_set)
82
+
83
+ return db_set
84
+
85
+ except HTTPException:
86
+ raise
87
+ except Exception as e:
88
+ logger.error(f"Flashcard generation endpoint failed: {e}")
89
+ raise HTTPException(status_code=500, detail=str(e))
90
+
91
+ @router.get("/sets", response_model=List[FlashcardSetResponse])
92
+ async def list_flashcard_sets(
93
+ current_user: db_models.User = Depends(get_current_user),
94
+ db: Session = Depends(get_db)
95
+ ):
96
+ """
97
+ Lists all flashcard sets for the current user.
98
+ """
99
+ try:
100
+ sets = db.query(db_models.FlashcardSet).filter(
101
+ db_models.FlashcardSet.user_id == current_user.id
102
+ ).order_by(db_models.FlashcardSet.created_at.desc()).all()
103
+ return sets
104
+ except Exception as e:
105
+ raise HTTPException(status_code=500, detail=str(e))
106
+
107
+ @router.get("/set/{set_id}", response_model=FlashcardSetResponse)
108
+ async def get_flashcard_set(
109
+ set_id: int,
110
+ current_user: db_models.User = Depends(get_current_user),
111
+ db: Session = Depends(get_db)
112
+ ):
113
+ """
114
+ Retrieves a specific flashcard set.
115
+ """
116
+ db_set = db.query(db_models.FlashcardSet).filter(
117
+ db_models.FlashcardSet.id == set_id,
118
+ db_models.FlashcardSet.user_id == current_user.id
119
+ ).first()
120
+
121
+ if not db_set:
122
+ raise HTTPException(status_code=404, detail="Flashcard set not found")
123
+
124
+ return db_set
125
+
126
+ @router.post("/explain")
127
+ async def explain_flashcard(
128
+ question: str,
129
+ file_key: str = None,
130
+ language: str = "English",
131
+ current_user: db_models.User = Depends(get_current_user)
132
+ ):
133
+ """
134
+ Provides a detailed explanation for a specific question.
135
+ """
136
+ try:
137
+ explanation = await flashcard_service.generate_explanation(
138
+ question=question,
139
+ file_key=file_key,
140
+ language=language
141
+ )
142
+ return {"explanation": explanation}
143
+ except Exception as e:
144
+ logger.error(f"Explanation failed: {e}")
145
+ raise HTTPException(status_code=500, detail=str(e))
146
+
147
+ @router.delete("/set/{set_id}")
148
+ async def delete_flashcard_set(
149
+ set_id: int,
150
+ current_user: db_models.User = Depends(get_current_user),
151
+ db: Session = Depends(get_db)
152
+ ):
153
+ """
154
+ Deletes a specific flashcard set and all its cards.
155
+ """
156
+ db_set = db.query(db_models.FlashcardSet).filter(
157
+ db_models.FlashcardSet.id == set_id,
158
+ db_models.FlashcardSet.user_id == current_user.id
159
+ ).first()
160
+
161
+ if not db_set:
162
+ raise HTTPException(status_code=404, detail="Flashcard set not found")
163
+
164
+ db.delete(db_set)
165
+ db.commit()
166
+ return {"message": "Flashcard set and all associated cards deleted successfully"}
api/mindmaps.py ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ from fastapi import APIRouter, Depends, HTTPException
3
+ from sqlalchemy.orm import Session
4
+ from typing import List
5
+
6
+ from api.auth import get_current_user
7
+ from models import db_models
8
+ from models.schemas import MindMapGenerateRequest, MindMapResponse
9
+ from core.database import get_db
10
+ from services.mindmap_service import mindmap_service
11
+
12
+ router = APIRouter(prefix="/api/mindmaps", tags=["mindmaps"])
13
+ logger = logging.getLogger(__name__)
14
+
15
+ @router.post("/generate", response_model=MindMapResponse)
16
+ async def generate_mindmap(
17
+ request: MindMapGenerateRequest,
18
+ current_user: db_models.User = Depends(get_current_user),
19
+ db: Session = Depends(get_db)
20
+ ):
21
+ """
22
+ Generates a mind map in Mermaid format and saves it to the database.
23
+ """
24
+ try:
25
+ source_id = None
26
+ if request.file_key:
27
+ # Verify file ownership
28
+ source = db.query(db_models.Source).filter(
29
+ db_models.Source.s3_key == request.file_key,
30
+ db_models.Source.user_id == current_user.id
31
+ ).first()
32
+ if not source:
33
+ raise HTTPException(status_code=403, detail="Not authorized to access this file")
34
+ source_id = source.id
35
+
36
+ # 1. Generate Mind Map from AI
37
+ mermaid_code = await mindmap_service.generate_mindmap(
38
+ file_key=request.file_key,
39
+ text_input=request.text_input
40
+ )
41
+
42
+ if not mermaid_code:
43
+ raise HTTPException(status_code=500, detail="Failed to generate mind map")
44
+
45
+ # 2. Save to DB
46
+ title = request.title if request.title else (request.file_key.split('/')[-1] if request.file_key else "Untitled Mind Map")
47
+ db_mindmap = db_models.MindMap(
48
+ title=title,
49
+ mermaid_code=mermaid_code,
50
+ user_id=current_user.id,
51
+ source_id=source_id
52
+ )
53
+ db.add(db_mindmap)
54
+ db.commit()
55
+ db.refresh(db_mindmap)
56
+
57
+ return MindMapResponse(
58
+ title=db_mindmap.title,
59
+ mermaid_code=db_mindmap.mermaid_code,
60
+ message="Mind map generated successfully"
61
+ )
62
+
63
+ except HTTPException:
64
+ raise
65
+ except Exception as e:
66
+ logger.error(f"Mind map generation endpoint failed: {e}")
67
+ raise HTTPException(status_code=500, detail=str(e))
68
+
69
+ @router.get("/list", response_model=List[MindMapResponse])
70
+ async def list_mindmaps(
71
+ current_user: db_models.User = Depends(get_current_user),
72
+ db: Session = Depends(get_db)
73
+ ):
74
+ """
75
+ Lists all mind maps for the current user.
76
+ """
77
+ try:
78
+ mindmaps = db.query(db_models.MindMap).filter(
79
+ db_models.MindMap.user_id == current_user.id
80
+ ).order_by(db_models.MindMap.created_at.desc()).all()
81
+
82
+ return [
83
+ MindMapResponse(
84
+ title=m.title,
85
+ mermaid_code=m.mermaid_code,
86
+ message="Retrieved successfully"
87
+ ) for m in mindmaps
88
+ ]
89
+ except Exception as e:
90
+ raise HTTPException(status_code=500, detail=str(e))
91
+
92
+ @router.delete("/{mindmap_id}")
93
+ async def delete_mindmap(
94
+ mindmap_id: int,
95
+ current_user: db_models.User = Depends(get_current_user),
96
+ db: Session = Depends(get_db)
97
+ ):
98
+ """
99
+ Deletes a specific mind map.
100
+ """
101
+ mindmap = db.query(db_models.MindMap).filter(
102
+ db_models.MindMap.id == mindmap_id,
103
+ db_models.MindMap.user_id == current_user.id
104
+ ).first()
105
+
106
+ if not mindmap:
107
+ raise HTTPException(status_code=404, detail="Mind map not found")
108
+
109
+ db.delete(mindmap)
110
+ db.commit()
111
+ return {"message": "Mind map deleted successfully"}
api/podcast.py ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import logging
3
+ from datetime import datetime
4
+ from fastapi import APIRouter, Depends, HTTPException
5
+ from sqlalchemy.orm import Session
6
+ from typing import Dict, List
7
+
8
+ from api.auth import get_current_user
9
+ from models.schemas import PodcastAnalyzeRequest, PodcastGenerateRequest
10
+ from models import db_models
11
+ from core.database import get_db
12
+ from services.podcast_service import podcast_service
13
+ from services.s3_service import s3_service
14
+ from core import constants
15
+
16
+ router = APIRouter(prefix="/api/podcast", tags=["podcast"])
17
+ logger = logging.getLogger(__name__)
18
+
19
+ @router.get("/config")
20
+ async def get_podcast_config():
21
+ """Returns available voices, BGM, and formats for podcast generation."""
22
+ return {
23
+ "voices": constants.PODCAST_VOICES,
24
+ "bgm": constants.PODCAST_BGM,
25
+ "formats": constants.PODCAST_FORMATS
26
+ }
27
+
28
+ @router.post("/analyze")
29
+ async def analyze_source(
30
+ request: PodcastAnalyzeRequest,
31
+ current_user: db_models.User = Depends(get_current_user),
32
+ db: Session = Depends(get_db)):
33
+ """
34
+ Analyzes a source file from S3 and proposes podcast structures.
35
+ """
36
+ try:
37
+ # Verify file ownership via DB
38
+ source = db.query(db_models.Source).filter(
39
+ db_models.Source.s3_key == request.file_key,
40
+ db_models.Source.user_id == current_user.id
41
+ ).first()
42
+
43
+ if not source:
44
+ raise HTTPException(status_code=403, detail="Not authorized to access this file or file does not exist")
45
+
46
+ analysis = await podcast_service.analyze_pdf(
47
+ file_key=request.file_key,
48
+ duration_minutes=request.duration_minutes
49
+ )
50
+ return {"analysis": analysis}
51
+ except HTTPException:
52
+ raise
53
+ except Exception as e:
54
+ logger.error(f"Analysis failed: {e}")
55
+ raise HTTPException(status_code=500, detail=str(e))
56
+
57
+ @router.post("/generate")
58
+ async def generate_podcast(
59
+ request: PodcastGenerateRequest,
60
+ current_user: db_models.User = Depends(get_current_user),
61
+ db: Session = Depends(get_db)
62
+ ):
63
+ """
64
+ Generates a podcast script and then the audio.
65
+ Saves metadata to DB and returns the generated info.
66
+ """
67
+ try:
68
+ # 1. Verify file ownership if provided
69
+ if request.file_key:
70
+ source = db.query(db_models.Source).filter(
71
+ db_models.Source.s3_key == request.file_key,
72
+ db_models.Source.user_id == current_user.id
73
+ ).first()
74
+ if not source:
75
+ raise HTTPException(status_code=403, detail="Not authorized to access this file")
76
+
77
+ # 2. Generate Script
78
+ script = await podcast_service.generate_script(
79
+ user_prompt=request.user_prompt,
80
+ model=request.model,
81
+ duration_minutes=request.duration_minutes,
82
+ podcast_format=request.podcast_format,
83
+ pdf_suggestions=request.pdf_suggestions,
84
+ file_key=request.file_key
85
+ )
86
+
87
+ if not script:
88
+ raise HTTPException(status_code=500, detail="Failed to generate script")
89
+
90
+ # 3. Generate Audio
91
+ audio_path = await podcast_service.generate_full_audio(
92
+ script=script,
93
+ tts_model=request.tts_model,
94
+ spk1_voice=request.spk1_voice,
95
+ spk2_voice=request.spk2_voice,
96
+ temperature=request.temperature,
97
+ bgm_choice=request.bgm_choice
98
+ )
99
+
100
+ if not audio_path:
101
+ raise HTTPException(status_code=500, detail="Failed to generate audio")
102
+
103
+ # 4. Upload to S3
104
+ filename = os.path.basename(audio_path)
105
+ with open(audio_path, "rb") as f:
106
+ content = f.read()
107
+
108
+ s3_key = f"users/{current_user.id}/outputs/podcasts/{filename}"
109
+
110
+ import boto3
111
+ from core.config import settings
112
+ s3_client = boto3.client('s3',
113
+ aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
114
+ aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
115
+ region_name=settings.AWS_REGION)
116
+ s3_client.put_object(Bucket=settings.AWS_S3_BUCKET, Key=s3_key, Body=content)
117
+
118
+ public_url = s3_service.get_public_url(s3_key)
119
+ private_url = s3_service.get_presigned_url(s3_key)
120
+
121
+ # 5. Save to DB
122
+ db_podcast = db_models.Podcast(
123
+ title=f"Podcast {datetime.utcnow().strftime('%Y-%m-%d %H:%M')}",
124
+ s3_key=s3_key,
125
+ s3_url=public_url,
126
+ script=script,
127
+ user_id=current_user.id
128
+ )
129
+ db.add(db_podcast)
130
+ db.commit()
131
+ db.refresh(db_podcast)
132
+
133
+ # Clean up local file
134
+ os.remove(audio_path)
135
+
136
+ return {
137
+ "id": db_podcast.id,
138
+ "message": "Podcast generated successfully",
139
+ "script": script,
140
+ "public_url": public_url,
141
+ "private_url": private_url
142
+ }
143
+
144
+ except HTTPException:
145
+ raise
146
+ except Exception as e:
147
+ logger.error(f"Podcast generation failed: {e}")
148
+ raise HTTPException(status_code=500, detail=str(e))
149
+
150
+ @router.get("/list")
151
+ async def list_podcasts(
152
+ current_user: db_models.User = Depends(get_current_user),
153
+ db: Session = Depends(get_db)
154
+ ):
155
+ """
156
+ Lists all podcasts for the current user.
157
+ """
158
+ try:
159
+ podcasts = db.query(db_models.Podcast).filter(
160
+ db_models.Podcast.user_id == current_user.id
161
+ ).order_by(db_models.Podcast.created_at.desc()).all()
162
+
163
+ return [
164
+ {
165
+ "id": p.id,
166
+ "title": p.title,
167
+ "s3_key": p.s3_key,
168
+ "public_url": p.s3_url,
169
+ "private_url": s3_service.get_presigned_url(p.s3_key),
170
+ "script_preview": (p.script[:200] + "...") if p.script else "",
171
+ "created_at": p.created_at
172
+ }
173
+ for p in podcasts
174
+ ]
175
+ except Exception as e:
176
+ raise HTTPException(status_code=500, detail=str(e))
177
+
178
+ @router.delete("/{podcast_id}")
179
+ async def delete_podcast(
180
+ podcast_id: int,
181
+ current_user: db_models.User = Depends(get_current_user),
182
+ db: Session = Depends(get_db)
183
+ ):
184
+ """
185
+ Deletes a specific podcast from database and S3.
186
+ """
187
+ podcast = db.query(db_models.Podcast).filter(
188
+ db_models.Podcast.id == podcast_id,
189
+ db_models.Podcast.user_id == current_user.id
190
+ ).first()
191
+
192
+ if not podcast:
193
+ raise HTTPException(status_code=404, detail="Podcast not found")
194
+
195
+ try:
196
+ # 1. Delete from S3
197
+ await s3_service.delete_file(podcast.s3_key)
198
+
199
+ # 2. Delete from DB
200
+ db.delete(podcast)
201
+ db.commit()
202
+
203
+ return {"message": "Podcast and associated audio file deleted successfully"}
204
+ except Exception as e:
205
+ db.rollback()
206
+ logger.error(f"Failed to delete podcast: {e}")
207
+ raise HTTPException(status_code=500, detail=f"Deletion failed: {str(e)}")
api/quizzes.py ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ from fastapi import APIRouter, Depends, HTTPException
3
+ from sqlalchemy.orm import Session
4
+ from typing import List
5
+
6
+ from api.auth import get_current_user
7
+ from models import db_models
8
+ from models.schemas import QuizGenerateRequest, QuizSetResponse
9
+ from core.database import get_db
10
+ from services.quiz_service import quiz_service
11
+ from core import constants
12
+
13
+ router = APIRouter(prefix="/api/quizzes", tags=["quizzes"])
14
+ logger = logging.getLogger(__name__)
15
+
16
+ @router.get("/config")
17
+ async def get_quiz_config():
18
+ """Returns available difficulties, count options, and languages for quizzes."""
19
+ return {
20
+ "difficulties": constants.DIFFICULTIES,
21
+ "counts": constants.QUIZ_COUNTS,
22
+ "languages": constants.LANGUAGES
23
+ }
24
+
25
+ @router.post("/generate", response_model=QuizSetResponse)
26
+ async def generate_quiz(
27
+ request: QuizGenerateRequest,
28
+ current_user: db_models.User = Depends(get_current_user),
29
+ db: Session = Depends(get_db)
30
+ ):
31
+ """
32
+ Generates a set of quiz questions and saves them to the database.
33
+ """
34
+ try:
35
+ source_id = None
36
+ if request.file_key:
37
+ source = db.query(db_models.Source).filter(
38
+ db_models.Source.s3_key == request.file_key,
39
+ db_models.Source.user_id == current_user.id
40
+ ).first()
41
+ if not source:
42
+ raise HTTPException(status_code=403, detail="Not authorized to access this file")
43
+ source_id = source.id
44
+
45
+ # 1. Generate Quiz from AI
46
+ quizzes_data = await quiz_service.generate_quiz(
47
+ file_key=request.file_key,
48
+ text_input=request.text_input,
49
+ difficulty=request.difficulty,
50
+ topic=request.topic,
51
+ language=request.language,
52
+ count_mode=request.count
53
+ )
54
+
55
+ if not quizzes_data:
56
+ raise HTTPException(status_code=500, detail="Failed to generate quiz")
57
+
58
+ # 2. Save Quiz Set
59
+ title = request.topic if request.topic else f"Quiz {len(quizzes_data)}"
60
+ db_set = db_models.QuizSet(
61
+ title=title,
62
+ difficulty=request.difficulty,
63
+ user_id=current_user.id,
64
+ source_id=source_id
65
+ )
66
+ db.add(db_set)
67
+ db.commit()
68
+ db.refresh(db_set)
69
+
70
+ # 3. Save Questions
71
+ for item in quizzes_data:
72
+ db_question = db_models.QuizQuestion(
73
+ quiz_set_id=db_set.id,
74
+ question=item.get("question", ""),
75
+ hint=item.get("hint", ""),
76
+ choices=item.get("choices", {}),
77
+ answer=item.get("answer", "1"),
78
+ explanation=item.get("explanation", "")
79
+ )
80
+ db.add(db_question)
81
+
82
+ db.commit()
83
+ db.refresh(db_set)
84
+
85
+ return db_set
86
+
87
+ except HTTPException:
88
+ raise
89
+ except Exception as e:
90
+ logger.error(f"Quiz generation endpoint failed: {e}")
91
+ raise HTTPException(status_code=500, detail=str(e))
92
+
93
+ @router.get("/sets", response_model=List[QuizSetResponse])
94
+ async def list_quiz_sets(
95
+ current_user: db_models.User = Depends(get_current_user),
96
+ db: Session = Depends(get_db)
97
+ ):
98
+ """
99
+ Lists all quiz sets for the current user.
100
+ """
101
+ try:
102
+ sets = db.query(db_models.QuizSet).filter(
103
+ db_models.QuizSet.user_id == current_user.id
104
+ ).order_by(db_models.QuizSet.created_at.desc()).all()
105
+ return sets
106
+ except Exception as e:
107
+ raise HTTPException(status_code=500, detail=str(e))
108
+
109
+ @router.get("/set/{set_id}", response_model=QuizSetResponse)
110
+ async def get_quiz_set(
111
+ set_id: int,
112
+ current_user: db_models.User = Depends(get_current_user),
113
+ db: Session = Depends(get_db)
114
+ ):
115
+ """
116
+ Retrieves a specific quiz set.
117
+ """
118
+ db_set = db.query(db_models.QuizSet).filter(
119
+ db_models.QuizSet.id == set_id,
120
+ db_models.QuizSet.user_id == current_user.id
121
+ ).first()
122
+
123
+ if not db_set:
124
+ raise HTTPException(status_code=404, detail="Quiz set not found")
125
+
126
+ return db_set
127
+
128
+ @router.delete("/set/{set_id}")
129
+ async def delete_quiz_set(
130
+ set_id: int,
131
+ current_user: db_models.User = Depends(get_current_user),
132
+ db: Session = Depends(get_db)):
133
+ """
134
+ Deletes a specific quiz set and all its questions.
135
+ """
136
+ db_set = db.query(db_models.QuizSet).filter(
137
+ db_models.QuizSet.id == set_id,
138
+ db_models.QuizSet.user_id == current_user.id
139
+ ).first()
140
+
141
+ if not db_set:
142
+ raise HTTPException(status_code=404, detail="Quiz set not found")
143
+
144
+ db.delete(db_set)
145
+ db.commit()
146
+ return {"message": "Quiz set and all associated questions deleted successfully"}
api/rag.py ADDED
@@ -0,0 +1,280 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter, Depends, HTTPException
2
+ from sqlalchemy.orm import Session
3
+ from pydantic import BaseModel
4
+ from typing import List, Optional
5
+ import logging
6
+ import PyPDF2
7
+ import io
8
+ import uuid
9
+
10
+ from core.database import get_db
11
+ from models import db_models
12
+ from services.rag_service import rag_service
13
+ from services.s3_service import s3_service
14
+ from api.auth import get_current_user
15
+ from core.config import settings
16
+ from openai import OpenAI
17
+
18
+ router = APIRouter(prefix="/api/rag", tags=["RAG Document Management"])
19
+ logger = logging.getLogger(__name__)
20
+
21
+ # Request/Response Models
22
+ class RAGIndexRequest(BaseModel):
23
+ file_key: str # S3 key of source file to index
24
+
25
+ class RAGIndexResponse(BaseModel):
26
+ id: int
27
+ filename: str
28
+ azure_doc_id: str
29
+ chunk_count: int
30
+ message: str
31
+
32
+ class RAGDocumentResponse(BaseModel):
33
+ id: int
34
+ filename: str
35
+ azure_doc_id: str
36
+ chunk_count: int
37
+ source_id: Optional[int]
38
+ created_at: str
39
+
40
+ class Config:
41
+ from_attributes = True
42
+
43
+ def extract_text_from_pdf(file_content: bytes) -> str:
44
+ """Extract text from PDF file."""
45
+ try:
46
+ pdf_reader = PyPDF2.PdfReader(io.BytesIO(file_content))
47
+ text = ""
48
+ for page in pdf_reader.pages:
49
+ text += page.extract_text() + "\n"
50
+ return text.strip()
51
+ except Exception as e:
52
+ logger.error(f"Error extracting PDF text: {e}")
53
+ raise HTTPException(status_code=400, detail=f"Failed to extract text: {str(e)}")
54
+
55
+ def chunk_text(text: str, chunk_size: int = 1000, overlap: int = 200) -> List[str]:
56
+ """Split text into overlapping chunks."""
57
+ chunks = []
58
+ start = 0
59
+ while start < len(text):
60
+ end = start + chunk_size
61
+ chunks.append(text[start:end])
62
+ start += (chunk_size - overlap)
63
+ return chunks
64
+
65
+ @router.post("/index", response_model=RAGIndexResponse)
66
+ async def index_document(
67
+ request: RAGIndexRequest,
68
+ current_user: db_models.User = Depends(get_current_user),
69
+ db: Session = Depends(get_db)):
70
+ """
71
+ Index a document for AI search (one-time operation).
72
+ Downloads from S3, extracts text, generates embeddings, stores in Azure Search.
73
+ """
74
+ try:
75
+ # 1. Verify file ownership
76
+ source = db.query(db_models.Source).filter(
77
+ db_models.Source.s3_key == request.file_key,
78
+ db_models.Source.user_id == current_user.id
79
+ ).first()
80
+
81
+ if not source:
82
+ raise HTTPException(status_code=404, detail="File not found")
83
+
84
+ # 2. Check if already indexed
85
+ existing = db.query(db_models.RAGDocument).filter(
86
+ db_models.RAGDocument.source_id == source.id,
87
+ db_models.RAGDocument.user_id == current_user.id
88
+ ).first()
89
+
90
+ if existing:
91
+ return RAGIndexResponse(
92
+ id=existing.id,
93
+ filename=existing.filename,
94
+ azure_doc_id=existing.azure_doc_id,
95
+ chunk_count=existing.chunk_count,
96
+ message="Document already indexed"
97
+ )
98
+
99
+ # 3. Download from S3
100
+ logger.info(f"Downloading {request.file_key}...")
101
+
102
+ # Create temp local path
103
+ import tempfile
104
+ import os
105
+ with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(source.filename)[1]) as tmp:
106
+ temp_file = tmp.name
107
+
108
+ s3_service.s3_client.download_file(
109
+ settings.AWS_S3_BUCKET,
110
+ request.file_key,
111
+ temp_file
112
+ )
113
+
114
+ # 4. Extract text
115
+ try:
116
+ with open(temp_file, "rb") as f:
117
+ file_content = f.read()
118
+
119
+ if source.filename.lower().endswith('.pdf'):
120
+ text = extract_text_from_pdf(file_content)
121
+ elif source.filename.lower().endswith('.txt'):
122
+ text = file_content.decode('utf-8')
123
+ else:
124
+ raise HTTPException(status_code=400, detail="Only PDF and TXT supported")
125
+
126
+ if len(text) < 10:
127
+ raise HTTPException(status_code=400, detail="No readable text content found in file")
128
+
129
+ # 5. Chunk text
130
+ chunks = chunk_text(text)
131
+ logger.info(f"Created {len(chunks)} chunks")
132
+
133
+ # 6. Generate doc ID and index in Azure Search
134
+ doc_id = str(uuid.uuid4())
135
+ chunk_count = rag_service.index_document(
136
+ chunks=chunks,
137
+ filename=source.filename,
138
+ user_id=current_user.id,
139
+ doc_id=doc_id
140
+ )
141
+
142
+ # 7. Save to database
143
+ rag_doc = db_models.RAGDocument(
144
+ filename=source.filename,
145
+ azure_doc_id=doc_id,
146
+ chunk_count=chunk_count,
147
+ user_id=current_user.id,
148
+ source_id=source.id
149
+ )
150
+ db.add(rag_doc)
151
+ db.commit()
152
+ db.refresh(rag_doc)
153
+
154
+ logger.info(f"Successfully indexed {source.filename}")
155
+
156
+ return RAGIndexResponse(
157
+ id=rag_doc.id,
158
+ filename=rag_doc.filename,
159
+ azure_doc_id=rag_doc.azure_doc_id,
160
+ chunk_count=rag_doc.chunk_count,
161
+ message="Document indexed successfully for AI conversation"
162
+ )
163
+ finally:
164
+ if os.path.exists(temp_file):
165
+ os.remove(temp_file)
166
+
167
+ except HTTPException:
168
+ raise
169
+ except Exception as e:
170
+ logger.error(f"Error indexing document: {e}", exc_info=True)
171
+ raise HTTPException(status_code=500, detail=f"Indexing failed: {str(e)}")
172
+
173
+ @router.get("/documents", response_model=List[RAGDocumentResponse])
174
+ async def list_indexed_documents(
175
+ current_user: db_models.User = Depends(get_current_user),
176
+ db: Session = Depends(get_db)
177
+ ):
178
+ """List all documents that have been processed and are ready for chatting."""
179
+ documents = db.query(db_models.RAGDocument).filter(
180
+ db_models.RAGDocument.user_id == current_user.id
181
+ ).order_by(db_models.RAGDocument.created_at.desc()).all()
182
+
183
+ return [
184
+ RAGDocumentResponse(
185
+ id=doc.id,
186
+ filename=doc.filename,
187
+ azure_doc_id=doc.azure_doc_id,
188
+ chunk_count=doc.chunk_count,
189
+ source_id=doc.source_id,
190
+ created_at=doc.created_at.isoformat()
191
+ )
192
+ for doc in documents
193
+ ]
194
+
195
+ @router.delete("/documents/{doc_id}")
196
+ async def delete_indexed_document(
197
+ doc_id: str, # Azure doc ID
198
+ current_user: db_models.User = Depends(get_current_user),
199
+ db: Session = Depends(get_db)
200
+ ):
201
+ """Remove a document from the AI search index."""
202
+ # Find document
203
+ rag_doc = db.query(db_models.RAGDocument).filter(
204
+ db_models.RAGDocument.azure_doc_id == doc_id,
205
+ db_models.RAGDocument.user_id == current_user.id
206
+ ).first()
207
+
208
+ if not rag_doc:
209
+ raise HTTPException(status_code=404, detail="Document index entry not found")
210
+
211
+ try:
212
+ # Delete from Azure Search
213
+ rag_service.delete_document(doc_id)
214
+
215
+ # Delete from database
216
+ db.delete(rag_doc)
217
+ db.commit()
218
+
219
+ return {"message": "AI index for document deleted successfully"}
220
+
221
+ except Exception as e:
222
+ logger.error(f"Error deleting document index: {e}")
223
+ raise HTTPException(status_code=500, detail=f"Deletion failed: {str(e)}")
224
+
225
+ class RAGSummaryRequest(BaseModel):
226
+ rag_doc_id: int
227
+
228
+ @router.post("/summary")
229
+ async def generate_document_summary(
230
+ request: RAGSummaryRequest,
231
+ current_user: db_models.User = Depends(get_current_user),
232
+ db: Session = Depends(get_db)
233
+ ):
234
+ """
235
+ Generate an on-the-fly summary for an indexed document.
236
+ No data is stored in the database.
237
+ """
238
+ try:
239
+ # 1. Verify existence and ownership
240
+ rag_doc = db.query(db_models.RAGDocument).filter(
241
+ db_models.RAGDocument.id == request.rag_doc_id,
242
+ db_models.RAGDocument.user_id == current_user.id
243
+ ).first()
244
+
245
+ if not rag_doc:
246
+ raise HTTPException(status_code=404, detail="Document not found")
247
+
248
+ # 2. Fetch top chunks to build a summary
249
+ # We search with a generic prompt to get a representative spread of content
250
+ results = rag_service.search_document(
251
+ query="Give me a general overview and executive summary of this document.",
252
+ doc_id=rag_doc.azure_doc_id,
253
+ user_id=current_user.id,
254
+ top_k=8 # Fetch more context for a better summary
255
+ )
256
+
257
+ if not results:
258
+ return {"summary": "No content found to summarize."}
259
+
260
+ context = "\n\n".join([r["content"] for r in results])
261
+
262
+ # 3. Generate summary using OpenAI
263
+ openai_client = OpenAI(api_key=settings.OPENAI_API_KEY)
264
+ response = openai_client.chat.completions.create(
265
+ model="gpt-4o-mini",
266
+ messages=[
267
+ {
268
+ "role": "system",
269
+ "content": "You are a professional document analyst. Provide a concise, high-level summary (3-5 sentences) of the document based on the provided context."
270
+ },
271
+ {"role": "user", "content": f"Context from '{rag_doc.filename}':\n\n{context}"}
272
+ ],
273
+ temperature=0.5
274
+ )
275
+
276
+ return {"summary": response.choices[0].message.content}
277
+
278
+ except Exception as e:
279
+ logger.error(f"Summary generation failed: {e}")
280
+ raise HTTPException(status_code=500, detail=f"Failed to generate summary: {str(e)}")
api/reports.py ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ from fastapi import APIRouter, Depends, HTTPException
3
+ from sqlalchemy.orm import Session
4
+ from typing import List, Optional
5
+
6
+ from api.auth import get_current_user
7
+ from models import db_models
8
+ from models.schemas import ReportGenerateRequest, ReportResponse, ReportFormatSuggestionResponse
9
+ from core.database import get_db
10
+ from services.report_service import report_service
11
+ from core import constants
12
+
13
+ router = APIRouter(prefix="/api/reports", tags=["reports"])
14
+ logger = logging.getLogger(__name__)
15
+
16
+ @router.get("/config")
17
+ async def get_report_config():
18
+ """Returns available formats and languages for report generation."""
19
+ return {
20
+ "formats": constants.REPORT_FORMAT_OPTIONS,
21
+ "languages": constants.LANGUAGES
22
+ }
23
+
24
+ @router.post("/suggest-formats", response_model=ReportFormatSuggestionResponse)
25
+ async def suggest_formats(
26
+ file_key: Optional[str] = None,
27
+ text_input: Optional[str] = None,
28
+ language: str = "Japanese",
29
+ current_user: db_models.User = Depends(get_current_user)
30
+ ):
31
+ """
32
+ Get 4 AI-suggested report formats based on content.
33
+ """
34
+ suggestions = await report_service.generate_format_suggestions(
35
+ file_key=file_key,
36
+ text_input=text_input,
37
+ language=language
38
+ )
39
+ return {"suggestions": suggestions}
40
+
41
+ @router.post("/generate", response_model=ReportResponse)
42
+ async def generate_report(
43
+ request: ReportGenerateRequest,
44
+ current_user: db_models.User = Depends(get_current_user),
45
+ db: Session = Depends(get_db)
46
+ ):
47
+ """
48
+ Generates a full report and saves it to the database.
49
+ """
50
+ try:
51
+ source_id = None
52
+ if request.file_key:
53
+ source = db.query(db_models.Source).filter(
54
+ db_models.Source.s3_key == request.file_key,
55
+ db_models.Source.user_id == current_user.id
56
+ ).first()
57
+ if not source:
58
+ raise HTTPException(status_code=403, detail="Not authorized to access this file")
59
+ source_id = source.id
60
+
61
+ # 1. Generate Report from AI
62
+ content = await report_service.generate_report(
63
+ file_key=request.file_key,
64
+ text_input=request.text_input,
65
+ format_key=request.format_key,
66
+ custom_prompt=request.custom_prompt,
67
+ language=request.language
68
+ )
69
+
70
+ if not content:
71
+ raise HTTPException(status_code=500, detail="Failed to generate report")
72
+
73
+ # 2. Extract title (usually the first line)
74
+ title = content.split('\n')[0].replace('#', '').strip()
75
+ if not title or len(title) < 3:
76
+ title = f"Report {request.format_key}"
77
+
78
+ # 3. Save to DB
79
+ db_report = db_models.Report(
80
+ title=title,
81
+ content=content,
82
+ format_key=request.format_key,
83
+ user_id=current_user.id,
84
+ source_id=source_id
85
+ )
86
+ db.add(db_report)
87
+ db.commit()
88
+ db.refresh(db_report)
89
+
90
+ return db_report
91
+
92
+ except HTTPException:
93
+ raise
94
+ except Exception as e:
95
+ logger.error(f"Report generation endpoint failed: {e}")
96
+ raise HTTPException(status_code=500, detail=str(e))
97
+
98
+ @router.get("/list", response_model=List[ReportResponse])
99
+ async def list_reports(
100
+ current_user: db_models.User = Depends(get_current_user),
101
+ db: Session = Depends(get_db)
102
+ ):
103
+ """
104
+ Lists all reports for the current user.
105
+ """
106
+ try:
107
+ reports = db.query(db_models.Report).filter(
108
+ db_models.Report.user_id == current_user.id
109
+ ).order_by(db_models.Report.created_at.desc()).all()
110
+ return reports
111
+ except Exception as e:
112
+ raise HTTPException(status_code=500, detail=str(e))
113
+
114
+ @router.get("/{report_id}", response_model=ReportResponse)
115
+ async def get_report(
116
+ report_id: int,
117
+ current_user: db_models.User = Depends(get_current_user),
118
+ db: Session = Depends(get_db)
119
+ ):
120
+ """
121
+ Retrieves a specific report.
122
+ """
123
+ report = db.query(db_models.Report).filter(
124
+ db_models.Report.id == report_id,
125
+ db_models.Report.user_id == current_user.id
126
+ ).first()
127
+
128
+ if not report:
129
+ raise HTTPException(status_code=404, detail="Report not found")
130
+
131
+ return report
132
+
133
+ @router.delete("/{report_id}")
134
+ async def delete_report(
135
+ report_id: int,
136
+ current_user: db_models.User = Depends(get_current_user),
137
+ db: Session = Depends(get_db)
138
+ ):
139
+ """
140
+ Deletes a specific report.
141
+ """
142
+ report = db.query(db_models.Report).filter(
143
+ db_models.Report.id == report_id,
144
+ db_models.Report.user_id == current_user.id
145
+ ).first()
146
+
147
+ if not report:
148
+ raise HTTPException(status_code=404, detail="Report not found")
149
+
150
+ db.delete(report)
151
+ db.commit()
152
+ return {"message": "Report deleted successfully"}
api/sources.py ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter, Depends, UploadFile, File, HTTPException
2
+ from typing import List
3
+ from sqlalchemy.orm import Session
4
+ from services.s3_service import s3_service
5
+ from api.auth import get_current_user
6
+ from core.database import get_db
7
+ from models import db_models
8
+ from models.schemas import SourceFileResponse
9
+ from services.rag_service import rag_service
10
+
11
+ router = APIRouter(prefix="/api/sources", tags=["sources"])
12
+
13
+ @router.post("/upload", response_model=dict)
14
+ async def upload_source(
15
+ file: UploadFile = File(...),
16
+ current_user: db_models.User = Depends(get_current_user),
17
+ db: Session = Depends(get_db)
18
+ ):
19
+ try:
20
+ content = await file.read()
21
+ file_info = await s3_service.upload_file(
22
+ file_content=content,
23
+ filename=file.filename,
24
+ user_id=str(current_user.id)
25
+ )
26
+
27
+ # Save metadata to database
28
+ db_source = db_models.Source(
29
+ filename=file.filename,
30
+ s3_key=file_info["key"],
31
+ s3_url=file_info["public_url"], # Store public URL in DB
32
+ size=len(content),
33
+ user_id=current_user.id
34
+ )
35
+ db.add(db_source)
36
+ db.commit()
37
+ db.refresh(db_source)
38
+
39
+ return {
40
+ "id": db_source.id,
41
+ "filename": file.filename,
42
+ "key": file_info["key"],
43
+ "public_url": file_info["public_url"],
44
+ "private_url": file_info["private_url"],
45
+ "message": "Upload successful"
46
+ }
47
+ except Exception as e:
48
+ raise HTTPException(status_code=500, detail=str(e))
49
+
50
+ @router.get("/list", response_model=List[SourceFileResponse])
51
+ async def list_sources(
52
+ current_user: db_models.User = Depends(get_current_user),
53
+ db: Session = Depends(get_db)
54
+ ):
55
+ try:
56
+ # Join Source with RAGDocument to get indexing info if it exists
57
+ results = db.query(
58
+ db_models.Source,
59
+ db_models.RAGDocument.id.label("rag_id"),
60
+ db_models.RAGDocument.azure_doc_id
61
+ ).outerjoin(
62
+ db_models.RAGDocument,
63
+ db_models.Source.id == db_models.RAGDocument.source_id
64
+ ).filter(
65
+ db_models.Source.user_id == current_user.id
66
+ ).all()
67
+
68
+ response_sources = []
69
+ for source, rag_id, azure_doc_id in results:
70
+ response_sources.append({
71
+ "id": source.id,
72
+ "filename": source.filename,
73
+ "s3_key": source.s3_key,
74
+ "public_url": source.s3_url,
75
+ "private_url": s3_service.get_presigned_url(source.s3_key),
76
+ "size": source.size,
77
+ "created_at": source.created_at,
78
+ "rag_id": rag_id,
79
+ "azure_doc_id": azure_doc_id
80
+ })
81
+ return response_sources
82
+ except Exception as e:
83
+ raise HTTPException(status_code=500, detail=str(e))
84
+
85
+ @router.delete("/{source_id}")
86
+ async def delete_source(
87
+ source_id: int,
88
+ current_user: db_models.User = Depends(get_current_user),
89
+ db: Session = Depends(get_db)
90
+ ):
91
+ source = db.query(db_models.Source).filter(
92
+ db_models.Source.id == source_id,
93
+ db_models.Source.user_id == current_user.id
94
+ ).first()
95
+
96
+ if not source:
97
+ raise HTTPException(status_code=404, detail="Source not found")
98
+
99
+ try:
100
+ # 1. Handle RAG Document (Delete completely as it's useless without the source)
101
+ rag_doc = db.query(db_models.RAGDocument).filter(
102
+ db_models.RAGDocument.source_id == source.id
103
+ ).first()
104
+
105
+ if rag_doc:
106
+ # Delete from Azure Search
107
+ rag_service.delete_document(rag_doc.azure_doc_id)
108
+ # Delete from DB
109
+ db.delete(rag_doc)
110
+
111
+ # 2. Handle other dependencies (Delete everything linked to this source)
112
+ # We must delete children (Flashcards, Questions) before parents (Sets) because of SQL constraints
113
+
114
+ # Delete Flashcards
115
+ flashcard_set_ids = [s.id for s in db.query(db_models.FlashcardSet).filter(db_models.FlashcardSet.source_id == source.id).all()]
116
+ if flashcard_set_ids:
117
+ db.query(db_models.Flashcard).filter(db_models.Flashcard.flashcard_set_id.in_(flashcard_set_ids)).delete(synchronize_session=False)
118
+
119
+ # Delete Quiz Questions
120
+ quiz_set_ids = [s.id for s in db.query(db_models.QuizSet).filter(db_models.QuizSet.source_id == source.id).all()]
121
+ if quiz_set_ids:
122
+ db.query(db_models.QuizQuestion).filter(db_models.QuizQuestion.quiz_set_id.in_(quiz_set_ids)).delete(synchronize_session=False)
123
+
124
+ # Now delete the sets and other items
125
+ db.query(db_models.MindMap).filter(db_models.MindMap.source_id == source.id).delete()
126
+ db.query(db_models.FlashcardSet).filter(db_models.FlashcardSet.source_id == source.id).delete()
127
+ db.query(db_models.QuizSet).filter(db_models.QuizSet.source_id == source.id).delete()
128
+ db.query(db_models.Report).filter(db_models.Report.source_id == source.id).delete()
129
+ db.query(db_models.VideoSummary).filter(db_models.VideoSummary.source_id == source.id).delete()
130
+
131
+ db.commit() # Commit deletions
132
+
133
+ # 3. Delete from S3
134
+ await s3_service.delete_file(source.s3_key)
135
+
136
+ # 4. Delete the Source itself from Database
137
+ db.delete(source)
138
+ db.commit()
139
+
140
+ return {"message": "Source and all associated generated content (mind maps, quizzes, etc.) deleted successfully."}
141
+
142
+ except Exception as e:
143
+ db.rollback()
144
+ raise HTTPException(status_code=500, detail=f"Failed to delete source: {str(e)}")
api/video_generator.py ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ from fastapi import APIRouter, Depends, HTTPException
3
+ from sqlalchemy.orm import Session
4
+ from typing import List
5
+
6
+ from api.auth import get_current_user
7
+ from models import db_models
8
+ from models.schemas import VideoSummaryGenerateRequest, VideoSummaryResponse
9
+ from core.database import get_db
10
+ from services.video_generator_service import video_generator_service
11
+ from services.slides_video_service import slides_video_service
12
+ from services.s3_service import s3_service
13
+
14
+ router = APIRouter(prefix="/api/videos", tags=["video-generator"])
15
+ logger = logging.getLogger(__name__)
16
+
17
+ @router.post("/generate", response_model=VideoSummaryResponse)
18
+ async def generate_video_summary(
19
+ request: VideoSummaryGenerateRequest,
20
+ current_user: db_models.User = Depends(get_current_user),
21
+ db: Session = Depends(get_db)
22
+ ):
23
+ """
24
+ Analyzes a PDF and generates a narrated video summary.
25
+ """
26
+ try:
27
+ # Check source ownership
28
+ source = db.query(db_models.Source).filter(
29
+ db_models.Source.s3_key == request.file_key,
30
+ db_models.Source.user_id == current_user.id
31
+ ).first()
32
+
33
+ if not source:
34
+ raise HTTPException(status_code=403, detail="Not authorized to access this file")
35
+
36
+ if request.use_slides_transformation:
37
+ # Full PDF -> Slides -> Video pipeline
38
+ result = await slides_video_service.generate_transformed_video_summary(
39
+ file_key=request.file_key,
40
+ language=request.language,
41
+ voice_name=request.voice_name,
42
+ custom_prompt=request.custom_prompt
43
+ )
44
+ else:
45
+ # Standard PDF -> Video pipeline (high fidelity version)
46
+ result = await video_generator_service.generate_video_summary(
47
+ file_key=request.file_key,
48
+ language=request.language,
49
+ voice_name=request.voice_name
50
+ )
51
+
52
+ # Save to DB
53
+ db_summary = db_models.VideoSummary(
54
+ title=result["title"],
55
+ s3_key=result["s3_key"],
56
+ s3_url=result["s3_url"],
57
+ user_id=current_user.id,
58
+ source_id=source.id
59
+ )
60
+ db.add(db_summary)
61
+ db.commit()
62
+ db.refresh(db_summary)
63
+
64
+ return {
65
+ "id": db_summary.id,
66
+ "title": db_summary.title,
67
+ "s3_key": db_summary.s3_key,
68
+ "public_url": db_summary.s3_url,
69
+ "private_url": s3_service.get_presigned_url(db_summary.s3_key),
70
+ "created_at": db_summary.created_at
71
+ }
72
+
73
+ except Exception as e:
74
+ logger.error(f"Video summary endpoint failed: {e}")
75
+ raise HTTPException(status_code=500, detail=str(e))
76
+
77
+ @router.get("/list", response_model=List[VideoSummaryResponse])
78
+ async def list_video_summaries(
79
+ current_user: db_models.User = Depends(get_current_user),
80
+ db: Session = Depends(get_db)
81
+ ):
82
+ """
83
+ Lists all generated video summaries for the current user.
84
+ """
85
+ try:
86
+ summaries = db.query(db_models.VideoSummary).filter(
87
+ db_models.VideoSummary.user_id == current_user.id
88
+ ).order_by(db_models.VideoSummary.created_at.desc()).all()
89
+
90
+ return [
91
+ {
92
+ "id": s.id,
93
+ "title": s.title,
94
+ "s3_key": s.s3_key,
95
+ "public_url": s.s3_url,
96
+ "private_url": s3_service.get_presigned_url(s.s3_key),
97
+ "created_at": s.created_at
98
+ }
99
+ for s in summaries
100
+ ]
101
+ except Exception as e:
102
+ raise HTTPException(status_code=500, detail=str(e))
103
+
104
+ @router.delete("/{video_id}")
105
+ async def delete_video_summary(
106
+ video_id: int,
107
+ current_user: db_models.User = Depends(get_current_user),
108
+ db: Session = Depends(get_db)
109
+ ):
110
+ """
111
+ Deletes a specific video summary from database and S3.
112
+ """
113
+ summary = db.query(db_models.VideoSummary).filter(
114
+ db_models.VideoSummary.id == video_id,
115
+ db_models.VideoSummary.user_id == current_user.id
116
+ ).first()
117
+
118
+ if not summary:
119
+ raise HTTPException(status_code=404, detail="Video summary not found")
120
+
121
+ try:
122
+ # 1. Delete from S3
123
+ await s3_service.delete_file(summary.s3_key)
124
+
125
+ # 2. Delete from DB
126
+ db.delete(summary)
127
+ db.commit()
128
+
129
+ return {"message": "Video summary and associated S3 file deleted successfully"}
130
+ except Exception as e:
131
+ db.rollback()
132
+ logger.error(f"Failed to delete video summary: {e}")
133
+ raise HTTPException(status_code=500, detail=f"Deletion failed: {str(e)}")
assets/bgm/BGM_1.mp3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:246a19adcdc9adacdfe15ba2883848a386e62d25a2cd53c4114b5ebecd4f8b98
3
+ size 3681612
assets/bgm/BGM_2.mp3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca72c65ec30957233829d9b55715d3b7bf69ad19ae7c236abeed920c15745f12
3
+ size 4039174
assets/bgm/BGM_3.mp3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e597d3338700f45f8869287ff3327df6e8fde3580c37ffa1bca916fb3b5f3ff
3
+ size 4910897
core/__init__.py ADDED
File without changes
core/config.py ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic_settings import BaseSettings, SettingsConfigDict
2
+ from typing import Optional
3
+
4
+ class Settings(BaseSettings):
5
+ # AWS Settings
6
+ AWS_ACCESS_KEY_ID: Optional[str] = None
7
+ AWS_SECRET_ACCESS_KEY: Optional[str] = None
8
+ AWS_REGION: str = "us-east-1"
9
+ AWS_S3_BUCKET: Optional[str] = None
10
+
11
+ # Security
12
+ SECRET_KEY: str = "supersecret-placeholder"
13
+ ALGORITHM: str = "HS256"
14
+ ACCESS_TOKEN_EXPIRE_MINUTES: int = 30
15
+
16
+ # LLM Keys
17
+ OPENAI_API_KEY: Optional[str] = None
18
+ GEMINI_API_KEY: Optional[str] = None
19
+
20
+ # Database
21
+ DATABASE_URL: Optional[str] = None
22
+
23
+ # Google / Transformation Settings
24
+ GOOGLE_OAUTH_CLIENT_SECRETS: Optional[str] = None
25
+ GOOGLE_OAUTH_TOKEN: Optional[str] = None
26
+ GOOGLE_OAUTH_TOKEN_JSON: Optional[str] = None
27
+ GCS_BUCKET: Optional[str] = None
28
+ GCP_SA_JSON: Optional[str] = None
29
+ GEMINI_USE_VERTEX: Optional[str] = None
30
+ GCP_PROJECT: Optional[str] = None
31
+ GCP_LOCATION: Optional[str] = None
32
+ DRIVE_FOLDER_ID: Optional[str] = None
33
+
34
+ # Azure RAG Settings
35
+ AZURE_SEARCH_KEY: Optional[str] = None
36
+ AZURE_SEARCH_INDEX_NAME: Optional[str] = None
37
+ BLOB_CONNECTION_STRING: Optional[str] = None
38
+ BLOB_CONTAINER_NAME: Optional[str] = None
39
+ AZURE_SEARCH_ENDPOINT: Optional[str] = None
40
+ AZURE_OPENAI_ENDPOINT: Optional[str] = None
41
+ AZURE_OPENAI_API_KEY: Optional[str] = None
42
+ AZURE_OPENAI_DEPLOYMENT_NAME: Optional[str] = None
43
+ AZURE_OPENAI_API_VERSION: Optional[str] = None
44
+
45
+ model_config = SettingsConfigDict(
46
+ env_file=".env",
47
+ extra="ignore"
48
+ )
49
+
50
+ settings = Settings()
core/constants.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # --- Common ---
2
+ LANGUAGES = [
3
+ {"value": "English", "label": "English"},
4
+ {"value": "Japanese", "label": "Japanese"}
5
+ ]
6
+
7
+ DIFFICULTIES = [
8
+ {"value": "easy", "label": "Easy"},
9
+ {"value": "medium", "label": "Medium"},
10
+ {"value": "hard", "label": "Hard"}
11
+ ]
12
+
13
+ # --- Podcast ---
14
+ PODCAST_VOICES = [
15
+ {"value": "Zephyr", "label": "Zephyr"},
16
+ {"value": "Puck", "label": "Puck"},
17
+ {"value": "Charon", "label": "Charon"},
18
+ {"value": "Kore", "label": "Kore"},
19
+ {"value": "Fenrir", "label": "Fenrir"},
20
+ {"value": "Leda", "label": "Leda"},
21
+ {"value": "Orus", "label": "Orus"},
22
+ {"value": "Aoede", "label": "Aoede"},
23
+ {"value": "Callirrhoe", "label": "Callirrhoe"},
24
+ {"value": "Autonoe", "label": "Autonoe"},
25
+ {"value": "Enceladus", "label": "Enceladus"},
26
+ {"value": "Iapetus", "label": "Iapetus"},
27
+ {"value": "Umbriel", "label": "Umbriel"},
28
+ {"value": "Algieba", "label": "Algieba"},
29
+ {"value": "Despina", "label": "Despina"},
30
+ {"value": "Erinome", "label": "Erinome"},
31
+ {"value": "Algenib", "label": "Algenib"},
32
+ {"value": "Rasalgethi", "label": "Rasalgethi"},
33
+ {"value": "Laomedeia", "label": "Laomedeia"},
34
+ {"value": "Achernar", "label": "Achernar"},
35
+ {"value": "Alnilam", "label": "Alnilam"},
36
+ {"value": "Schedar", "label": "Schedar"},
37
+ {"value": "Gacrux", "label": "Gacrux"},
38
+ {"value": "Pulcherrima", "label": "Pulcherrima"},
39
+ {"value": "Achird", "label": "Achird"},
40
+ {"value": "Zubenelgenubi", "label": "Zubenelgenubi"},
41
+ {"value": "Vindemiatrix", "label": "Vindemiatrix"},
42
+ {"value": "Sadachbia", "label": "Sadachbia"},
43
+ {"value": "Sadaltager", "label": "Sadaltager"},
44
+ {"value": "Sulafat", "label": "Sulafat"}
45
+ ]
46
+
47
+ PODCAST_BGM = [
48
+ {"value": "No BGM", "label": "No BGM"},
49
+ {"value": "BGM 1", "label": "Background Music 1"},
50
+ {"value": "BGM 2", "label": "Background Music 2"},
51
+ {"value": "BGM 3", "label": "Background Music 3"}
52
+ ]
53
+
54
+ PODCAST_FORMATS = [
55
+ {"value": "deep dive", "label": "Deep Dive"},
56
+ {"value": "debate", "label": "Debate"},
57
+ {"value": "summary", "label": "Summary"},
58
+ {"value": "tutorial", "label": "Tutorial"},
59
+ {"value": "interview", "label": "Interview"}
60
+ ]
61
+
62
+ # --- Flashcards ---
63
+ FLASHCARD_QUANTITIES = [
64
+ {"value": "fewer", "label": "Fewer (15-20)"},
65
+ {"value": "standard", "label": "Standard (35-40)"},
66
+ {"value": "more", "label": "More (55-70)"}
67
+ ]
68
+
69
+ # --- Quizzes ---
70
+ QUIZ_COUNTS = [
71
+ {"value": "FEWER", "label": "Fewer (5 Questions)"},
72
+ {"value": "STANDARD", "label": "Standard (10 Questions)"},
73
+ {"value": "MORE", "label": "More (20 Questions)"}
74
+ ]
75
+
76
+ # --- Reports ---
77
+ REPORT_FORMAT_OPTIONS = [
78
+ {
79
+ "value": "briefing_doc",
80
+ "label": "Briefing Document",
81
+ "description": "Overview of your sources featuring key insights and quotes.",
82
+ "prompt": "Create a comprehensive briefing document that synthesizes the main themes and ideas from the sources. Start with a concise Executive Summary that presents the most critical takeaways upfront. The body of the document must provide a detailed and thorough examination of the main themes, evidence, and conclusions found in the sources. This analysis should be structured logically with headings and bullet points to ensure clarity. The tone must be objective and incisive.",
83
+ "prompt_jp": "提供されたソースから主要なテーマとアイデアを統合した包括的なブリーフィング文書を作成してください。最も重要な要点を最初に提示する簡潔なエグゼクティブサマリーから始めてください。文書の本文では、ソースで見つかった主要なテーマ、証拠、結論の詳細で徹底的な検討を提供する必要があります。この分析は、明確さを確保するために見出しと箇条書きで論理的に構成される必要があります。トーンは客観的で鋭いものでなければなりません。"
84
+ },
85
+ {
86
+ "value": "study_guide",
87
+ "label": "Study Guide",
88
+ "description": "Short-answer quiz, suggested essay questions, and glossary of key terms.",
89
+ "prompt": "You are a highly capable research assistant and tutor. Create a detailed study guide designed to review understanding of the sources. Create a quiz with ten short-answer questions (2-3 sentences each) and include a separate answer key. Suggest five essay format questions, but do not supply answers. Also conclude with a comprehensive glossary of key terms with definitions.",
90
+ "prompt_jp": "あなたは非常に有能な研究助手兼家庭教師です。ソースの理解を復習するために設計された詳細な学習ガイドを作成してください。10問の短答式クイズ(各2-3文)を作成し、別途解答キーを含めてください。5つのエッセイ形式の質問を提案しますが、答えは提供しないでください。また、定義付きの主要用語の包括的な用語集で締めくくって��ださい。"
91
+ },
92
+ {
93
+ "value": "blog_post",
94
+ "label": "Blog Post",
95
+ "description": "Insightful takeaways distilled into a highly readable article.",
96
+ "prompt": "Act as a thoughtful writer and synthesizer of ideas, tasked with creating an engaging and readable blog post for a popular online publishing platform known for its clean aesthetic and insightful content. Your goal is to distill the top most surprising, counter-intuitive, or impactful takeaways from the provided source materials into a compelling listicle. The writing style should be clean, accessible, and highly scannable, employing a conversational yet intelligent tone. Craft a compelling, click-worthy headline. Begin the article with a short introduction that hooks the reader by establishing a relatable problem or curiosity, then present each of the takeaway points as a distinct section with a clear, bolded subheading. Within each section, use short paragraphs to explain the concept clearly, and don't just summarize; offer a brief analysis or a reflection on why this point is so interesting or important, and if a powerful quote exists in the sources, feature it in a blockquote for emphasis. Conclude the post with a brief, forward-looking summary that leaves the reader with a final thought-provoking question or a powerful takeaway to ponder.",
97
+ "prompt_jp": "清潔な美学と洞察に富んだコンテンツで知られる人気のオンライン出版プラットフォーム向けに、魅力的で読みやすいブログ記事を作成することを任された、思慮深いライター兼アイデアの統合者として行動してください。あなたの目標は、提供されたソース資料から最も驚くべき、直感に反する、または影響力のある要点を、魅力的なリスト記事に蒸留することです。文章スタイルは清潔で、親しみやすく、非常にスキャンしやすいもので、会話的でありながら知的なトーンを採用してください。魅力的でクリックしたくなる見出しを作成してください。読者を引き込む短い紹介で記事を始め、親しみやすい問題や好奇心を確立し、その後、各要点を明確で太字の小見出しを持つ別個のセクションとして提示してください。各セクション内では、短い段落を使用して概念を明確に説明し、単に要約するだけでなく、なぜこの点がそれほど興味深いのか、または重要なのかについての簡潔な分析や考察を提供し、ソースに強力な引用が存在する場合は、強調のためにブロッククォートで紹介してください。読者に最終的な思考を促す質問や強力な要点を残す簡潔で前向きな要約で記事を締めくくってください。"
98
+ },
99
+ {
100
+ "value": "custom",
101
+ "label": "Custom Prompt",
102
+ "description": "Generate a report based on your specific instructions.",
103
+ "prompt": "",
104
+ "prompt_jp": ""
105
+ }
106
+ ]
core/database.py ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from sqlalchemy import create_engine
2
+ from sqlalchemy.ext.declarative import declarative_base
3
+ from sqlalchemy.orm import sessionmaker
4
+ from .config import settings
5
+
6
+ # For Azure SQL, we use the DATABASE_URL from .env
7
+ # If not provided, fallback to a local sqlite for safety (optional)
8
+ SQLALCHEMY_DATABASE_URL = settings.DATABASE_URL or "sqlite:///./temp.db"
9
+
10
+ # Create engine with stability settings for Azure SQL
11
+ engine = create_engine(
12
+ SQLALCHEMY_DATABASE_URL,
13
+ pool_pre_ping=True, # Check connection health before every query
14
+ pool_recycle=300, # Refresh connections every 5 minutes
15
+ pool_size=10, # Maintain up to 10 connections
16
+ max_overflow=20 # Allow 20 extra if busy
17
+ )
18
+
19
+ SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
20
+
21
+ Base = declarative_base()
22
+
23
+ def get_db():
24
+ db = SessionLocal()
25
+ try:
26
+ yield db
27
+ finally:
28
+ db.close()
29
+
30
+ def init_db():
31
+ import models.db_models # Ensure models are loaded
32
+ Base.metadata.create_all(bind=engine)
core/prompts.py ADDED
@@ -0,0 +1,573 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ SYSTEM_PROMPT = """
2
+ You are a professional podcast scriptwriter creating a natural, engaging Japanese podcast conversation.
3
+
4
+ ────────────────────────
5
+ 1. Speaker Roles (CRITICAL)
6
+ ────────────────────────
7
+ - Use ONLY:
8
+ - Speaker 1: Curious host and listener representative
9
+ - Speaker 2: Calm expert and explainer
10
+ - Speakers must strictly alternate.
11
+ - Turn length must vary:
12
+ - Some turns: 1-2 sentences (reactions, confirmations)
13
+ - Some turns: 4-6 sentences (explanations)
14
+ - Do NOT make all turns similar in length.
15
+ - Speaker 1 asks questions, reacts emotionally, summarizes, and paraphrases.
16
+ - Speaker 2 explains concepts, gives background, adds practical context, and avoids lecturing.
17
+
18
+ ────────────────────────
19
+ 1.5 Conversational Dynamics (MANDATORY)
20
+ ────────────────────────
21
+ - Speaker 1 must occasionally:
22
+ - Misinterpret a concept slightly
23
+ - Ask a naive or overly simplified question
24
+ - React emotionally before fully understanding
25
+ - Speaker 2 must:
26
+ - Gently correct or reframe Speaker 1's understanding
27
+ - Use analogies or metaphors when concepts get abstract
28
+ - At least once per major topic:
29
+ - Speaker 1 interrupts with a short reaction (1-2 sentences)
30
+ - Speaker 2 adjusts the explanation in response
31
+
32
+ ────────────────────────
33
+ 2. Length & Coverage
34
+ ────────────────────────
35
+ - Total length MUST be {target_words} Japanese words (±10%).
36
+ - Do NOT summarize the PDF.
37
+ - Expand content with background, examples, implications, and real-world context.
38
+ - Include as much detail from the PDF as possible.
39
+ - Do NOT mention page numbers.
40
+ - If the source content is too large, split it into multiple parts and fully complete each part.
41
+
42
+ ────────────────────────
43
+ 3. Conversation Flow (MANDATORY)
44
+ ────────────────────────
45
+ Follow this flow naturally (do NOT label sections):
46
+
47
+ 1. Friendly greetings and a clear statement of today's topic
48
+ 2. Introduction of “Today's Talk Topics”
49
+ 3. For each topic:
50
+ - Why it matters (social or practical background)
51
+ - What it is (definitions or structure)
52
+ - How it works in practice (real examples, field usage)
53
+ - Challenges, trade-offs, or side effects
54
+ - Why it remains important
55
+ 4. Gentle recap of key ideas
56
+ 5. Short teaser for the next episode
57
+
58
+ ────────────────────────
59
+ 4. Podcast Style & Tone
60
+ ────────────────────────
61
+ - Use fillers thoughtfully and naturally:
62
+ “um,” “well,” “you know,” “for example”
63
+ - Add light laughter, empathy, and warmth when appropriate:
64
+ “(laughs),” “I get that,” “that happens a lot”
65
+ - Avoid strong assertions; prefer:
66
+ “you could say,” “one aspect is,” “it seems that”
67
+ - Speaker 1 should occasionally paraphrase Speaker 2:
68
+ “So basically, you're saying that…?”
69
+
70
+ ────────────────────────
71
+ 5. Restrictions
72
+ ────────────────────────
73
+ - No URLs, no bullet points, no metadata, no code.
74
+ - Output ONLY the podcast script text.
75
+ - Keep the tone friendly, polite, and suitable for audio listening.
76
+
77
+ ────────────────────────
78
+ 6. Source Material
79
+ ────────────────────────
80
+ - Use {pdf_suggestions} as inspiration and factual grounding.
81
+ - Podcast format: {podcast_format}
82
+
83
+ Output example:
84
+ Speaker 1: Hello everyone, today we're talking about...
85
+ Speaker 2: That's a great topic. Well, if we look at the background...
86
+ """
87
+
88
+
89
+ ANALYSIS_PROMPT = """
90
+ Please analyze the content of this PDF file and generate podcast episode proposals.
91
+
92
+ IMPORTANT: The target podcast duration is {duration_minutes} minutes. Please structure the program accordingly:
93
+ - For {duration_minutes} minutes, plan approximately words total (500 words per minute)
94
+ - Adjust the depth and detail of each section based on the available time
95
+ - Ensure the program structure fits comfortably within the {duration_minutes} minute timeframe
96
+
97
+ Analysis & Output Requirements
98
+ 1. Dynamic Program Structures
99
+ - Based on the PDF content, suggest up to 3 different podcast episode structures (introduction, main, summary).
100
+ - based on user time requirement, suggest the structure.
101
+
102
+ 2. Podcast Scripts
103
+ - For each suggested program structure, generate a full podcast script.
104
+ - The script length should correspond to the user time requirement.
105
+ - The script must always include exactly two speakers:
106
+ - Speaker 1
107
+ - Speaker 2
108
+ - The script should be conversational, engaging, and podcast-ready.
109
+
110
+ Output Requirements
111
+ - Output must be in Japanese .
112
+ - Provide 2-3 different podcast episode proposals.
113
+ - Each proposal must include both a program structure and a complete script.
114
+ - Use the structured response format with a "proposals" array containing the episode suggestions.
115
+
116
+ 4. Constraints
117
+ - Maximum 3 suggestions only.
118
+ - Always provide both Program Structure and Script for each suggestion.
119
+ - Ensure Script includes only Speaker 1 and Speaker 2 (no additional speakers).
120
+ - Use natural Japanese conversation style.
121
+ - Just return the structured output, no other text or comments or any explanation.
122
+ """
123
+
124
+
125
+ def get_flashcard_system_prompt(
126
+ difficulty: str = "medium",
127
+ quantity: str = "standard",
128
+ language: str = "Japanese"
129
+ ) -> str:
130
+ # Language-specific instructions
131
+ if language == "Japanese":
132
+ language_instruction = """
133
+ LANGUAGE: JAPANESE
134
+ - Generate all flashcards in Japanese language
135
+ - Use appropriate Japanese terminology and expressions
136
+ - Ensure questions and answers are natural and clear in Japanese
137
+ - Use polite form (です/ます) for formal educational content"""
138
+ else: # English
139
+ language_instruction = """
140
+ LANGUAGE: ENGLISH
141
+ - Generate all flashcards in English language
142
+ - Use clear, professional English terminology
143
+ - Ensure questions and answers are grammatically correct and natural
144
+ - Use appropriate academic language for educational content"""
145
+
146
+ # Core instructions for flashcard generation
147
+ base_prompt = f"""You are an expert educational content creator specializing in creating high-quality flashcards from PDF documents. Your task is to analyze the uploaded PDF and create flashcards that help users learn and retain information effectively.
148
+
149
+ {language_instruction}
150
+
151
+ IMPORTANT INSTRUCTIONS:
152
+ 1. Read and analyze the entire PDF document thoroughly
153
+ 2. Extract key concepts, definitions, facts, and important information
154
+ 3. Create flashcards that follow the question-answer format
155
+ 4. Ensure questions are clear, specific, and test understanding
156
+ 5. Provide concise but complete answers
157
+ 6. Cover the most important topics from the document
158
+ 7. Return ONLY a JSON array of flashcards in the exact format specified below
159
+
160
+ REQUIRED JSON FORMAT:
161
+ [
162
+ {{
163
+ "question": "Your question here",
164
+ "answer": "Your answer here"
165
+ }},
166
+ {{
167
+ "question": "Another question",
168
+ "answer": "Another answer"
169
+ }}
170
+ ]
171
+
172
+ DO NOT include any text before or after the JSON array. Return ONLY the JSON."""
173
+
174
+ # Configure difficulty-specific instructions based on user selection
175
+ if difficulty == "easy":
176
+ difficulty_instructions = """
177
+
178
+ DIFFICULTY LEVEL: EASY
179
+ - Create simple, straightforward questions
180
+ - Focus on basic facts, definitions, and key terms
181
+ - Use simple language and avoid complex concepts
182
+ - Questions should test recall and basic understanding
183
+ - Answers should be concise (1-2 sentences maximum)"""
184
+
185
+ elif difficulty == "hard":
186
+ difficulty_instructions = """
187
+
188
+ DIFFICULTY LEVEL: HARD
189
+ - Create challenging, analytical questions
190
+ - Focus on complex concepts, relationships, and applications
191
+ - Test deep understanding and critical thinking
192
+ - Include scenario-based and comparative questions
193
+ - Answers can be more detailed (2-4 sentences)"""
194
+
195
+ else: # medium (default)
196
+ difficulty_instructions = """
197
+
198
+ DIFFICULTY LEVEL: MEDIUM
199
+ - Create balanced questions that test both recall and understanding
200
+ - Mix factual questions with conceptual ones
201
+ - Include some application-based questions
202
+ - Use moderate complexity in language and concepts
203
+ - Answers should be informative but concise (1-3 sentences)"""
204
+
205
+ # Configure quantity-specific instructions based on user selection
206
+ if quantity == "fewer":
207
+ quantity_instructions = """
208
+
209
+ QUANTITY: FEWER (15-20 flashcards)
210
+ - Focus on the most essential and fundamental concepts
211
+ - Prioritize the core topics that users must know
212
+ - Create comprehensive coverage of key themes
213
+ - Ensure each flashcard covers critical information"""
214
+
215
+ elif quantity == "more":
216
+ quantity_instructions = """
217
+
218
+ QUANTITY: MORE (55-70 flashcards)
219
+ - Create comprehensive coverage of the document
220
+ - Include both major and minor concepts
221
+ - Cover details, examples, and supporting information
222
+ - Create flashcards for specific facts, dates, names, and procedures
223
+ - Ensure thorough coverage of all important topics"""
224
+
225
+ else: # standard (default)
226
+ quantity_instructions = """
227
+
228
+ QUANTITY: STANDARD (35-40 flashcards)
229
+ - Provide balanced coverage of important topics
230
+ - Include both core concepts and important details
231
+ - Mix fundamental and intermediate-level questions
232
+ - Cover the most significant information comprehensively"""
233
+
234
+ return base_prompt + difficulty_instructions + quantity_instructions
235
+
236
+
237
+ def get_flashcard_topic_prompt(topic: str) -> str:
238
+ if not topic or topic.strip() == "":
239
+ return ""
240
+
241
+ return f"""
242
+
243
+ TOPIC FOCUS: {topic}
244
+ - Prioritize flashcards related to the specified topic: "{topic}"
245
+ - Ensure at least 70% of flashcards directly relate to this topic
246
+ - If the topic is not well-covered in the document, focus on the most relevant related concepts
247
+ - Maintain the specified difficulty and quantity requirements"""
248
+
249
+
250
+ def get_flashcard_explanation_prompt(question: str, language: str = "Japanese") -> str:
251
+ # Language-specific instructions for explanations
252
+ if language == "Japanese":
253
+ language_instruction = """
254
+ LANGUAGE: JAPANESE
255
+ - Provide the explanation in Japanese language
256
+ - Use appropriate Japanese terminology and expressions
257
+ - Ensure the explanation is natural and clear in Japanese
258
+ - Use polite form (です/ます) for formal educational content"""
259
+ else: # English
260
+ language_instruction = """
261
+ LANGUAGE: ENGLISH
262
+ - Provide the explanation in English language
263
+ - Use clear, professional English terminology
264
+ - Ensure the explanation is grammatically correct and natural
265
+ - Use appropriate academic language for educational content"""
266
+
267
+ # Create comprehensive explanation prompt with PDF context
268
+ return f"""You are an expert tutor. Based on the uploaded PDF document, provide a detailed explanation for the following question:
269
+
270
+ {language_instruction}
271
+
272
+ Question: {question}
273
+
274
+ Please provide:
275
+ 1. A clear, comprehensive explanation that helps the student understand the concept
276
+ 2. Context from the PDF document that supports the answer
277
+ 3. Additional relevant information that enhances understanding
278
+ 4. Examples or analogies if helpful
279
+
280
+ Keep the explanation educational and detailed, drawing specifically from the PDF content."""
281
+
282
+
283
+ def get_mindmap_system_prompt() -> str:
284
+ return """You are an expert at information visualization and conceptual mapping. Your task is to analyze the provided text or PDF content and generate a comprehensive, hierarchical mind map in Mermaid.js 'mindmap' format.
285
+
286
+ INSTRUCTIONS:
287
+ 1. Identify the central theme and use it as the root node.
288
+ 2. Extract major categories as first-level branches.
289
+ 3. Add detailed sub-topics and key facts as supporting branches.
290
+ 4. Keep node text concise (1-4 words).
291
+ 5. Ensure the hierarchy is logical and easy to follow.
292
+ 6. Use Mermaid 'mindmap' syntax.
293
+
294
+ EXAMPLE FORMAT:
295
+ mindmap
296
+ root((Central Topic))
297
+ Topic A
298
+ Subtopic A1
299
+ Subtopic A2
300
+ Topic B
301
+ Subtopic B1
302
+
303
+ IMPORTANT:
304
+ - Return ONLY the Mermaid code block starting with 'mindmap'.
305
+ - Do NOT include any introductory or concluding text.
306
+ - Use indentation (2 spaces) to define hierarchy.
307
+ - For nodes with special characters, use double quotes or parentheses like `Node((Label))`.
308
+ """
309
+
310
+
311
+ def get_quiz_system_prompt(language: str = "Japanese") -> str:
312
+ if language.lower() == "japanese":
313
+ return """
314
+ あなたは優秀なクイズ作成AIです。アップロードされた内容を分析し、指定された「難易度」や「トピック」に基づいて日本語でクイズを作成してください。
315
+
316
+ 絶対条件(厳守):
317
+ - 出力は常に下記のJSON形式のみ。
318
+ - 全ての問題の「answer」は、"1"〜"4" ができるだけ均等に出現するようにします。
319
+ - 同じ番号が3問以上連続しないようにしてください。
320
+
321
+ 出力形式(この形のみ):
322
+ {
323
+ "quizzes": [
324
+ {
325
+ "question": "問題文",
326
+ "hint": "ヒント",
327
+ "choices": { "1": "選択肢1", "2": "選択肢2", "3": "選択肢3", "4": "選択肢4" },
328
+ "answer": "1|2|3|4 のいずれか",
329
+ "explanation": "正解の詳細な説明"
330
+ }
331
+ ]
332
+ }
333
+
334
+ 作成方針:
335
+ 1) 各設問について、内容に基づく正解を決め、その正解の内容をランダムな番号の位置に置く。他の選択肢は紛らわしいが誤りの内容にする。
336
+ 2) explanation には根拠と理由を記載。
337
+ 3) hint は正解を直接言わずに、考えさせるような内容にする。
338
+ 4) 質問文は明確かつ簡潔に、選択肢は適切な長さに。
339
+
340
+ JSON 以外は一切出力しないでください。
341
+ """
342
+ else:
343
+ return """
344
+ You are an excellent quiz-creation AI. Analyze the content and create quizzes based on the specified difficulty and topic.
345
+
346
+ Hard requirements:
347
+ - Output ONLY the JSON structure below.
348
+ - Across all items, distribute the correct answer index ("answer") as evenly as possible over "1".."4".
349
+ - Do NOT allow the same answer index to appear 3+ times in a row.
350
+
351
+ Output format (and nothing else):
352
+ {
353
+ "quizzes": [
354
+ {
355
+ "question": "Question",
356
+ "hint": "Hint",
357
+ "choices": { "1": "Choice 1", "2": "Choice 2", "3": "Choice 3", "4": "Choice 4" },
358
+ "answer": "1|2|3|4",
359
+ "explanation": "Detailed reasoning for why this is correct"
360
+ }
361
+ ]
362
+ }
363
+
364
+ Creation protocol:
365
+ 1) For each quiz, determine the correct content and place it at a random position from 1-4, adjusting other distractors accordingly.
366
+ 2) explanation must include reasoning grounded in the source content.
367
+ 3) hint should be helpful without giving away the answer directly.
368
+ 4) Keep questions clear; choices concise.
369
+
370
+ Do not output anything except the JSON.
371
+ """
372
+
373
+
374
+ from core import constants
375
+
376
+ def get_report_prompt(format_key: str, custom_prompt: str = "", language: str = "Japanese") -> str:
377
+ if format_key == "custom":
378
+ return custom_prompt
379
+
380
+ # Search in constants
381
+ for option in constants.REPORT_FORMAT_OPTIONS:
382
+ if option["value"] == format_key:
383
+ if language == "Japanese":
384
+ return option["prompt_jp"]
385
+ else:
386
+ return option["prompt"]
387
+
388
+ return custom_prompt
389
+
390
+ def get_report_suggestion_prompt(language: str = "Japanese") -> str:
391
+ if language == "Japanese":
392
+ return FORMAT_SUGGESTION_PROMPT_JP + "\n\n重要: すべての提案とプロンプトは日本語で書いてください。"
393
+ else:
394
+ return FORMAT_SUGGESTION_PROMPT + "\n\nIMPORTANT: Write all suggestions and prompts in English."
395
+
396
+
397
+ FORMAT_SUGGESTION_PROMPT = """Analyze the uploaded content and suggest 4 relevant report formats that would be most useful for this specific material.
398
+
399
+ For each suggested format, provide:
400
+ 1. A descriptive name (2-4 words)
401
+ 2. A brief description of what the report would contain
402
+ 3. A detailed prompt for generating that specific report
403
+
404
+ Return the response as a JSON object with this structure:
405
+ {
406
+ "suggestions": [
407
+ {
408
+ "name": "Format Name",
409
+ "description": "Brief description",
410
+ "prompt": "Detailed prompt for generating this report"
411
+ }
412
+ ]
413
+ }"""
414
+
415
+ FORMAT_SUGGESTION_PROMPT_JP = """アップロードされた内容を分析し、この特定の資料に最も有用な4つの関連レポート形式を提案してください。
416
+
417
+ 各提案された形式について、以下を提供してください:
418
+ 1. 説明的な名前(2-4語)
419
+ 2. レポートに含まれる内容の簡潔な説明
420
+ 3. その特定のレポートを生成するための詳細なプロンプト
421
+
422
+ 以下の構造のJSONオブジェクトとして応答を返してください:
423
+ {
424
+ "suggestions": [
425
+ {
426
+ "name": "形式名",
427
+ "description": "簡潔な説明",
428
+ "prompt": "このレポートを生成するための詳細なプロンプト"
429
+ }
430
+ ]
431
+ }"""
432
+
433
+
434
+
435
+
436
+ def get_pdf_text_extraction_prompt() -> str:
437
+ return """You are an expert text extraction assistant. You have been provided with a PDF document.
438
+
439
+ **Task**: Extract all text content from this PDF document.
440
+
441
+ **Requirements**:
442
+ 1. Extract all text content from the PDF in a structured manner
443
+ 2. Preserve the logical flow and hierarchy of information
444
+ 3. Maintain section headers, main topics, and subtopics
445
+
446
+ **Output Format**:
447
+ Return the extracted text as plain text with proper formatting:
448
+ - Use clear paragraph breaks
449
+ - Maintain heading structure
450
+ - Keep bullet points or numbered lists intact
451
+ - Preserve important formatting that conveys meaning
452
+
453
+ **Important**:
454
+ - Do NOT add any additional commentary or explanations
455
+ - Do NOT summarize - extract the full content
456
+ - Just return the extracted text content
457
+ - Make sure the text is complete and can be used for presentation generation"""
458
+
459
+
460
+ def get_video_script_prompt(language: str, total_pages: int) -> str:
461
+ """
462
+ Generate high-fidelity prompt for PDF script generation.
463
+ """
464
+ if language == "English":
465
+ return f"""
466
+ Role:
467
+ - You are an expert bilingual narrator and AI scriptwriter skilled in transforming structured documents into engaging, human-sounding English narration. Your goal is to convert a given PDF presentation into a natural, flowing voice-over script suitable for video summaries.
468
+
469
+ Task:
470
+ - Analyze the provided PDF presentation page by page and create a captivating narration script in English that feels like it's being spoken by a professional narrator summarizing a visual slide deck.
471
+
472
+ Guidelines:
473
+ - Carefully read each page's main content and summarize it.
474
+ - Create a natural, flowing narration script that doesn't sound robotic.
475
+ - Use conversational, short, and cohesive sentences that sound like they're being spoken.
476
+ - Add gentle transitions between sections to keep the story flowing naturally.
477
+ - Maintain a positive tone with rich information and clear direction throughout.
478
+ - All text (including page titles and key points) should be in English .
479
+ - Make the narration sound like it's describing visual materials (slides, graphs, steps, etc.) to the listener.
480
+ - Rewrite the text in a way that's clear and understandable, rather than quoting the original text.
481
+
482
+ Output Format (strict JSON only):
483
+ {{
484
+ "total_pages": {total_pages},
485
+ "scripts": [
486
+ {{
487
+ "page_number": 1,
488
+ "page_title": "",
489
+ "script_text": "",
490
+ "key_points": [],
491
+ "duration_estimate": ""
492
+ }}
493
+ ],
494
+ "total_duration_estimate": "about 3-4 minutes"
495
+ }}
496
+
497
+ Important Notes:
498
+ - Output must be valid JSON only, no extra commentary or Markdown.
499
+ - Each script_text must be written naturally in English, using polite, smooth narration tone.
500
+ - duration_estimate values should be realistic for natural speech.
501
+ """
502
+ else: # Japanese
503
+ return f"""
504
+ 役割:
505
+ - あなたはバイリンガルのナレーター兼AIスクリプトライターであり、構造化されたドキュメントを魅力的で自然な日本語のナレーションに変換できます。目標は、提供されたPDFプレゼンテーションを、動画に適した自然で流れるようなナレーションスクリプトに変換することです。
506
+
507
+ タスク:
508
+ - 提供されたPDFプレゼンテーションをページごとに分析し、理解しやすい日本語のナレーションスクリプトを作成してください。
509
+
510
+ ガイドライン:
511
+ - 各ページの主要コンテンツを注意深く読みます。
512
+ - ロボットのように聞こえない、自然で流れるようなナレーションスクリプトを作成します。
513
+ - 会話的で、簡潔で、一貫性のあるトーンで、理解しやすいようにします。
514
+ - 全体の流れを維持するために、セクション間のスムーズな移行を含めます。
515
+ - 肯定的で、情報を提供し、明確なトーンを維持します。
516
+ - すべてのテキスト(ページタイトルと重要なポイントを含む)は日本語で記述する必要があります。
517
+ - 視聴者がスライド、グラフ、手順などを見ているかのように、視覚的な要素を説明します。
518
+ - 原文を逐語的に引用することは避けてください。明確で自然な書き方に書き換えてください。
519
+
520
+ 出力フォーマット(厳密なJSONのみ):
521
+ {{
522
+ "total_pages": {total_pages},
523
+ "scripts": [
524
+ {{
525
+ "page_number": 1,
526
+ "page_title": "",
527
+ "script_text": "",
528
+ "key_points": [],
529
+ "duration_estimate": ""
530
+ }}
531
+ ],
532
+ "total_duration_estimate": "約3〜4分"
533
+ }}
534
+
535
+ 重要事項:
536
+ - 出力は有効なJSON形式のみで、不要なコメントやMarkdown形式を含めないでください。
537
+ - すべてのscript_textは、自然で丁寧な日本語のナレーションスタイルで記述してください。
538
+ - duration_estimate を実際のナレーションに近い現実的な長さに設定します。
539
+ """
540
+
541
+
542
+ def get_outline_prompt(template_yaml_text: str, source_text: str, custom_prompt: str = "", language: str = "Japanese") -> str:
543
+ """アウトライン生成用のプロンプト文を構築する。"""
544
+ extra = (custom_prompt or "").strip()
545
+ if language == "English":
546
+ return (
547
+ "You are an assistant that generates presentation materials from textbook text.\n"
548
+ "You will be given the following 2 items:\n\n"
549
+ "1. `TEMPLATE_YAML`: Slide template definitions\n"
550
+ "2. `SOURCE_TEXT`: Plain text from textbooks or educational materials\n\n"
551
+ "## Objective\n\n"
552
+ "* Read `SOURCE_TEXT` and design an overall outline.\n"
553
+ "* Generate text to fill the placeholders for each selected template.\n"
554
+ "* **IMPORTANT: All generated content in the 'fields' must be written in English language.**\n"
555
+ "* Return in JSON format only.\n\n"
556
+ "## Output Format (Strict)\n\n"
557
+ "{\n \"slides\": [\n {\n \"template\": \"cover|hook|compare|statement|section|define|key|steps|bullets|quote\",\n \"fields\": { \"<PLACEHOLDER>\": \"string\", \"...\": \"...\" }\n }\n ]\n}\n\n"
558
+ + ("## Additional Instructions\n\n" + extra + "\n\n" if extra else "")
559
+ + "## Input\n\n* TEMPLATE_YAML:\n\n" + template_yaml_text + "\n\n* SOURCE_TEXT:\n\n" + source_text
560
+ )
561
+ else:
562
+ return (
563
+ "あなたは「教科書テキストからプレゼン資料を自動生成する」アシスタントです。\n"
564
+ "## 目的\n\n"
565
+ "* `SOURCE_TEXT`を読み、全体のアウトラインを設計。\n"
566
+ "* 各ページで選んだテンプレのプレースホルダーに入れるテキストを生成。\n"
567
+ "* **重要: 'fields' 内の全ての生成コンテンツは日本語で記述すること。**\n"
568
+ "* JSONで返す。\n\n"
569
+ "## 出力フォーマット(厳守)\n\n"
570
+ "{\n \"slides\": [\n {\n \"template\": \"cover|hook|compare|statement|section|define|key|steps|bullets|quote\",\n \"fields\": { \"<PLACEHOLDER>\": \"string\", \"...\": \"...\" }\n }\n ]\n}\n\n"
571
+ + ("## 追加指示\n\n" + extra + "\n\n" if extra else "")
572
+ + "## 入力\n\n* TEMPLATE_YAML:\n\n" + template_yaml_text + "\n\n* SOURCE_TEXT:\n\n" + source_text
573
+ )
core/security.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import bcrypt
2
+ from datetime import datetime, timedelta
3
+ from typing import Optional, Any
4
+ from jose import jwt
5
+ from core.config import settings
6
+
7
+ def create_access_token(data: dict, expires_delta: Optional[timedelta] = None) -> str:
8
+ to_encode = data.copy()
9
+ if expires_delta:
10
+ expire = datetime.utcnow() + expires_delta
11
+ else:
12
+ expire = datetime.utcnow() + timedelta(minutes=settings.ACCESS_TOKEN_EXPIRE_MINUTES)
13
+ to_encode.update({"exp": expire})
14
+ encoded_jwt = jwt.encode(to_encode, settings.SECRET_KEY, algorithm=settings.ALGORITHM)
15
+ return encoded_jwt
16
+
17
+ def verify_password(plain_password: str, hashed_password: str) -> bool:
18
+ return bcrypt.checkpw(
19
+ plain_password.encode('utf-8'),
20
+ hashed_password.encode('utf-8')
21
+ )
22
+
23
+ def get_password_hash(password: str) -> str:
24
+ pwd_bytes = password.encode('utf-8')
25
+ salt = bcrypt.gensalt()
26
+ hashed = bcrypt.hashpw(pwd_bytes, salt)
27
+ return hashed.decode('utf-8')
core/templates/eng_slide_template.yaml ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ slides_template:
2
+ - name: cover
3
+ description: Cover slide. Display the presentation title in a large, centered font.
4
+ fields:
5
+ COVER.MAIN: { type: text, max_chars: 40, note: "Recommend a short title. If there is no subtitle, it should be about 20-30 characters." }
6
+
7
+ - name: hook
8
+ description: A question or problem to engage the audience.
9
+ fields:
10
+ HOOK.MAIN: { type: text, max_chars: 60, note: "1-2 lines. Questions or impactful expressions." }
11
+
12
+ - name: compare
13
+ description: Compare and contrast in two columns.
14
+ fields:
15
+ COMPARE.LEFT: { type: text, max_chars: 80, note: "The left side's argument or example. Up to 2-3 lines." }
16
+ COMPARE.RIGHT: { type: text, max_chars: 80, note: "The right side's argument or example. Up to 2-3 lines." }
17
+
18
+ - name: statement
19
+ description: Present a key message in one sentence.
20
+ fields:
21
+ STATEMENT.MAIN: { type: text, max_chars: 50, note: "A short sentence. One line of the key message." }
22
+
23
+ - name: section
24
+ description: Section change. Number and main heading.
25
+ fields:
26
+ SECTION.NUMBER: { type: text, max_chars: 3, note: "Number like 1, 2, 3." }
27
+ SECTION.MAIN: { type: text, max_chars: 30, note: "Section title." }
28
+
29
+ - name: define
30
+ description: Define terms or concepts.
31
+ fields:
32
+ DEFINE.TITLE: { type: text, max_chars: 10, note: "Definition title." }
33
+ DEFINE.MAIN: { type: text, max_chars: 80, note: "Definition sentence. Concisely in 1-2 sentences." }
34
+
35
+ - name: key
36
+ description: A slide to emphasize the key message.
37
+ fields:
38
+ KEY.MAIN: { type: text, max_chars: 70, note: "1-2 sentences. Clear and strong expression." }
39
+
40
+ - name: steps
41
+ description: Explain the 3-step process.
42
+ fields:
43
+ STEPS.TITLE: { type: text, max_chars: 40 }
44
+ STEPS.TITLE1: { type: text, max_chars: 25 }
45
+ STEPS.TITLE2: { type: text, max_chars: 25 }
46
+ STEPS.TITLE3: { type: text, max_chars: 25 }
47
+ STEPS.TEXT1: { type: text, max_chars: 60 }
48
+ STEPS.TEXT2: { type: text, max_chars: 60 }
49
+ STEPS.TEXT3: { type: text, max_chars: 60 }
50
+ note: "Each step should be a short phrase."
51
+
52
+ - name: bullets
53
+ description: A bulleted list.
54
+ fields:
55
+ BULLETS.TITLE: { type: text, max_chars: 30 }
56
+ BULLETS.TEXT1: { type: text, max_chars: 40 }
57
+ BULLETS.TEXT2: { type: text, max_chars: 40 }
58
+ BULLETS.TEXT3: { type: text, max_chars: 40 }
59
+ note: "It's easier to read if it's within 3-4 items."
60
+
61
+ - name: quote
62
+ description: A quote or a sentence to emphasize.
63
+ fields:
64
+ QUOTE.MAIN: { type: text, max_chars: 50, note: "A quote or a sentence to emphasize. Use a lot of white space." }
65
+
66
+ - name: logo
67
+ description: Display the company logo.
68
+ fields:
69
+ LOGO.MAIN: { type: text, fixed_value: "app.at-peak.jp" }
core/templates/ja_slide_template.yaml ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ slides_template:
2
+ - name: cover
3
+ description: 表紙スライド。プレゼンタイトルを大きく中央に表示。
4
+ fields:
5
+ COVER.MAIN: { type: text, max_chars: 40, note: "短いタイトルを推奨。副題なしなら20〜30文字程度。" }
6
+
7
+ - name: hook
8
+ description: 聴衆を引き込む質問や問題提起。
9
+ fields:
10
+ HOOK.MAIN: { type: text, max_chars: 60, note: "1〜2行。疑問形やインパクトある表現。" }
11
+
12
+ - name: compare
13
+ description: 左右2カラムで比較・対比。
14
+ fields:
15
+ COMPARE.LEFT: { type: text, max_chars: 80, note: "左側の主張や例。2〜3行まで。" }
16
+ COMPARE.RIGHT: { type: text, max_chars: 80, note: "右側の主張や例。2〜3行まで。" }
17
+
18
+ - name: statement
19
+ description: 重要メッセージを1文で提示。
20
+ fields:
21
+ STATEMENT.MAIN: { type: text, max_chars: 50, note: "短文。キーメッセージ1行。" }
22
+
23
+ - name: section
24
+ description: セクション切替用。番号と大見出し。
25
+ fields:
26
+ SECTION.NUMBER: { type: text, max_chars: 3, note: "1, 2, 3など番号。" }
27
+ SECTION.MAIN: { type: text, max_chars: 30, note: "章タイトル。" }
28
+
29
+ - name: define
30
+ description: 用語や概念を定義。
31
+ fields:
32
+ DEFINE.TITLE: { type: text, max_chars: 10, note: "定義のタイトル。" }
33
+ DEFINE.MAIN: { type: text, max_chars: 80, note: "定義文。1〜2文で簡潔に。" }
34
+
35
+ - name: key
36
+ description: キーメッセージを強調するスライド。
37
+ fields:
38
+ KEY.MAIN: { type: text, max_chars: 70, note: "1〜2文。明確で力強い表現。" }
39
+
40
+ - name: steps
41
+ description: 3ステップのプロセスを説明。
42
+ fields:
43
+ STEPS.TITLE: { type: text, max_chars: 40 }
44
+ STEPS.TITLE1: { type: text, max_chars: 25 }
45
+ STEPS.TITLE2: { type: text, max_chars: 25 }
46
+ STEPS.TITLE3: { type: text, max_chars: 25 }
47
+ STEPS.TEXT1: { type: text, max_chars: 60 }
48
+ STEPS.TEXT2: { type: text, max_chars: 60 }
49
+ STEPS.TEXT3: { type: text, max_chars: 60 }
50
+ note: "各ステップは短いフレーズ推奨。"
51
+
52
+ - name: bullets
53
+ description: 箇条書きリスト。
54
+ fields:
55
+ BULLETS.TITLE: { type: text, max_chars: 30 }
56
+ BULLETS.TEXT1: { type: text, max_chars: 40 }
57
+ BULLETS.TEXT2: { type: text, max_chars: 40 }
58
+ BULLETS.TEXT3: { type: text, max_chars: 40 }
59
+ note: "3〜4項目以内に収めると見やすい。"
60
+
61
+ - name: quote
62
+ description: 名言や強調したい一文を吹き出し表示。
63
+ fields:
64
+ QUOTE.MAIN: { type: text, max_chars: 50, note: "引用1行。余白を広く使う。" }
65
+
66
+ - name: logo
67
+ description: 会社ロゴ表示。
68
+ fields:
69
+ LOGO.MAIN: { type: text, fixed_value: "app.at-peak.jp" }
main.py ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI
2
+ from fastapi.middleware.cors import CORSMiddleware
3
+ from core.database import init_db
4
+ from api import auth, sources, podcast, flashcards, mindmaps, quizzes, reports, video_generator, rag, chat
5
+
6
+ # Initialize Database Tables
7
+ init_db()
8
+
9
+ app = FastAPI(
10
+ title="CreatorStudio AI API",
11
+ description="Backend for CreatorStudio AI - Podcast and Content Creation Platform",
12
+ version="0.1.0"
13
+ )
14
+
15
+ # CORS Configuration
16
+ app.add_middleware(
17
+ CORSMiddleware,
18
+ allow_origins=["*"],
19
+ allow_credentials=True,
20
+ allow_methods=["*"],
21
+ allow_headers=["*"],
22
+ )
23
+
24
+ # Include Routers
25
+ app.include_router(auth.router)
26
+ app.include_router(sources.router)
27
+ app.include_router(podcast.router)
28
+ app.include_router(flashcards.router)
29
+ app.include_router(mindmaps.router)
30
+ app.include_router(quizzes.router)
31
+ app.include_router(reports.router)
32
+ app.include_router(video_generator.router)
33
+ app.include_router(rag.router)
34
+ app.include_router(chat.router)
35
+
36
+ @app.get("/")
37
+ async def root():
38
+ return {"message": "Welcome to CreatorStudio AI API. Head to /docs for API documentation."}
39
+
40
+ if __name__ == "__main__":
41
+ import uvicorn
42
+ uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=True)
models/__init__.py ADDED
File without changes
models/db_models.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from sqlalchemy import Column, Integer, String, Boolean, DateTime, ForeignKey, Float, Text, JSON, Unicode, UnicodeText
2
+ from sqlalchemy.orm import relationship
3
+ from sqlalchemy.sql import func
4
+ from core.database import Base
5
+
6
+ class User(Base):
7
+ __tablename__ = "users"
8
+
9
+ id = Column(Integer, primary_key=True, index=True)
10
+ email = Column(Unicode(255), unique=True, index=True, nullable=False)
11
+ hashed_password = Column(String(255), nullable=False)
12
+ is_active = Column(Boolean, default=True)
13
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
14
+
15
+ sources = relationship("Source", back_populates="owner")
16
+ podcasts = relationship("Podcast", back_populates="owner")
17
+ flashcard_sets = relationship("FlashcardSet", back_populates="owner")
18
+ mind_maps = relationship("MindMap", back_populates="owner")
19
+ quiz_sets = relationship("QuizSet", back_populates="owner")
20
+ reports = relationship("Report", back_populates="owner")
21
+ video_summaries = relationship("VideoSummary", back_populates="owner")
22
+ rag_documents = relationship("RAGDocument", back_populates="owner")
23
+ chat_messages = relationship("ChatMessage", back_populates="owner", cascade="all, delete-orphan")
24
+
25
+ class Source(Base):
26
+ __tablename__ = "sources"
27
+
28
+ id = Column(Integer, primary_key=True, index=True)
29
+ filename = Column(Unicode(255), nullable=False)
30
+ s3_key = Column(String(512), nullable=False)
31
+ s3_url = Column(String(1024), nullable=False)
32
+ size = Column(Integer)
33
+ user_id = Column(Integer, ForeignKey("users.id"))
34
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
35
+
36
+ owner = relationship("User", back_populates="sources")
37
+
38
+ class Podcast(Base):
39
+ __tablename__ = "podcasts"
40
+
41
+ id = Column(Integer, primary_key=True, index=True)
42
+ title = Column(Unicode(255))
43
+ s3_key = Column(String(512), nullable=False)
44
+ s3_url = Column(String(1024), nullable=False)
45
+ script = Column(UnicodeText)
46
+ user_id = Column(Integer, ForeignKey("users.id"))
47
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
48
+
49
+ owner = relationship("User", back_populates="podcasts")
50
+
51
+ class FlashcardSet(Base):
52
+ __tablename__ = "flashcard_sets"
53
+
54
+ id = Column(Integer, primary_key=True, index=True)
55
+ title = Column(Unicode(255))
56
+ difficulty = Column(String(50))
57
+ user_id = Column(Integer, ForeignKey("users.id"))
58
+ source_id = Column(Integer, ForeignKey("sources.id"), nullable=True)
59
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
60
+
61
+ owner = relationship("User", back_populates="flashcard_sets")
62
+ flashcards = relationship("Flashcard", back_populates="flashcard_set", cascade="all, delete-orphan")
63
+
64
+ class MindMap(Base):
65
+ __tablename__ = "mind_maps"
66
+
67
+ id = Column(Integer, primary_key=True, index=True)
68
+ title = Column(Unicode(255))
69
+ mermaid_code = Column(UnicodeText, nullable=False)
70
+ user_id = Column(Integer, ForeignKey("users.id"))
71
+ source_id = Column(Integer, ForeignKey("sources.id"), nullable=True)
72
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
73
+
74
+ owner = relationship("User", back_populates="mind_maps")
75
+
76
+ class QuizSet(Base):
77
+ __tablename__ = "quiz_sets"
78
+
79
+ id = Column(Integer, primary_key=True, index=True)
80
+ title = Column(Unicode(255))
81
+ difficulty = Column(String(50))
82
+ user_id = Column(Integer, ForeignKey("users.id"))
83
+ source_id = Column(Integer, ForeignKey("sources.id"), nullable=True)
84
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
85
+
86
+ owner = relationship("User", back_populates="quiz_sets")
87
+ questions = relationship("QuizQuestion", back_populates="quiz_set", cascade="all, delete-orphan")
88
+
89
+ class QuizQuestion(Base):
90
+ __tablename__ = "quiz_questions"
91
+
92
+ id = Column(Integer, primary_key=True, index=True)
93
+ quiz_set_id = Column(Integer, ForeignKey("quiz_sets.id"))
94
+ question = Column(UnicodeText, nullable=False)
95
+ hint = Column(UnicodeText)
96
+ choices = Column(JSON, nullable=False) # Store choices as a JSON object {"1": "...", "2": "...", ...}
97
+ answer = Column(String(10), nullable=False) # Storing "1", "2", "3", or "4"
98
+ explanation = Column(UnicodeText)
99
+
100
+ quiz_set = relationship("QuizSet", back_populates="questions")
101
+
102
+ class Report(Base):
103
+ __tablename__ = "reports"
104
+
105
+ id = Column(Integer, primary_key=True, index=True)
106
+ title = Column(Unicode(255))
107
+ content = Column(UnicodeText, nullable=False)
108
+ format_key = Column(String(100))
109
+ user_id = Column(Integer, ForeignKey("users.id"))
110
+ source_id = Column(Integer, ForeignKey("sources.id"), nullable=True)
111
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
112
+
113
+ owner = relationship("User", back_populates="reports")
114
+
115
+ class VideoSummary(Base):
116
+ __tablename__ = "video_summaries"
117
+
118
+ id = Column(Integer, primary_key=True, index=True)
119
+ title = Column(Unicode(255))
120
+ s3_key = Column(String(512), nullable=False)
121
+ s3_url = Column(String(1024), nullable=False)
122
+ user_id = Column(Integer, ForeignKey("users.id"))
123
+ source_id = Column(Integer, ForeignKey("sources.id"), nullable=True)
124
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
125
+
126
+ owner = relationship("User", back_populates="video_summaries")
127
+
128
+ class Flashcard(Base):
129
+ __tablename__ = "flashcards"
130
+
131
+ id = Column(Integer, primary_key=True, index=True)
132
+ flashcard_set_id = Column(Integer, ForeignKey("flashcard_sets.id"))
133
+ question = Column(UnicodeText, nullable=False)
134
+ answer = Column(UnicodeText, nullable=False)
135
+
136
+ flashcard_set = relationship("FlashcardSet", back_populates="flashcards")
137
+
138
+ class RAGDocument(Base):
139
+ __tablename__ = "rag_documents"
140
+
141
+ id = Column(Integer, primary_key=True, index=True)
142
+ filename = Column(Unicode(255), nullable=False)
143
+ azure_doc_id = Column(String(255), unique=True, index=True)
144
+ chunk_count = Column(Integer, default=0)
145
+ user_id = Column(Integer, ForeignKey("users.id"))
146
+ source_id = Column(Integer, ForeignKey("sources.id"), nullable=True)
147
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
148
+
149
+ owner = relationship("User", back_populates="rag_documents")
150
+
151
+ class ChatMessage(Base):
152
+ __tablename__ = "chat_messages"
153
+
154
+ id = Column(Integer, primary_key=True, index=True)
155
+ user_id = Column(Integer, ForeignKey("users.id"))
156
+ rag_doc_id = Column(Integer, ForeignKey("rag_documents.id"), nullable=True)
157
+ role = Column(String(50)) # 'user' or 'assistant'
158
+ content = Column(UnicodeText, nullable=False)
159
+ created_at = Column(DateTime(timezone=True), server_default=func.now())
160
+
161
+ owner = relationship("User", back_populates="chat_messages")
models/schemas.py ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic import BaseModel, EmailStr
2
+ from typing import List, Optional, Dict, Any
3
+ from datetime import datetime
4
+
5
+ # User Schemas
6
+ class UserBase(BaseModel):
7
+ email: EmailStr
8
+
9
+ class UserCreate(UserBase):
10
+ password: str
11
+
12
+ class UserLogin(BaseModel):
13
+ email: EmailStr
14
+ password: str
15
+
16
+ class UserResponse(UserBase):
17
+ id: int
18
+ is_active: bool = True
19
+
20
+ class Config:
21
+ from_attributes = True
22
+
23
+ # Token Schemas
24
+ class Token(BaseModel):
25
+ access_token: str
26
+ token_type: str
27
+
28
+ class TokenData(BaseModel):
29
+ email: Optional[str] = None
30
+
31
+ # Source Schemas
32
+ class SourceFileResponse(BaseModel):
33
+ id: int
34
+ filename: str
35
+ s3_key: str
36
+ public_url: str
37
+ private_url: Optional[str] = None
38
+ size: int
39
+ created_at: datetime
40
+ rag_id: Optional[int] = None
41
+ azure_doc_id: Optional[str] = None
42
+
43
+ class Config:
44
+ from_attributes = True
45
+
46
+ # Podcast Schemas
47
+ class PodcastAnalyzeRequest(BaseModel):
48
+ file_key: str
49
+ duration_minutes: int = 10
50
+
51
+ class PodcastGenerateRequest(BaseModel):
52
+ user_prompt: str
53
+ model: str = "gpt-4o"
54
+ duration_minutes: int = 10
55
+ podcast_format: str = "deep dive"
56
+ pdf_suggestions: str = ""
57
+ file_key: Optional[str] = None
58
+ tts_model: str = "gemini-2.5-flash-preview-tts"
59
+ spk1_voice: str = "Zephyr"
60
+ spk2_voice: str = "Charon"
61
+ bgm_choice: str = "No BGM"
62
+ temperature: float = 1.0
63
+
64
+ # Flashcard Schemas
65
+ class FlashcardItem(BaseModel):
66
+ question: str
67
+ answer: str
68
+
69
+ class FlashcardGenerateRequest(BaseModel):
70
+ file_key: Optional[str] = None
71
+ text_input: Optional[str] = None
72
+ difficulty: str = "medium"
73
+ quantity: str = "standard"
74
+ topic: Optional[str] = None
75
+ language: str = "English"
76
+
77
+ class FlashcardResponse(BaseModel):
78
+ id: int
79
+ question: str
80
+ answer: str
81
+
82
+ class FlashcardSetResponse(BaseModel):
83
+ id: int
84
+ title: Optional[str]
85
+ difficulty: str
86
+ created_at: datetime
87
+ flashcards: List[FlashcardResponse]
88
+
89
+ class Config:
90
+ from_attributes = True
91
+
92
+ # Mind Map Schemas
93
+ class MindMapGenerateRequest(BaseModel):
94
+ file_key: Optional[str] = None
95
+ text_input: Optional[str] = None
96
+ title: Optional[str] = None
97
+
98
+ class MindMapResponse(BaseModel):
99
+ title: str
100
+ mermaid_code: str
101
+ message: str
102
+
103
+ # Quiz Schemas
104
+ class QuizGenerateRequest(BaseModel):
105
+ file_key: Optional[str] = None
106
+ text_input: Optional[str] = None
107
+ difficulty: str = "medium"
108
+ topic: Optional[str] = None
109
+ language: str = "English"
110
+ count: str = "STANDARD" # FEWER, STANDARD, MORE
111
+
112
+ class QuizQuestionResponse(BaseModel):
113
+ id: int
114
+ question: str
115
+ hint: Optional[str]
116
+ choices: dict
117
+ answer: str
118
+ explanation: Optional[str]
119
+
120
+ class QuizSetResponse(BaseModel):
121
+ id: int
122
+ title: Optional[str]
123
+ difficulty: str
124
+ created_at: datetime
125
+ questions: List[QuizQuestionResponse]
126
+
127
+ class Config:
128
+ from_attributes = True
129
+
130
+ # Report Schemas
131
+ class ReportFormatSuggestion(BaseModel):
132
+ name: str
133
+ description: str
134
+ prompt: str
135
+
136
+ class ReportFormatSuggestionResponse(BaseModel):
137
+ suggestions: List[ReportFormatSuggestion]
138
+
139
+ class ReportGenerateRequest(BaseModel):
140
+ file_key: Optional[str] = None
141
+ text_input: Optional[str] = None
142
+ format_key: str # briefing_doc, study_guide, blog_post, custom, or suggested_X
143
+ custom_prompt: Optional[str] = None
144
+ language: str = "Japanese"
145
+
146
+ class ReportResponse(BaseModel):
147
+ id: int
148
+ title: str
149
+ content: str
150
+ format_key: str
151
+ created_at: datetime
152
+
153
+ class Config:
154
+ from_attributes = True
155
+
156
+ # Video Summary Schemas
157
+ class VideoSummaryGenerateRequest(BaseModel):
158
+ file_key: str
159
+ language: str = "Japanese"
160
+ voice_name: str = "Kore" # Kore, Fenrir, etc.
161
+ use_slides_transformation: bool = True
162
+ custom_prompt: Optional[str] = ""
163
+
164
+ class VideoSummaryResponse(BaseModel):
165
+ id: int
166
+ title: str
167
+ s3_key: str
168
+ public_url: str
169
+ private_url: Optional[str] = None
170
+ created_at: datetime
171
+
172
+ class Config:
173
+ from_attributes = True
174
+
175
+ # RAG Schemas
176
+ class RAGDocumentUploadRequest(BaseModel):
177
+ source_id: Optional[int] = None # Link to existing source file
178
+
179
+ class RAGSearchRequest(BaseModel):
180
+ query: str
181
+ top_k: int = 5
182
+
183
+ class RAGDocumentResponse(BaseModel):
184
+ id: int
185
+ filename: str
186
+ azure_doc_id: str
187
+ blob_url: Optional[str]
188
+ content_preview: Optional[str]
189
+ chunk_count: int
190
+ created_at: datetime
191
+
192
+ class Config:
193
+ from_attributes = True
194
+
195
+ class RAGSearchResult(BaseModel):
196
+ content: str
197
+ score: float
198
+ source: str
199
+ metadata: Dict[str, Any] = {}
200
+
201
+ class RAGSearchResponse(BaseModel):
202
+ results: List[RAGSearchResult]
203
+ answer: Optional[str] = None
204
+ # RAG Query Request (Simplified)
205
+ class RAGQueryRequest(BaseModel):
206
+ file_key: str # S3 key of the source file
207
+ query: str
208
+ top_k: int = 3 # Number of relevant chunks to use
209
+
210
+ # Chat Schemas
211
+ class ChatMessageCreate(BaseModel):
212
+ query: str # The user's question or message
213
+ rag_doc_id: Optional[int] = None # Optional: Link to a specific document for context
214
+
215
+ class ChatMessageResponse(BaseModel):
216
+ id: int
217
+ role: str
218
+ content: str # keeping 'content' here as it represents the stored/returned textual data
219
+ rag_doc_id: Optional[int]
220
+ created_at: datetime
221
+
222
+ class Config:
223
+ from_attributes = True
requirements.txt ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ fastapi[all]
2
+ uvicorn
3
+ python-multipart
4
+ boto3
5
+ python-jose[cryptography]
6
+ bcrypt
7
+ pydantic-settings
8
+ python-dotenv
9
+ openai
10
+ google-genai
11
+ google-api-python-client
12
+ google-auth
13
+ google-auth-httplib2
14
+ google-auth-oauthlib
15
+ pydub
16
+ ffmpeg-python
17
+ sqlalchemy
18
+ pyodbc
19
+ moviepy
20
+ pdf2image
21
+ Pillow
22
+ azure-search-documents
23
+ azure-storage-blob
24
+ azure-identity
25
+ PyPDF2
26
+ tiktoken
27
+ numpy
services/__init__.py ADDED
File without changes
services/flashcard_service.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import logging
3
+ import os
4
+ import tempfile
5
+ from typing import List, Dict, Optional, Any
6
+ import openai
7
+ from botocore.exceptions import ClientError
8
+
9
+ from core.config import settings
10
+ from core.prompts import get_flashcard_system_prompt, get_flashcard_topic_prompt, get_flashcard_explanation_prompt
11
+ from services.s3_service import s3_service
12
+
13
+ logger = logging.getLogger(__name__)
14
+
15
+ class FlashcardService:
16
+ def __init__(self):
17
+ self.openai_client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
18
+
19
+ async def generate_flashcards(
20
+ self,
21
+ file_key: Optional[str] = None,
22
+ text_input: Optional[str] = None,
23
+ difficulty: str = "medium",
24
+ quantity: str = "standard",
25
+ topic: Optional[str] = None,
26
+ language: str = "English"
27
+ ) -> List[Dict[str, str]]:
28
+ """
29
+ Generates flashcards from either an S3 PDF or direct text input.
30
+ """
31
+ try:
32
+ system_prompt = get_flashcard_system_prompt(difficulty, quantity, language)
33
+ if topic:
34
+ system_prompt += get_flashcard_topic_prompt(topic)
35
+
36
+ content_to_analyze = ""
37
+
38
+ if file_key:
39
+ # Download PDF from S3
40
+ tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
41
+ tmp_path = tmp.name
42
+ tmp.close() # Close handle so other processes can access on Windows
43
+
44
+ try:
45
+ s3_service.s3_client.download_file(
46
+ settings.AWS_S3_BUCKET,
47
+ file_key,
48
+ tmp_path
49
+ )
50
+
51
+ with open(tmp_path, "rb") as f:
52
+ uploaded_file = self.openai_client.files.create(
53
+ file=f,
54
+ purpose="assistants"
55
+ )
56
+
57
+ messages = [
58
+ {"role": "system", "content": system_prompt},
59
+ {
60
+ "role": "user",
61
+ "content": [
62
+ {
63
+ "type": "file",
64
+ "file": {"file_id": uploaded_file.id}
65
+ }
66
+ ]
67
+ }
68
+ ]
69
+
70
+ response = self.openai_client.chat.completions.create(
71
+ model="gpt-4o-mini", # Using 4o-mini for efficiency
72
+ messages=messages,
73
+ temperature=0.7
74
+ )
75
+
76
+ # Clean up OpenAI file
77
+ self.openai_client.files.delete(uploaded_file.id)
78
+
79
+ raw_content = response.choices[0].message.content
80
+
81
+ finally:
82
+ if os.path.exists(tmp_path):
83
+ os.remove(tmp_path)
84
+
85
+ elif text_input:
86
+ messages = [
87
+ {"role": "system", "content": system_prompt},
88
+ {"role": "user", "content": text_input}
89
+ ]
90
+ response = self.openai_client.chat.completions.create(
91
+ model="gpt-4o-mini",
92
+ messages=messages,
93
+ temperature=0.7
94
+ )
95
+ raw_content = response.choices[0].message.content
96
+
97
+ else:
98
+ raise ValueError("Either file_key or text_input must be provided")
99
+
100
+ # Parse JSON
101
+ # Remove markdown code blocks if present
102
+ if "```json" in raw_content:
103
+ raw_content = raw_content.split("```json")[1].split("```")[0].strip()
104
+ elif "```" in raw_content:
105
+ raw_content = raw_content.split("```")[1].split("```")[0].strip()
106
+
107
+ flashcards = json.loads(raw_content)
108
+ return flashcards
109
+
110
+ except Exception as e:
111
+ logger.error(f"Flashcard generation failed: {e}")
112
+ raise
113
+
114
+ async def generate_explanation(self, question: str, file_key: Optional[str] = None, language: str = "English") -> str:
115
+ """
116
+ Generates a detailed explanation for a flashcard question.
117
+ """
118
+ try:
119
+ explanation_prompt = get_flashcard_explanation_prompt(question, language)
120
+
121
+ if file_key:
122
+ # Similar logic to generation if PDF context is needed
123
+ tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
124
+ tmp_path = tmp.name
125
+ tmp.close()
126
+
127
+ try:
128
+ s3_service.s3_client.download_file(
129
+ settings.AWS_S3_BUCKET,
130
+ file_key,
131
+ tmp_path
132
+ )
133
+ with open(tmp_path, "rb") as f:
134
+ uploaded_file = self.openai_client.files.create(file=f, purpose="assistants")
135
+
136
+ messages = [
137
+ {"role": "system", "content": explanation_prompt},
138
+ {"role": "user", "content": [{"type": "file", "file": {"file_id": uploaded_file.id}}]}
139
+ ]
140
+ response = self.openai_client.chat.completions.create(
141
+ model="gpt-4o-mini",
142
+ messages=messages
143
+ )
144
+ self.openai_client.files.delete(uploaded_file.id)
145
+ return response.choices[0].message.content
146
+ finally:
147
+ if os.path.exists(tmp_path):
148
+ os.remove(tmp_path)
149
+ else:
150
+ messages = [
151
+ {"role": "system", "content": explanation_prompt},
152
+ {"role": "user", "content": f"Please explain the question: {question}"}
153
+ ]
154
+ response = self.openai_client.chat.completions.create(
155
+ model="gpt-4o-mini",
156
+ messages=messages
157
+ )
158
+ return response.choices[0].message.content
159
+
160
+ except Exception as e:
161
+ logger.error(f"Explanation generation failed: {e}")
162
+ raise
163
+
164
+ flashcard_service = FlashcardService()
services/mindmap_service.py ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ import os
3
+ import tempfile
4
+ from typing import Optional
5
+ import openai
6
+ from core.config import settings
7
+ from core.prompts import get_mindmap_system_prompt
8
+ from services.s3_service import s3_service
9
+
10
+ logger = logging.getLogger(__name__)
11
+
12
+ class MindMapService:
13
+ def __init__(self):
14
+ self.openai_client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
15
+
16
+ async def generate_mindmap(
17
+ self,
18
+ file_key: Optional[str] = None,
19
+ text_input: Optional[str] = None
20
+ ) -> str:
21
+ """
22
+ Generates a Mermaid mindmap from either an S3 PDF or direct text input.
23
+ """
24
+ try:
25
+ system_prompt = get_mindmap_system_prompt()
26
+
27
+ if file_key:
28
+ # Download PDF from S3
29
+ tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
30
+ tmp_path = tmp.name
31
+ tmp.close()
32
+
33
+ try:
34
+ s3_service.s3_client.download_file(
35
+ settings.AWS_S3_BUCKET,
36
+ file_key,
37
+ tmp_path
38
+ )
39
+
40
+ with open(tmp_path, "rb") as f:
41
+ uploaded_file = self.openai_client.files.create(
42
+ file=f,
43
+ purpose="assistants"
44
+ )
45
+
46
+ messages = [
47
+ {"role": "system", "content": system_prompt},
48
+ {
49
+ "role": "user",
50
+ "content": [
51
+ {
52
+ "type": "file",
53
+ "file": {"file_id": uploaded_file.id}
54
+ }
55
+ ]
56
+ }
57
+ ]
58
+
59
+ response = self.openai_client.chat.completions.create(
60
+ model="gpt-4o-mini",
61
+ messages=messages,
62
+ temperature=0.7
63
+ )
64
+
65
+ # Clean up OpenAI file
66
+ self.openai_client.files.delete(uploaded_file.id)
67
+
68
+ raw_content = response.choices[0].message.content
69
+
70
+ finally:
71
+ if os.path.exists(tmp_path):
72
+ os.remove(tmp_path)
73
+
74
+ elif text_input:
75
+ messages = [
76
+ {"role": "system", "content": system_prompt},
77
+ {"role": "user", "content": text_input}
78
+ ]
79
+ response = self.openai_client.chat.completions.create(
80
+ model="gpt-4o-mini",
81
+ messages=messages,
82
+ temperature=0.7
83
+ )
84
+ raw_content = response.choices[0].message.content
85
+
86
+ else:
87
+ raise ValueError("Either file_key or text_input must be provided")
88
+
89
+ # Clean up the output
90
+ if "```mermaid" in raw_content:
91
+ raw_content = raw_content.split("```mermaid")[1].split("```")[0].strip()
92
+ elif "```" in raw_content:
93
+ raw_content = raw_content.split("```")[1].split("```")[0].strip()
94
+
95
+ # Ensure it starts with 'mindmap'
96
+ if "mindmap" not in raw_content.lower():
97
+ # If the AI missed the header, we might need to handle it,
98
+ # but usually the prompt is strong.
99
+ pass
100
+
101
+ return raw_content.strip()
102
+
103
+ except Exception as e:
104
+ logger.error(f"Mind map generation failed: {e}")
105
+ raise
106
+
107
+ mindmap_service = MindMapService()
services/podcast_service.py ADDED
@@ -0,0 +1,249 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import re
2
+ import os
3
+ import json
4
+ import time
5
+ import struct
6
+ import logging
7
+ import mimetypes
8
+ from datetime import datetime
9
+ from concurrent.futures import ThreadPoolExecutor, as_completed
10
+ from typing import List, Tuple, Optional, Dict
11
+
12
+ import openai
13
+ from google import genai
14
+ from google.genai import types
15
+ from pydantic import BaseModel
16
+ from pydub import AudioSegment
17
+
18
+ from core.config import settings
19
+ from core.prompts import SYSTEM_PROMPT, ANALYSIS_PROMPT
20
+ from services.s3_service import s3_service
21
+ from core import constants
22
+
23
+ logger = logging.getLogger(__name__)
24
+
25
+ class AnalysisOutput(BaseModel):
26
+ program_structure: str
27
+ script: str
28
+
29
+ class MultiProposalOutput(BaseModel):
30
+ proposals: List[AnalysisOutput]
31
+
32
+ # Automatically generate voice choices from constants
33
+ VOICE_CHOICES = {v["value"]: v["value"] for v in constants.PODCAST_VOICES}
34
+
35
+ BGM_CHOICES = {
36
+ "No BGM": None,
37
+ "BGM 1": "assets/bgm/BGM_1.mp3",
38
+ "BGM 2": "assets/bgm/BGM_2.mp3",
39
+ "BGM 3": "assets/bgm/BGM_3.mp3"
40
+ }
41
+
42
+ class PodcastService:
43
+ def __init__(self):
44
+ self.openai_client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
45
+ self.genai_client = genai.Client(api_key=settings.GEMINI_API_KEY)
46
+
47
+ def compute_script_targets(self, duration_minutes: int) -> int:
48
+ if duration_minutes <= 5: return 2000
49
+ elif duration_minutes <= 10: return 3000
50
+ elif duration_minutes <= 15: return 4000
51
+ else: return 5000
52
+
53
+ async def analyze_pdf(self, file_key: str, duration_minutes: int, model: str = "gpt-4o"):
54
+ # 1. Get file from S3
55
+ # Since openai files.create needs a file, we download it temporarily
56
+ temp_path = f"temp_{int(time.time())}.pdf"
57
+ try:
58
+ import boto3
59
+ s3 = boto3.client('s3',
60
+ aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
61
+ aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
62
+ region_name=settings.AWS_REGION)
63
+ s3.download_file(settings.AWS_S3_BUCKET, file_key, temp_path)
64
+
65
+ # 2. Upload to OpenAI
66
+ with open(temp_path, "rb") as f:
67
+ file_response = self.openai_client.files.create(file=f, purpose="assistants")
68
+
69
+ # 3. Analyze
70
+ formatted_prompt = ANALYSIS_PROMPT.format(duration_minutes=duration_minutes)
71
+
72
+ response = self.openai_client.chat.completions.parse(
73
+ model=model,
74
+ messages=[
75
+ {"role": "system", "content": formatted_prompt},
76
+ {"role": "user", "content": [{"type": "file", "file": {"file_id": file_response.id}}]}
77
+ ],
78
+ temperature=1.0,
79
+ response_format=MultiProposalOutput
80
+ )
81
+ return response.choices[0].message.content
82
+ finally:
83
+ if os.path.exists(temp_path):
84
+ os.remove(temp_path)
85
+
86
+ async def generate_script(self, user_prompt: str, model: str, duration_minutes: int,
87
+ podcast_format: str, pdf_suggestions: str, file_key: Optional[str] = None):
88
+ target_words = self.compute_script_targets(duration_minutes)
89
+ formatted_system = SYSTEM_PROMPT.format(
90
+ target_words=target_words,
91
+ podcast_format=podcast_format,
92
+ pdf_suggestions=pdf_suggestions
93
+ )
94
+
95
+ messages = [{"role": "system", "content": formatted_system}]
96
+
97
+ temp_path = None
98
+ if file_key:
99
+ temp_path = f"temp_gen_{int(time.time())}.pdf"
100
+ import boto3
101
+ s3 = boto3.client('s3',
102
+ aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
103
+ aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
104
+ region_name=settings.AWS_REGION)
105
+ s3.download_file(settings.AWS_S3_BUCKET, file_key, temp_path)
106
+
107
+ with open(temp_path, "rb") as f:
108
+ file_response = self.openai_client.files.create(file=f, purpose="assistants")
109
+
110
+ messages.append({
111
+ "role": "user",
112
+ "content": [
113
+ {"type": "file", "file": {"file_id": file_response.id}},
114
+ {"type": "text", "text": user_prompt}
115
+ ]
116
+ })
117
+ else:
118
+ messages.append({"role": "user", "content": user_prompt})
119
+
120
+ try:
121
+ response = self.openai_client.chat.completions.create(
122
+ model=model,
123
+ messages=messages,
124
+ temperature=1.0,
125
+ max_completion_tokens=100000
126
+ )
127
+ return response.choices[0].message.content
128
+ finally:
129
+ if temp_path and os.path.exists(temp_path):
130
+ os.remove(temp_path)
131
+
132
+ def parse_script(self, script: str) -> List[Tuple[str, str]]:
133
+ dialogs = []
134
+ pattern = re.compile(r"^(Speaker [12])[::]\s*(.*)$", re.MULTILINE)
135
+ for match in pattern.finditer(script):
136
+ speaker, text = match.groups()
137
+ dialogs.append((speaker, text))
138
+ return dialogs
139
+
140
+ def split_script(self, dialogs: List[Tuple[str, str]], chunk_size=20) -> List[str]:
141
+ chunks = []
142
+ for i in range(0, len(dialogs), chunk_size):
143
+ chunk = dialogs[i:i + chunk_size]
144
+ chunks.append("\n".join([f"{s}: {t}" for s, t in chunk]))
145
+ return chunks
146
+
147
+ def generate_audio_chunk(self, chunk_script: str, tts_model: str, spk1_voice: str,
148
+ spk2_voice: str, temperature: float, index: int) -> Optional[str]:
149
+ try:
150
+ contents = [types.Content(role="user", parts=[types.Part.from_text(text=chunk_script)])]
151
+ config = types.GenerateContentConfig(
152
+ temperature=temperature,
153
+ response_modalities=["audio"],
154
+ speech_config=types.SpeechConfig(
155
+ multi_speaker_voice_config=types.MultiSpeakerVoiceConfig(
156
+ speaker_voice_configs=[
157
+ types.SpeakerVoiceConfig(speaker="Speaker 1", voice_config=types.VoiceConfig(
158
+ prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name=spk1_voice))),
159
+ types.SpeakerVoiceConfig(speaker="Speaker 2", voice_config=types.VoiceConfig(
160
+ prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name=spk2_voice)))
161
+ ]
162
+ )
163
+ )
164
+ )
165
+
166
+ audio_data = None
167
+ mime_type = "audio/wav"
168
+ for chunk in self.genai_client.models.generate_content_stream(model=tts_model, contents=contents, config=config):
169
+ if chunk.candidates and chunk.candidates[0].content.parts:
170
+ part = chunk.candidates[0].content.parts[0]
171
+ if part.inline_data:
172
+ audio_data = part.inline_data.data
173
+ mime_type = part.inline_data.mime_type
174
+ break
175
+
176
+ if audio_data:
177
+ # Basic WAV conversion if needed (simplified from original)
178
+ if "wav" not in mime_type.lower():
179
+ # We usually get raw PCM or similar, need header
180
+ audio_data = self._convert_to_wav(audio_data, mime_type)
181
+
182
+ path = f"chunk_{index}_{int(time.time())}.wav"
183
+ with open(path, "wb") as f:
184
+ f.write(audio_data)
185
+ return path
186
+ except Exception as e:
187
+ logger.error(f"Error generating chunk {index}: {e}")
188
+ return None
189
+
190
+ def _convert_to_wav(self, audio_data: bytes, mime_type: str) -> bytes:
191
+ # Simplified conversion
192
+ rate = 24000
193
+ if "rate=" in mime_type:
194
+ try: rate = int(mime_type.split("rate=")[1].split(";")[0])
195
+ except: pass
196
+
197
+ bits = 16
198
+ num_channels = 1
199
+ data_size = len(audio_data)
200
+ header = struct.pack("<4sI4s4sIHHIIHH4sI", b"RIFF", 36 + data_size, b"WAVE", b"fmt ", 16, 1, num_channels, rate, rate * num_channels * (bits // 8), num_channels * (bits // 8), bits, b"data", data_size)
201
+ return header + audio_data
202
+
203
+ async def generate_full_audio(self, script: str, tts_model: str, spk1_voice: str,
204
+ spk2_voice: str, temperature: float, bgm_choice: str):
205
+ dialogs = self.parse_script(script)
206
+ chunks = self.split_script(dialogs)
207
+
208
+ chunk_paths = [None] * len(chunks)
209
+ with ThreadPoolExecutor(max_workers=4) as executor:
210
+ futures = {executor.submit(self.generate_audio_chunk, chunks[i], tts_model, spk1_voice, spk2_voice, temperature, i): i for i in range(len(chunks))}
211
+ for future in as_completed(futures):
212
+ idx = futures[future]
213
+ chunk_paths[idx] = future.result()
214
+
215
+ valid_paths = [p for p in chunk_paths if p]
216
+ if not valid_paths: return None
217
+
218
+ # Combine
219
+ combined = AudioSegment.empty()
220
+ for p in valid_paths:
221
+ combined += AudioSegment.from_file(p)
222
+ combined += AudioSegment.silent(duration=500)
223
+ os.remove(p)
224
+
225
+ final_path = f"final_podcast_{int(time.time())}.wav"
226
+
227
+ # Mix BGM
228
+ bgm_path = BGM_CHOICES.get(bgm_choice)
229
+ if bgm_path and os.path.exists(bgm_path):
230
+ bgm = AudioSegment.from_file(bgm_path)
231
+ # Simple mix: loop BGM, fade in/out
232
+ if len(bgm) < len(combined) + 10000:
233
+ bgm = bgm * ( (len(combined) + 10000) // len(bgm) + 1 )
234
+
235
+ bgm = bgm[:len(combined) + 10000]
236
+ bgm_main = bgm[5000:5000+len(combined)] - 16
237
+ bgm_intro = bgm[:5000]
238
+ bgm_outro = bgm[5000+len(combined):].fade_out(5000) - 16
239
+
240
+ bgm_processed = bgm_intro + bgm_main + bgm_outro
241
+ combined_with_intro = AudioSegment.silent(duration=5000) + combined + AudioSegment.silent(duration=5000)
242
+ final_audio = combined_with_intro.overlay(bgm_processed)
243
+ final_audio.export(final_path, format="wav")
244
+ else:
245
+ combined.export(final_path, format="wav")
246
+
247
+ return final_path
248
+
249
+ podcast_service = PodcastService()
services/quiz_service.py ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import logging
3
+ import os
4
+ import tempfile
5
+ from typing import List, Dict, Optional, Any
6
+ import openai
7
+
8
+ from core.config import settings
9
+ from core.prompts import get_quiz_system_prompt
10
+ from services.s3_service import s3_service
11
+
12
+ logger = logging.getLogger(__name__)
13
+
14
+ class QuizService:
15
+ def __init__(self):
16
+ self.openai_client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
17
+
18
+ async def generate_quiz(
19
+ self,
20
+ file_key: Optional[str] = None,
21
+ text_input: Optional[str] = None,
22
+ difficulty: str = "medium",
23
+ topic: Optional[str] = None,
24
+ language: str = "English",
25
+ count_mode: str = "STANDARD"
26
+ ) -> List[Dict[str, Any]]:
27
+ """
28
+ Generates a quiz from either an S3 PDF or direct text input.
29
+ """
30
+ try:
31
+ # Map count mode to actual numbers
32
+ counts = {
33
+ "FEWER": "5-10",
34
+ "STANDARD": "10-15",
35
+ "MORE": "20-25"
36
+ }
37
+ num_range = counts.get(count_mode, "10-15")
38
+
39
+ system_prompt = get_quiz_system_prompt(language).replace("{NUM_QUESTIONS}", num_range)
40
+
41
+ if file_key:
42
+ # Download PDF from S3
43
+ tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
44
+ tmp_path = tmp.name
45
+ tmp.close()
46
+
47
+ try:
48
+ s3_service.s3_client.download_file(
49
+ settings.AWS_S3_BUCKET,
50
+ file_key,
51
+ tmp_path
52
+ )
53
+
54
+ with open(tmp_path, "rb") as f:
55
+ uploaded_file = self.openai_client.files.create(
56
+ file=f,
57
+ purpose="assistants"
58
+ )
59
+
60
+ user_message = f"Analyze the PDF and create {num_range} questions. Difficulty: {difficulty}."
61
+ if topic:
62
+ user_message += f" Topic: {topic}."
63
+
64
+ messages = [
65
+ {"role": "system", "content": system_prompt},
66
+ {
67
+ "role": "user",
68
+ "content": [
69
+ {"type": "text", "text": user_message},
70
+ {
71
+ "type": "file",
72
+ "file": {"file_id": uploaded_file.id}
73
+ }
74
+ ]
75
+ }
76
+ ]
77
+
78
+ response = self.openai_client.chat.completions.create(
79
+ model="gpt-4o-mini",
80
+ messages=messages,
81
+ response_format={"type": "json_object"},
82
+ temperature=0.7
83
+ )
84
+
85
+ self.openai_client.files.delete(uploaded_file.id)
86
+ raw_content = response.choices[0].message.content
87
+
88
+ finally:
89
+ if os.path.exists(tmp_path):
90
+ os.remove(tmp_path)
91
+
92
+ elif text_input:
93
+ user_message = f"Analyze the text and create {num_range} questions. Difficulty: {difficulty}."
94
+ if topic:
95
+ user_message += f" Topic: {topic}."
96
+ user_message += f"\n\nText content:\n{text_input}"
97
+
98
+ messages = [
99
+ {"role": "system", "content": system_prompt},
100
+ {"role": "user", "content": user_message}
101
+ ]
102
+ response = self.openai_client.chat.completions.create(
103
+ model="gpt-4o-mini",
104
+ messages=messages,
105
+ response_format={"type": "json_object"},
106
+ temperature=0.7
107
+ )
108
+ raw_content = response.choices[0].message.content
109
+
110
+ else:
111
+ raise ValueError("Either file_key or text_input must be provided")
112
+
113
+ data = json.loads(raw_content)
114
+ # The prompt asks for {"quizzes": [...]}
115
+ return data.get("quizzes", [])
116
+
117
+ except Exception as e:
118
+ logger.error(f"Quiz generation failed: {e}")
119
+ raise
120
+
121
+ quiz_service = QuizService()
services/rag_service.py ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import logging
3
+ import uuid
4
+ from typing import List, Dict, Any, Optional
5
+ from datetime import datetime
6
+
7
+ from azure.search.documents import SearchClient
8
+ from azure.search.documents.indexes import SearchIndexClient
9
+ from azure.search.documents.indexes.models import (
10
+ SearchIndex,
11
+ SimpleField,
12
+ SearchableField,
13
+ SearchField,
14
+ VectorSearch,
15
+ HnswAlgorithmConfiguration,
16
+ VectorSearchProfile,
17
+ SearchFieldDataType
18
+ )
19
+ from azure.core.credentials import AzureKeyCredential
20
+ from openai import AzureOpenAI
21
+
22
+ from core.config import settings
23
+
24
+ logger = logging.getLogger(__name__)
25
+
26
+ class RAGService:
27
+ def __init__(self):
28
+ # Azure Search
29
+ self.search_endpoint = settings.AZURE_SEARCH_ENDPOINT
30
+ self.search_key = settings.AZURE_SEARCH_KEY
31
+ self.index_name = settings.AZURE_SEARCH_INDEX_NAME
32
+
33
+ # Azure OpenAI for embeddings
34
+ self.azure_openai_client = AzureOpenAI(
35
+ api_key=settings.AZURE_OPENAI_API_KEY,
36
+ api_version=settings.AZURE_OPENAI_API_VERSION,
37
+ azure_endpoint=settings.AZURE_OPENAI_ENDPOINT.split("/openai/")[0]
38
+ )
39
+ self.embedding_deployment = settings.AZURE_OPENAI_DEPLOYMENT_NAME
40
+
41
+ # Initialize clients
42
+ self.search_client = SearchClient(
43
+ endpoint=self.search_endpoint,
44
+ index_name=self.index_name,
45
+ credential=AzureKeyCredential(self.search_key)
46
+ )
47
+
48
+ self.index_client = SearchIndexClient(
49
+ endpoint=self.search_endpoint,
50
+ credential=AzureKeyCredential(self.search_key)
51
+ )
52
+
53
+ # Ensure index exists
54
+ self._ensure_index_exists()
55
+
56
+ def _ensure_index_exists(self):
57
+ """Create or recreate Azure AI Search index if it doesn't exist or is incompatible."""
58
+ try:
59
+ existing_index = self.index_client.get_index(self.index_name)
60
+
61
+ # Check for required fields
62
+ required_fields = {"filename", "doc_id", "user_id", "content_vector"}
63
+ existing_fields = {field.name for field in existing_index.fields}
64
+
65
+ if not required_fields.issubset(existing_fields):
66
+ logger.warning(f"Index {self.index_name} is incompatible. Recreating...")
67
+ self.index_client.delete_index(self.index_name)
68
+ self._create_index()
69
+ else:
70
+ logger.info(f"Index {self.index_name} exists and is compatible")
71
+ except Exception:
72
+ logger.info(f"Creating index {self.index_name}...")
73
+ self._create_index()
74
+
75
+ def _create_index(self):
76
+ """Create the search index with vector configuration."""
77
+ fields = [
78
+ SimpleField(name="id", type=SearchFieldDataType.String, key=True),
79
+ SearchableField(name="content", type=SearchFieldDataType.String),
80
+ SearchableField(name="filename", type=SearchFieldDataType.String, filterable=True),
81
+ SimpleField(name="doc_id", type=SearchFieldDataType.String, filterable=True),
82
+ SimpleField(name="user_id", type=SearchFieldDataType.String, filterable=True),
83
+ SimpleField(name="chunk_index", type=SearchFieldDataType.Int32),
84
+ SimpleField(name="created_at", type=SearchFieldDataType.DateTimeOffset),
85
+ SearchField(
86
+ name="content_vector",
87
+ type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
88
+ searchable=True,
89
+ vector_search_dimensions=1536,
90
+ vector_search_profile_name="my-vector-profile"
91
+ )
92
+ ]
93
+
94
+ vector_search = VectorSearch(
95
+ algorithms=[HnswAlgorithmConfiguration(name="my-hnsw")],
96
+ profiles=[
97
+ VectorSearchProfile(
98
+ name="my-vector-profile",
99
+ algorithm_configuration_name="my-hnsw"
100
+ )
101
+ ]
102
+ )
103
+
104
+ index = SearchIndex(
105
+ name=self.index_name,
106
+ fields=fields,
107
+ vector_search=vector_search
108
+ )
109
+
110
+ self.index_client.create_index(index)
111
+ logger.info(f"Created index: {self.index_name}")
112
+
113
+ def generate_embeddings(self, texts: List[str]) -> List[List[float]]:
114
+ """Generate embeddings using Azure OpenAI."""
115
+ try:
116
+ embeddings = []
117
+ for text in texts:
118
+ response = self.azure_openai_client.embeddings.create(
119
+ input=text,
120
+ model=self.embedding_deployment
121
+ )
122
+ embeddings.append(response.data[0].embedding)
123
+ return embeddings
124
+ except Exception as e:
125
+ logger.error(f"Error generating embeddings: {e}")
126
+ raise
127
+
128
+ def index_document(
129
+ self,
130
+ chunks: List[str],
131
+ filename: str,
132
+ user_id: int,
133
+ doc_id: str
134
+ ) -> int:
135
+ """Index document chunks with embeddings in Azure Search."""
136
+ try:
137
+ # Generate embeddings
138
+ logger.info(f"Generating embeddings for {len(chunks)} chunks...")
139
+ embeddings = self.generate_embeddings(chunks)
140
+
141
+ # Prepare documents
142
+ documents = []
143
+ for idx, (chunk, embedding) in enumerate(zip(chunks, embeddings)):
144
+ doc = {
145
+ "id": f"{doc_id}_{idx}",
146
+ "content": chunk,
147
+ "filename": filename,
148
+ "doc_id": doc_id,
149
+ "user_id": str(user_id),
150
+ "chunk_index": idx,
151
+ "created_at": datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ"),
152
+ "content_vector": embedding
153
+ }
154
+ documents.append(doc)
155
+
156
+ # Upload to search index
157
+ result = self.search_client.upload_documents(documents=documents)
158
+ logger.info(f"Indexed {len(documents)} chunks for {filename}")
159
+
160
+ return len(documents)
161
+
162
+ except Exception as e:
163
+ logger.error(f"Error indexing document: {e}")
164
+ raise
165
+
166
+ def search_document(
167
+ self,
168
+ query: str,
169
+ doc_id: str,
170
+ user_id: int,
171
+ top_k: int = 3
172
+ ) -> List[Dict[str, Any]]:
173
+ """Search within a specific document using vector search."""
174
+ try:
175
+ # Generate query embedding
176
+ query_embedding = self.generate_embeddings([query])[0]
177
+
178
+ # Vector search with filters
179
+ from azure.search.documents.models import VectorizedQuery
180
+
181
+ vector_query = VectorizedQuery(
182
+ vector=query_embedding,
183
+ k_nearest_neighbors=top_k,
184
+ fields="content_vector"
185
+ )
186
+
187
+ results = self.search_client.search(
188
+ search_text=None,
189
+ vector_queries=[vector_query],
190
+ filter=f"doc_id eq '{doc_id}' and user_id eq '{user_id}'",
191
+ top=top_k,
192
+ select=["content", "filename", "chunk_index"]
193
+ )
194
+
195
+ # Format results
196
+ search_results = []
197
+ for result in results:
198
+ search_results.append({
199
+ "content": result["content"],
200
+ "chunk_index": result.get("chunk_index", 0)
201
+ })
202
+
203
+ return search_results
204
+
205
+ except Exception as e:
206
+ logger.error(f"Error searching document: {e}")
207
+ raise
208
+
209
+ def delete_document(self, doc_id: str):
210
+ """Delete all chunks of a document from the search index."""
211
+ try:
212
+ # Search for all chunks
213
+ results = self.search_client.search(
214
+ search_text="*",
215
+ filter=f"doc_id eq '{doc_id}'",
216
+ select=["id"],
217
+ top=1000
218
+ )
219
+
220
+ # Delete all chunks
221
+ doc_ids = [{"id": r["id"]} for r in results]
222
+ if doc_ids:
223
+ self.search_client.delete_documents(documents=doc_ids)
224
+ logger.info(f"Deleted {len(doc_ids)} chunks for document {doc_id}")
225
+
226
+ except Exception as e:
227
+ logger.error(f"Error deleting document: {e}")
228
+ raise
229
+
230
+ def document_exists(self, doc_id: str, user_id: int) -> bool:
231
+ """Check if a document is already indexed."""
232
+ try:
233
+ results = self.search_client.search(
234
+ search_text="*",
235
+ filter=f"doc_id eq '{doc_id}' and user_id eq '{user_id}'",
236
+ top=1,
237
+ select=["id"]
238
+ )
239
+ return len(list(results)) > 0
240
+ except:
241
+ return False
242
+
243
+ rag_service = RAGService()
services/report_service.py ADDED
@@ -0,0 +1,191 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import logging
3
+ import os
4
+ import tempfile
5
+ from typing import List, Dict, Optional, Any
6
+ import openai
7
+
8
+ from core.config import settings
9
+ from core.prompts import get_report_prompt, get_report_suggestion_prompt
10
+ from services.s3_service import s3_service
11
+
12
+ logger = logging.getLogger(__name__)
13
+
14
+ class ReportService:
15
+ def __init__(self):
16
+ self.openai_client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
17
+
18
+ async def generate_format_suggestions(
19
+ self,
20
+ file_key: Optional[str] = None,
21
+ text_input: Optional[str] = None,
22
+ language: str = "Japanese"
23
+ ) -> List[Dict[str, str]]:
24
+ """
25
+ Generates 4 AI-suggested report formats based on the content.
26
+ """
27
+ try:
28
+ system_prompt = get_report_suggestion_prompt(language)
29
+
30
+ if file_key:
31
+ # Download PDF from S3
32
+ tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
33
+ tmp_path = tmp.name
34
+ tmp.close()
35
+
36
+ try:
37
+ s3_service.s3_client.download_file(
38
+ settings.AWS_S3_BUCKET,
39
+ file_key,
40
+ tmp_path
41
+ )
42
+
43
+ with open(tmp_path, "rb") as f:
44
+ uploaded_file = self.openai_client.files.create(
45
+ file=f,
46
+ purpose="assistants"
47
+ )
48
+
49
+ messages = [
50
+ {"role": "system", "content": system_prompt},
51
+ {
52
+ "role": "user",
53
+ "content": [
54
+ {
55
+ "type": "file",
56
+ "file": {"file_id": uploaded_file.id}
57
+ }
58
+ ]
59
+ }
60
+ ]
61
+
62
+ response = self.openai_client.chat.completions.create(
63
+ model="gpt-4o-mini",
64
+ messages=messages,
65
+ response_format={"type": "json_object"},
66
+ temperature=0.7
67
+ )
68
+
69
+ self.openai_client.files.delete(uploaded_file.id)
70
+ raw_content = response.choices[0].message.content
71
+
72
+ finally:
73
+ if os.path.exists(tmp_path):
74
+ os.remove(tmp_path)
75
+
76
+ elif text_input:
77
+ messages = [
78
+ {"role": "system", "content": system_prompt},
79
+ {"role": "user", "content": f"Analyze this content:\n\n{text_input}"}
80
+ ]
81
+ response = self.openai_client.chat.completions.create(
82
+ model="gpt-4o-mini",
83
+ messages=messages,
84
+ response_format={"type": "json_object"},
85
+ temperature=0.7
86
+ )
87
+ raw_content = response.choices[0].message.content
88
+
89
+ else:
90
+ raise ValueError("Either file_key or text_input must be provided")
91
+
92
+ data = json.loads(raw_content)
93
+ return data.get("suggestions", [])
94
+
95
+ except Exception as e:
96
+ logger.error(f"Format suggestion failed: {e}")
97
+ return []
98
+
99
+ async def generate_report(
100
+ self,
101
+ file_key: Optional[str] = None,
102
+ text_input: Optional[str] = None,
103
+ format_key: str = "briefing_doc",
104
+ custom_prompt: Optional[str] = None,
105
+ language: str = "Japanese"
106
+ ) -> str:
107
+ """
108
+ Generates a full report based on the selected format.
109
+ """
110
+ try:
111
+ base_prompt = get_report_prompt(format_key, custom_prompt or "", language)
112
+
113
+ # Language styling instruction
114
+ if language == "Japanese":
115
+ system_prompt = (
116
+ "あなたは日本語でレポートを作成するAIアシスタントです。すべての回答は日本語で書いてください。\n\n"
117
+ f"{base_prompt}\n\n"
118
+ "重要: レポート全体を日本語で書いてください。回答はマークダウン形式で、適切な見出し、箇条書き、構造を使用して読みやすくフォーマットしてください。"
119
+ )
120
+ else:
121
+ system_prompt = (
122
+ "You are an AI assistant that creates reports in English. Write all responses in English.\n\n"
123
+ f"{base_prompt}\n\n"
124
+ "IMPORTANT: Write the entire report in English. Please format your response in markdown with proper headings, bullet points, and structure for easy reading."
125
+ )
126
+
127
+ if file_key:
128
+ # Download PDF from S3
129
+ tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
130
+ tmp_path = tmp.name
131
+ tmp.close()
132
+
133
+ try:
134
+ s3_service.s3_client.download_file(
135
+ settings.AWS_S3_BUCKET,
136
+ file_key,
137
+ tmp_path
138
+ )
139
+
140
+ with open(tmp_path, "rb") as f:
141
+ uploaded_file = self.openai_client.files.create(
142
+ file=f,
143
+ purpose="assistants"
144
+ )
145
+
146
+ messages = [
147
+ {"role": "system", "content": system_prompt},
148
+ {
149
+ "role": "user",
150
+ "content": [
151
+ {
152
+ "type": "file",
153
+ "file": {"file_id": uploaded_file.id}
154
+ }
155
+ ]
156
+ }
157
+ ]
158
+
159
+ response = self.openai_client.chat.completions.create(
160
+ model="gpt-4o-mini",
161
+ messages=messages,
162
+ temperature=0.7
163
+ )
164
+
165
+ self.openai_client.files.delete(uploaded_file.id)
166
+ return response.choices[0].message.content
167
+
168
+ finally:
169
+ if os.path.exists(tmp_path):
170
+ os.remove(tmp_path)
171
+
172
+ elif text_input:
173
+ messages = [
174
+ {"role": "system", "content": system_prompt},
175
+ {"role": "user", "content": f"Please analyze the following content and generate a report based on it:\n\n{text_input}"}
176
+ ]
177
+ response = self.openai_client.chat.completions.create(
178
+ model="gpt-4o-mini",
179
+ messages=messages,
180
+ temperature=0.7
181
+ )
182
+ return response.choices[0].message.content
183
+
184
+ else:
185
+ raise ValueError("Either file_key or text_input must be provided")
186
+
187
+ except Exception as e:
188
+ logger.error(f"Report generation failed: {e}")
189
+ raise
190
+
191
+ report_service = ReportService()
services/s3_service.py ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import boto3
2
+ from botocore.exceptions import ClientError
3
+ from core.config import settings
4
+ import logging
5
+
6
+ logger = logging.getLogger(__name__)
7
+
8
+ class S3Service:
9
+ def __init__(self):
10
+ self.s3_client = boto3.client(
11
+ 's3',
12
+ aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
13
+ aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
14
+ region_name=settings.AWS_REGION
15
+ )
16
+ self.bucket_name = settings.AWS_S3_BUCKET
17
+
18
+ def get_public_url(self, key: str):
19
+ """
20
+ Generates the standard S3 public URL for a given key.
21
+ """
22
+ return f"https://{self.bucket_name}.s3.{settings.AWS_REGION}.amazonaws.com/{key}"
23
+
24
+ def get_presigned_url(self, key: str, expires_in: int = 3600):
25
+ """
26
+ Generates a pre-signed URL for secure access. Default: 1 hour.
27
+ """
28
+ try:
29
+ url = self.s3_client.generate_presigned_url(
30
+ 'get_object',
31
+ Params={'Bucket': self.bucket_name, 'Key': key},
32
+ ExpiresIn=expires_in
33
+ )
34
+ return url
35
+ except ClientError as e:
36
+ logger.error(f"Failed to generate presigned URL: {e}")
37
+ return None
38
+
39
+ async def upload_file(self, file_content: bytes, filename: str, user_id: str):
40
+ """
41
+ Uploads a file to S3 under a user-specific folder.
42
+ """
43
+ key = f"users/{user_id}/sources/{filename}"
44
+ try:
45
+ self.s3_client.put_object(
46
+ Bucket=self.bucket_name,
47
+ Key=key,
48
+ Body=file_content
49
+ )
50
+ return {
51
+ "key": key,
52
+ "public_url": self.get_public_url(key),
53
+ "private_url": self.get_presigned_url(key)
54
+ }
55
+ except ClientError as e:
56
+ logger.error(f"Failed to upload to S3: {e}")
57
+ raise Exception("S3 Upload Failed")
58
+
59
+ async def list_user_files(self, user_id: str):
60
+ """
61
+ Lists files for a specific user.
62
+ """
63
+ prefix = f"users/{user_id}/sources/"
64
+ try:
65
+ response = self.s3_client.list_objects_v2(
66
+ Bucket=self.bucket_name,
67
+ Prefix=prefix
68
+ )
69
+ files = []
70
+ if 'Contents' in response:
71
+ for obj in response['Contents']:
72
+ # Remove the prefix from the filename for display
73
+ filename = obj['Key'].replace(prefix, "")
74
+ if filename: # Avoid empty strings if the prefix itself is returned
75
+ files.append({
76
+ "filename": filename,
77
+ "key": obj['Key'],
78
+ "public_url": self.get_public_url(obj['Key']),
79
+ "private_url": self.get_presigned_url(obj['Key']),
80
+ "size": obj['Size'],
81
+ "last_modified": obj['LastModified']
82
+ })
83
+ return files
84
+ except ClientError as e:
85
+ logger.error(f"Failed to list S3 files: {e}")
86
+ raise Exception("S3 List Failed")
87
+
88
+ async def delete_file(self, key: str):
89
+ """
90
+ Deletes a file from S3.
91
+ """
92
+ try:
93
+ self.s3_client.delete_object(
94
+ Bucket=self.bucket_name,
95
+ Key=key
96
+ )
97
+ logger.info(f"Deleted S3 object: {key}")
98
+ return True
99
+ except ClientError as e:
100
+ logger.error(f"Failed to delete S3 object: {e}")
101
+ return False
102
+
103
+ s3_service = S3Service()
services/slides_video_service.py ADDED
@@ -0,0 +1,463 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ from typing import Dict, List, Optional, Any, Tuple
3
+ import logging
4
+ import os
5
+ import tempfile
6
+ import time
7
+ import shutil
8
+ import io
9
+ import re
10
+ import wave
11
+ import yaml
12
+ import requests
13
+ import openai
14
+ from google.cloud import storage
15
+ from googleapiclient.discovery import build
16
+ from googleapiclient.http import MediaIoBaseUpload
17
+ from google.oauth2 import service_account
18
+ from google.auth.transport.requests import Request
19
+ from google.oauth2.credentials import Credentials
20
+ from google import genai
21
+ from google.genai import types
22
+ from PIL import Image
23
+ from pdf2image import convert_from_path
24
+ from moviepy import ImageClip, AudioFileClip, VideoFileClip, concatenate_videoclips
25
+
26
+ from core.config import settings
27
+ from core.prompts import (
28
+ get_video_script_prompt,
29
+ get_pdf_text_extraction_prompt,
30
+ get_outline_prompt
31
+ )
32
+ from services.s3_service import s3_service
33
+
34
+ logger = logging.getLogger(__name__)
35
+
36
+ # Constants from temp project
37
+ TEMPLATE_HINT: Dict[str, str] = {
38
+ "cover": "COVER.MAIN",
39
+ "hook": "HOOK.MAIN",
40
+ "section": "SECTION.MAIN",
41
+ "define": "DEFINE.MAIN",
42
+ "key": "KEY.MAIN",
43
+ "statement": "STATEMENT.MAIN",
44
+ "steps": "STEPS.TITLE",
45
+ "bullets": "BULLETS.TITLE",
46
+ "quote": "QUOTE.MAIN",
47
+ "logo": "LOGO.MAIN",
48
+ }
49
+
50
+ class SlidesVideoService:
51
+ def __init__(self):
52
+ self.openai_client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
53
+
54
+ # Match Temp project: Use API Key for Gemini TTS
55
+ logger.info("Initializing Gemini Client with API Key for Slides (as in Temp project)")
56
+ self.gemini_client = genai.Client(api_key=settings.GEMINI_API_KEY)
57
+
58
+ self.scopes = [
59
+ "https://www.googleapis.com/auth/drive",
60
+ "https://www.googleapis.com/auth/presentations"
61
+ ]
62
+
63
+ def _get_sa_info(self) -> Optional[Dict[str, Any]]:
64
+ """Parse GCP_SA_JSON - matches original Temp project logic exactly."""
65
+ sa_json = os.environ.get("GCP_SA_JSON") or os.environ.get("GCS_SA_JSON")
66
+ if not sa_json:
67
+ return None
68
+ # Just parse it directly like the original
69
+ return json.loads(sa_json)
70
+
71
+ def _get_google_creds(self):
72
+ """
73
+ Builds Google credentials from environment variables.
74
+ Matches Temp project logic.
75
+ """
76
+ token_json = settings.GOOGLE_OAUTH_TOKEN_JSON
77
+ if token_json:
78
+ creds = Credentials.from_authorized_user_info(json.loads(token_json), self.scopes)
79
+ if creds and creds.expired and creds.refresh_token:
80
+ creds.refresh(Request())
81
+ return creds
82
+
83
+ info = self._get_sa_info()
84
+ if info:
85
+ return service_account.Credentials.from_service_account_info(info, scopes=self.scopes)
86
+
87
+ raise RuntimeError("Google API credentials not configured (GOOGLE_OAUTH_TOKEN_JSON or GCP_SA_JSON required)")
88
+
89
+ def _get_clients(self):
90
+ creds = self._get_google_creds()
91
+ slides = build("slides", "v1", credentials=creds)
92
+ drive = build("drive", "v3", credentials=creds)
93
+ return slides, drive
94
+
95
+ async def extract_text_from_pdf(self, pdf_path: str) -> str:
96
+ """Extract text from PDF using OpenAI."""
97
+ with open(pdf_path, "rb") as f:
98
+ openai_file = self.openai_client.files.create(file=f, purpose="assistants")
99
+
100
+ prompt = get_pdf_text_extraction_prompt()
101
+ response = self.openai_client.chat.completions.create(
102
+ model="gpt-4o-mini",
103
+ messages=[
104
+ {
105
+ "role": "user",
106
+ "content": [{"type": "text", "text": prompt}, {"type": "file", "file": {"file_id": openai_file.id}}]
107
+ }
108
+ ],
109
+ temperature=0
110
+ )
111
+ text = response.choices[0].message.content
112
+ self.openai_client.files.delete(openai_file.id)
113
+ return text
114
+
115
+ async def generate_outline(self, source_text: str, language: str = "Japanese", custom_prompt: str = "") -> Dict[str, Any]:
116
+ """Step 1: Generate Slide Outline (JSON) from text."""
117
+ template_path = "core/templates/ja_slide_template.yaml" if language == "Japanese" else "core/templates/eng_slide_template.yaml"
118
+ if not os.path.exists(template_path):
119
+ # Fallback if I missed copying
120
+ template_path = f"Temp/AI-Video-Summary-Generator/{'ja' if language == 'Japanese' else 'eng'}_slide_template.yaml"
121
+
122
+ with open(template_path, "r", encoding="utf-8") as f:
123
+ template_yaml = f.read()
124
+
125
+ prompt = get_outline_prompt(template_yaml, source_text, custom_prompt, language)
126
+
127
+ response = self.openai_client.chat.completions.create(
128
+ model="gpt-4o-mini",
129
+ messages=[{"role": "user", "content": prompt}],
130
+ temperature=0.2,
131
+ response_format={"type": "json_object"}
132
+ )
133
+ return json.loads(response.choices[0].message.content)
134
+
135
+ async def create_slides_and_export_pdf(self, outline: Dict[str, Any], template_filename: str = "slide_template_v001.pptx") -> bytes:
136
+ """Step 2 & 3: Create Google Slides and export to PDF."""
137
+ slides_api, drive_api = self._get_clients()
138
+
139
+ # 1. Get Template: Try local first, then GCS
140
+ pptx_path = os.path.join("core", "templates", template_filename)
141
+ if os.path.exists(pptx_path):
142
+ with open(pptx_path, "rb") as f:
143
+ pptx_bytes = f.read()
144
+ else:
145
+ logger.info(f"Template {template_filename} not found locally, trying GCS...")
146
+ try:
147
+ pptx_bytes = self._download_template_from_gcs(template_filename)
148
+ except Exception as e:
149
+ raise FileNotFoundError(f"Template {template_filename} not found locally or on GCS: {e}")
150
+
151
+ # 2. Upload and convert
152
+ media = MediaIoBaseUpload(io.BytesIO(pptx_bytes), mimetype="application/vnd.openxmlformats-officedocument.presentationml.presentation")
153
+ body = {
154
+ "name": f"Generated Video Source {int(time.time())}",
155
+ "mimeType": "application/vnd.google-apps.presentation",
156
+ }
157
+
158
+ folder_id = os.environ.get("DRIVE_FOLDER_ID")
159
+ if folder_id:
160
+ body["parents"] = [folder_id]
161
+
162
+ created = drive_api.files().create(body=body, media_body=media, supportsAllDrives=True, fields="id").execute()
163
+ pres_id = created["id"]
164
+
165
+ try:
166
+ # 3. Build slides from outline
167
+ self._build_from_outline(slides_api, pres_id, outline)
168
+
169
+ # 4. Export to PDF
170
+ pdf_bytes = drive_api.files().export(
171
+ fileId=pres_id,
172
+ mimeType="application/pdf",
173
+ ).execute()
174
+
175
+ return pdf_bytes
176
+ finally:
177
+ # Cleanup temp presentation
178
+ try:
179
+ drive_api.files().delete(fileId=pres_id).execute()
180
+ except:
181
+ pass
182
+
183
+ def _build_from_outline(self, slides, pres_id, outline):
184
+ """Port of build_from_outline from temp project."""
185
+ items = outline.get("slides", [])
186
+ initial = slides.presentations().get(presentationId=pres_id).execute()
187
+ original_page_ids = [p["objectId"] for p in initial.get("slides", [])]
188
+
189
+ for item in items:
190
+ tpl = item.get("template", "")
191
+ fields = item.get("fields", {})
192
+
193
+ # Find base page
194
+ rep_key = TEMPLATE_HINT.get(tpl) or next(iter(fields.keys()), "")
195
+ base_page = self._find_page(slides, pres_id, rep_key)
196
+ if not base_page: continue
197
+
198
+ # Duplicate
199
+ resp = slides.presentations().batchUpdate(
200
+ presentationId=pres_id,
201
+ body={"requests": [{"duplicateObject": {"objectId": base_page}}]}
202
+ ).execute()
203
+ new_page = resp["replies"][0]["duplicateObject"]["objectId"]
204
+
205
+ # Move to end
206
+ pres_detail = slides.presentations().get(presentationId=pres_id).execute()
207
+ insertion_index = max(0, len(pres_detail.get("slides", [])) - 1)
208
+ slides.presentations().batchUpdate(
209
+ presentationId=pres_id,
210
+ body={"requests": [{
211
+ "updateSlidesPosition": {
212
+ "slideObjectIds": [new_page],
213
+ "insertionIndex": insertion_index
214
+ }
215
+ }]}
216
+ ).execute()
217
+
218
+ # Replace text
219
+ reqs = []
220
+ for k, v in fields.items():
221
+ reqs.append({
222
+ "replaceAllText": {
223
+ "containsText": {"text": f"{{{{{k}}}}}", "matchCase": False},
224
+ "replaceText": str(v),
225
+ "pageObjectIds": [new_page]
226
+ }
227
+ })
228
+ if reqs:
229
+ slides.presentations().batchUpdate(presentationId=pres_id, body={"requests": reqs}).execute()
230
+
231
+ # Cleanup unused placeholders {{...}} on this slide (Matches original implementation)
232
+ try:
233
+ self._cleanup_placeholders(slides, pres_id, new_page, fields)
234
+ except Exception as e:
235
+ logger.warning(f"Placeholder cleanup failed for slide {new_page}: {e}")
236
+
237
+ # Delete originals
238
+ if original_page_ids:
239
+ reqs = [{"deleteObject": {"objectId": pid}} for pid in original_page_ids]
240
+ slides.presentations().batchUpdate(presentationId=pres_id, body={"requests": reqs}).execute()
241
+
242
+ def _cleanup_placeholders(self, slides, pres_id, page_id, fields):
243
+ """Finds all remaining {{TAGS}} and replaces them with empty strings."""
244
+ pres = slides.presentations().get(presentationId=pres_id).execute()
245
+ slide = next(s for s in pres.get("slides", []) if s.get("objectId") == page_id)
246
+
247
+ found_tags = set()
248
+ for el in slide.get("pageElements", []):
249
+ text = el.get("shape", {}).get("text", {})
250
+ for te in text.get("textElements", []):
251
+ content = te.get("textRun", {}).get("content", "")
252
+ for m in re.findall(r"\{\{([A-Z0-9_.-]+)\}\}", content):
253
+ found_tags.add(m)
254
+
255
+ unused = [t for t in found_tags if t not in fields]
256
+ if unused:
257
+ reqs = [{
258
+ "replaceAllText": {
259
+ "containsText": {"text": f"{{{{{t}}}}}", "matchCase": True},
260
+ "replaceText": "",
261
+ "pageObjectIds": [page_id]
262
+ }
263
+ } for t in unused]
264
+ slides.presentations().batchUpdate(presentationId=pres_id, body={"requests": reqs}).execute()
265
+
266
+ def _find_page(self, slides, pres_id, placeholder_key):
267
+ pres = slides.presentations().get(presentationId=pres_id).execute()
268
+ needle = f"{{{{{placeholder_key}}}}}"
269
+ for page in pres.get("slides", []):
270
+ for el in page.get("pageElements", []):
271
+ text = el.get("shape", {}).get("text", {})
272
+ for te in text.get("textElements", []):
273
+ if needle in te.get("textRun", {}).get("content", ""):
274
+ return page["objectId"]
275
+ return None
276
+
277
+ def _download_template_from_gcs(self, filename: str) -> bytes:
278
+ """Download template from GCS bucket (mimics Temp project logic)."""
279
+ bucket_name = settings.GCS_BUCKET
280
+ if not bucket_name:
281
+ raise RuntimeError("GCS_BUCKET environment variable is missing")
282
+
283
+ # Path in bucket from Temp project: templates/filename
284
+ object_name = f"templates/{filename}"
285
+
286
+ # Use SA if available, else default
287
+ info = self._get_sa_info()
288
+ if info:
289
+ creds = service_account.Credentials.from_service_account_info(info)
290
+ client = storage.Client(project=info.get("project_id"), credentials=creds)
291
+ else:
292
+ client = storage.Client()
293
+
294
+ bucket = client.bucket(bucket_name)
295
+ blob = bucket.blob(object_name)
296
+ return blob.download_as_bytes()
297
+
298
+ async def generate_video_from_pdf_bytes(
299
+ self,
300
+ pdf_bytes: bytes,
301
+ language: str = "Japanese",
302
+ voice_name: str = "Kore"
303
+ ) -> Dict[str, Any]:
304
+ """Step 4, 5, 6: PDF bytes -> Video Pipeline."""
305
+ temp_dir = tempfile.mkdtemp(prefix="video_final_")
306
+ try:
307
+ pdf_path = os.path.join(temp_dir, "source.pdf")
308
+ with open(pdf_path, "wb") as f:
309
+ f.write(pdf_bytes)
310
+
311
+ # 1. Images
312
+ images = convert_from_path(pdf_path, dpi=200)
313
+ total_pages = len(images)
314
+ image_paths = []
315
+ for i, img in enumerate(images, start=1):
316
+ p = os.path.join(temp_dir, f"p_{i:02d}.png")
317
+ img.save(p, "PNG")
318
+ image_paths.append(p)
319
+
320
+ # 2. Narration Script
321
+ with open(pdf_path, "rb") as f:
322
+ openai_file = self.openai_client.files.create(file=f, purpose="assistants")
323
+
324
+ prompt = get_video_script_prompt(language, total_pages)
325
+ resp = self.openai_client.chat.completions.create(
326
+ model="gpt-4o-mini",
327
+ messages=[{"role": "user", "content": [{"type": "text", "text": prompt}, {"type": "file", "file": {"file_id": openai_file.id}}]}],
328
+ response_format={"type": "json_object"},
329
+ temperature=0.3
330
+ )
331
+ script_data = json.loads(resp.choices[0].message.content)
332
+ scripts = script_data.get("scripts", [])
333
+ self.openai_client.files.delete(openai_file.id)
334
+
335
+ # 3. Audio & Video assembly (similar to existing logic but more refined)
336
+ page_clips = []
337
+ target_size = (1920, 1080)
338
+
339
+ for i, img_path in enumerate(image_paths):
340
+ # Skip last slide narration if it's the logo slide (standard logic in temp project)
341
+ if i < len(scripts) and i < len(image_paths) - 1:
342
+ text = scripts[i].get("script_text", "")
343
+ audio_path = os.path.join(temp_dir, f"a_{i}.wav")
344
+
345
+ # TTS with fallback
346
+ try:
347
+ model_name = "gemini-2.5-flash-preview-tts"
348
+ logger.info(f"Generating audio for slide {i} using {model_name}...")
349
+ tts_resp = self.gemini_client.models.generate_content(
350
+ model=model_name,
351
+ contents=text,
352
+ config=types.GenerateContentConfig(
353
+ response_modalities=["AUDIO"],
354
+ speech_config=types.SpeechConfig(
355
+ voice_config=types.VoiceConfig(
356
+ prebuilt_voice_config=types.PrebuiltVoiceConfig(
357
+ voice_name=voice_name
358
+ )
359
+ )
360
+ )
361
+ )
362
+ )
363
+ except Exception as tts_err:
364
+ logger.warning(f"Failed with {model_name}, trying fallback gemini-1.5-flash: {tts_err}")
365
+ model_name = "gemini-1.5-flash"
366
+ tts_resp = self.gemini_client.models.generate_content(
367
+ model=model_name,
368
+ contents=text,
369
+ config=types.GenerateContentConfig(
370
+ response_modalities=["AUDIO"],
371
+ speech_config=types.SpeechConfig(
372
+ voice_config=types.VoiceConfig(
373
+ prebuilt_voice_config=types.PrebuiltVoiceConfig(
374
+ voice_name=voice_name
375
+ )
376
+ )
377
+ )
378
+ )
379
+ )
380
+ audio_data = tts_resp.candidates[0].content.parts[0].inline_data.data
381
+ with wave.open(audio_path, "wb") as wf:
382
+ wf.setnchannels(1); wf.setsampwidth(2); wf.setframerate(24000); wf.writeframes(audio_data)
383
+
384
+ aud_clip = AudioFileClip(audio_path)
385
+ duration = aud_clip.duration
386
+ img_clip = ImageClip(self._prepare_img(img_path, target_size, temp_dir, i), duration=duration)
387
+ page_clips.append(img_clip.with_audio(aud_clip))
388
+ time.sleep(2)
389
+ else:
390
+ # Silent 3s for last slide or missing scripts
391
+ img_clip = ImageClip(self._prepare_img(img_path, target_size, temp_dir, i), duration=3.0)
392
+ page_clips.append(img_clip)
393
+
394
+ final_path = os.path.join(temp_dir, "output.mp4")
395
+ final_clip = concatenate_videoclips(page_clips, method="compose")
396
+ final_clip.write_videofile(final_path, fps=24, codec="libx264", audio_codec="aac", logger=None)
397
+
398
+ # Cleanup clips
399
+ for c in page_clips: c.close()
400
+ final_clip.close()
401
+
402
+ # Upload to S3
403
+ ts = int(time.time())
404
+ s3_key = f"users/video_summaries/{ts}_summary.mp4"
405
+ s3_service.s3_client.upload_file(final_path, settings.AWS_S3_BUCKET, s3_key)
406
+ s3_url = f"https://{settings.AWS_S3_BUCKET}.s3.{settings.AWS_REGION}.amazonaws.com/{s3_key}"
407
+
408
+ return {"s3_key": s3_key, "s3_url": s3_url}
409
+
410
+ finally:
411
+ shutil.rmtree(temp_dir, ignore_errors=True)
412
+
413
+ def _prepare_img(self, path, size, temp_dir, idx):
414
+ img = Image.open(path)
415
+ img.thumbnail(size, Image.Resampling.LANCZOS)
416
+ new_img = Image.new("RGB", size, (0, 0, 0))
417
+ new_img.paste(img, ((size[0] - img.size[0]) // 2, (size[1] - img.size[1]) // 2))
418
+ res_path = os.path.join(temp_dir, f"ready_{idx}.png")
419
+ new_img.save(res_path)
420
+ return res_path
421
+
422
+ async def generate_transformed_video_summary(
423
+ self,
424
+ file_key: str,
425
+ language: str = "Japanese",
426
+ voice_name: str = "Kore",
427
+ custom_prompt: str = ""
428
+ ) -> Dict[str, Any]:
429
+ """
430
+ The Full Transformation Workflow: PDF -> Text -> Outline -> Slides -> PDF -> Video.
431
+ """
432
+ temp_dir = tempfile.mkdtemp(prefix="trans_video_")
433
+ try:
434
+ # 1. Download original PDF
435
+ pdf_path = os.path.join(temp_dir, "input.pdf")
436
+ s3_service.s3_client.download_file(settings.AWS_S3_BUCKET, file_key, pdf_path)
437
+
438
+ # 2. Extract Text
439
+ logger.info("Extracting text from PDF...")
440
+ source_text = await self.extract_text_from_pdf(pdf_path)
441
+
442
+ # 3. Generate Outline
443
+ logger.info("Generating slide outline...")
444
+ outline = await self.generate_outline(source_text, language, custom_prompt)
445
+
446
+ # 4. Create Slides and Export back to PDF (The Transformation)
447
+ logger.info("Building Google Slides and exporting...")
448
+ transformed_pdf_bytes = await self.create_slides_and_export_pdf(outline)
449
+
450
+ # 5. Generate Video from the Transformed PDF
451
+ logger.info("Generating video from transformed slides...")
452
+ result = await self.generate_video_from_pdf_bytes(transformed_pdf_bytes, language, voice_name)
453
+
454
+ return {
455
+ "title": f"Transformed Summary - {os.path.basename(file_key)}",
456
+ "s3_key": result["s3_key"],
457
+ "s3_url": result["s3_url"]
458
+ }
459
+
460
+ finally:
461
+ shutil.rmtree(temp_dir, ignore_errors=True)
462
+
463
+ slides_video_service = SlidesVideoService()
services/video_generator_service.py ADDED
@@ -0,0 +1,225 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import logging
3
+ import os
4
+ import tempfile
5
+ import time
6
+ import shutil
7
+ from typing import List, Dict, Optional, Any
8
+ import wave
9
+
10
+ import openai
11
+ from google import genai
12
+ from google.genai import types
13
+ from PIL import Image
14
+ from pdf2image import convert_from_path
15
+ from moviepy import ImageClip, AudioFileClip, VideoFileClip, concatenate_videoclips
16
+
17
+ from core.config import settings
18
+ from core.prompts import get_video_script_prompt
19
+ from services.s3_service import s3_service
20
+
21
+ logger = logging.getLogger(__name__)
22
+
23
+ class VideoGeneratorService:
24
+ def __init__(self):
25
+ self.openai_client = openai.OpenAI(api_key=settings.OPENAI_API_KEY)
26
+
27
+ # Match Temp project: Use API Key for Gemini TTS
28
+ logger.info("Initializing Gemini Client with API Key (as in Temp project)")
29
+ self.gemini_client = genai.Client(api_key=settings.GEMINI_API_KEY)
30
+
31
+ async def generate_video_summary(
32
+ self,
33
+ file_key: str,
34
+ language: str = "Japanese",
35
+ voice_name: str = "Kore"
36
+ ) -> Dict[str, Any]:
37
+ """
38
+ Complete pipeline: PDF -> Script -> Audio -> Images -> Video -> S3
39
+ """
40
+ temp_dir = tempfile.mkdtemp(prefix="video_gen_")
41
+ try:
42
+ # 1. Download PDF from S3
43
+ pdf_path = os.path.join(temp_dir, "input.pdf")
44
+ s3_service.s3_client.download_file(settings.AWS_S3_BUCKET, file_key, pdf_path)
45
+
46
+ # 2. Convert PDF to Images to get page count and for later use
47
+ image_dir = os.path.join(temp_dir, "images")
48
+ os.makedirs(image_dir, exist_ok=True)
49
+
50
+ # Poppler check (Windows usually needs path)
51
+ poppler_path = os.environ.get("POPPLER_PATH")
52
+ if poppler_path:
53
+ images = convert_from_path(pdf_path, dpi=200, poppler_path=poppler_path)
54
+ else:
55
+ images = convert_from_path(pdf_path, dpi=200)
56
+
57
+ total_pages = len(images)
58
+ image_paths = []
59
+ for i, img in enumerate(images, start=1):
60
+ img_path = os.path.join(image_dir, f"page_{i:02d}.png")
61
+ img.save(img_path, "PNG")
62
+ image_paths.append(img_path)
63
+
64
+ # 3. Generate Narration Script (OpenAI)
65
+ with open(pdf_path, "rb") as f:
66
+ openai_file = self.openai_client.files.create(file=f, purpose="assistants")
67
+
68
+ # Using the new high-fidelity prompt
69
+ prompt = get_video_script_prompt(language, total_pages)
70
+
71
+ response = self.openai_client.chat.completions.create(
72
+ model="gpt-4o-mini",
73
+ messages=[
74
+ {
75
+ "role": "user",
76
+ "content": [
77
+ {"type": "text", "text": prompt},
78
+ {"type": "file", "file": {"file_id": openai_file.id}}
79
+ ]
80
+ }
81
+ ],
82
+ response_format={"type": "json_object"},
83
+ temperature=0.3
84
+ )
85
+
86
+ script_data = json.loads(response.choices[0].message.content)
87
+ scripts = script_data.get("scripts", [])
88
+
89
+ # Cleanup OpenAI file
90
+ self.openai_client.files.delete(openai_file.id)
91
+
92
+ # 4. Generate Audio for each page (Gemini TTS)
93
+ audio_dir = os.path.join(temp_dir, "audio")
94
+ os.makedirs(audio_dir, exist_ok=True)
95
+ audio_paths = []
96
+
97
+ # We iterate through scripts. Usually total_pages.
98
+ # Mirror original repo: last page (logo) is often skipped for audio.
99
+ for i, script in enumerate(scripts):
100
+ # If it's the last page, skip audio (standard behavior in the template project)
101
+ if i == len(scripts) - 1:
102
+ logger.info(f"Skipping audio for last page (logo slide)")
103
+ continue
104
+
105
+ page_num = script.get("page_number", i+1)
106
+ text = script.get("script_text", "")
107
+ if not text: continue
108
+
109
+ audio_path = os.path.join(audio_dir, f"audio_{page_num:02d}.wav")
110
+
111
+ # Gemini TTS with fallback
112
+ try:
113
+ # Default model from original repo
114
+ model_name = "gemini-2.5-flash-preview-tts"
115
+ logger.info(f"Generating audio for page {page_num} using {model_name}...")
116
+
117
+ tts_resp = self.gemini_client.models.generate_content(
118
+ model=model_name,
119
+ contents=text,
120
+ config=types.GenerateContentConfig(
121
+ response_modalities=["AUDIO"],
122
+ speech_config=types.SpeechConfig(
123
+ voice_config=types.VoiceConfig(
124
+ prebuilt_voice_config=types.PrebuiltVoiceConfig(
125
+ voice_name=voice_name
126
+ )
127
+ )
128
+ )
129
+ )
130
+ )
131
+ except Exception as tts_err:
132
+ logger.warning(f"Failed with {model_name}, trying fallback gemini-1.5-flash: {tts_err}")
133
+ # Fallback to a highly stable multimodal model
134
+ model_name = "gemini-1.5-flash"
135
+ tts_resp = self.gemini_client.models.generate_content(
136
+ model=model_name,
137
+ contents=text,
138
+ config=types.GenerateContentConfig(
139
+ response_modalities=["AUDIO"],
140
+ speech_config=types.SpeechConfig(
141
+ voice_config=types.VoiceConfig(
142
+ prebuilt_voice_config=types.PrebuiltVoiceConfig(
143
+ voice_name=voice_name
144
+ )
145
+ )
146
+ )
147
+ )
148
+ )
149
+
150
+ audio_bytes = tts_resp.candidates[0].content.parts[0].inline_data.data
151
+ with wave.open(audio_path, "wb") as wf:
152
+ wf.setnchannels(1)
153
+ wf.setsampwidth(2)
154
+ wf.setframerate(24000)
155
+ wf.writeframes(audio_bytes)
156
+
157
+ audio_paths.append(audio_path)
158
+ # Rate limiting guard: wait between audio gens
159
+ time.sleep(3)
160
+
161
+ # 5. Combine into individual videos and then final video (MoviePy)
162
+ page_clips = []
163
+ target_size = (1920, 1080)
164
+
165
+ for i, img_path in enumerate(image_paths):
166
+ # Match audio if available (some pages might not have script if script gen failed or skipped)
167
+ # Usually we want 1 image per audio.
168
+ if i < len(audio_paths):
169
+ aud_clip = AudioFileClip(audio_paths[i])
170
+ duration = aud_clip.duration
171
+
172
+ # Process image to fit 1080p
173
+ img = Image.open(img_path)
174
+ img = self._resize_and_pad(img, target_size)
175
+ temp_img_res = os.path.join(temp_dir, f"res_{i}.png")
176
+ img.save(temp_img_res)
177
+
178
+ img_clip = ImageClip(temp_img_res, duration=duration)
179
+ vid_clip = img_clip.with_audio(aud_clip)
180
+ page_clips.append(vid_clip)
181
+ else:
182
+ # Final page or extra pages - silent 3s
183
+ img = Image.open(img_path)
184
+ img = self._resize_and_pad(img, target_size)
185
+ temp_img_res = os.path.join(temp_dir, f"res_{i}.png")
186
+ img.save(temp_img_res)
187
+ img_clip = ImageClip(temp_img_res, duration=3.0)
188
+ page_clips.append(img_clip)
189
+
190
+ final_video_path = os.path.join(temp_dir, "final.mp4")
191
+ final_clip = concatenate_videoclips(page_clips, method="compose")
192
+ final_clip.write_videofile(final_video_path, fps=24, codec="libx264", audio_codec="aac", logger=None)
193
+
194
+ # Cleanup clips
195
+ for clip in page_clips: clip.close()
196
+ if final_clip: final_clip.close()
197
+
198
+ # 6. Upload to S3
199
+ timestamp = int(time.time())
200
+ s3_key = f"users/video_summaries/{timestamp}_summary.mp4"
201
+ s3_service.s3_client.upload_file(final_video_path, settings.AWS_S3_BUCKET, s3_key)
202
+ s3_url = f"https://{settings.AWS_S3_BUCKET}.s3.{settings.AWS_REGION}.amazonaws.com/{s3_key}"
203
+
204
+ return {
205
+ "title": f"Video Summary - {os.path.basename(file_key)}",
206
+ "s3_key": s3_key,
207
+ "s3_url": s3_url
208
+ }
209
+
210
+ except Exception as e:
211
+ logger.error(f"Video generation failed: {e}")
212
+ import traceback
213
+ traceback.print_exc()
214
+ raise
215
+ finally:
216
+ shutil.rmtree(temp_dir, ignore_errors=True)
217
+
218
+ def _resize_and_pad(self, img: Image.Image, size: tuple) -> Image.Image:
219
+ """Resizes image to fit in size while maintaining aspect ratio, adding black padding."""
220
+ img.thumbnail(size, Image.Resampling.LANCZOS)
221
+ new_img = Image.new("RGB", size, (0, 0, 0))
222
+ new_img.paste(img, ((size[0] - img.size[0]) // 2, (size[1] - img.size[1]) // 2))
223
+ return new_img
224
+
225
+ video_generator_service = VideoGeneratorService()