MuhammadSaad16's picture
Upload 112 files
971b4ea verified

Feature Specification: Gemini API Migration

Feature Branch: 006-gemini-api-migration Created: 2025-12-14 Status: Draft Input: User description: "Replace OpenAI API with Google Gemini API for all AI operations"

User Scenarios & Testing (mandatory)

User Story 1 - Chat Response Generation (Priority: P1)

A user sends a chat message through the application. The system uses Google Gemini API (gemini-2.0-flash-exp model) instead of OpenAI to generate a contextual response.

Why this priority: Core functionality - chat is the primary user interaction requiring AI.

Independent Test: Can be fully tested by sending a chat message and verifying the response is generated successfully using the Gemini API.

Acceptance Scenarios:

  1. Given a valid user chat message, When the chat endpoint is called, Then the system returns an AI-generated response from Gemini.

  2. Given a chat message with conversation history, When the chat endpoint is called, Then Gemini processes the full context and returns a relevant response.

  3. Given a valid GEMINI_API_KEY is configured, When any AI operation is performed, Then the system authenticates successfully with Google's API.


User Story 2 - Urdu Translation (Priority: P1)

A user submits English content for translation to Urdu. The system uses Google Gemini API to perform the translation instead of OpenAI GPT-4.

Why this priority: Core functionality - translation is an existing feature that must continue working.

Independent Test: Can be fully tested by submitting English text and verifying accurate Urdu translation is returned.

Acceptance Scenarios:

  1. Given English text content, When the translate endpoint is called, Then the system returns Urdu translation generated by Gemini.

  2. Given technical content in English, When translation is requested, Then Gemini provides accurate Urdu translation preserving technical meaning.

  3. Given the translate endpoint receives valid content, When Gemini API is called, Then only the translated Urdu text is returned without additional explanations.


User Story 3 - Content Personalization (Priority: P1)

A user requests content personalization based on their background profile. The system uses Google Gemini API to adapt content complexity instead of OpenAI GPT-4.

Why this priority: Core functionality - personalization is an existing feature that must continue working.

Independent Test: Can be fully tested by submitting content with a user profile and verifying personalized content is returned with adjustments description.

Acceptance Scenarios:

  1. Given content and a user with software_level="beginner", When the personalize endpoint is called, Then Gemini returns simplified content with explanations.

  2. Given content and a user with software_level="advanced", When the personalize endpoint is called, Then Gemini returns content with technical depth.

  3. Given the personalize endpoint receives valid content and user_id, When Gemini API is called, Then a JSON response with personalized_content and adjustments_made fields is returned.


User Story 4 - Embedding Generation (Priority: P1)

The system generates embeddings for text content using Google Gemini's text-embedding-004 model instead of OpenAI embeddings for RAG (Retrieval Augmented Generation) operations.

Why this priority: Core functionality - embeddings are used for document retrieval in RAG service.

Independent Test: Can be fully tested by generating embeddings for sample text and verifying valid vector output.

Acceptance Scenarios:

  1. Given text content requiring embedding, When the embeddings service is called, Then Gemini's text-embedding-004 model returns a valid embedding vector.

  2. Given the RAG service needs to search documents, When embeddings are generated, Then the Gemini embeddings are compatible with existing Qdrant vector storage.

  3. Given multiple text chunks, When embeddings are generated, Then each chunk receives a consistent-dimension vector from Gemini.


Edge Cases

  • What happens when GEMINI_API_KEY is not configured? System should raise a clear configuration error at startup.
  • What happens when Gemini API rate limits are exceeded? System should return a 429 error with a retry-after indication.
  • What happens when Gemini API is temporarily unavailable? System should return a 503 error with appropriate error message.
  • What happens when Gemini returns an unexpected response format? System should handle gracefully and return a 500 error with logging.
  • What happens when embedding dimensions differ from OpenAI? Existing Qdrant collections may need re-indexing with new embeddings.

Requirements (mandatory)

Functional Requirements

  • FR-001: System MUST replace the OpenAI client with Google Generative AI (google-generativeai) library.
  • FR-002: System MUST use gemini-2.0-flash-exp model for chat responses and text generation.
  • FR-003: System MUST use text-embedding-004 model for generating embeddings.
  • FR-004: System MUST rename openai_service.py to gemini_service.py.
  • FR-005: System MUST maintain the same function signatures for all existing methods (get_chat_response, translate_to_urdu, personalize_content).
  • FR-006: System MUST update all import statements in route files (chat.py, translate.py, personalize.py) to use gemini_service.
  • FR-007: System MUST update rag_service.py to import and use GeminiService instead of OpenAIService.
  • FR-008: System MUST remove openai dependency from requirements.txt and add google-generativeai.
  • FR-009: System MUST use GEMINI_API_KEY environment variable instead of OPENAI_API_KEY.
  • FR-010: System MUST update config/settings to read GEMINI_API_KEY instead of OPENAI_API_KEY.

Migration Requirements

  • MR-001: All existing chat functionality MUST work identically after migration.
  • MR-002: All existing translation functionality MUST work identically after migration.
  • MR-003: All existing personalization functionality MUST work identically after migration.
  • MR-004: The EmbeddingsService MUST be updated to use Gemini embeddings if it uses OpenAI.
  • MR-005: Error handling patterns MUST remain consistent with existing implementation.
  • MR-006: Async operation patterns MUST remain consistent with existing implementation.

Key Entities

  • GeminiService: New service class replacing OpenAIService.

    • get_chat_response(prompt, history): Generate chat response using gemini-2.0-flash-exp
    • translate_to_urdu(content): Translate English to Urdu using gemini-2.0-flash-exp
    • personalize_content(content, software_level, hardware_level, learning_goals): Personalize content based on user profile
  • Environment Configuration:

    • GEMINI_API_KEY: API key for Google Gemini services (replaces OPENAI_API_KEY)

Success Criteria (mandatory)

Measurable Outcomes

  • SC-001: All existing API endpoints (/api/chat, /api/translate, /api/personalize) function correctly with Gemini backend.
  • SC-002: Chat responses are generated within 10 seconds for typical prompts.
  • SC-003: Translation quality is comparable to previous OpenAI implementation (verified by manual testing).
  • SC-004: Personalization maintains the same user-level adaptation quality as before.
  • SC-005: No OpenAI dependencies remain in the codebase after migration.
  • SC-006: All import statements reference gemini_service instead of openai_service.
  • SC-007: The system starts successfully with only GEMINI_API_KEY configured (no OPENAI_API_KEY required).

Assumptions

  • A valid GEMINI_API_KEY will be provided and configured in the environment.
  • Google Gemini API provides comparable functionality to OpenAI for chat, translation, and personalization.
  • The gemini-2.0-flash-exp model is available and suitable for production use.
  • The text-embedding-004 model produces embeddings compatible with Qdrant vector storage operations.
  • Existing conversation history format is compatible with Gemini's expected input format.
  • JSON response format for personalization can be achieved with Gemini.

Out of Scope

  • Re-indexing existing Qdrant embeddings with new Gemini embeddings (may be needed as follow-up).
  • Performance benchmarking comparison between OpenAI and Gemini.
  • Cost analysis comparison between the two providers.
  • Fallback mechanism to OpenAI if Gemini is unavailable.
  • A/B testing between OpenAI and Gemini responses.
  • Migration of any OpenAI-specific features not currently used in the codebase.

Files to Modify

File Change Type Description
app/services/openai_service.py Rename/Rewrite Rename to gemini_service.py, replace OpenAI with Gemini
app/services/embeddings_service.py Rewrite Replace OpenAI embeddings with Gemini text-embedding-004
app/services/rag_service.py Modify Update imports from openai_service to gemini_service
app/routes/chat.py Modify Update imports from openai_service to gemini_service
app/routes/translate.py Modify Update imports from openai_service to gemini_service
app/routes/personalize.py Modify Update imports from openai_service to gemini_service
app/config.py Modify Replace OPENAI_API_KEY with GEMINI_API_KEY, update model settings
requirements.txt Modify Remove openai, add google-generativeai
.env Modify Add GEMINI_API_KEY, can remove OPENAI_API_KEY

Dependencies

  • External: Google Gemini API access with valid API key
  • Library: google-generativeai Python package
  • Internal: Existing service architecture and route patterns