Spaces:

MuhammadSaad16
/

Launchlab

Configuration error

App Files Files Community

Launchlab / specs /006-gemini-api-migration /research.md

MuhammadSaad16

Upload 112 files

971b4ea verified 4 months ago

preview code

raw

history blame contribute delete

6.57 kB

Research: Gemini API Migration

Feature Branch: 006-gemini-api-migration Created: 2025-12-14

Executive Summary

This research consolidates findings for migrating from OpenAI to Google Gemini API. The migration requires using the new google-genai SDK (not the deprecated google-generativeai), with specific patterns for async operations.

Research Findings

1. SDK Selection

Decision: Use google-genai package (new unified SDK)

Rationale:

The old google-generativeai package is deprecated
New SDK provides unified interface for all Google AI services
Better async support via client.aio namespace
Cleaner architecture with centralized Client object

Alternatives Considered:

google-generativeai (deprecated, not recommended)
Direct REST API calls (more complexity, no benefit)

Sources:

2. Model Selection

Decision: Use gemini-2.0-flash-exp for chat/translation/personalization

Rationale:

Explicitly requested by user
Experimental model with latest capabilities
Fast response times suitable for interactive chat

Decision: Use text-embedding-004 for embeddings

Rationale:

Explicitly requested by user
Available via Gemini API
Note: gemini-embedding-001 is newer (3072 dimensions) but user specified text-embedding-004

Sources:

3. Async Pattern

Decision: Use client.aio.models.generate_content() for async operations

Rationale:

Current codebase uses asyncio.to_thread() for OpenAI calls
New Gemini SDK has native async support via client.aio namespace
Cleaner than wrapping sync calls in thread pool

Implementation Pattern:

from google import genai

client = genai.Client()

# Async generate content
response = await client.aio.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents='...'
)
print(response.text)

Sources:

Google GenAI SDK Documentation

4. Conversation History Format

Decision: Map OpenAI message format to Gemini contents format

OpenAI Format (current):

messages = [
    {"role": "system", "content": "..."},
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."}
]

Gemini Format (target):

contents = [
    types.Content(role="user", parts=[types.Part(text="...")]),
    types.Content(role="model", parts=[types.Part(text="...")])
]

Key Differences:

Gemini uses "model" instead of "assistant"
System prompts should be prepended to first user message or use system_instruction config
Parts structure for multi-modal support

Implementation Strategy:

Use system_instruction parameter for system prompts
Convert history format in get_chat_response method

5. JSON Response Format

Decision: Use response_mime_type for JSON output in personalization

Rationale:

OpenAI uses response_format={"type": "json_object"}
Gemini uses config.response_mime_type="application/json"

Implementation Pattern:

response = await client.aio.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents='...',
    config=types.GenerateContentConfig(
        response_mime_type="application/json"
    )
)

6. Embeddings Implementation

Decision: Use client.models.embed_content() for embeddings

Implementation Pattern:

from google import genai
from google.genai import types

client = genai.Client()

result = client.models.embed_content(
    model="text-embedding-004",
    contents=text,
    config=types.EmbedContentConfig(task_type="RETRIEVAL_DOCUMENT")
)
embedding = result.embeddings[0].values

Embedding Dimensions:

text-embedding-004: 768 dimensions
gemini-embedding-001: 3072 dimensions (default), configurable

Qdrant Compatibility Note:

OpenAI text-embedding-3-small produces 1536-dimensional vectors
text-embedding-004 produces 768-dimensional vectors
Existing Qdrant collections will need re-indexing (out of scope per spec)

7. API Key Configuration

Decision: Environment variable GEMINI_API_KEY

Rationale:

SDK auto-reads from GEMINI_API_KEY environment variable
Consistent with existing pattern (OPENAI_API_KEY → GEMINI_API_KEY)

Implementation:

# SDK reads GEMINI_API_KEY automatically
client = genai.Client()

# Or explicitly:
client = genai.Client(api_key=settings.GEMINI_API_KEY)

8. Error Handling

Decision: Map Gemini exceptions to existing HTTP error patterns

Gemini Exception	HTTP Code	Current OpenAI Pattern
google.api_core.exceptions.InvalidArgument	400	Validation errors
google.api_core.exceptions.ResourceExhausted	429	Rate limiting
google.api_core.exceptions.ServiceUnavailable	503	Service unavailable
google.api_core.exceptions.GoogleAPIError	500	Generic error

Technical Decisions Summary

Aspect	Decision	Confidence
SDK Package	google-genai	High
Chat Model	gemini-2.0-flash-exp	High (user specified)
Embedding Model	text-embedding-004	High (user specified)
Async Pattern	client.aio.models.*	High
JSON Output	response_mime_type	High
System Prompts	system_instruction config	High
API Key	GEMINI_API_KEY env var	High

Risk Assessment

Risk	Likelihood	Impact	Mitigation
Embedding dimension mismatch	High	High	Document re-indexing requirement
Model availability (experimental)	Medium	Medium	Monitor for stable release
Response format differences	Low	Low	Thorough testing
Rate limit differences	Low	Medium	Monitor and adjust if needed

Research: Gemini API Migration

Executive Summary

Research Findings

1. SDK Selection

2. Model Selection

3. Async Pattern

4. Conversation History Format

5. JSON Response Format

6. Embeddings Implementation

7. API Key Configuration

8. Error Handling

Technical Decisions Summary

Risk Assessment

References