File size: 14,845 Bytes
0f62534 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 |
# Technical Specification: Multi-Country & Multi-Language RSS Generation
## Document Information
- **Project Name:** Social Media Post automation
- **Author:** Product Manager
- **Generated:** 2025-11-22
- **Version:** 1.0
## Context Summary
### Documents Available:
- PRD (Product Requirements Document) for LinkedIn community manager enhancements
- Architecture document detailing the system structure
- Keyword frequency analysis implementation documentation
- Existing user authentication and registration flow
- Database schema with profiles table for user metadata
### Project Type:
Brownfield project - LinkedIn community management tool with React frontend and Flask backend
### Existing Stack:
- Frontend: React 18.2.0, Vite, Redux Toolkit, Tailwind CSS
- Backend: Flask 3.1.1, Python 3.8+
- Database: Supabase (PostgreSQL)
- External APIs: LinkedIn API, Google News RSS, Gradio client for AI interactions
- Task Queue: Celery + Redis
- Infrastructure: Docker with docker-compose, Nginx reverse proxy
### Code Structure:
- Well-structured with clear separation between frontend and backend
- Established API patterns and Redux state management
- Existing features include RSS source management, AI content generation, and post scheduling
- User profiles stored in Supabase with JSONB metadata field for additional user information
## Problem Statement
The current LinkedIn post generation system only supports US and English parameters when generating RSS feeds from keywords. This limits the global reach of the application by not considering the user's country and language preferences during content generation. The system needs to be enhanced to:
1. Collect user's country and language preferences during registration
2. Generate RSS feeds based on the user's country (not just US)
3. Support both English and French languages for the same country
4. Merge dataframes from both languages into one for processing
5. Implement the merging logic in the backend keyword analysis function
## Solution Overview
The solution involves implementing user preferences for country and language that will be used in RSS generation. This includes:
1. Modifying user registration to collect country and language preferences
2. Updating the RSS generation function to use user-specific parameters
3. Creating logic to generate both English and French RSS feeds for the same country
4. Merging the resulting dataframes from both languages
5. Updating the keyword analysis function to handle merged dataframes
## Scope
### In Scope:
- Modifying user registration flow to collect country and language preferences
- Updating the `generate_google_news_rss_from_string` function to accept country/language parameters
- Modifying the RSS generation logic in `ai_agent.py` to generate feeds for both languages
- Implementing dataframe merging logic in the content service for keyword analysis
- Updating user profile management to store country and language preferences
- Modifying the frontend to collect country and language during registration
### Out of Scope:
- Changing the Supabase database schema (using existing JSONB field)
- Modifying the LinkedIn posting functionality itself
- Adding language translation for the UI
- Adding additional languages beyond English and French
## Source Tree Changes
### Backend Changes:
- `backend/api/auth.py` - MODIFY - Add country and language to registration endpoint
- `backend/services/auth_service.py` - MODIFY - Update register_user function to store preferences
- `backend/models/user.py` - MODIFY - Add methods to update user preferences
- `Linkedin_poster_dev/ai_agent.py` - MODIFY - Update `generate_google_news_rss_from_string` function and article_reader function
- `backend/services/content_service.py` - MODIFY - Update `_generate_google_news_rss_from_string` method and implement dataframe merging in keyword analysis
- `backend/api/sources.py` - MODIFY - Add endpoint to update user preferences if needed
### Frontend Changes:
- `frontend/src/pages/Register.jsx` - MODIFY - Add country and language selection during registration
- `frontend/src/pages/Settings.jsx` - CREATE/MODIFY - Add ability to update country/language preferences
- `frontend/src/services/authService.js` - MODIFY - Update registration payload
- `frontend/src/components/CountryLanguageSelector.jsx` - CREATE - New component for country/language selection
## Technical Approach
### Backend Implementation:
Use existing Supabase `profiles` table with `raw_user_meta` JSONB field to store user preferences. The `raw_user_meta` field will contain:
```json
{
"country": "FR",
"language": "fr",
"preferred_languages": ["en", "fr"]
}
```
Follow the existing patterns for user profile updates using the Supabase client. The `generate_google_news_rss_from_string` function will be updated to accept country and language parameters instead of hardcoded "US" and "en".
### Frontend Implementation:
Use React with existing Redux Toolkit state management. Create a reusable component for country and language selection that follows the existing Tailwind CSS design system with colors: primary: #910029, secondary: #39404B, accent: #ECF4F7.
### Dataframe Merging:
Implement pandas-based dataframe concatenation in the content service to merge articles from both English and French RSS feeds for the same country. Remove duplicates based on article link to avoid duplication.
## Existing Patterns to Follow
Follow the service pattern established in existing services:
- Use class-based services with constructor dependency injection
- Use async/await for all asynchronous operations
- Throw custom error classes with error codes
- Include JSDoc-style Python docstrings for all public methods
- Follow existing error handling patterns in the application
- Use the existing authentication middleware for all new endpoints
- Follow the existing Redux store patterns in frontend
## Integration Points
### Internal Modules:
- `@/models/user` - Update with country/language preferences
- `@/services/auth_service` - Store user preferences during registration
- `@/services/content_service` - Handle merged dataframes in keyword analysis
- `@/api/auth` - Collect preferences during registration
### External APIs:
- Supabase database integration using existing client
- LinkedIn API for post publishing (unchanged)
- Google News RSS feeds with user-specific parameters
### Configuration:
- No additional environment variables needed
- Update documentation for new user preference handling
## Development Context
### Relevant Existing Code:
- See `backend/services/auth_service.py` for user registration patterns
- Reference `backend/models/user.py` for user data structure
- Follow error handling in `backend/services/content_service.py`
- Use existing country/language detection patterns if available
### Framework/Libraries:
- Flask 3.1.1 (web framework)
- React 18.2.0 (frontend)
- Redux Toolkit 1.8.5 (state management)
- Tailwind CSS (styling)
- Supabase JS client (database/auth)
- Pandas 2.2.2 (data processing)
- Feedparser (RSS parsing)
### Internal Modules:
- `@/services/AuthService` - User registration and authentication
- `@/models/User` - User data structure
- `@/services/ContentService` - Content processing and analysis
- `@/components/Header` and `@/components/Sidebar` - Existing UI components
### Configuration Changes:
- Update README.md to document new user preference functionality
- Add country/language options to frontend form validation
## Existing Conventions
Brownfield project with established conventions:
- JavaScript/TypeScript: camelCase naming, ESLint with React plugin linting
- Python: snake_case naming, PEP 8 compliant
- File organization: Group by feature in frontend, by type in backend
- API endpoints: RESTful patterns with JWT authentication
- Testing: pytest for backend, Jest/React Testing Library for frontend
- Import style: Grouped by external libraries, internal modules, relative imports
- Error handling: Consistent response format with success/error flags
## Implementation Stack
- Runtime: Node.js 20.x (frontend), Python 3.8+ (backend)
- Framework: Flask 3.1.1 (backend), React 18.2.0 (frontend)
- Language: JavaScript/TypeScript, Python
- Testing: pytest 8.4.1 (backend), Jest (frontend)
- Linting: ESLint 8.57.0 (frontend), flake8 (backend)
- Styling: Tailwind CSS with custom configuration
- Database: Supabase (PostgreSQL)
- State Management: Redux Toolkit
## Technical Details
### User Preference Storage:
Use the existing `profiles` table's `raw_user_meta` JSONB field to store user preferences:
- Store country as ISO 3166-1 alpha-2 code (e.g., "US", "FR", "DE")
- Store language as ISO 639-1 code (e.g., "en", "fr")
- Store additional languages in an array for multi-language support
### RSS Generation Logic:
- Modify the `generate_google_news_rss_from_string` function to accept user-specific country and language parameters
- For each keyword, generate RSS feeds for both English and French if both are preferred
- When processing RSS feeds in `article_reader`, generate both language feeds for the user's country
- Merge the resulting dataframes, removing duplicates based on article URL
### Dataframe Merging Algorithm:
```python
def merge_language_dataframes(df_english, df_french):
# Combine both dataframes
df_combined = pd.concat([df_english, df_french], ignore_index=True)
# Remove duplicates based on the 'link' column
df_deduplicated = df_combined.drop_duplicates(subset=['link'], keep='first')
# Sort by date in descending order (most recent first)
df_deduplicated = df_deduplicated.sort_values(by='date', ascending=False)
return df_deduplicated
```
### RSS URL Generation:
- Generate URLs with format: `https://news.google.com/rss/search?q={query}&hl={language}&gl={country}&ceid={country}:{language}`
- For the current user, generate both English and French feeds for their country
- Example: if user is from France, generate feeds for `hl=fr&gl=FR&ceid=FR:fr` and `hl=en&gl=FR&ceid=FR:en`
### Performance Considerations:
- Cache RSS feeds to prevent excessive API calls
- Implement rate limiting for RSS feed requests
- Use efficient pandas operations for dataframe merging
- Optimize database queries to retrieve user preferences efficiently
### Security Considerations:
- Validate country and language codes against known standards
- Sanitize user inputs for country and language selection
- Maintain existing JWT authentication for all new endpoints
- Ensure ISO code validation to prevent injection attacks
### Error Handling:
- Provide default country/language if user preferences are not set
- Handle RSS feed generation failures gracefully
- Log errors appropriately for debugging
- Display user-friendly error messages
## Development Setup
1. Navigate to project root directory
2. Install backend dependencies: `cd backend && pip install -r requirements.txt`
3. Install frontend dependencies: `cd frontend && npm install`
4. Set up environment variables from .env.example
5. Run the development servers: `npm run dev` (runs both frontend and backend)
6. Run tests: `npm run test` for frontend, `npm run test:backend` for backend
## Implementation Guide
### Setup Steps:
1. Create feature branch from main
2. Verify dev environment is running correctly
3. Review existing user profile management patterns
4. Set up test data for country/language preferences
### Implementation Steps:
#### Story 1: User Preference Collection
1. Update user registration form in `frontend/src/pages/Register.jsx` to include country and language selection
2. Add country/language validation and selection component in `frontend/src/components/CountryLanguageSelector.jsx`
3. Update `authService.js` to include preferences in registration payload
4. Modify `backend/api/auth.py` registration endpoint to accept preferences
5. Update `backend/services/auth_service.py` to store preferences in user profile
6. Update `backend/models/user.py` with methods to manage preferences
#### Story 2: RSS Generation with User Preferences
1. Modify `generate_google_news_rss_from_string` function in `Linkedin_poster_dev/ai_agent.py` to accept country and language parameters
2. Update `article_reader` function to generate feeds for both English and French for user's country
3. Implement dataframe merging logic in `article_reader` to combine results from both languages
4. Update `content_service.py` with similar logic for keyword analysis
5. Test RSS feed generation with various country/language combinations
### Testing Strategy:
- Unit tests for user preference storage and retrieval
- Integration tests for modified RSS generation functions
- Frontend component tests for country/language selection
- End-to-end tests for the registration flow with preferences
- Test RSS feed generation with different country/language combinations
### Acceptance Criteria:
1. Given a new user registers, when they specify country and language preferences, then system stores these preferences in the user profile
2. Given user has specified preferences, when the system generates RSS feeds, then feeds are generated using the user's country and both English and French languages
3. Given RSS feeds from both languages are generated, when system processes articles, then dataframes are properly merged without duplicates
4. Given keyword analysis is performed, when user preferences exist, then system analyzes content from both language feeds for the user's country
## Developer Resources
### File Paths Reference:
- `/backend/api/auth.py` - Registration API endpoint
- `/backend/services/auth_service.py` - Registration service logic
- `/Linkedin_poster_dev/ai_agent.py` - Main RSS generation logic
- `/backend/services/content_service.py` - Content analysis service
- `/frontend/src/pages/Register.jsx` - User registration page
- `/frontend/src/components/CountryLanguageSelector.jsx` - New component
- `/frontend/src/services/authService.js` - Frontend auth service
### Key Code Locations:
- `generate_google_news_rss_from_string function` (Linkedin_poster_dev/ai_agent.py:316) - Main RSS generation function
- `article_reader function` (Linkedin_poster_dev/ai_agent.py:468) - RSS processing logic
- `register_user function` (backend/services/auth_service.py:11) - User registration logic
- `_generate_google_news_rss_from_string method` (backend/services/content_service.py:731) - Backend RSS generation
### Testing Locations:
- Unit: `backend/tests/` and `frontend/src/tests/`
- Integration: `backend/tests/` for API tests
- E2E: To be added if needed
### Documentation to Update:
- README.md - Add documentation for new country/language preference feature
- API.md - Document changes to registration endpoint
- CHANGELOG.md - Note the new multi-country/multi-language support feature |