Spaces:
Runtime error
Runtime error
| title: tt-creators | |
| app_file: creators.py | |
| sdk: gradio | |
| sdk_version: 5.20.0 | |
| # TikTok Creator Analyzer | |
| A Gradio-based tool for analyzing TikTok creator profiles from CSV files. | |
| ## Features | |
| - Efficiently loads and processes millions of TikTok creator profiles | |
| - Caches data in Parquet format for faster subsequent loads | |
| - Tracks processed files to avoid reprocessing the same data | |
| - Incrementally updates the database when new files are added | |
| - Advanced search with multiple filters: | |
| - Follower count range (min/max) | |
| - Video count range (min/max) | |
| - Keywords in signature | |
| - Region filter | |
| - "Has Email" filter to find profiles with contact information | |
| - Download search results as CSV | |
| - Network accessible interface (binds to 0.0.0.0) | |
| - Shareable via temporary public URL | |
| ## Installation | |
| 1. Install the required dependencies: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 2. Make sure your CSV files are in the correct location (`../data/tiktok_profiles/`) | |
| ## Usage | |
| Run the script: | |
| ```bash | |
| python creators.py | |
| ``` | |
| The first run will: | |
| 1. Load all CSV files from the data directory | |
| 2. Combine them into a single dataset | |
| 3. Save the combined data as a Parquet file for faster loading in the future | |
| 4. Track which files have been processed to avoid duplicates | |
| 5. Launch a Gradio web interface for searching and analyzing the data | |
| Subsequent runs will: | |
| 1. Load the existing data from the Parquet file | |
| 2. Check for new CSV files that haven't been processed yet | |
| 3. If new files exist, process only those files and update the database | |
| 4. Launch the Gradio interface with the updated data | |
| The interface will be accessible from: | |
| - Other machines on your network at: `http://your-ip-address:7860` | |
| - A temporary public URL that will be displayed in the console (thanks to `share=True`) | |
| ## Maintenance | |
| The application includes a Maintenance tab that shows: | |
| - How many files have been processed | |
| - When the database was last updated | |
| - An option to force reload all files (useful if you suspect data corruption) | |
| ## Data Format | |
| The CSV files should have the following columns: | |
| - id | |
| - unique_id | |
| - follower_count | |
| - nickname | |
| - video_count | |
| - following_count | |
| - signature | |
| - bio_link | |
| - updated_at | |
| - tt_seller | |
| - region | |
| - language | |
| - url |