Spaces:
Runtime error
Runtime error
| license: mit | |
| sdk: gradio | |
| emoji: 🏃 | |
| colorFrom: blue | |
| colorTo: blue | |
| pinned: true | |
| sdk_version: 6.3.0 | |
| # SongBPM | |
| A powerful Python tool to extract BPM (Beats Per Minute), musical key, and song metadata from [songbpm.com](https://songbpm.com/). Perfect for music producers, DJs, and data analysts who need tempo information for their workflows. | |
| ## Features | |
| - **Single Song Search**: Quickly look up BPM and key for any song | |
| - **Batch Processing**: Process multiple songs from a file | |
| - **Multiple Export Formats**: Save results to CSV or JSON | |
| - **Web Interface**: User-friendly Gradio web demo for easy searching | |
| - **Respectful Scraping**: Built-in delays to avoid overloading servers | |
| - **Error Handling**: Robust error handling with retry logic | |
| - **Headless Mode**: Run silently or with visible browser for debugging | |
| ## Requirements | |
| - Python 3.9 or higher | |
| - Google Chrome or Chromium browser | |
| - Required Python packages (see `requirements.txt`) | |
| - Gradio (for web interface demo) | |
| ## Installation | |
| 1. Clone or download this repository: | |
| ```bash | |
| git clone https://huggingface.co/spaces/terastudio/SongBPM | |
| cd SongBPM | |
| ``` | |
| 2. Create a virtual environment (recommended): | |
| ```bash | |
| python -m venv venv | |
| source venv/bin/activate # On Windows: venv\Scripts\activate | |
| ``` | |
| 3. Install required packages: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 4. Install Playwright browsers: | |
| ```bash | |
| playwright install chromium | |
| ``` | |
| ## Usage | |
| ### Basic Search | |
| Search for a single song: | |
| ```bash | |
| python songbpm_scraper.py "queen - under pressure" | |
| ``` | |
| ### Batch Processing | |
| Process multiple songs from a file: | |
| ```bash | |
| # Create a file with one song per line | |
| echo "daft punk - one more time" > songs.txt | |
| echo "the prodigy - firestarter" >> songs.txt | |
| echo "metallica - enter sandman" >> songs.txt | |
| python songbpm_scraper.py --file songs.txt | |
| ``` | |
| ### Export Results | |
| Save results to CSV: | |
| ```bash | |
| python songbpm_scraper.py "adele - rolling in the deep" --csv results.csv | |
| ``` | |
| Save results to JSON: | |
| ```bash | |
| python songbpm_scraper.py "taylor swift - shake it off" --json output.json | |
| ``` | |
| ### Debug Mode | |
| Show the browser window during scraping: | |
| ```bash | |
| python songbpm_scraper.py "beyoncé - crazy in love" --visible | |
| ``` | |
| ### Custom Delay | |
| Adjust the delay between requests (in seconds): | |
| ```bash | |
| python songbpm_scraper.py "song name" --delay 5.0 | |
| ``` | |
| ## Web Interface (Gradio Demo) | |
| A user-friendly web interface is available for easy song searching without using the command line: | |
| ### Launch the Web Demo | |
| ```bash | |
| python app.py | |
| ``` | |
| This starts a local web server. Open your browser to: | |
| - **Local URL**: http://127.0.0.1:7860 | |
| ### Web Interface Features | |
| The web demo provides three tabs: | |
| 1. **Single Search** - Search for one song at a time with instant results | |
| 2. **Batch Search** - Upload a file with multiple song queries | |
| 3. **Help** - Usage instructions and tips | |
| ### Command Line Options | |
| ```bash | |
| # Change host and port | |
| python app.py --host 0.0.0.0 --port 8080 | |
| # Create a public share link (temporary URL) | |
| python app.py --share | |
| # Enable debug mode | |
| python app.py --debug | |
| ``` | |
| ### Web Interface Screenshots | |
| The interface displays: | |
| - Song title and artist name | |
| - BPM (tempo) information | |
| - Musical key (e.g., C Major, A Minor) | |
| - Song duration | |
| - Direct links to source pages | |
| You can also export results directly from the web interface in CSV or JSON format. | |
| ## Output Format | |
| ### CSV Output | |
| | Title | Artist | BPM | Key | Duration | URL | | |
| |-------|--------|-----|-----|----------|-----| | |
| | Under Pressure | Queen | 114 | B | 4:08 | https://songbpm.com/... | | |
| ### JSON Output | |
| ```json | |
| [ | |
| { | |
| "title": "Under Pressure", | |
| "artist": "Queen", | |
| "bpm": 114, | |
| "key": "B", | |
| "duration": "4:08", | |
| "url": "https://songbpm.com/..." | |
| } | |
| ] | |
| ``` | |
| ## API Usage | |
| You can also use the scraper as a Python library in your own projects: | |
| ```python | |
| import asyncio | |
| from songbpm_scraper import SongBPMExtractor | |
| async def main(): | |
| extractor = SongBPMExtractor() | |
| # Single song search | |
| results = await extractor.extract("coldplay - fix you") | |
| print(results) | |
| # Batch search | |
| songs = [ | |
| "tame impala - the less i know the better", | |
| "flume - never be like you" | |
| ] | |
| all_results = await extractor.extract_batch(songs) | |
| # Export | |
| extractor.export_to_csv(all_results, "my_songs.csv") | |
| asyncio.run(main()) | |
| ``` | |
| ## Customization | |
| ### Modifying Selectors | |
| If the website structure changes, you may need to update the CSS selectors in `songbpm_scraper.py`. Look for the `_parse_search_results` method: | |
| ```python | |
| def _parse_search_results(self, soup: BeautifulSoup, source_url: str) -> list[SongData]: | |
| # Update these selectors if the website changes | |
| song_selectors = [ | |
| "div.track", | |
| "div.song-item", | |
| "div[data-testid='track']", | |
| # ... add or modify selectors | |
| ] | |
| ``` | |
| ### Adjusting Delays | |
| Modify the `ScraperConfig` to adjust request timing: | |
| ```python | |
| from songbpm_scraper import ScraperConfig, SongBPMScraper | |
| config = ScraperConfig( | |
| min_delay=1.0, # Minimum delay in seconds | |
| max_delay=3.0, # Maximum delay in seconds | |
| timeout=60000, # Request timeout in milliseconds | |
| max_retries=5 # Number of retry attempts | |
| ) | |
| scraper = SongBPMScraper(config) | |
| ``` | |
| ## Ethical Usage Guidelines | |
| 1. **Respect Rate Limits**: The scraper includes delays by default. Do not reduce them excessively. | |
| 2. **Check robots.txt**: Always verify you're allowed to scrape the target website. | |
| 3. **Personal Use Only**: This tool is intended for personal, non-commercial use. | |
| 4. **API First**: If available, use official APIs instead of scraping. | |
| 5. **Cache Results**: Store results locally to avoid repeated requests for the same data. | |
| ## Troubleshooting | |
| ### Browser Not Found | |
| If you see an error about missing browser: | |
| ```bash | |
| playwright install chromium | |
| ``` | |
| ### Timeout Errors | |
| Increase the timeout in the configuration: | |
| ```python | |
| config = ScraperConfig(timeout=60000) # 60 seconds | |
| ``` | |
| ### No Results | |
| The website structure may have changed. Try with the `--visible` flag to see what's happening: | |
| ```bash | |
| python songbpm_scraper.py "song name" --visible | |
| ``` | |
| ## Legal Disclaimer | |
| This tool is provided for educational purposes only. Users are responsible for ensuring their use complies with the website's terms of service and applicable laws. The developer is not responsible for any misuse of this tool. | |
| ## License | |
| MIT License - feel free to use and modify for your own projects. | |
| ## Contributing | |
| Contributions are welcome! Please feel free to submit a Pull Request. | |
| ## Support | |
| If you encounter issues or have questions, please open an issue on GitHub. |