Spaces:

rbbist
/

Nepal_Kanun_Patrika_Scrapper

Sleeping

App Files Files Community

rbbist commited on Aug 6, 2025

Commit

39044e6

verified ·

1 Parent(s): c686250

Upload 4 files

Browse files

Files changed (4) hide show

Kanun_Patrika_Scrapper_For_HFSpaces.py +0 -0
README.markdown +40 -0
app.py +63 -0
requirements.txt +7 -0

Kanun_Patrika_Scrapper_For_HFSpaces.py ADDED Viewed

The diff for this file is too large to render. See raw diff

README.markdown ADDED Viewed

	@@ -0,0 +1,40 @@

+```markdown
+# Nepal Kanoon Patrika Scraper
+This is a web application deployed on Hugging Face Spaces (Free Tier) that scrapes legal case data from the Nepal Kanoon Patrika website (https://nkp.gov.np/). It allows users to select a case type (mudda type) and a Nepali year to scrape legal case details, which are stored in a SQLite database and associated HTML files are saved in a folder.
+## Features
+- Scrapes legal case details including decision number, court, judges, parties, and more.
+- Stores data in a SQLite database (`legal_cases.db`).
+- Saves raw HTML files in the `scraped_html` folder for future reference.
+- Uses existing HTML files when available to reduce redundant web requests.
+- Provides a user-friendly Gradio interface for initiating scraping tasks.
+## Usage Instructions
+1. Open the Gradio interface in your browser.
+2. Select a **Mudda Type** from the dropdown menu. Options include:
+   - दुनियाबादी देवानी
+   - सरकारबादी देवानी
+   - दुनियावादी फौजदारी
+   - सरकारवादी फौजदारी
+   - रिट
+   - निवेदन
+   - विविध
+3. Enter a **Nepali Year** (e.g., २०७३) in the textbox.
+4. Click the **Run Scraper** button to start scraping.
+5. Monitor the progress and results in the status output box.
+## Technical Details
+- **Backend**: The scraping logic is implemented in `Kanun_Patrika_Scraper_For_HFSpaces.py`, which handles web requests, HTML parsing, and data storage.
+- **Storage**: Scraped data is stored in a SQLite database (`legal_cases.db`) to keep file sizes manageable within Hugging Face Spaces' free tier storage limits.
+- **HTML Files**: Raw HTML content is saved in the `scraped_html` folder for reuse, reducing the need for repeated web requests.
+- **Dependencies**: Listed in `requirements.txt`, including `requests`, `beautifulsoup4`, `pandas`, `nepali-datetime`, and `gradio`.
+- **Environment**: Designed to run on Hugging Face Spaces (Free Tier) with CPU-only requirements.
+## Notes
+- The database and HTML files are stored persistently in the Hugging Face Spaces environment.
+- The application is modular, allowing updates to the backend script (`Kanun_Patrika_Scraper_For_HFSpaces.py`) without modifying the Gradio interface (`app.py`).
+- Ensure the Nepali year entered is valid (e.g., between 2015 and the current year in Nepali calendar) to avoid errors.
+For issues or contributions, please contact the repository maintainer.
+```

app.py ADDED Viewed

	@@ -0,0 +1,63 @@

+```python
+import gradio as gr
+from Kanun_Patrika_Scraper_For_HFSpaces import LegalCaseScraper
+def run_scraper(mudda_type, nepali_year, progress=gr.Progress()):
+    """
+    Run the scraper with the given inputs and update progress.
+    Returns a message indicating success or failure.
+    """
+    try:
+        # Initialize scraper
+        scraper = LegalCaseScraper(output_db="legal_cases.db", html_folder="scraped_html")
+        # Validate inputs
+        if not mudda_type or not nepali_year:
+            return "Error: Please select a mudda type and enter a Nepali year."
+        # Run scraper
+        progress(0.1, desc="Starting scraper...")
+        scraper.run_scraper(mudda_type=mudda_type, sal=nepali_year, use_saved=True)
+        progress(1.0, desc="Scraping completed!")
+        return f"Scraping completed for mudda_type: {mudda_type}, year: {nepali_year}. Data saved to SQLite database."
+    except Exception as e:
+        return f"Error: {str(e)}"
+    finally:
+        scraper.close()
+# Define Gradio interface using Blocks
+with gr.Blocks(title="Nepal Kanoon Patrika Scraper") as demo:
+    gr.Markdown("# Nepal Kanoon Patrika Scraper")
+    gr.Markdown("Scrape legal case data from Nepal Kanoon Patrika website. Select a mudda type and enter a Nepali year to begin.")
+    with gr.Row():
+        mudda_type = gr.Dropdown(
+            choices=[
+                "दुनियाबादी देवानी",
+                "सरकारबादी देवानी",
+                "दुनियावादी फौजदारी",
+                "सरकारवादी फौजदारी",
+                "रिट",
+                "निवेदन",
+                "विविध"
+            ],
+            label="Mudda Type",
+            info="Select the type of legal case"
+        )
+        nepali_year = gr.Textbox(label="Nepali Year", placeholder="e.g., २०७३", max_lines=1)
+    run_button = gr.Button("Run Scraper")
+    output = gr.Textbox(label="Status", interactive=False)
+    run_button.click(
+        fn=run_scraper,
+        inputs=[mudda_type, nepali_year],
+        outputs=output
+    )
+# Launch the interface
+demo.launch()
+```

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+```text
+requests==2.32.3
+beautifulsoup4==4.12.3
+pandas==2.2.3
+nepali-datetime==1.0.2
+gradio==4.44.0
+```