Spaces:

theeducationtree11
/

scripttgenn

Runtime error

App Files Files Community

theeducationtree11 commited on Jul 24, 2025

Commit

b91965c

verified ·

1 Parent(s): 567ffc4

Upload 4 files

Browse files

Files changed (4) hide show

README.md +91 -8
app.py +89 -0
packages.txt +1 -0
requirements.txt +11 -0

README.md CHANGED Viewed

@@ -1,12 +1,95 @@
 ---
-title: Scripttgenn
-emoji: 📉
-colorFrom: green
-colorTo: red
 sdk: gradio
-sdk_version: 5.38.1
-app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Scripttt
 sdk: gradio
+appfile: app.py
+colorFrom: blue
+colorTo: green
+license: mit
+tags:
+  - transcription
+  - diarization
+  - whisper
+  - pyannote
+  - video
+  - short-form
+  - gradio
+  - content-creation
+pythonversion: "3.10"
 ---
+# Scripttt
+Scripttt is a Python web application that enables content creators to repurpose long-form video content into concise, engaging scripts for short-form platforms such as Instagram Reels and YouTube Shorts. Built with Gradio, Scripttt combines state-of-the-art transcription, speaker diarization, and script generation to deliver production-ready outputs that reflect the tone and style of the original conversation.
+## Features
+- **Video File Uploads Only**
+  Accepts direct uploads of video files (`.mp4`, `.mkv`, and other common formats). Audio-only files and external links are not supported.
+- **Accurate Transcription**
+  Utilizes OpenAI Whisper for high-quality speech-to-text conversion.
+- **Speaker Diarization**
+  Employs pyannote.audio to automatically identify and label speakers within the transcript.
+- **Speaker-Tagged Transcript**
+  Generates a clean, speaker-attributed transcript of the input video.
+- **Short-Form Script Generation**
+  Produces a concise, human-like script optimized for viral, short-form video content.
+- **Privacy by Design**
+  All processing occurs locally; no external URLs or remote media are accepted.
+## Installation
+1. **Clone the Repository**
+   ```
+   git clone https://github.com/your-username/scripttt.git
+   cd scripttt
+   ```
+2. **Set Up a Virtual Environment (Recommended)**
+   ```
+   python -m venv venv
+   source venv/bin/activate  # On Windows: venv\Scripts\activate
+   ```
+3. **Install Dependencies**
+   ```
+   pip install -r requirements.txt
+   ```
+4. **Configure Environment Variables**
+   - Create a `.env` file in the project root.
+   - Add your Hugging Face and Google API credentials as environment variables.
+     Example:
+     ```
+     HUGGINGFACE_TOKEN=your_huggingface_token
+     GOOGLE_API_KEY=your_google_api_key
+     ```
+## Usage
+1. **Run the Application**
+   ```
+   python app.py
+   ```
+2. **Access the Interface**
+   - Open the local URL provided by Gradio in your browser.
+   - Upload a supported video file and follow the on-screen instructions.
+## Output
+- **Speaker-Tagged Transcript:**
+  A clean, readable transcript with speaker labels.
+- **Short-Form Script:**
+  A new, concise script based on the original video, ready for use in short-form content production.
+## Limitations
+- YouTube links, remote URLs, and audio-only files are **not supported**. Only direct video file uploads are accepted.
+```

app.py ADDED Viewed

	@@ -0,0 +1,89 @@

+# app.py
+import os, tempfile, subprocess, gradio as gr
+from dotenv import load_dotenv
+import whisper
+import pvfalcon
+# ───────────────────────────────────────────
+# 1.  ENVIRONMENT
+# ───────────────────────────────────────────
+load_dotenv()
+FALCON_ACCESS_KEY = os.getenv("FALCON_ACCESS_KEY")
+if not FALCON_ACCESS_KEY:
+    raise RuntimeError(
+        "Set FALCON_ACCESS_KEY in your environment or .env file "
+        "(get one free at https://console.picovoice.ai)."
+    )
+# ───────────────────────────────────────────
+# 2.  MODELS
+# ───────────────────────────────────────────
+whisper_model = whisper.load_model("base")          # CPU-friendly
+falcon = pvfalcon.create(access_key=FALCON_ACCESS_KEY)
+# ───────────────────────────────────────────
+# 3.  CORE LOGIC
+# ───────────────────────────────────────────
+def process_video(file, language="Auto"):
+    # 3.1  Choose language for Whisper
+    lang_code = None if language == "Auto" else language.lower()
+    # 3.2  Extract mono 16-kHz WAV with ffmpeg
+    with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as wav:
+        wav_path = wav.name
+    subprocess.run(
+        ["ffmpeg", "-y", "-i", file.name,
+         "-ar", "16000", "-ac", "1", "-acodec", "pcm_s16le", wav_path],
+        stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL
+    )
+    if not os.path.getsize(wav_path):
+        return "Audio extraction failed.", ""
+    # 3.3  Speaker diarization
+    segments = falcon.process_file(wav_path)         # list[pvfalcon.Segment]
+    diarized_map, label_map, counter = [], {}, 1
+    for seg in segments:
+        tag = seg.speaker_tag
+        if tag not in label_map:
+            label_map[tag] = f"Speaker {counter}"
+            counter += 1
+        diarized_map.append(
+            dict(start=seg.start_sec, end=seg.end_sec, speaker=label_map[tag])
+        )
+    # 3.4  Transcription (Whisper)
+    res = whisper_model.transcribe(wav_path, language=lang_code)
+    paragraph_transcript = res["text"]                       # plain paragraph
+    # 3.5  Merge speakers with transcription
+    speaker_lines = []
+    for s in res.get("segments", []):
+        speaker = next(
+            (m["speaker"] for m in diarized_map if m["start"] <= s["start"] <= m["end"]),
+            "Unknown"
+        )
+        speaker_lines.append(f"{speaker}: {s['text']}")
+    speaker_transcript = "\n".join(speaker_lines)
+    # 3.6  Return in desired order
+    return speaker_transcript, paragraph_transcript
+# ───────────────────────────────────────────
+# 4.  GRADIO UI
+# ───────────────────────────────────────────
+demo = gr.Interface(
+    fn=process_video,
+    inputs=[
+        gr.File(label="Upload Video", type="filepath"),
+        gr.Dropdown(["Auto", "English", "Hindi", "Urdu"], label="Language")
+    ],
+    outputs=[
+        gr.Textbox(label="Speaker-wise Transcript", show_copy_button=True),
+        gr.Textbox(label=" Transcription", show_copy_button=True)
+    ],
+    title="Transcription + Speaker Segmentation",
+    description="Whisper + Picovoice Falcon running fully on CPU."
+)
+if __name__ == "__main__":
+    demo.launch()

packages.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ ffmpeg

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+python-dotenv
+requests
+openai
+pandas
+git+https://github.com/openai/whisper.git
+ffmpeg-python
+yt-dlp
+torch
+torchaudio
+gradio
+pvfalcon # Picovoice Falcon