Spaces:

91prince
/

SEGAN

No application file

App Files Files Community

SEGAN / README.md

91prince

Update README and requirements.txt: add gradio==3.39.0 and improve docs

d056f62 3 months ago

preview code

raw

history blame contribute delete

1.5 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

metadata

title: SEGAN
emoji: 🏢
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 6.1.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Remove BackgroundNoise and Generate Image from the Audio

SEAGAN Speech Enhancement & API

A minimal speech-denoising project built around a SEGAN-style U-Net generator. It includes:

Training script to learn on paired noisy/clean audio.
Inference pipeline that denoises long clips in chunks and can pack output audio losslessly into PNG.
FastAPI service to expose denoise + PNG pack/restore endpoints.
Gradio demo for Hugging Face Spaces (app.py).

Repo Contents

SEGAN.py – training components: config, dataset, U-Net generator, PatchGAN discriminator, training loop.
pipeline.py – inference utilities: chunked denoiser, spectral gating cleanup, PNG pack/restore helpers.
app.py – Gradio / FastAPI app wiring the pipeline for UI/API use.
checkpoint/seagan_final.pt – example checkpoint (place your own if different) — tracked with git-lfs.
requirements.txt – Python dependencies.

Prerequisites

Python 3.9+ (tested with PyTorch CPU/GPU builds).
For GPU inference/training, install the matching CUDA-enabled torch/torchaudio.
FFmpeg is not required; torchaudio handles WAV I/O.

Install

python -m venv .venv
# Windows PowerShell:
.\.venv\Scripts\Activate.ps1
# or cmd:
.\.venv\Scripts\activate.bat
pip install -r requirements.txt