FranckyB
/

VibeVoice-Large-4bit

4-bit precision

Model card Files Files and versions

VibeVoice-Large-4bit / README.md

FranckyB's picture

Upload folder using huggingface_hub

c304b36 verified about 1 month ago

|

history blame contribute delete

1 kB

	# VibeVoice 7B - 4-bit Quantized

	Optimized for RTX 3060/4060 and similar 12GB VRAM GPUs.

	## Specifications
	- Quantization: 4-bit (nf4)
	- Model size: 6.2 GB
	- VRAM usage: ~8 GB
	- Quality: Very good (minimal degradation)

	## Usage

	```python
	from vibevoice.modular.modeling_vibevoice_inference import VibeVoiceForConditionalGenerationInference
	from vibevoice.processor.vibevoice_processor import VibeVoiceProcessor

	model = VibeVoiceForConditionalGenerationInference.from_pretrained(
	"Dannidee/VibeVoice7b-low-vram/4bit",
	device_map='cuda',
	torch_dtype=torch.bfloat16,
	)
	processor = VibeVoiceProcessor.from_pretrained("Dannidee/VibeVoice7b-low-vram/4bit")

	# Generate speech
	text = "Speaker 1: Hello! Speaker 2: Hi there!"
	inputs = processor(
	text=[text],
	voice_samples=[["voice1.wav", "voice2.wav"]],
	padding=True,
	return_tensors="pt",
	)

	outputs = model.generate(**inputs)
	processor.save_audio(outputs.speech_outputs[0], "output.wav")
	```