# Reverse Engineering of S3Tokenizer
Supervised Semantic Speech Tokenizer (S3Tokenizer)
| Before (extract code offline) | After (extract code online) |
|---|---|
| ```py class SpeechLLM(nn.Module): ... def __init__(self, ...): ... def forward(self, speech_codes: Tensor, text_ids: Tensor, ...): ... ``` | ```py import s3tokenizer class SpeechLLM(nn.Module): ... def __init__(self, ...): ... self.speech_tokenizer = s3tokenizer.load_model("speech_tokenizer_v1") # or "speech_tokenizer_v1_25hz" self.speech_tokenizer.freeze() def forward(self, speech: Tensor, speech_lens: Tensor, text_ids: Tensor, ...): ... speech_codes, speech_codes_lens = self.speech_tokenizer.quantize(speech, speech_lens) speech_codes = speech_codes.clone() # for backward compatbility speech_codes_lens = speeech_codes_lens.clone() # for backward compatbility ``` |