Instructions to use MoYoYoTech/Translator with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MoYoYoTech/Translator with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="MoYoYoTech/Translator",
	filename="moyoyo_asr_models/qwen2.5-1.5b-instruct-q5_0.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use MoYoYoTech/Translator with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoYoYoTech/Translator:Q5_0
# Run inference directly in the terminal:
llama-cli -hf MoYoYoTech/Translator:Q5_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoYoYoTech/Translator:Q5_0
# Run inference directly in the terminal:
llama-cli -hf MoYoYoTech/Translator:Q5_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf MoYoYoTech/Translator:Q5_0
# Run inference directly in the terminal:
./llama-cli -hf MoYoYoTech/Translator:Q5_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf MoYoYoTech/Translator:Q5_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf MoYoYoTech/Translator:Q5_0

Use Docker

docker model run hf.co/MoYoYoTech/Translator:Q5_0

LM Studio
Jan
Ollama
How to use MoYoYoTech/Translator with Ollama:
```
ollama run hf.co/MoYoYoTech/Translator:Q5_0
```

Unsloth Studio

How to use MoYoYoTech/Translator with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoYoYoTech/Translator to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoYoYoTech/Translator to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for MoYoYoTech/Translator to start chatting

How to use MoYoYoTech/Translator with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf MoYoYoTech/Translator:Q5_0

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "MoYoYoTech/Translator:Q5_0"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use MoYoYoTech/Translator with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf MoYoYoTech/Translator:Q5_0

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default MoYoYoTech/Translator:Q5_0

Run Hermes

hermes

Docker Model Runner
How to use MoYoYoTech/Translator with Docker Model Runner:
```
docker model run hf.co/MoYoYoTech/Translator:Q5_0
```

Lemonade

How to use MoYoYoTech/Translator with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull MoYoYoTech/Translator:Q5_0

Run and chat with the model

lemonade run user.Translator-Q5_0

List all available models

lemonade list

Xin Zhang commited on Apr 17, 2025

Commit

fdeedee

1 Parent(s): ce0e589

[fix]: typo

Browse files

Files changed (1) hide show

transcribe/strategy.py +30 -30

transcribe/strategy.py CHANGED Viewed

@@ -34,17 +34,17 @@ class TranscriptResult:
 class TranscriptToken:
     """表示一个转录片段，包含文本和时间信息"""
     text: str  # 转录的文本内容
-    t0: float  # 开始时间（百分之一秒）
-    t1: float  # 结束时间（百分之一秒）
     def is_punctuation(self):
         """检查文本是否包含标点符号"""
         return REGEX_MARKERS.search(self.text.strip()) is not  None
     def is_end(self):
         """检查文本是否为句子结束标记"""
         return SENTENCE_END_PATTERN.search(self.text.strip())  is not  None
     def is_pause(self):
         """检查文本是否为暂停标记"""
         return PAUSEE_END_PATTERN.search(self.text.strip()) is not  None
@@ -86,13 +86,13 @@ class TranscriptChunk:
             if not ck.only_punctuation()
         ]
     def get_split_first_rest(self,  mode: SplitMode):
         chunks = self.split_by(mode)
         fisrt_chunk = chunks[0] if chunks else self
         rest_chunks = chunks[1:] if chunks else None
         return fisrt_chunk, rest_chunks
     def puncation_numbers(self) -> int:
         """计算片段中标点符号的数量"""
         return sum(1 for seg in self.items if seg.is_punctuation())
@@ -104,35 +104,35 @@ class TranscriptChunk:
     def join(self) -> str:
         """将片段连接为一个字符串"""
         return self.separator.join(seg.text for seg in self.items)
     def compare(self, chunk: Optional['TranscriptChunk'] = None) -> float:
         """比较当前片段与另一个片段的相似度"""
         if not chunk:
             return 0
         score =  self._calculate_similarity(self.join(), chunk.join())
         logger.debug(f"Compare: {self.join()} vs {chunk.join()} : {score}")
         return score
     def only_punctuation(self)->bool:
         return all(seg.is_punctuation() for seg in self.items)
     def has_punctuation(self) -> bool:
         return any(seg.is_punctuation() for seg in self.items)
     def get_buffer_index(self) -> int:
         return self.items[-1].buffer_index()
     def is_end_sentence(self) ->bool:
         return self.items[-1].is_end()
 class TranscriptHistory:
     """管理转录片段的历史记录"""
     def __init__(self) -> None:
         self.history = collections.deque(maxlen=2)  # 存储最近的两个片段
     def add(self, chunk: TranscriptChunk):
         """添加新的片段到历史记录"""
         self.history.appendleft(chunk)
@@ -144,7 +144,7 @@ class TranscriptHistory:
     def lastest_chunk(self):
         """获取最后一个片段"""
         return self.history[-1]
     def clear(self):
         self.history.clear()
@@ -168,7 +168,7 @@ class TranscriptBuffer:
     def get_seg_id(self) -> int:
         return self._current_seg_id
     @property
     def current_sentences_length(self) -> int:
         count = 0
@@ -178,7 +178,7 @@ class TranscriptBuffer:
             else:
                 count += len(item)
         return count
     def update_pending_text(self, text: str) -> None:
         """更新临时缓冲字符串"""
         self._buffer = text
@@ -192,11 +192,11 @@ class TranscriptBuffer:
     def commit_paragraph(self) -> None:
         """
         提交当前短句为完整段落（如句子结束）
         Args:
             end_of_sentence: 是否为句子结尾（如检测到句号）
         """
         count = 0
         current_sentences = []
         while len(self._sentences): # and count < 20:
@@ -219,13 +219,13 @@ class TranscriptBuffer:
         output = self.split_and_join(
                     text.replace(
                         self._separator, ""))
         logger.debug("==== rebuild string ====")
         logger.debug(text)
         logger.debug(output)
         return output
     @staticmethod
     def split_and_join(text):
         tokens = []
@@ -264,7 +264,7 @@ class TranscriptBuffer:
             for stable_str in stable_strings:
                 self.update_pending_text(stable_str)
                 self.commit_line()
             current_text_len = len(self.current_not_commit_text.split(self._separator)) if self._separator else len(self.current_not_commit_text)
             # current_text_len = len(self.current_not_commit_text.split(self._separator))
             self.update_pending_text(remaining_string)
@@ -279,7 +279,7 @@ class TranscriptBuffer:
             self.update_pending_text(remaining_string)
         return False
     @property
     def un_commit_paragraph(self) -> str:
         """当前短句组合"""
@@ -298,7 +298,7 @@ class TranscriptBuffer:
     @property
     def current_not_commit_text(self) -> str:
         return self.un_commit_paragraph + self.pending_text
 class TranscriptStabilityAnalyzer:
@@ -311,8 +311,8 @@ class TranscriptStabilityAnalyzer:
     def merge_chunks(self, chunks: List[TranscriptChunk])->str:
         output =  list(r.join() for r in chunks if r)
         return output
     def analysis(self, current: TranscriptChunk, buffer_duration: float) -> Iterator[TranscriptResult]:
         current = TranscriptChunk(items=current, separator=self._separator)
@@ -344,13 +344,13 @@ class TranscriptStabilityAnalyzer:
         # logger.debug("==========================")
         if curr_first and prev_first:
             core = curr_first.compare(prev_first)
             has_punctuation = curr_first.has_punctuation()
             if core >= 0.8 and has_punctuation:
                 yield from self._yield_commit_results(curr_first, curr_rest, curr_first.is_end_sentence())
                 return
         yield TranscriptResult(
             seg_id=self._transcript_buffer.get_seg_id(),
             context=self._transcript_buffer.current_not_commit_text
@@ -377,7 +377,7 @@ class TranscriptStabilityAnalyzer:
         stable_str_list = [stable_chunk.join()] if hasattr(stable_chunk, "join") else self.merge_chunks(stable_chunk)
         remaining_str_list = self.merge_chunks(remaining_chunks)
         frame_cut_index = stable_chunk[-1].get_buffer_index() if isinstance(stable_chunk, list) else stable_chunk.get_buffer_index()
         prev_seg_id = self._transcript_buffer.get_seg_id()
         commit_paragraph = self._transcript_buffer.update_and_commit(stable_str_list, remaining_str_list, is_end_sentence)
         logger.debug(f"current buffer: {self._transcript_buffer.__dict__}")
@@ -401,4 +401,4 @@ class TranscriptStabilityAnalyzer:
                 cut_index=frame_cut_index,
                 context=self._transcript_buffer.current_not_commit_text,
             )

 class TranscriptToken:
     """表示一个转录片段，包含文本和时间信息"""
     text: str  # 转录的文本内容
+    t0: int  # 开始时间（百分之一秒）
+    t1: int  # 结束时间（百分之一秒）
     def is_punctuation(self):
         """检查文本是否包含标点符号"""
         return REGEX_MARKERS.search(self.text.strip()) is not  None
     def is_end(self):
         """检查文本是否为句子结束标记"""
         return SENTENCE_END_PATTERN.search(self.text.strip())  is not  None
     def is_pause(self):
         """检查文本是否为暂停标记"""
         return PAUSEE_END_PATTERN.search(self.text.strip()) is not  None
             if not ck.only_punctuation()
         ]
     def get_split_first_rest(self,  mode: SplitMode):
         chunks = self.split_by(mode)
         fisrt_chunk = chunks[0] if chunks else self
         rest_chunks = chunks[1:] if chunks else None
         return fisrt_chunk, rest_chunks
     def puncation_numbers(self) -> int:
         """计算片段中标点符号的数量"""
         return sum(1 for seg in self.items if seg.is_punctuation())
     def join(self) -> str:
         """将片段连接为一个字符串"""
         return self.separator.join(seg.text for seg in self.items)
     def compare(self, chunk: Optional['TranscriptChunk'] = None) -> float:
         """比较当前片段与另一个片段的相似度"""
         if not chunk:
             return 0
         score =  self._calculate_similarity(self.join(), chunk.join())
         logger.debug(f"Compare: {self.join()} vs {chunk.join()} : {score}")
         return score
     def only_punctuation(self)->bool:
         return all(seg.is_punctuation() for seg in self.items)
     def has_punctuation(self) -> bool:
         return any(seg.is_punctuation() for seg in self.items)
     def get_buffer_index(self) -> int:
         return self.items[-1].buffer_index()
     def is_end_sentence(self) ->bool:
         return self.items[-1].is_end()
 class TranscriptHistory:
     """管理转录片段的历史记录"""
     def __init__(self) -> None:
         self.history = collections.deque(maxlen=2)  # 存储最近的两个片段
     def add(self, chunk: TranscriptChunk):
         """添加新的片段到历史记录"""
         self.history.appendleft(chunk)
     def lastest_chunk(self):
         """获取最后一个片段"""
         return self.history[-1]
     def clear(self):
         self.history.clear()
     def get_seg_id(self) -> int:
         return self._current_seg_id
     @property
     def current_sentences_length(self) -> int:
         count = 0
             else:
                 count += len(item)
         return count
     def update_pending_text(self, text: str) -> None:
         """更新临时缓冲字符串"""
         self._buffer = text
     def commit_paragraph(self) -> None:
         """
         提交当前短句为完整段落（如句子结束）
         Args:
             end_of_sentence: 是否为句子结尾（如检测到句号）
         """
         count = 0
         current_sentences = []
         while len(self._sentences): # and count < 20:
         output = self.split_and_join(
                     text.replace(
                         self._separator, ""))
         logger.debug("==== rebuild string ====")
         logger.debug(text)
         logger.debug(output)
         return output
     @staticmethod
     def split_and_join(text):
         tokens = []
             for stable_str in stable_strings:
                 self.update_pending_text(stable_str)
                 self.commit_line()
             current_text_len = len(self.current_not_commit_text.split(self._separator)) if self._separator else len(self.current_not_commit_text)
             # current_text_len = len(self.current_not_commit_text.split(self._separator))
             self.update_pending_text(remaining_string)
             self.update_pending_text(remaining_string)
         return False
     @property
     def un_commit_paragraph(self) -> str:
         """当前短句组合"""
     @property
     def current_not_commit_text(self) -> str:
         return self.un_commit_paragraph + self.pending_text
 class TranscriptStabilityAnalyzer:
     def merge_chunks(self, chunks: List[TranscriptChunk])->str:
         output =  list(r.join() for r in chunks if r)
         return output
     def analysis(self, current: TranscriptChunk, buffer_duration: float) -> Iterator[TranscriptResult]:
         current = TranscriptChunk(items=current, separator=self._separator)
         # logger.debug("==========================")
         if curr_first and prev_first:
             core = curr_first.compare(prev_first)
             has_punctuation = curr_first.has_punctuation()
             if core >= 0.8 and has_punctuation:
                 yield from self._yield_commit_results(curr_first, curr_rest, curr_first.is_end_sentence())
                 return
         yield TranscriptResult(
             seg_id=self._transcript_buffer.get_seg_id(),
             context=self._transcript_buffer.current_not_commit_text
         stable_str_list = [stable_chunk.join()] if hasattr(stable_chunk, "join") else self.merge_chunks(stable_chunk)
         remaining_str_list = self.merge_chunks(remaining_chunks)
         frame_cut_index = stable_chunk[-1].get_buffer_index() if isinstance(stable_chunk, list) else stable_chunk.get_buffer_index()
         prev_seg_id = self._transcript_buffer.get_seg_id()
         commit_paragraph = self._transcript_buffer.update_and_commit(stable_str_list, remaining_str_list, is_end_sentence)
         logger.debug(f"current buffer: {self._transcript_buffer.__dict__}")
                 cut_index=frame_cut_index,
                 context=self._transcript_buffer.current_not_commit_text,
             )