Raptor K's picture

Raptor K

raptorkwok

·

https://raptor.hk

AI & ML interests

NLP, Machine Translation

Recent Activity

updated a dataset about 19 hours ago

raptorkwok/cantonese-chinese-parallel-corpus

new activity 9 days ago

Qwen/Qwen2.5-7B-Instruct-GGUF:Update the path of llama-* commands

new activity 9 days ago

Qwen/Qwen2.5-7B-Instruct-GGUF:"huggingface-cli download" is deprecated, use "hf download" instead

View all activity

Organizations

None yet

updated a dataset about 19 hours ago

raptorkwok/cantonese-chinese-parallel-corpus

Viewer • Updated about 19 hours ago • 185k • 40 • 2

New activity in Qwen/Qwen2.5-7B-Instruct-GGUF 9 days ago

Update the path of llama-* commands

#10 opened 9 days ago by

"huggingface-cli download" is deprecated, use "hf download" instead

#9 opened 9 days ago by

liked a model 3 months ago

OpenMOSS-Team/bart-base-chinese

0.1B • Updated Aug 12, 2025 • 7.02k • 110

New activity in raptorkwok/cantonese-chinese-parallel-corpus 3 months ago

[bot] Conversion to Parquet

#1 opened about 1 year ago by

parquet-converter

liked a dataset 3 months ago

raptorkwok/cantonese-chinese-parallel-corpus

Viewer • Updated about 19 hours ago • 185k • 40 • 2

upvoted an article 3 months ago

Article

Engineering Notes: Training a LoRA for Z-Image Turbo with the Ostris AI Toolkit

Dec 2, 2025

•

13

commented on Engineering Notes: Training a LoRA for Z-Image Turbo with the Ostris AI Toolkit 3 months ago

Thanks for writing the article. I'm unclear about the following:

What should be entered in each dataset's image description? short non-word token to identify the image? Is this token unique to each image in the dataset? I thought each image should have its own paragraph to describe the image.
If I set Quantization > Transformer & Text Encoder set to None, what are the side effects?

My objective is to train a LoRA of a person with ~20 images of the same person, what kind of images should I prepare? full-body, or just portraits?

updated a model 4 months ago

raptorkwok/cantonese_tokenizer_fast

published a model 4 months ago

raptorkwok/cantonese_tokenizer_fast

updated a model 4 months ago

raptorkwok/pyctokenizer

Translation • 0.2B • Updated Jan 20

published a model 4 months ago

raptorkwok/pyctokenizer

Translation • 0.2B • Updated Jan 20

upvoted a collection 5 months ago

Cantonese Dataset

A Collection of dataset from HON9KON9IZE members • 10 items • Updated Oct 7, 2024 • 6

updated a dataset 5 months ago

raptorkwok/cantonese_sentences

Viewer • Updated Dec 10, 2025 • 30.2M • 43 • 5

updated a Space 5 months ago

ChineseBLEU

Evaluate Chinese text translation quality

New activity in dimitribarbot/Qwen-Image-int8wo 5 months ago

Add safetensors version of the model?

#1 opened 5 months ago by

New activity in google/mt5-xl 6 months ago

TypeError: Couldn't build proto file into descriptor pool: duplicate file name sentencepiece_model.proto

#3 opened 6 months ago by

updated a Space 6 months ago

Chinesemeteor

Evaluate Chinese text quality

published 2 Spaces 6 months ago

ChineseBLEU

Evaluate Chinese text translation quality

Chinesemeteor

Evaluate Chinese text quality