3 2 12

Ic Pro

icpro

xegulon

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

Hcompany/Holo2-30B-A3B

liked a model about 1 month ago

nanochat-students/nanochat-d20

liked a model about 2 months ago

zai-org/GLM-4.7-Flash

View all activity

Organizations

liked 2 models about 1 month ago

Hcompany/Holo2-30B-A3B

Image-Text-to-Text • 31B • Updated Nov 21, 2025 • 965 • 41

nanochat-students/nanochat-d20

0.6B • Updated Oct 27, 2025 • 1.41k • 16

liked a model about 2 months ago

zai-org/GLM-4.7-Flash

Text Generation • Updated Jan 29 • 1.75M • • 1.61k

New activity in karpathy/nanochat-d32 about 2 months ago

Add Optimizer State

#9 opened about 2 months ago by

icpro

reacted to Parveshiiii's post with 👍 8 months ago

Post

2865

🧠 MathX-5M by XenArcAI — Scalable Math Reasoning for Smarter LLMs

Introducing MathX-5M, a high-quality, instruction-tuned dataset built to supercharge mathematical reasoning in large language models. With 5 million rigorously filtered examples, it spans everything from basic arithmetic to advanced calculus—curated from public sources and enhanced with synthetic data.

🔍 Key Highlights:
- Step-by-step reasoning with verified answers
- Covers algebra, geometry, calculus, logic, and more
- RL-validated correctness and multi-stage filtering
- Ideal for fine-tuning, benchmarking, and educational AI

📂 - https://huggingface.co/datasets/XenArcAI/MathX-5M

1 reply

liked 2 models 12 months ago

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 4.84M • • 1.47k

docling-project/SmolDocling-256M-preview

Image-Text-to-Text • Updated Sep 17, 2025 • 72.9k • 1.61k

liked a model over 1 year ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 734k • • 12.5k

updated 2 models almost 2 years ago

icpro/trained-model-classification-evaluation

Text Classification • 0.3B • Updated May 31, 2024

edumalin/edumalin-mixtral-indications-and-evaluation-merged

Text Generation • 47B • Updated May 30, 2024

upvoted a paper almost 2 years ago

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 124

reacted to victor's post with 🔥 almost 2 years ago

Post

4363

The hype is real: a mysterious gpt2-chatbot model has appeared on the LLM Arena Leaderboard 👀.
It seems to be at least on par with the top performing models (closed and open).

To try it out: https://chat.lmsys.org/ -> then click on the Direct Chat tab and select gpt2-chatbot.

Take your bet, what do you think it is?

4 replies

liked 5 models almost 2 years ago

liked a Space almost 2 years ago

Open LLM Leaderboard

🏆

13.9k

Track, rank and evaluate open LLMs and chatbots

upvoted an article almost 2 years ago

Article

Fine-tune Llama 3 with ORPO

Apr 22, 2024

•

240

reacted to tomaarsen's post with 🔥 almost 2 years ago

Post

3221

🚀 Sentence Transformers v2.7.0 is out! Featuring a new loss function, easier Matryoshka model inference & evaluation, CrossEncoder improvements & Intel Gaudi2 Accelerator support. Details:

1️⃣ A new loss function: CachedGISTEmbedLoss
This loss function is a combination of CachedMultipleNegativesRankingLoss and the GISTEmbedLoss, both of which are already excellent. The caching mechanism allows for much higher batch sizes with constant memory usage, which boosts training performance. The GIST part introduces a guide model to guide the in-batch negative sample selection. This prevents false negatives, resulting in a stronger training signal.

2️⃣ Automatic Matryoshka model truncation
Matryoshka models produce embeddings that are still useful after truncation. However, this truncation always had to be done manually, until now! We've added a truncate_dim option to the Sentence Transformer constructor. This also allows truncation when using HuggingFaceEmbeddings from LlamaIndex or LangChain.

3️⃣ Additionally, you can now specify truncate_dim in evaluators to get the performance after truncation. (Hint: it's surprisingly good, even for models not trained with MatryoshkaLoss, and it can speed up e.g. clustering, retrieval, etc.)

4️⃣ CrossEncoder improvements
The CrossEncoder now supports 'push_to_hub' to upload trained reranker models to Hugging Face. Additionally, CrossEncoders now support trust_remote_code to load models with custom modelling code.

5️⃣ Inference on Intel Gaudi2
If you have an Intel Gaudi2 Accelerator, Sentence Transformers now uses it automatically for even faster inference. No changes are necessary to your code, the device is automatically detected!

Check out the release notes for all of the details: https://github.com/UKPLab/sentence-transformers/releases/tag/v2.7.0

I'm very excited for the upcoming releases: I'm making great progress with a notable v3 refactor that should heavily improve the training process for embedding models!

2 replies

Ic Pro

AI & ML interests

Recent Activity

Organizations

icpro's activity

Add Optimizer State

Open LLM Leaderboard

Fine-tune Llama 3 with ORPO