Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Denis Matveev's picture

Denis Matveev

sodeniZz

AI & ML interests

None yet

Organizations

None yet

sodeniZz 's collections 2

Parameter-Efficient Fine-Tuning (LoRA & DoRa & QLoRA)

A collection of parameter-efficient fine-tuning experiments for sentiment classification using chat-based instruction tuning

sodeniZz/llm-course-hw3-lora

Text Generation • 0.3B • Updated Dec 8, 2025
sodeniZz/llm-course-hw3-dora

Text Generation • 0.3B • Updated Dec 8, 2025 • 1
sodeniZz/llm-course-hw3-tinyllama-qlora

Updated Dec 8, 2025
sodeniZz/llm-course-hw3-tinyllamma-qlora

Updated Dec 8, 2025

LLM Course Homework 2: RLHF (DPO & PPO)

The collection includes the DPO-trained model, PPO-trained model, and the Reward Model used for PPO.

sodeniZz/llm-course-hw2-dpo

Text Generation • 0.1B • Updated Nov 15, 2025
sodeniZz/llm-course-hw2-reward-model

Text Classification • 0.1B • Updated Nov 15, 2025 • 1
sodeniZz/llm-course-hw2-ppo

Text Generation • 0.1B • Updated Nov 15, 2025 • 2

Parameter-Efficient Fine-Tuning (LoRA & DoRa & QLoRA)

A collection of parameter-efficient fine-tuning experiments for sentiment classification using chat-based instruction tuning

sodeniZz/llm-course-hw3-lora

Text Generation • 0.3B • Updated Dec 8, 2025
sodeniZz/llm-course-hw3-dora

Text Generation • 0.3B • Updated Dec 8, 2025 • 1
sodeniZz/llm-course-hw3-tinyllama-qlora

Updated Dec 8, 2025
sodeniZz/llm-course-hw3-tinyllamma-qlora

Updated Dec 8, 2025

LLM Course Homework 2: RLHF (DPO & PPO)

The collection includes the DPO-trained model, PPO-trained model, and the Reward Model used for PPO.

sodeniZz/llm-course-hw2-dpo

Text Generation • 0.1B • Updated Nov 15, 2025
sodeniZz/llm-course-hw2-reward-model

Text Classification • 0.1B • Updated Nov 15, 2025 • 1
sodeniZz/llm-course-hw2-ppo

Text Generation • 0.1B • Updated Nov 15, 2025 • 2

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs