t.d.a.g. PRO

sequelbox

sequelbox.bsky.social

AI & ML interests

open source, infinite games. (they/them)

Recent Activity

posted an update about 6 hours ago

JUST RELEASED: the Tachibana 4 DeepSeek-V4-Pro dataset and our all-new Tachibana-Agent coding model! - Questions prioritize real-world, challenging agentic coding tasks across a variety of programming languages and topics. Synthetic prompts utilize a variety of personas, experience levels, and styles of communication to maximize real-world flexibility and usability. - Areas of focus include back-end and front-end development, systems programming, distributed systems, performance optimization, data structures, databases and data engineering, game and mobile development, security engineering, compiler design, custom tooling, task automation, practical bugfixes, and more! - A wide variety of emphasized languages improves development capability: Python, C, C++, C#, Go, TypeScript, Java, JavaScript, Rust, Haskell, SQL, Shell, R, Ruby, assembly code, and more! The new dataset: https://huggingface.co/datasets/sequelbox/Tachibana4-DeepSeek-V4-Pro The new model: https://huggingface.co/sequelbox/Qwen3.6-27B-Tachibana-Agent We're thrilled to bring this to everyone - try it out and see what you think! Tachibana 4 is the first of several datasets used for the upcoming Esper 4! See what we're working on and help our releases come out faster: https://huggingface.co/spaces/sequelbox/SupportOpenSource Open source will win :) love, allegra

liked a model about 6 hours ago

sequelbox/Qwen3.6-27B-Tachibana-Agent

updated a model about 6 hours ago

sequelbox/Qwen3.6-27B-Tachibana-Agent

View all activity

Organizations

posted an update about 6 hours ago

Post

JUST RELEASED: the Tachibana 4 DeepSeek-V4-Pro dataset and our all-new Tachibana-Agent coding model!

- Questions prioritize real-world, challenging agentic coding tasks across a variety of programming languages and topics. Synthetic prompts utilize a variety of personas, experience levels, and styles of communication to maximize real-world flexibility and usability.
- Areas of focus include back-end and front-end development, systems programming, distributed systems, performance optimization, data structures, databases and data engineering, game and mobile development, security engineering, compiler design, custom tooling, task automation, practical bugfixes, and more!
- A wide variety of emphasized languages improves development capability: Python, C, C++, C#, Go, TypeScript, Java, JavaScript, Rust, Haskell, SQL, Shell, R, Ruby, assembly code, and more!

The new dataset: sequelbox/Tachibana4-DeepSeek-V4-Pro
The new model: sequelbox/Qwen3.6-27B-Tachibana-Agent

We're thrilled to bring this to everyone - try it out and see what you think!

Tachibana 4 is the first of several datasets used for the upcoming Esper 4! See what we're working on and help our releases come out faster: sequelbox/SupportOpenSource

Open source will win :)

love,
allegra

posted an update 7 days ago

Post

3199

EARLY SNEAK PREVIEW of our first DeepSeek-V4-Pro dataset, Tachibana 4!

Tachibana 4 is our upcoming agentic coding dataset:
- Questions prioritize real-world, challenging agentic coding tasks across a variety of programming languages and topics.
- Areas of focus include back-end and front-end development, systems programming, distributed systems, performance optimization, data structures, databases and data engineering, game and mobile development, security engineering, compiler design, custom tooling, task automation, practical bugfixes, and more!
- A wide variety of emphasized languages improves development capability: Python, C, C++, C#, Go, TypeScript, Java, JavaScript, Rust, Haskell, SQL, Shell, R, Ruby, assembly code, and more!
- Synthethic prompts utilize a variety of personas, experience levels, and styles of communication to maximize real-world flexibility and usability.

Get it now: sequelbox/Tachibana4-DeepSeek-V4-Pro-PREVIEW

These agentic datasets will power the upcoming Esper 4, and whatever you can build! We'll have more finetunes on the way as well! :) we're going to make open source better and better for your work!

If you would like to see Esper 4 and these datasets faster, this is the best way you can help us: sequelbox/SupportOpenSource

for freedom, with love,
allegra

replied to their post 12 days ago

Qwen 3.6 27B available as well: https://huggingface.co/ValiantLabs/Qwen3.6-27B-Esper3.1

Quants for both are up too:
https://huggingface.co/mradermacher/Qwen3.6-35B-A3B-Esper3.1-i1-GGUF
https://huggingface.co/mradermacher/Qwen3.6-35B-A3B-Esper3.1-GGUF
https://huggingface.co/mradermacher/Qwen3.6-27B-Esper3.1-i1-GGUF
https://huggingface.co/mradermacher/Qwen3.6-27B-Esper3.1-GGUF

posted an update 17 days ago

Post

1915

NEW RELEASE: Esper 3.1 for Qwen 3.6!

- Your dedicated DevOps expert: Esper 3.1 maximizes DevOps and architecture helpfulness, powered by high-difficulty DevOps and architecture data generated with DeepSeek-V3.1-Terminus!
- Improved coding performance: challenging code-reasoning datasets stretch DeepSeek-V3.1-Terminus and DeepSeek-V3.2 to the limits, allowing Esper 3.1 to tackle harder coding tasks!
- AI to build AI: our high-difficulty AI expertise data boosts Esper 3.1's MLOps, AI architecture, AI research, and general reasoning skills.

Get it now: ValiantLabs/Qwen3.6-35B-A3B-Esper3.1

We're working on more finetunes for the newest Qwen and Gemma models, and we've also started working on the agentic-first datasets for Esper 4 :) we're going to make open source better and better for your work!

Please note that real life financial and family concerns have popped up and have imposed unfortunate limitations on our ability to devote time to our open-source work :( If you would like to see Esper 4 and our other releases speed up instead of slowing down, this is the best way you can help us: sequelbox/SupportOpenSource

No matter what, we'll keep fighting and we won't give up!

with love,
allegra

1 reply

posted an update 24 days ago

Post

232

Multiple new releases for Gemma 4!

For Gemma 4 31B: Guardpoint, our medical reasoning model, trained on medical knowledge, management, diagnosis, and tasks:
- Structured medical reasoning responses are efficient and informative, cutting token costs for faster inference!
- Wide-ranging knowledge base: trained on a wide variety of medical disciplines, patient types, and query structures!
- High quality medical responses emphasize performance, brevity, specificity, statistical rationality, and openness.

Get Guardpoint for Gemma 4: ValiantLabs/gemma-4-31B-it-Guardpoint

For Gemma 4 E4B and E2B: Shining Valiant 3, our science-reasoning model!
- Science-reasoning: physics, biology, chemistry, compsci, astronomy, Earth science, and information theory.
- AI to build AI: high-quality reasoning performance on AI, MLOps, math and CUDA, complex adaptive and agentic systems, cognition, logic, linguistics, simulation, knowledge management, and more!
- Supplemented creative reasoning and general chat performance.

Get the new SV3 models:
E4B: ValiantLabs/gemma-4-E4B-it-ShiningValiant3
E2B: ValiantLabs/gemma-4-E2B-it-ShiningValiant3

We're working on several things - most excitingly, we've officially started the dataset curation process for Esper 4! We're focused on enhanced agentic capability and higher-dififculty, higher-value tasks this time, very excited to bring this to everyone when we can :)

Help support our releases, donations used for our experimental models and datasets: sequelbox/SupportOpenSource

Fight for open source with us!

for love and friendship,
allegra

reacted to mayafree's post with 🔥 about 2 months ago

Post

5904

Leaderboard of Leaderboards — A Real-Time Meta-Ranking of AI Benchmarks

MAYA-AI/all-leaderboard

Hundreds of AI leaderboards exist on HuggingFace. Knowing which ones the community actually trusts has never been easy — until now.

Leaderboard of Leaderboards (LoL) ranks the leaderboards themselves, using live HuggingFace trending scores and cumulative likes as the signal. No editorial curation. No manual selection. Just what the global AI research community is actually visiting and endorsing, surfaced in real time.

Sort by trending to see what is capturing attention right now, or by likes to see what has built lasting credibility over time. Nine domain filters let you zero in on what matters most to your work, and every entry shows both its rank within this collection and its real-time global rank across all HuggingFace Spaces.

The collection spans well-established standards like Open LLM Leaderboard, Chatbot Arena, MTEB, and BigCodeBench alongside frameworks worth watching. FINAL Bench targets AGI-level evaluation across 100 tasks in 15 domains and recently reached the global top 5 in HuggingFace dataset rankings. Smol AI WorldCup runs tournament-format competitions for sub-8B models scored via FINAL Bench criteria. ALL Bench aggregates results across frameworks into a unified ranking that resists the overfitting risks of any single standard.

The deeper purpose is not convenience. It is transparency. How we measure AI matters as much as the AI we measure.

5 replies

posted an update 2 months ago

Post

1383

Multiple new releases for Qwen 3.5 27B!

Firstly, Guardpoint, our medical reasoning model; trained on medical knowledge, management, diagnosis, and tasks:
- Structured medical reasoning responses are efficient and informative, cutting token costs for faster inference!
- Wide-ranging knowledge base: trained on a wide variety of medical disciplines, patient types, and query structures!
- High quality medical responses emphasize performance, brevity, specificity, statistical rationality, and openness.

Get Guardpoint for Qwen 3.5 27B: ValiantLabs/Qwen3.5-27B-Guardpoint

Secondly, we've also brought DAG Reasoning to Qwen 3.5 27B:
- Create structured, analytical Directed Acyclic Graphs to provide insight into your queries and situations!
- Multi-step analysis identifies causal relationships, produces confidence measurements, and forms a single structured graph object.
- DAG Reasoning Format provides clear, readable JSON containing structured, useful information; easy to use for creating visualizations, doing analysis, or further conversation with your assistant.
- Trained in a variety of subjects for flexible analysis: programming, science, business, economics, finance, law, logistics, management, and more!

Get the newest DAG Reasoning release: sequelbox/Qwen3.5-27B-DAG-Reasoning

We also have Esper 3.1 available for Qwen 3.5 27B - focused on high-performance coding, DevOps, and architecture: ValiantLabs/Qwen3.5-27B-Esper3.1

We'll have a lot more to come for the high-performance Qwen 3.5! Most of it is waiting for Deepseek V4 to come out first :) We've got some fun ideas!

Help support our releases, donations used for our experimental models and datasets: sequelbox/SupportOpenSource

Fight for open source with us! We've got a lot to do.

for friendship,
allegra

replied to their post 2 months ago

a silver lining: as this issue has nothing to do with training, only merging, there shouldn't be any delays to our other planned Qwen 3.5 releases this week. We'll work hard to get those out for everyone's use.

posted an update 2 months ago

Post

335

IMPORTANT corrective note/apology: the initial upload of ValiantLabs/Qwen3.5-27B-Esper3.1 contained improperly merged weights, meaning it was effectively just Qwen 3.5. We've merged properly now and re-uploaded the correct weights to the existing repository.

The model link is here: ValiantLabs/Qwen3.5-27B-Esper3.1

This is not at all the fault of Qwen 3.5, transformers, or anything other than our own flawed upload pipeline and insufficient post-upload validation. We have to do better. We'll immediately improve our validation procedures to be more rigorous. This will never happen again.

We are proud to build quickly: by providing a quick finetune of a new high-performance model like Qwen 3.5, we seek to provide value not only in the model's direct use but especially to our fellow open-source creators, who can learn from our initial training attempt. We are proud that building fast helps you build fast. In this case, our poor work has produced the opposite result - we've wasted your time. We are really sorry to everyone, but especially our fellow builders.

We feel about one inch tall right now, but we're going to get back to work and do better. Our crew deserves better and so do our users.

Humbly, your captain,
t.d.a.g.

1 reply

reacted to etemiz's post with ❤️ 2 months ago

Post

6350

AHA 2026 scores of Qwen3.5

27B
Huihui abliteration 65%
Heretic abliteration 55%
Normal 50%

35B
Huihui abliteration 64%
@jiaojjjjje abliteration 57%
@LeadFootThrottleCock abliteration 56%
Normal 49%

6 replies

replied to their post 2 months ago

more models to come in a few days, very excited about Q3.5 so far 😄

posted an update 2 months ago

Post

249

NEW RELEASE: Esper 3.1 for the all-new Qwen 3.5 27b!

- Esper is our full-stack, full-cycle coding, DevOps, and architecture specialist!
- Our newest, best DeepSeek technical datasets emphasize more challenging queries and tough real-world coding tasks across a variety of programming languages and development paradigms:
- Titanium 3 for coding and reasoning in DevOps and architecture: sequelbox/Titanium3-DeepSeek-V3.1-Terminus
- Tachibana 3 for high-difficulty code production in a variety of topics and programming languages:
- sequelbox/Tachibana3-Part1-DeepSeek-V3.1-Terminus
- sequelbox/Tachibana3-Part2-DeepSeek-V3.2
- Mitakihara for MLOps, AI building, use, expertise, and research: sequelbox/Mitakihara-DeepSeek-R1-0528

Get Esper 3.1 now for Qwen 3.5 27b: ValiantLabs/Qwen3.5-27B-Esper3.1

We're very excited to build on the new Qwen 3.5 - look for Shining Valiant, Guardpoint, and the rest of our crew coming soon!

We're also planning for new datasets with the upcoming Deepseek V4, including improved agentic and deep-thinking capabilities, and making use of Deepseek's architecture innovations!

See our Experimental Reasoning models and open-source datasets: @sequelbox

Help us keep working for open source AI with a donation - used directly for new models and datasets: sequelbox/SupportOpenSource

Fight for open source. It's an investment in your own future with AI. Closed-source providers will get more expensive, restrictive, and punitive in the long run if they win.

with love, your captain,
allegra

1 reply

posted an update 3 months ago

Post

1671

NEW RELEASE: we've brought Guardpoint to gpt-oss 120b and 20b!
- Guardpoint is our new medical reasoning model; trained on medical knowledge, management, diagnosis, and tasks from DeepSeek-V3.2-Speciale!
- Structured medical reasoning responses are efficient and informative, cutting token costs for faster inference!
- Wide-ranging knowledge base: trained on a wide variety of medical disciplines, patient types, and query structures!
- High quality medical responses emphasize performance, brevity, specificity, statistical rationality, and openness.

Get it now:
Guardpoint for gpt-oss-120b: ValiantLabs/gpt-oss-120b-Guardpoint
Guardpoint for gpt-oss-20b: ValiantLabs/gpt-oss-20b-Guardpoint
Powered by our new structured medical reasoning dataset: sequelbox/Superpotion-DeepSeek-V3.2-Speciale

Guardpoint is also available for Qwen 3:
Guardpoint for Qwen 3 32B: ValiantLabs/Qwen3-32B-Guardpoint
Guardpoint for Qwen 3 14B: ValiantLabs/Qwen3-14B-Guardpoint

We've been working hard on Guardpoint; we're really excited to share it with everyone! It's also our best finetune so far for gpt-oss. Try it out and see what you think!

We'll be bringing Guardpoint, Shining Valiant, and Esper to more models soon, along with further experimental releases. We're planning to do a lot with Deepseek's upcoming release; it should unlock a lot of new possibilities for specialist and experimental models!

Get our experimental models: https://huggingface.co/collections/sequelbox/experimental-reasoning-models
Get our reasoning datasets: https://huggingface.co/collections/sequelbox/reasoning-datasets

Help support our releases, donations used for our experimental models and datasets: sequelbox/SupportOpenSource

Fight for open source with us!

love,
allegra

replied to their post 3 months ago

gpt-oss release coming soon :)

reacted to mmhamdy's post with 🔥 4 months ago

Post

3155

The new DeepSeek Engram paper is super fun! It also integrates mHC, and I suspect they're probably releasing all these papers to make the V4 report of reasonable length😄

Here's a nice short summary from Gemini

posted an update 4 months ago

Post

2692

NEW RELEASE: it's here! Meet the newest member of the Valiant crew: Guardpoint, our new medical reasoning model!
- Trained on medical knowledge, management, diagnosis, and tasks from DeepSeek-V3.2-Speciale!
- Structured medical reasoning responses are efficient and informative, cutting token costs for faster inference!
- Wide-ranging knowledge base: trained on a wide variety of medical disciplines, patient types, and query structures!
- High quality medical responses emphasize performance, brevity, specificity, statistical rationality, and openness.

Get it now:
Guardpoint for Qwen 3 32B: ValiantLabs/Qwen3-32B-Guardpoint
Guardpoint for Qwen 3 14B: ValiantLabs/Qwen3-14B-Guardpoint
Powered by our new structured medical reasoning dataset: sequelbox/Superpotion-DeepSeek-V3.2-Speciale

We've been working hard on Guardpoint; we're really excited to share it with everyone!

We'll be bringing Guardpoint to more models soon, along with further releases for the Shining Valiant and Esper series!

Get our experimental models: https://huggingface.co/collections/sequelbox/experimental-reasoning-models
Get our reasoning datasets: https://huggingface.co/collections/sequelbox/reasoning-datasets

Help support our releases, donations used for our experimental models and datasets: sequelbox/SupportOpenSource

2026 is going to be an amazing year for open source AI! It's time for the AI revolution you need; from the bottom up, built together by all of us.

for love, friendship, and better days,
allegra

1 reply

replied to their post 5 months ago

put up a quick merge of the SV3 and Esper 3.1 Ministrals: https://huggingface.co/sequelbox/Ministral-3-14B-Reasoning-2512-PlumEsper1.1

posted an update 5 months ago

Post

3044

Two new releases today!

Firstly, our new Raiden-Mini dataset, powered by DeepSeek's newest deepseek-ai/DeepSeek-V3.2-Speciale model!
- A V3.2-Speciale reasoning showcase: the Raiden prompts test the model's creative, analytic, and general reasoning skills!
- HEAD TO HEAD: a comparison subset pits V3.2-Speciale against V3.2 with the same prompts, providing a direct look at each model's advantages!

Get the new Raiden-Mini dataset: sequelbox/Raiden-Mini-DeepSeek-V3.2-Speciale

On the model side, we've also brought Shining Valiant 3 to Ministral 3!
- Science-reasoning: sequelbox/Celestia3-DeepSeek-R1-0528 for physics, biology, chemistry, compsci, astronomy, Earth science, and information theory.
- AI to build AI: the sequelbox/Mitakihara-DeepSeek-R1-0528 dataset for high-quality reasoning performance on AI, MLOps, math and CUDA, complex adaptive and agentic systems, cognition, logic, linguistics, simulation, knowledge management, and more!
- Creative reasoning and general chat performance supplemented with sequelbox/Raiden-DeepSeek-R1

Get the newest SV3: ValiantLabs/Ministral-3-14B-Reasoning-2512-ShiningValiant3

Esper 3.1 is available for Ministral 3 as well: ValiantLabs/Ministral-3-14B-Reasoning-2512-Esper3.1

We're working hard on our next Big New Release, coming out in the next few weeks :)

Help support our releases, donations used for models and datasets: sequelbox/SupportOpenSource

Open source matters. Fight for it with us.

with love and friendship,
allegra

1 reply

replied to their post 5 months ago

we've added 14b and 3b as well - we'd like to specifically recommend the 14b for everyone to try: https://huggingface.co/ValiantLabs/Ministral-3-14B-Reasoning-2512-Esper3.1

reacted to danielhanchen's post with 🔥 5 months ago

Post

3978

Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM)
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF

🐱 Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3

3 replies

t.d.a.g. PRO

AI & ML interests

Recent Activity

Organizations

sequelbox's activity