Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
🏗️
Building on HF
126.0
TFLOPS
Zixi "Oz" Li
PRO
OzTianlu
9
32
44
Follow
Ujjwal-Tyagi's profile picture
j14i's profile picture
kkish's profile picture
34 followers
·
34 following
https://github.com/lizixi-0x2F
lizixi-0x2F
AI & ML interests
My research focuses on deep reasoning with small language models, Transformer architecture innovation, and knowledge distillation for efficient alignment and transfer.
Recent Activity
updated
a model
11 days ago
OzTianlu/Qwen3.5-2B-OBLITERATED
published
a model
11 days ago
OzTianlu/Qwen3.5-2B-OBLITERATED
reacted
to
their
post
with 🔥
19 days ago
ResNet is Explicit Euler. GPT is Implicit Euler. What Else is Hiding in Plain Sight? Read online: https://datawhalechina.github.io/learning-terrain/ I wrote an open-source monograph on learning dynamics — The Terrain of Learning. Bilingual (Chinese/English), 4 volumes, 12 chapters, 30+ print-grade figures. Completely free (CC BY-NC-SA 4.0). The core argument: gradient descent is not optimization. It's terrain motion. The loss function is a landscape. The gradient is the direction of slope. The optimizer is how you choose each step. Once you see it this way, everything clicks: ResNet = explicit Euler integration on a vector field. The residual branch is the vector field. Each layer takes one Euler step. GPT autoregression = implicit-state Euler iteration. Stable where explicit Euler explodes. That's why transformers handle long-range dependencies. DEQ = the Banach fixed-point theorem in production. The forward pass is root-finding. There are no layers to backprop through. KL divergence = a Bregman divergence on the entropy landscape. Your belief space is curved, not flat. Chain-of-thought reasoning = hidden states flowing along a reasoning field toward an attractor basin. Correct answers have wide basins. The number of reasoning steps is determined by the terrain, not by the problem. Diffusion models = systems flowing downhill along a score vector field, from noise to structure, from high energy to low energy. The book traces one idea across 337 years — from F=ma (Newton, 1687) to H=T+V (Hamilton, 1833) to loss landscape + gradient field (2020s). Hamilton replaced a catalog of forces with one geometric object. This book does the same for deep learning. GitHub: https://github.com/datawhalechina/learning-terrain Discussion: https://github.com/datawhalechina/learning-terrain/discussions/2 Convergence is not hope. Convergence is geometry. You see.
View all activity
Organizations
OzTianlu
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
about 1 month ago
OpenDataArena/ODA-Mixture-500k
Viewer
•
Updated
Apr 27
•
506k
•
472
•
123
liked
a model
about 2 months ago
HuggingFaceTB/nanowhale-100m
Text Generation
•
0.1B
•
Updated
May 4
•
875
•
64
liked
a dataset
3 months ago
TAAC2026/data_sample_1000
Viewer
•
Updated
Apr 10
•
1k
•
958
•
93
liked
a model
3 months ago
google/gemma-4-26B-A4B-it
Image-Text-to-Text
•
27B
•
Updated
29 days ago
•
13.3M
•
•
1.21k
liked
a model
4 months ago
NoesisLab/Arcade-3B
Text Generation
•
3B
•
Updated
Mar 16
•
12
•
8
liked
a dataset
4 months ago
OpenDataArena/MMFineReason-SFT-123K-Qwen3-VL-235B-Thinking
Viewer
•
Updated
Feb 3
•
123k
•
424
•
84
liked
4 models
4 months ago
Tesslate/OmniCoder-9B
Text Generation
•
9B
•
Updated
Mar 13
•
3.49k
•
653
mispeech/dashengtokenizer
Audio-to-Audio
•
0.8B
•
Updated
Apr 21
•
2.79k
•
12
NoesisLab/Collins-Embedding-3M
Sentence Similarity
•
Updated
Mar 12
•
8
SimplySara/Kai-3B-Instruct-i1-GGUF
Text Generation
•
3B
•
Updated
Feb 28
•
18
•
1
liked
a Space
4 months ago
Sleeping
Agents
3
Kai 30B Instruct
🌖
3
Chat with Kai-30B-Instruct
liked
3 models
4 months ago
NoesisLab/Kai-30B-Instruct
Text Generation
•
33B
•
Updated
Mar 26
•
20
•
21
NoesisLab/Kai-3B-Instruct
Text Generation
•
3B
•
Updated
Mar 4
•
9
•
5
NoesisLab/Kai-0.35B-Instruct
Text Generation
•
0.4B
•
Updated
Feb 26
•
6
•
4
liked
2 datasets
4 months ago
nebius/SWE-rebench-openhands-trajectories
Viewer
•
Updated
Dec 27, 2025
•
67.1k
•
5.61k
•
132
openai/openai_humaneval
Viewer
•
Updated
Jan 4, 2024
•
164
•
233k
•
396
liked
a model
4 months ago
LocoreMind/LocoOperator-4B
Text Generation
•
4B
•
Updated
Feb 24
•
444
•
•
279
liked
a Space
4 months ago
Running
on
Zero
Agents
2
ChatSpartacus
⚡
2
Chat with Spartacus-1B-Instruct
liked
a dataset
4 months ago
OpenDataArena/ODA-Mixture-100k
Viewer
•
Updated
Jan 21
•
101k
•
165
•
97
liked
a model
4 months ago
Qwen/Qwen3-32B
Text Generation
•
33B
•
Updated
Jul 26, 2025
•
4.47M
•
•
709
Load more