Ran Le
leran1995
AI & ML interests
LLM Pretraining
Organizations
Update README.md
#38 opened 2 months ago
by
kerasakit
Overthinking Problem
➕ 1
4
#27 opened 3 months ago
by
JainilGosalia
Add syntax highlight in python markdown code snippets
#34 opened 2 months ago
by
RahulSharma0
increase the max_new_tokens in the demo
#33 opened 2 months ago
by
bitsnaps
dataset for sft
1
#30 opened 2 months ago
by
Roman1111111
Chain-of-Thought or Chain-of-Mimicry? The Over-SFT problem in Nanbeige 4.1-3B aka "I_Should_X"
1
#26 opened 3 months ago
by
srs6901
Typo in readme: Live-Code-Bench-Pro-Mediium -> Live-Code-Bench-Pro-Medium
1
#28 opened 3 months ago
by
HenkPoley
Reportedly this 'Nanbeige/Nanbeige4.1-3B' model doesn't even know 'who' it is, or who makes it!?
31
#19 opened 3 months ago
by
dakerholdings
Very Impressive!
❤️ 14
9
#7 opened 3 months ago
by
cob05
anybody know the size of the context window for Nanbeige4.1-3B?
1
#20 opened 3 months ago
by
test333333
you got us hooked now , cant wait for the release of the 4.2 version . could you please provide any ETA or approximations about when it MIGHT release ?
➕ 3
5
#17 opened 3 months ago
by
Why-T
Training Data and inference scripts with tool calling , websearch and so on plus training scripts
❤️ 1
3
#16 opened 3 months ago
by
snapo
Any Plans for an Instruct Model?
🤗🔥 6
6
#15 opened 3 months ago
by
Ashacorporation
Reasoning switch?
1
#3 opened 3 months ago
by
Sinaya
License?
1
#5 opened 3 months ago
by
ivanfioravanti
diagram and numbers dont match
1
#3 opened 3 months ago
by
kalle07
please upload to modelscope
1
#1 opened 3 months ago
by
J22
Github codebase for DPD
❤️ 1
1
#4 opened 5 months ago
by
NamburiSrinath
A good model for benchmark but useless in daily due to think too much
2
#2 opened 5 months ago
by
wh1018
My thoughts on CoT of this model
👍 1
11
#1 opened 5 months ago
by
MrDevolver