RedML

government

http://localhost:8000

AI & ML interests

None defined yet.

Recent Activity

Fizzarolli updated a model about 1 month ago

red-ml/TankieV2a-8B

Fizzarolli updated a model about 1 month ago

red-ml/TankieV2.1-8B

Fizzarolli published a model about 1 month ago

red-ml/TankieV2.1-8B

View all activity

updated 2 models about 1 month ago

red-ml/TankieV2a-8B

8B • Updated May 26 • 2

red-ml/TankieV2.1-8B

Text Generation • 8B • Updated May 26 • 2

published a model about 1 month ago

red-ml/TankieV2.1-8B

Text Generation • 8B • Updated May 26 • 2

updated a model about 1 month ago

red-ml/TankieV2b-8B

8B • Updated May 26 • 1

published a model about 1 month ago

red-ml/TankieV2b-8B

8B • Updated May 26 • 1

updated a bucket about 1 month ago

red-ml/tankieV2-data-tokenized

published a model about 2 months ago

red-ml/TankieV2a-8B

8B • Updated May 26 • 2

published a bucket about 2 months ago

red-ml/tankieV2-data-tokenized

in red-ml/prolewiki-library-markdownified about 2 months ago

[bot] Conversion to Parquet

#1 opened about 2 months ago by

parquet-converter

updated 2 datasets about 2 months ago

red-ml/tankieV2-data

Viewer • Updated May 15 • 2.23k • 18 • 1

red-ml/prolewiki-library-markdownified

Viewer • Updated May 15 • 1.05k • 26

published 2 datasets about 2 months ago

red-ml/prolewiki-library-markdownified

Viewer • Updated May 15 • 1.05k • 26

red-ml/tankieV2-data

Viewer • Updated May 15 • 2.23k • 18 • 1

updated a Space about 2 months ago

README

published a Space about 2 months ago

README

posted an update about 2 years ago

Post

3071

hi everyone!

i wanted to share an experiment i did with upcycling phi-3 mini into an moe recently.
while benchmarks are definitely within a margin of error and they performed similarly, i think it's an interesting base to try and see if you can improve phi's performance! (maybe looking into HuggingFaceFW/fineweb-edu could be interesting, i also left some other notes if anyone with more compute access wants to try it themselves)

check it out! Fizzarolli/phi3-4x4b-v1

posted an update about 2 years ago

Post

3487

Is anyone looking into some sort of decentralized/federated dataset generation or classification by humans instead of synthetically?

From my experience with trying models, a *lot* of modern finetunes are trained on what amounts to, in essence, GPT-4 generated slop that makes everything sound like a rip-off GPT-4 (refer to i.e. the Dolphin finetunes). I have a feeling that this is a lot of the reason people haven't been quite as successful as Meta's instruct tunes of Llama 3.