anwgpt

AI & ML interests

None defined yet.

Recent Activity

FlameF0X updated a model about 1 month ago

anwgpt/anwllama-1-chat

FlameF0X published a model about 1 month ago

anwgpt/anwllama-1-chat

FlameF0X updated a model about 1 month ago

anwgpt/anwllama-1-base

View all activity

posted an update 4 days ago

Post

133

I did some testing on the scalability of FWKV. It hits a speed bottleneck at 1B due to the T4’s bandwidth limitations. Theoretically, it should match RWKV’s inference speed if the GPU had more bandwidth. So the 1B size is not accurate.

posted an update 5 days ago

Post

191

Greetings Hugging Face!

I started a new project called **FWKV** (Feed-forward Weighted Key Value, or Floored Weighted Key Value), a RWKV-style LM that uses FFNNs (Feed-Forward Neural Networks) instead of RNN and floor(W·K·V). I'm hoping to make it much more efficient and scalable than RWKV.

So far I have:

- FlameF0X/FWKV-29M — this one is undertrained and doesn't have a Space yet. In the attached image you can see its speed on a T4 compared to models with the same configuration.

The only model that's fully working right now is:
- FlameF0X/FWKV-TinyStories — trained on TinyStories for one epoch. The demo Space is FlameF0X/FWKV-demo.

2 replies

·

updated a model about 1 month ago

anwgpt/anwllama-1-chat

Text Generation • 34M • Updated Apr 8 • 33

published a model about 1 month ago

anwgpt/anwllama-1-chat

Text Generation • 34M • Updated Apr 8 • 33

updated a model about 1 month ago

anwgpt/anwllama-1-base

Text Generation • 34M • Updated Apr 8 • 18

updated 2 collections about 1 month ago

anwllama 1

1 item • Updated Apr 8

anwgpt4

4 items • Updated Apr 8

published a model about 1 month ago

anwgpt/anwllama-1-base

Text Generation • 34M • Updated Apr 8 • 18

updated a model about 2 months ago

anwgpt/anwgpt4.1-opus

published a model about 2 months ago

anwgpt/anwgpt4.1-opus

updated 2 models about 2 months ago

anwgpt/anwgpt4.1-chat

Text Generation • 30.7M • Updated Mar 25 • 1.35k • 2

anwgpt/anwgpt4-chat

Text Generation • 27.2M • Updated Mar 25 • 1.36k • 2

updated 2 models 4 months ago

anwgpt/anwgpt4-base

Text Generation • 27.2M • Updated Jan 31 • 27 • 2

anwgpt/anwgpt4.1-base

Text Generation • 30.7M • Updated Jan 31 • 3

updated a collection 4 months ago

anwgpt4

4 items • Updated Apr 8

published 2 models 4 months ago

anwgpt/anwgpt4.1-chat

Text Generation • 30.7M • Updated Mar 25 • 1.35k • 2

anwgpt/anwgpt4.1-base

Text Generation • 30.7M • Updated Jan 31 • 3

updated a collection 4 months ago

anwgpt4

4 items • Updated Apr 8

posted an update 9 months ago

Post

4371

I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore.

7 replies

·