I guess the reason is slow is because llama.cpp is not optimized...
appvoid
AI & ML interests
Recent Activity
Organizations
Correct! It's causal modeling (for now) with a char level tokenizer with only 8 tokens.
The model learns by looking for relationships of sequences for a single token, so the only way it learns is literally nudging weights towards a generalized solution using pure sequences.
In short, it learns to learn.
Will the be any app to.. convert the dots to something meaningful?
Not yet, I'm focusing on getting the core right first. But once the model is general enough, I don't see why not. Though you might need to finetune it for your use case.
Correct! It's causal modeling (for now) with a char level tokenizer with only 8 tokens.
The model learns by looking for relationships of sequences for a single token, so the only way it learns is literally nudging weights towards a generalized solution using pure sequences.
In short, it learns to learn.
It's already decent at some tasks, with next version coming in a few weeks.
appvoid/dot
The first model proudly trained from scratch on "physical" reasoning instead of chunky language tokens was published.
if you need raw power though slow, rwkv 0.4b has you covered, if you need something in between choose lfm2 350m
indeed ๐ค
The first project, as far as I know, that focuses purely on few-shot prompting results rather than zero-shot like usually done with decoder-only transformer models. This model excels at few-shot tasks compared to most 0.6b and even bigger models. It also outperforms the base model on some popular language modeling benchmarks.
appvoid/arco-3
Try it yourself!
Do you have your raspberry pis and phones ready for this new model yet?
New model, new architecture, more power:
Since I'm unable to post for about 11 hours, I will post it here: https://huggingface.co/appvoid/arco-3
the issue is, we are still to find how to apply that to language space
Current transformer-based, self-supervised systems have driven massive gains, but important gaps remain on the path to AGI. Key missing pieces are continual, curiosity-driven learning; grounded multimodal perception; reliable, contextual long-term memory with forgetting; motivated (hot) executive control and dynamic attention; metacognition and coherent causal world-models; and robust fluid reasoning, planning and decision-making. Progress will require hybrid architectures (neuromorphic/Hebbian + gradients + symbolic modules), active-inference and intrinsic-motivation objectives, and new lifelong, embodied benchmarks to evaluate safety and competence.
https://huggingface.co/blog/KnutJaegersberg/whats-missing-for-agi-in-todays-tech-trajectories
all i can say to this question is i don't know, maybe it could rapidly developed into an asi and if the amount of compute for a super intelligence ends up being at human brain level or even a little more than that, then is easier to picture the implications
in you hypothetical scenario, there could be rich people that could become criminals with such power btw, corruption is universal and the power of knowing non-obvious things of reality itself due to lack of a higher intelligence can be pointed towards the masses
obviously gpt 2 was the kickstarter for openai but they didn't actually know the power of gpt4 when they created gpt 1
same thing could happen with reasoning, it might or might not have bigger implications, who knows
good point, if someone creates an ai that extrapolates to any dataset then might just make science quickly than the average damage bad guys cause
i know right?