YarikDev
yarikdevcom
AI & ML interests
Discovering
Recent Activity
liked a model 13 days ago
Tesslate/OmniCoder-9B-GGUF liked a model 15 days ago
Tesslate/OmniCoder-9B reacted to ajibawa-2023's post with ๐ฅ 24 days ago
Cpp-Code-Large
Dataset: https://huggingface.co/datasets/ajibawa-2023/Cpp-Code-Large
Cpp-Code-Large is a large-scale corpus of C++ source code comprising more than 5 million lines of C++ code. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, software engineering automation, and static program analysis for the C++ ecosystem.
By providing a high-volume, language-specific corpus, Cpp-Code-Large enables systematic experimentation in C++-focused model training, domain adaptation, and downstream code understanding tasks.
Cpp-Code-Large addresses the need for a dedicated C++-only dataset at substantial scale, enabling focused research across systems programming, performance-critical applications, embedded systems, game engines, and large-scale native software projects.Organizations
None yet