Artem Rebrikov
elyha7
AI & ML interests
None yet
Recent Activity
reacted
to
ajibawa-2023's
post with 🔥 4 days ago
Cpp-Code-Large
Dataset: https://huggingface.co/datasets/ajibawa-2023/Cpp-Code-Large
Cpp-Code-Large is a large-scale corpus of C++ source code comprising more than 5 million lines of C++ code. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, software engineering automation, and static program analysis for the C++ ecosystem.
By providing a high-volume, language-specific corpus, Cpp-Code-Large enables systematic experimentation in C++-focused model training, domain adaptation, and downstream code understanding tasks.
Cpp-Code-Large addresses the need for a dedicated C++-only dataset at substantial scale, enabling focused research across systems programming, performance-critical applications, embedded systems, game engines, and large-scale native software projects. updated
a collection
11 days ago
TTS updated
a collection
11 days ago
VLM Organizations
None yet