Leo
leonelcde
ยท
AI & ML interests
None yet
Recent Activity
liked a dataset 3 days ago
ajibawa-2023/Shell-Code-Large reacted to ajibawa-2023's post with ๐ฅ 3 days ago
Shell-Code-Large
Dataset: https://huggingface.co/datasets/ajibawa-2023/Shell-Code-Large
Shell-Code-Large is a large-scale corpus of Shell scripting source code comprising approximately 640,000 code samples stored in JSON Lines (.jsonl) format. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, DevOps automation, cloud infrastructure engineering, system administration, and software engineering automation.
By providing a high-volume, language-specific corpus focused exclusively on Shell scripting, Shell-Code-Large enables systematic experimentation in automation workflows, deployment pipelines, infrastructure management, and command-line tooling. These domains remain foundational to Linux systems, cloud-native platforms, CI/CD environments, and modern DevOps practices.
Shell-Code-Large addresses the need for a dedicated Shell-focused dataset at substantial scale, enabling targeted research into scripting patterns, command composition, workflow orchestration, infrastructure automation, and operational engineering practices liked a dataset 8 days ago
Glint-Research/Fable-5-tracesOrganizations
None yet