Feynman Innovations

ajibawa-2023

93 149 156

AjinkyaBawase

AI & ML interests

LLM, RL, DL, ML, AGI. Developing LLMs (preferably fully fine tuned ) for various use cases.

Recent Activity

reacted to theirpost with 🚀 13 days ago

Shell-Code-Large Dataset: https://huggingface.co/datasets/ajibawa-2023/Shell-Code-Large Shell-Code-Large is a large-scale corpus of Shell scripting source code comprising approximately 640,000 code samples stored in JSON Lines (.jsonl) format. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, DevOps automation, cloud infrastructure engineering, system administration, and software engineering automation. By providing a high-volume, language-specific corpus focused exclusively on Shell scripting, Shell-Code-Large enables systematic experimentation in automation workflows, deployment pipelines, infrastructure management, and command-line tooling. These domains remain foundational to Linux systems, cloud-native platforms, CI/CD environments, and modern DevOps practices. Shell-Code-Large addresses the need for a dedicated Shell-focused dataset at substantial scale, enabling targeted research into scripting patterns, command composition, workflow orchestration, infrastructure automation, and operational engineering practices

reacted to theirpost with 🔥 13 days ago

posted an update 13 days ago

View all activity

Organizations

reacted to their post with 🚀🔥 13 days ago

Post

6840

Shell-Code-Large
Dataset: ajibawa-2023/Shell-Code-Large

Shell-Code-Large is a large-scale corpus of Shell scripting source code comprising approximately 640,000 code samples stored in JSON Lines (.jsonl) format. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, DevOps automation, cloud infrastructure engineering, system administration, and software engineering automation.

By providing a high-volume, language-specific corpus focused exclusively on Shell scripting, Shell-Code-Large enables systematic experimentation in automation workflows, deployment pipelines, infrastructure management, and command-line tooling. These domains remain foundational to Linux systems, cloud-native platforms, CI/CD environments, and modern DevOps practices.

Shell-Code-Large addresses the need for a dedicated Shell-focused dataset at substantial scale, enabling targeted research into scripting patterns, command composition, workflow orchestration, infrastructure automation, and operational engineering practices