Request for Access to UltraData-SFT-2605 Dataset for Research & Open Source Contribution

#1
by arpitsh018 - opened

We deeply appreciate your vision and contributions toward the open-source community. However, in your description, you mentioned that your team uploaded the SFT dataset as part of the complete training data at https://huggingface.co/datasets/openbmb/UltraData-SFT-2605, but it does not currently appear to be accessible for public use or testing. It would be extremely valuable if you could make this dataset available, as open access would enable the community to conduct meaningful research and analysis, better understand large-scale high-quality SFT data, improve transparency and reproducibility, and contribute responsibly to further advancements in this space. Access to such high-quality supervised fine-tuning and pretraining datasets would significantly benefit the broader ecosystem. We would greatly appreciate any updates regarding its availability, and thank you again for your contributions to the community.

Sign up or log in to comment