# Dataset Card for Custom Text Dataset ## Dataset Name Custom CNN/DailyMail Text Summarization Dataset ## Overview This dataset is a custom subset and extension of the CNN/DailyMail dataset, consisting of news articles and their corresponding summaries. ## Composition Train Dataset: A custom train dataset consisting of one long news article with its manually written summary. Test Dataset: A test dataset sampled from the original CNN/DailyMail dataset, consisting of 100 articles and their corresponding highlights. ## Collection Process The custom train dataset was crafted using news articles from the CNN/DailyMail dataset. ## Preprocessing The intput text was tokenized. ## How to Use ```python from datasets import load_from_disk # Load the custom dataset train_dataset = load_from_disk("./results/custom_dataset/train") test_dataset = load_from_disk("./results/custom_dataset/test") ``` ## Evaluation This dataset can be evaluated using metrics such as ROUGE or BLEU. ## Limitations The train dataset consists of only one example. ## Ethical Considerations The data originates from news sources, which may contain sensitive or politically biased contents.