# Özel Görevler İçin Datasets Bu klasör, specific NLP task'leri için dataset örnekleri içerir. ## Task'ler ### ❓ Question Answering #### Extractive QA (SQuAD-style) ```python { 'context': 'Paris is the capital of France...', 'question': 'What is the capital of France?', 'answers': { 'text': ['Paris'], 'answer_start': [0] } } ``` #### Multiple Choice QA ```python { 'question': 'What is 2+2?', 'choices': ['3', '4', '5', '6'], 'answer': 1 # Index of correct answer } ``` **Best Practices:** - Validate answer spans - Handle impossible questions - Question type classification - Context length management ### 📝 Summarization #### News Summarization ```python { 'article': 'Long news article...', 'summary': 'Brief summary...', 'compression_ratio': 0.24 } ``` **Metrics:** - ROUGE scores - Compression ratio (20-30% optimal) - Abstractive vs Extractive **Best Practices:** - Multiple reference summaries - Length constraints - Quality validation ### 🏷️ Named Entity Recognition #### BIO Tagging ```python { 'tokens': ['John', 'Smith', 'works', 'at', 'Google'], 'ner_tags': ['B-PER', 'I-PER', 'O', 'O', 'B-ORG'] } ``` **Tag Schema:** - B-PER, I-PER (Person) - B-ORG, I-ORG (Organization) - B-LOC, I-LOC (Location) - O (Outside) **Best Practices:** - Consistent tagging scheme - Entity type taxonomy - Nested entities handling - Entity linking (optional) ### 😊 Sentiment Analysis #### Binary/Multi-class ```python { 'text': 'This product is amazing!', 'label': 2, # 0: neg, 1: neutral, 2: pos 'confidence': 0.95 } ``` #### Aspect-Based ```python { 'text': 'Great product but slow delivery', 'aspect_sentiments': { 'product': 'positive', 'delivery': 'negative' } } ``` **Best Practices:** - Multi-level granularity - Confidence scores - Domain-specific lexicons - Emotion detection ### 📊 Text Classification #### Topic Classification ```python { 'text': 'Article text...', 'label': 'technology', 'label_id': 0 } ``` **Best Practices:** - Balanced classes - Hierarchical categories - Multi-label support - Class imbalance handling ### 🎯 Multi-Task Learning #### Unified Format ```python { 'text': 'Sample text...', 'sentiment': 'positive', 'topic': 'technology', 'quality_score': 0.85 } ``` **Best Practices:** - Consistent preprocessing - Task-specific heads - Shared representations - Task weighting ## Dataset Statistics | Task | Örnekler | Format | |------|----------|--------| | QA | 300 | Extractive + MC | | Summarization | 100 | News articles | | NER | 100 | BIO tagged | | Sentiment | 350 | Multi-class + Aspect | | Classification | 200 | Topic | | Multi-Task | 100 | Unified | ## Quality Metrics ### QA - Exact Match (EM) - F1 Score - Answer span accuracy ### Summarization - ROUGE-1, ROUGE-2, ROUGE-L - Compression ratio - Factual consistency ### NER - Precision, Recall, F1 per entity type - Exact match - Partial match ### Sentiment - Accuracy - Macro/Micro F1 - Confusion matrix ### Classification - Accuracy - Per-class F1 - Macro/Weighted F1 ## Best Practices (Genel) ✅ Clear annotation guidelines ✅ Inter-annotator agreement ✅ Quality control checks ✅ Regular dataset updates ✅ Version control ✅ Documentation ✅ Ethical considerations ✅ Bias analysis