| # Özel Görevler İçin Datasets | |
| Bu klasör, specific NLP task'leri için dataset örnekleri içerir. | |
| ## Task'ler | |
| ### ❓ Question Answering | |
| #### Extractive QA (SQuAD-style) | |
| ```python | |
| { | |
| 'context': 'Paris is the capital of France...', | |
| 'question': 'What is the capital of France?', | |
| 'answers': { | |
| 'text': ['Paris'], | |
| 'answer_start': [0] | |
| } | |
| } | |
| ``` | |
| #### Multiple Choice QA | |
| ```python | |
| { | |
| 'question': 'What is 2+2?', | |
| 'choices': ['3', '4', '5', '6'], | |
| 'answer': 1 # Index of correct answer | |
| } | |
| ``` | |
| **Best Practices:** | |
| - Validate answer spans | |
| - Handle impossible questions | |
| - Question type classification | |
| - Context length management | |
| ### 📝 Summarization | |
| #### News Summarization | |
| ```python | |
| { | |
| 'article': 'Long news article...', | |
| 'summary': 'Brief summary...', | |
| 'compression_ratio': 0.24 | |
| } | |
| ``` | |
| **Metrics:** | |
| - ROUGE scores | |
| - Compression ratio (20-30% optimal) | |
| - Abstractive vs Extractive | |
| **Best Practices:** | |
| - Multiple reference summaries | |
| - Length constraints | |
| - Quality validation | |
| ### 🏷️ Named Entity Recognition | |
| #### BIO Tagging | |
| ```python | |
| { | |
| 'tokens': ['John', 'Smith', 'works', 'at', 'Google'], | |
| 'ner_tags': ['B-PER', 'I-PER', 'O', 'O', 'B-ORG'] | |
| } | |
| ``` | |
| **Tag Schema:** | |
| - B-PER, I-PER (Person) | |
| - B-ORG, I-ORG (Organization) | |
| - B-LOC, I-LOC (Location) | |
| - O (Outside) | |
| **Best Practices:** | |
| - Consistent tagging scheme | |
| - Entity type taxonomy | |
| - Nested entities handling | |
| - Entity linking (optional) | |
| ### 😊 Sentiment Analysis | |
| #### Binary/Multi-class | |
| ```python | |
| { | |
| 'text': 'This product is amazing!', | |
| 'label': 2, # 0: neg, 1: neutral, 2: pos | |
| 'confidence': 0.95 | |
| } | |
| ``` | |
| #### Aspect-Based | |
| ```python | |
| { | |
| 'text': 'Great product but slow delivery', | |
| 'aspect_sentiments': { | |
| 'product': 'positive', | |
| 'delivery': 'negative' | |
| } | |
| } | |
| ``` | |
| **Best Practices:** | |
| - Multi-level granularity | |
| - Confidence scores | |
| - Domain-specific lexicons | |
| - Emotion detection | |
| ### 📊 Text Classification | |
| #### Topic Classification | |
| ```python | |
| { | |
| 'text': 'Article text...', | |
| 'label': 'technology', | |
| 'label_id': 0 | |
| } | |
| ``` | |
| **Best Practices:** | |
| - Balanced classes | |
| - Hierarchical categories | |
| - Multi-label support | |
| - Class imbalance handling | |
| ### 🎯 Multi-Task Learning | |
| #### Unified Format | |
| ```python | |
| { | |
| 'text': 'Sample text...', | |
| 'sentiment': 'positive', | |
| 'topic': 'technology', | |
| 'quality_score': 0.85 | |
| } | |
| ``` | |
| **Best Practices:** | |
| - Consistent preprocessing | |
| - Task-specific heads | |
| - Shared representations | |
| - Task weighting | |
| ## Dataset Statistics | |
| | Task | Örnekler | Format | | |
| |------|----------|--------| | |
| | QA | 300 | Extractive + MC | | |
| | Summarization | 100 | News articles | | |
| | NER | 100 | BIO tagged | | |
| | Sentiment | 350 | Multi-class + Aspect | | |
| | Classification | 200 | Topic | | |
| | Multi-Task | 100 | Unified | | |
| ## Quality Metrics | |
| ### QA | |
| - Exact Match (EM) | |
| - F1 Score | |
| - Answer span accuracy | |
| ### Summarization | |
| - ROUGE-1, ROUGE-2, ROUGE-L | |
| - Compression ratio | |
| - Factual consistency | |
| ### NER | |
| - Precision, Recall, F1 per entity type | |
| - Exact match | |
| - Partial match | |
| ### Sentiment | |
| - Accuracy | |
| - Macro/Micro F1 | |
| - Confusion matrix | |
| ### Classification | |
| - Accuracy | |
| - Per-class F1 | |
| - Macro/Weighted F1 | |
| ## Best Practices (Genel) | |
| ✅ Clear annotation guidelines | |
| ✅ Inter-annotator agreement | |
| ✅ Quality control checks | |
| ✅ Regular dataset updates | |
| ✅ Version control | |
| ✅ Documentation | |
| ✅ Ethical considerations | |
| ✅ Bias analysis | |