Spaces:

sbompolas
/

Lesbian-morphosyntactic-parsing

Sleeping

App Files Files Community

sbompolas commited on Jun 28, 2025

Commit

9c8bfef

verified ·

1 Parent(s): 2aba304

Update README.md

Browse files

Files changed (1) hide show

README.md +67 -5

README.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
-title: Lesbian Morphosyntactic Parsing
-emoji: 📊
-colorFrom: green
-colorTo: pink
 sdk: gradio
 sdk_version: 5.35.0
 app_file: app.py
@@ -10,4 +10,66 @@ pinned: false
 license: cc-by-4.0
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Stanza Parser with CoNLL-U Viewer
+emoji: 🔍
+colorFrom: blue
+colorTo: green
 sdk: gradio
 sdk_version: 5.35.0
 app_file: app.py
 license: cc-by-4.0
 ---
+# Stanza Parser with CoNLL-U Viewer
+A comprehensive linguistic analysis tool powered by Stanford's Stanza library that provides sentence parsing with multiple output formats.
+## Features
+- **Multi-language Support**: Parse text in English, Spanish, French, German, Chinese, Russian, and Arabic
+- **CoNLL-U Output**: Get standard linguistic annotation format output
+- **Interactive Data Table**: Browse parsed tokens with all linguistic features
+- **Dependency Visualization**: Text-based visualization of dependency relationships
+- **Copy-friendly Output**: Easy to copy results for use in other tools
+## What is CoNLL-U?
+CoNLL-U is a standard format for representing linguistic annotations that includes:
+- **Tokenization**: Word and sentence boundaries
+- **Part-of-Speech Tagging**: Universal and language-specific POS tags
+- **Lemmatization**: Base forms of words
+- **Morphological Features**: Grammatical attributes
+- **Dependency Parsing**: Syntactic relationships between words
+## How to Use
+1. Enter your text in the input box
+2. Select the appropriate language
+3. Click "Parse Text" or press Enter
+4. View results in three formats:
+   - Raw CoNLL-U format (copy-paste ready)
+   - Interactive data table
+   - Dependency structure visualization
+## Example Output
+For the sentence "The cat sits on the mat", you'll get:
+- **CoNLL-U format**: Standard 10-column format with all linguistic features
+- **Data table**: Interactive view of each token's properties
+- **Dependencies**: "cat --nsubj--> sits", "mat --nmod--> sits", etc.
+## Use Cases
+- **Linguistic Research**: Analyze sentence structure and grammatical relationships
+- **NLP Development**: Generate training data or test parsing models
+- **Educational**: Learn about syntactic analysis and dependency grammar
+- **Text Processing**: Prepare annotated data for downstream tasks
+## Technical Details
+This space uses:
+- **Stanza**: Stanford's multilingual NLP toolkit
+- **Gradio**: For the interactive web interface
+- **Pandas**: For data table visualization
+The models are automatically downloaded and cached when the space starts up.
+## Supported Languages
+Currently supports: English (en), Spanish (es), French (fr), German (de), Chinese (zh), Russian (ru), Arabic (ar)
+---
+*Powered by Stanford Stanza - https://stanfordnlp.github.io/stanza/*