sbompolas commited on
Commit
9c8bfef
·
verified ·
1 Parent(s): 2aba304

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -5
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
- title: Lesbian Morphosyntactic Parsing
3
- emoji: 📊
4
- colorFrom: green
5
- colorTo: pink
6
  sdk: gradio
7
  sdk_version: 5.35.0
8
  app_file: app.py
@@ -10,4 +10,66 @@ pinned: false
10
  license: cc-by-4.0
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Stanza Parser with CoNLL-U Viewer
3
+ emoji: 🔍
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
  sdk_version: 5.35.0
8
  app_file: app.py
 
10
  license: cc-by-4.0
11
  ---
12
 
13
+ # Stanza Parser with CoNLL-U Viewer
14
+
15
+ A comprehensive linguistic analysis tool powered by Stanford's Stanza library that provides sentence parsing with multiple output formats.
16
+
17
+ ## Features
18
+
19
+ - **Multi-language Support**: Parse text in English, Spanish, French, German, Chinese, Russian, and Arabic
20
+ - **CoNLL-U Output**: Get standard linguistic annotation format output
21
+ - **Interactive Data Table**: Browse parsed tokens with all linguistic features
22
+ - **Dependency Visualization**: Text-based visualization of dependency relationships
23
+ - **Copy-friendly Output**: Easy to copy results for use in other tools
24
+
25
+ ## What is CoNLL-U?
26
+
27
+ CoNLL-U is a standard format for representing linguistic annotations that includes:
28
+
29
+ - **Tokenization**: Word and sentence boundaries
30
+ - **Part-of-Speech Tagging**: Universal and language-specific POS tags
31
+ - **Lemmatization**: Base forms of words
32
+ - **Morphological Features**: Grammatical attributes
33
+ - **Dependency Parsing**: Syntactic relationships between words
34
+
35
+ ## How to Use
36
+
37
+ 1. Enter your text in the input box
38
+ 2. Select the appropriate language
39
+ 3. Click "Parse Text" or press Enter
40
+ 4. View results in three formats:
41
+ - Raw CoNLL-U format (copy-paste ready)
42
+ - Interactive data table
43
+ - Dependency structure visualization
44
+
45
+ ## Example Output
46
+
47
+ For the sentence "The cat sits on the mat", you'll get:
48
+
49
+ - **CoNLL-U format**: Standard 10-column format with all linguistic features
50
+ - **Data table**: Interactive view of each token's properties
51
+ - **Dependencies**: "cat --nsubj--> sits", "mat --nmod--> sits", etc.
52
+
53
+ ## Use Cases
54
+
55
+ - **Linguistic Research**: Analyze sentence structure and grammatical relationships
56
+ - **NLP Development**: Generate training data or test parsing models
57
+ - **Educational**: Learn about syntactic analysis and dependency grammar
58
+ - **Text Processing**: Prepare annotated data for downstream tasks
59
+
60
+ ## Technical Details
61
+
62
+ This space uses:
63
+ - **Stanza**: Stanford's multilingual NLP toolkit
64
+ - **Gradio**: For the interactive web interface
65
+ - **Pandas**: For data table visualization
66
+
67
+ The models are automatically downloaded and cached when the space starts up.
68
+
69
+ ## Supported Languages
70
+
71
+ Currently supports: English (en), Spanish (es), French (fr), German (de), Chinese (zh), Russian (ru), Arabic (ar)
72
+
73
+ ---
74
+
75
+ *Powered by Stanford Stanza - https://stanfordnlp.github.io/stanza/*