dariofinardi commited on
Commit
d01c16d
·
verified ·
1 Parent(s): cee33c8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -22,7 +22,7 @@ tags:
22
  By operating at the character level (UTF-8 bytes), ByT5 is intrinsically immune to typos, dirty OCR outputs, and Out-Of-Vocabulary (OOV) tokens, making it exceptionally reliable for real-world, messy documents.
23
 
24
  The model expects an **Anchor Date** (reference date), an optional **Language Code**, and the **Temporal String** as input:
25
- > Input format: `YYYY-MM-DD | lang | input_text`
26
 
27
  ## Use Cases
28
 
@@ -64,7 +64,7 @@ model_id = "SemplificaAI/t5-temporal-normalizer"
64
  tokenizer = AutoTokenizer.from_pretrained("google/byt5-small")
65
  model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
66
 
67
- # Format: YYYY-MM-DD | lang | text
68
  input_text = "2024-01-01 | en | 3 days post admission"
69
  inputs = tokenizer(input_text, return_tensors="pt")
70
 
 
22
  By operating at the character level (UTF-8 bytes), ByT5 is intrinsically immune to typos, dirty OCR outputs, and Out-Of-Vocabulary (OOV) tokens, making it exceptionally reliable for real-world, messy documents.
23
 
24
  The model expects an **Anchor Date** (reference date), an optional **Language Code**, and the **Temporal String** as input:
25
+ > Input format: `YYYY-MM-DD | lang (optional) | input_text`
26
 
27
  ## Use Cases
28
 
 
64
  tokenizer = AutoTokenizer.from_pretrained("google/byt5-small")
65
  model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
66
 
67
+ # Format: YYYY-MM-DD | lang (optional) | text
68
  input_text = "2024-01-01 | en | 3 days post admission"
69
  inputs = tokenizer(input_text, return_tensors="pt")
70