tiiuae
/

Falcon-OCR

text-generation

vision-language

document-understanding

Model card Files Files and versions

Update README.md

#3

by wamreyaz - opened 27 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -42,6 +42,9 @@ print(texts[0])
 > The first `generate()` call is slower due to `torch.compile` building optimized kernels. Subsequent calls are much faster.
 ## Categories
 By default, category is `"plain"` (general text extraction). You can specify a category to use a task-specific prompt:
@@ -96,6 +99,9 @@ for det in results[0]:
 }
 ```
 ## Citation

 > The first `generate()` call is slower due to `torch.compile` building optimized kernels. Subsequent calls are much faster.
+We already use PagedInference which is quite fast for most interactive tasks, but for large-scale deployment, check the Deployment section
+which provides a vLLM backend.
 ## Categories
 By default, category is `"plain"` (general text extraction). You can specify a category to use a task-specific prompt:
 }
 ```
+## Deployment
+TODO: explain how to set up the vLLM server.
 ## Citation