ericaRC commited on
Commit
a3e7ffe
·
verified ·
1 Parent(s): 04290ae

Add TROUBLESHOOTING.md documenting real deploy failures

Browse files
Files changed (1) hide show
  1. TROUBLESHOOTING.md +134 -0
TROUBLESHOOTING.md ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Troubleshooting
2
+
3
+ Real failures we've hit deploying this repo to Hugging Face Inference Endpoints, and how to fix them. Read this first when the endpoint won't start.
4
+
5
+ ---
6
+
7
+ ## 1. `Unrecognized model ... Should have a model_type key in its config.json`
8
+
9
+ Endpoint logs end with a giant list of model types (`albert, align, ... m2m_100, ... zoedepth`) and `Application startup failed`.
10
+
11
+ **Cause.** The Hub repo doesn't actually contain model weights / `config.json`. Usually happens when `model_loader.py` was committed to git but never *executed* against the Hub (pushing the Python file ≠ running it).
12
+
13
+ **Check.**
14
+
15
+ ```bash
16
+ python3 -c "from huggingface_hub import HfApi; print([s.rfilename for s in HfApi().model_info('ericaRC/example').siblings])"
17
+ ```
18
+
19
+ You should see `config.json`, `model.safetensors`, `tokenizer_config.json`, `tokenizer.json`, `handler.py`, `requirements.txt`, `README.md`. If it's only `.gitattributes` and scripts, the weights were never pushed.
20
+
21
+ **Fix.**
22
+
23
+ ```bash
24
+ huggingface-cli login
25
+ python3 model_loader.py
26
+ ```
27
+
28
+ ---
29
+
30
+ ## 2. `403 Forbidden` on `.../info/lfs/objects/batch`
31
+
32
+ `push_to_hub` dies with `HfHubHTTPError: 403 Forbidden: Authorization error.`
33
+
34
+ **Cause.** Your HF token lacks write access to the target repo. Most commonly: a fine-grained token scoped to your user only, trying to push to an org namespace. Reading works (which is why `whoami` succeeds) but LFS writes are rejected.
35
+
36
+ **Check.**
37
+
38
+ ```bash
39
+ python3 -c "
40
+ from huggingface_hub import HfApi
41
+ perms = HfApi().whoami()['auth']['accessToken'].get('fineGrained', {})
42
+ for s in perms.get('scoped', []):
43
+ print(s['entity']['type'], s['entity']['name'], '->', s['permissions'])
44
+ "
45
+ ```
46
+
47
+ You need an entry matching the target repo's namespace (user or org) that includes `repo.write`.
48
+
49
+ **Fix.** At https://huggingface.co/settings/tokens either:
50
+ - Edit the existing token and add the org with `repo.write` + `repo.content.read` + `repo.access.read`, **or**
51
+ - Create a new classic "Write" token and `huggingface-cli login` with it.
52
+
53
+ ---
54
+
55
+ ## 3. `AttributeError: 'list' object has no attribute 'keys'` in `_set_model_specific_special_tokens`
56
+
57
+ Endpoint logs show a traceback through `tokenization_nllb_fast.py` → `tokenization_utils_base.py` and crash on:
58
+
59
+ ```
60
+ self.SPECIAL_TOKENS_ATTRIBUTES + list(special_tokens.keys())
61
+ ```
62
+
63
+ **Cause.** Transformers-version skew between save time and load time. `transformers` 5.x introduced an `extra_special_tokens` field (serialized as a list for NLLB's Flores-200 codes). The Inference Endpoints base image ships a `transformers` 4.x that expects `extra_special_tokens` to be a dict and calls `.keys()` on it.
64
+
65
+ **Check.**
66
+
67
+ ```bash
68
+ python3 -c "
69
+ import json
70
+ from huggingface_hub import hf_hub_download
71
+ cfg = json.load(open(hf_hub_download('ericaRC/example', 'tokenizer_config.json')))
72
+ print('extra_special_tokens type:', type(cfg.get('extra_special_tokens')).__name__)
73
+ print('additional_special_tokens count:', len(cfg.get('additional_special_tokens') or []))
74
+ "
75
+ ```
76
+
77
+ If `extra_special_tokens` is a non-empty `list` and `additional_special_tokens` is empty, you're hitting this.
78
+
79
+ **Fix (already applied to this repo).** `tokenizer_config.json` has been normalized:
80
+ - lang codes live in `additional_special_tokens` (list — old *and* new transformers accept this)
81
+ - `extra_special_tokens` is `{}` (empty dict — passes `.keys()` in old transformers, ignored in new)
82
+
83
+ And `requirements.txt` pins `transformers>=4.40.0,<5.0` to prevent the endpoint from auto-pulling a 5.x that re-introduces the mismatch.
84
+
85
+ **Prevention going forward.** When running `model_loader.py`, use the same `transformers` major version the endpoint runs:
86
+
87
+ ```bash
88
+ pip install "transformers<5" "huggingface_hub" "torch"
89
+ python3 model_loader.py
90
+ ```
91
+
92
+ Don't save tokenizers from `transformers` 5.x and load them in a 4.x container (or vice versa) unless you've confirmed the schema matches.
93
+
94
+ ---
95
+
96
+ ## 4. Endpoint boots but requests return garbage / wrong language
97
+
98
+ **Cause.** `src_lang` wasn't set on the tokenizer, or `forced_bos_token_id` wasn't passed at generation time. NLLB needs both.
99
+
100
+ **Check.** Look at the request body:
101
+
102
+ ```json
103
+ {
104
+ "inputs": "Hello, world!",
105
+ "parameters": { "src_lang": "eng_Latn", "tgt_lang": "fra_Latn" }
106
+ }
107
+ ```
108
+
109
+ If you're hitting the endpoint without a `parameters` block, `handler.py` falls back to `eng_Latn → spa_Latn`.
110
+
111
+ **Fix.** Always pass `src_lang` and `tgt_lang` using [Flores-200 codes](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200).
112
+
113
+ ---
114
+
115
+ ## 5. Container Type is set to "Text Generation Inference (TGI)"
116
+
117
+ TGI only supports decoder-only causal LMs. NLLB is seq2seq, so TGI will refuse to load it and `handler.py` will be ignored.
118
+
119
+ **Fix.** In the endpoint's Advanced configuration, set **Container Type → Default** (the HF inference toolkit). That container picks up `handler.py` automatically.
120
+
121
+ ---
122
+
123
+ ## Checklist before clicking Deploy
124
+
125
+ - [ ] `HfApi().model_info(REPO).siblings` lists `config.json`, `model.safetensors`, `tokenizer*.json`, `handler.py`, `requirements.txt`, `README.md`.
126
+ - [ ] `tokenizer_config.json` has `extra_special_tokens: {}` (or absent) and `additional_special_tokens` populated.
127
+ - [ ] `requirements.txt` pins `transformers<5`.
128
+ - [ ] Local smoke test passes:
129
+ ```python
130
+ from handler import EndpointHandler
131
+ h = EndpointHandler("ericaRC/example")
132
+ print(h({"inputs": "Hello, world!", "parameters": {"src_lang": "eng_Latn", "tgt_lang": "fra_Latn"}}))
133
+ ```
134
+ - [ ] Endpoint Container Type = **Default**, not TGI.