# MODEL_NAME This repository contains **layoutlm-camembertv2-qa** weights exported to `safetensors` format. ## Source These weights are derived from pretrained models: - **Layout encoder (LayoutLM)**: [`microsoft/layoutlm-base-uncased`](https://huggingface.co/microsoft/layoutlm-base-uncased) — pretrained on IIT-CDIP + masked visual-language modeling (LayoutLM paper) - **Text encoder**: [`almanach/camembertv2-base`](https://huggingface.co/almanach/camembertv2-base) — French language model (RoBERTa-like architecture) ## Methodology This checkpoint was produced by **weight merging**, not end-to-end training. 1. Load the pretrained layout encoder weights (LiLT or LayoutLM) — kept intact 2. Replace the text encoder weights (embeddings, attention layers, FFN) with those from the French model 3. Update the tokenizer and vocabulary configuration accordingly No training or fine-tuning was performed at this stage. This checkpoint is intended as a **starting point** for downstream fine-tuning on French document understanding tasks (NER, token classification, extractive QA…). ## Files | File | Description | |------|-------------| | `model.safetensors` | Model weights | | `pytorch_model.bin` | Model weights (PyTorch format) | | `config.json` | Model configuration | | `tokenizer_config.json` | Tokenizer configuration | | `README.md` | This model card | ## Usage ```python from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("USERNAME/MODEL_NAME") model = AutoModel.from_pretrained("USERNAME/MODEL_NAME") ``` ## Limitations - This model has **not been fine-tuned** on any French document dataset - Performance on downstream tasks is **not guaranteed** without task-specific fine-tuning - Intended for research and experimentation purposes ## License Weights are derived from models released under the MIT and Apache-2.0 licenses. Please refer to the original repositories for full license terms. ## Acknowledgements - [LayoutLM: Pre-training of Text and Layout for Document Image Understanding](https://arxiv.org/abs/1912.13318) — Xu et al., 2020 - [`microsoft/layoutlm-base-uncased`](https://huggingface.co/microsoft/layoutlm-base-uncased) > **Note**: This is not an official release from any of the above organizations.