flax-community
/

clip-spanish

@@ -1,7 +1,25 @@
-# Download datasets:
-* Download and decompress tsv file from here: https://github.com/google-research-datasets/wit/blob/main/DATA.md
-* Use `prepare_wit.py` to download images from Wikipedia as annotated on each TSV file.
-* Use `scale_converter.py` to remove corrupt images and resize suitable images to 224x224
-* Use `join_datasets_custom_split.py` to group all JSONs from different subsets of the dataset together
-* Use `discard_incorrect_files.py` to filter out images that we were not able to convert.
-* Finally, use `run-clip.sh` to train.

+---
+language: es
+license: CC-BY 4.0
+tags:
+- spanish
+- roberta
+- vit
+---
+# CLIP-Spanish
+CLIP Spanish is a CLIP-like Model for Spanish. It is composed of a RoBERTa-base language encoder and a ViT-B/32 image encoder using [Flax](https://github.com/google/flax), including training scripts (see training.md).
+This is part of the [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104), organised by [HuggingFace](https://huggingface.co/) and TPU usage sponsored by Google.
+## Spanish WIT
+We used a subset of 141,230 Spanish captions from the [WIT dataset](https://github.com/google-research-datasets/wit) for training.
+## Team members
+- Eduardo González Ponferrada ([edugp](https://huggingface.co/edugp))
+- Manu Romero ([mrm8488](https://huggingface.co/))
+- María Grandury ([mariagrandury](https://huggingface.co/))
+## Useful links
+- [Community Week timeline](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104#summary-timeline-calendar-6)
+- [Community Week README](https://github.com/huggingface/transformers/blob/master/examples/research_projects/jax-projects/README.md)
+- [Community Week thread](https://discuss.huggingface.co/t/bertin-pretrain-roberta-large-from-scratch-in-spanish/7125)
+- [Community Week channel](https://discord.com/channels/858019234139602994/859113060068229190)
+- [Hybrid CLIP example scripts](https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects/hybrid_clip)
+- [Model Repository](https://huggingface.co/flax-community/bertin-roberta-large-spanish/)

training.md ADDED Viewed

	@@ -0,0 +1,7 @@

+# Training:
+* Download tsv files from here: https://github.com/google-research-datasets/wit/blob/main/DATA.md
+* Use `prepare_wit.py` to download images from Wikipedia as annotated on each TSV file.
+* Use `scale_converter.py` to remove corrupt images and resize suitable images to 224x224.
+* Use `join_datasets_custom_split.py` to group all JSONs from different subsets of the dataset together.
+* Use `discard_incorrect_files.py` to filter out images that we were not able to convert.
+* Finally, use `run-clip.sh` to train.