micaelahs davanstrien HF Staff commited on
Commit
e832e59
·
1 Parent(s): a93db3a

Add HF Storage Bucket persistence support (#8)

Browse files

- Add HF Storage Bucket persistence support (45864bb8300898c940880c7ce905906cee795c68)


Co-authored-by: Daniel van Strien <davanstrien@users.noreply.huggingface.co>

Files changed (2) hide show
  1. Dockerfile +22 -13
  2. README.md +28 -17
Dockerfile CHANGED
@@ -33,8 +33,8 @@ FROM heartexlabs/label-studio:hf-latest
33
  # annotation data in the space will be lost. You can enable configuration
34
  # persistence through one of two methods:
35
  #
36
- # 1) Enabling Hugging Face Persistent Storage for saving project and annotation
37
- # settings, as well as local task storage.
38
  # 2) Connecting an external Postgres database for saving project and annotation
39
  # settings, and cloud by connecting cloud storage for tasks.
40
  #
@@ -42,20 +42,24 @@ FROM heartexlabs/label-studio:hf-latest
42
 
43
  ################################################################################
44
  #
45
- # How to Enable Hugging Face Persistent Storage for Label Studio
46
- # --------------------------------------------------------------
47
  #
48
- # In the Hugging Face Label Studio Space settings, select the appropriate
49
- # Persistent Storage tier. Note that Persistent Storage is a paid add-on.
50
- # By default, persistent storage is mounted to /data. In your Space settings,
51
- # set the following variables:
52
  #
53
- # LABEL_STUDIO_BASE_DATA_DIR=/data
54
- # ENV STORAGE_PERSISTENCE=1
 
 
 
 
 
 
55
  #
56
- # Your space will restart. NOTE: if you have existing settings and data,
57
- # they will be lost in this first restart. Data and setting will only be
58
- # preserved on subsequent restarts of the space.
59
  #
60
  ################################################################################
61
 
@@ -124,4 +128,9 @@ FROM heartexlabs/label-studio:hf-latest
124
  #
125
  ################################################################################
126
 
 
 
 
 
 
127
  CMD exec label-studio --host=$SPACE_HOST
 
33
  # annotation data in the space will be lost. You can enable configuration
34
  # persistence through one of two methods:
35
  #
36
+ # 1) Attaching an HF Storage Bucket for persisting the SQLite database and
37
+ # media uploads across restarts (recommended).
38
  # 2) Connecting an external Postgres database for saving project and annotation
39
  # settings, and cloud by connecting cloud storage for tasks.
40
  #
 
42
 
43
  ################################################################################
44
  #
45
+ # How to Enable Persistence with HF Storage Buckets
46
+ # --------------------------------------------------
47
  #
48
+ # HF Storage Buckets provide persistent object storage that can be
49
+ # mounted directly into your Space. Label Studio will write its SQLite
50
+ # database and media uploads into the mounted bucket, so everything
51
+ # survives restarts.
52
  #
53
+ # 1. Create a bucket: hf buckets create <namespace>/label-studio-data
54
+ # 2. Attach it in Space Settings → Storage Buckets, mount path: /data
55
+ # 3. Set Space Variables:
56
+ # LABEL_STUDIO_BASE_DATA_DIR=/data
57
+ # STORAGE_PERSISTENCE=1
58
+ # 4. (Recommended) Set a SECRET_KEY Space Secret to keep user sessions
59
+ # alive across restarts.
60
+ # 5. Factory rebuild the Space.
61
  #
62
+ # See https://huggingface.co/docs/hub/storage-buckets for more details.
 
 
63
  #
64
  ################################################################################
65
 
 
128
  #
129
  ################################################################################
130
 
131
+ # The upstream base image runs as UID 1001. HF Storage Bucket mounts currently
132
+ # require root to write. This can be removed once bucket mounts support
133
+ # non-root UIDs natively.
134
+ USER root
135
+
136
  CMD exec label-studio --host=$SPACE_HOST
README.md CHANGED
@@ -45,8 +45,10 @@ After logging in, Label Studio will present you with a project view. Here you
45
  can create a new project with prompts to upload data and set up a custom
46
  configuration interface.
47
 
48
- **Note that in the default configuration, storage is local and temporary. Any
49
- projects, annotations, and configurations will be lost if the space is restarted.**
 
 
50
 
51
  ## Next Steps and Additional Resources
52
 
@@ -83,29 +85,38 @@ You will need to provide new users with an invitation link to join the space,
83
  which can be found in the Organizations interface of Label Studio.
84
 
85
  By default this space stores all project configuration and data annotations
86
- in local storage with Sqlite. If the space is reset, all configuration and
87
- annotation data in the space will be lost. You can enable configuration
88
- persistence in one of two ways:
89
 
90
- 1. Enabling Persistent Storage in your Space settings and configuring Label
91
- Studio to write its database and task storage there.
92
 
93
  2. Connecting an external Postgres database and cloud storage to your space,
94
  guaranteeing that all project and annotation settings are preserved.
95
 
96
- ### Enabling Hugging Face Persistent Storage
97
 
98
- In the Hugging Face Label Studio Space settings, select the appropriate
99
- Persistent Storage tier. Note that Persistent Storage is a paid add-on.
100
- By default, persistent storage is mounted to /data. In your Space settings,
101
- set the following variables:
102
 
103
- LABEL_STUDIO_BASE_DATA_DIR=/data
104
- ENV STORAGE_PERSISTENCE=1
 
 
 
 
 
 
 
 
 
 
105
 
106
- Your space will restart. NOTE: if you have existing settings and data,
107
- they will be lost in this first restart. Data and setting will only be
108
- preserved on subsequent restarts of the space.
109
 
110
  ### Enabling Postgres Database and Cloud Storage
111
 
 
45
  can create a new project with prompts to upload data and set up a custom
46
  configuration interface.
47
 
48
+ **Note that in the default configuration, storage is local and temporary.** To
49
+ persist your projects and annotations across restarts, attach an
50
+ [HF Storage Bucket](https://huggingface.co/docs/hub/storage-buckets) — see the
51
+ persistence section below.
52
 
53
  ## Next Steps and Additional Resources
54
 
 
85
  which can be found in the Organizations interface of Label Studio.
86
 
87
  By default this space stores all project configuration and data annotations
88
+ in local storage with SQLite. If the space is reset, all configuration and
89
+ annotation data in the space will be lost. You can enable persistence in one
90
+ of two ways:
91
 
92
+ 1. Attaching an HF Storage Bucket (recommended).
 
93
 
94
  2. Connecting an external Postgres database and cloud storage to your space,
95
  guaranteeing that all project and annotation settings are preserved.
96
 
97
+ ### Persistence with HF Storage Buckets
98
 
99
+ [HF Storage Buckets](https://huggingface.co/docs/hub/storage-buckets) provide
100
+ persistent object storage that can be mounted directly into your Space. Label
101
+ Studio will write its SQLite database and media uploads into the mounted bucket,
102
+ so projects and annotations survive restarts.
103
 
104
+ 1. **Create a bucket:**
105
+
106
+ hf buckets create <your-namespace>/label-studio-data
107
+
108
+ 2. **Attach it** in Space Settings → Storage Buckets, mount path `/data`.
109
+
110
+ 3. **Set two Space Variables:**
111
+
112
+ LABEL_STUDIO_BASE_DATA_DIR=/data
113
+ STORAGE_PERSISTENCE=1
114
+
115
+ 4. **Factory rebuild** the Space.
116
 
117
+ It is also recommended to set a `SECRET_KEY` Space Secret to keep user sessions
118
+ alive across restarts. Without it, Label Studio generates a random key on each
119
+ boot and all users are logged out on restart.
120
 
121
  ### Enabling Postgres Database and Cloud Storage
122