Buckets:
Filesystem API
The HfFileSystem class provides a pythonic file interface to the Hugging Face Hub based on fsspec.
HfFileSystem[[huggingface_hub.HfFileSystem]]
HfFileSystem is based on fsspec, so it is compatible with most of the APIs that it offers. For more details, check out our guide and fsspec's API Reference.
huggingface_hub.HfFileSystem[[huggingface_hub.HfFileSystem]]
Access a remote Hugging Face Hub repository as if were a local file system.
HfFileSystem provides fsspec compatibility, which is useful for libraries that require it (e.g., reading Hugging Face datasets directly with
pandas). However, it introduces additional overhead due to this compatibility layer. For better performance and reliability, it's recommended to useHfApimethods when possible.
The file system supports paths for the hf:// protocol, which follows those URL schemes:
- Models, Datasets and Spaces repositories:
hf://[@]/
hf://datasets/[@]/
hf://spaces/[@]/
- Buckets (generic storage):
hf://buckets//
Note: when using the HfFileSystem directly, passing the hf:// protocol prefix is optional in paths.
Usage:
>>> from huggingface_hub import hffs
>>> # List files
>>> hffs.glob("my-username/my-model/*.bin")
['my-username/my-model/pytorch_model.bin']
>>> hffs.ls("datasets/my-username/my-dataset", detail=False)
['datasets/my-username/my-dataset/.gitattributes', 'datasets/my-username/my-dataset/README.md', 'datasets/my-username/my-dataset/data.json']
>>> # Read/write files
>>> with hffs.open("my-username/my-model/pytorch_model.bin") as f:
... data = f.read()
>>> with hffs.open("my-username/my-model/pytorch_model.bin", "wb") as f:
... f.write(data)
Specify a token for authentication:
>>> from huggingface_hub import HfFileSystem
>>> hffs = HfFileSystem(token=token)
cp_filehuggingface_hub.HfFileSystem.cp_filehttps://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/hf_file_system.py#L810[{"name": "path1", "val": ": str"}, {"name": "path2", "val": ": str"}, {"name": "revision", "val": ": str | None = None"}, {"name": "**kwargs", "val": ""}]- path1 (str) --
Source path to copy from.
- path2 (
str) -- Destination path to copy to. - revision (
str, optional) -- The git revision to copy from.0
Copy a file within or between repositories.
Note: When possible, use
HfApi.upload_file()for better performance.
Parameters:
endpoint (str, optional) : Endpoint of the Hub. Defaults to .
token (bool or str, optional) : A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.
block_size (int, optional) : Block size for reading and writing files.
expand_info (bool, optional) : Whether to expand the information of the files.
- **storage_options (
dict, optional) : Additional options for the filesystem. See fsspec documentation.
exists[[huggingface_hub.HfFileSystem.exists]]
Check if a file exists.
For more details, refer to fsspec documentation.
Note: When possible, use
HfApi.file_exists()for better performance.
Parameters:
path (str) : Path to check.
Returns:
bool
True if file exists, False otherwise.
find[[huggingface_hub.HfFileSystem.find]]
List all files below path.
For more details, refer to fsspec documentation.
Parameters:
path (str) : Root path to list files from.
maxdepth (int, optional) : Maximum depth to descend into subdirectories.
withdirs (bool, optional) : Include directory paths in the output. Defaults to False.
detail (bool, optional) : If True, returns a dict mapping paths to file information. Defaults to False.
refresh (bool, optional) : If True, bypass the cache and fetch the latest data. Defaults to False.
revision (str, optional) : The git revision to list from.
Returns:
Union[list[str], dict[str, dict[str, Any]]]
List of paths or dict of file information.
get_file[[huggingface_hub.HfFileSystem.get_file]]
Copy single remote file to local.
Note: When possible, use
HfApi.hf_hub_download()orHfApi.download_bucket_filesfor better performance.
Parameters:
rpath (str) : Remote path to download from.
lpath (str) : Local path to download to.
callback (Callback, optional) : Optional callback to track download progress. Defaults to no callback.
outfile (IO, optional) : Optional file-like object to write to. If provided, lpath is ignored.
glob[[huggingface_hub.HfFileSystem.glob]]
Find files by glob-matching.
For more details, refer to fsspec documentation.
Parameters:
path (str) : Path pattern to match.
Returns:
list[str]
List of paths matching the pattern.
info[[huggingface_hub.HfFileSystem.info]]
Get information about a file or directory.
For more details, refer to fsspec documentation.
Note: When possible, use
HfApi.get_paths_info()orHfApi.repo_info()for better performance (orHfApi.get_bucket_paths_info()orHfApi.bucket_info()for buckets)
Parameters:
path (str) : Path to get info for.
refresh (bool, optional) : If True, bypass the cache and fetch the latest data. Defaults to False.
revision (str, optional) : The git revision to get info from.
Returns:
dict[str, Any]
Dictionary containing file information (type, size, commit info, etc.).
invalidate_cache[[huggingface_hub.HfFileSystem.invalidate_cache]]
Clear the cache for a given path.
For more details, refer to fsspec documentation.
Parameters:
path (str, optional) : Path to clear from cache. If not provided, clear the entire cache.
isdir[[huggingface_hub.HfFileSystem.isdir]]
Check if a path is a directory.
For more details, refer to fsspec documentation.
Parameters:
path (str) : Path to check.
Returns:
bool
True if path is a directory, False otherwise.
isfile[[huggingface_hub.HfFileSystem.isfile]]
Check if a path is a file.
For more details, refer to fsspec documentation.
Parameters:
path (str) : Path to check.
Returns:
bool
True if path is a file, False otherwise.
ls[[huggingface_hub.HfFileSystem.ls]]
List the contents of a directory.
For more details, refer to fsspec documentation.
Note: When possible, use
HfApi.list_repo_tree()for better performance.
Parameters:
path (str) : Path to the directory.
detail (bool, optional) : If True, returns a list of dictionaries containing file information. If False, returns a list of file paths. Defaults to True.
refresh (bool, optional) : If True, bypass the cache and fetch the latest data. Defaults to False.
revision (str, optional) : The git revision to list from.
Returns:
list[Union[str, dict[str, Any]]]
List of file paths (if detail=False) or list of file information dictionaries (if detail=True).
modified[[huggingface_hub.HfFileSystem.modified]]
Get the last modified time of a file.
For more details, refer to fsspec documentation.
Parameters:
path (str) : Path to the file.
Returns:
datetime
Last modified time of the file.
resolve_path[[huggingface_hub.HfFileSystem.resolve_path]]
Resolve a Hugging Face file system path into its components.
Parameters:
path (str) : Path to resolve.
revision (str, optional) : The revision of the repo to resolve. Defaults to the revision specified in the path.
Returns:
HfFileSystemResolvedPath
Resolved path information containing repo_type, repo_id, revision and path_in_repo.
rm[[huggingface_hub.HfFileSystem.rm]]
Delete files from a repository.
For more details, refer to fsspec documentation.
Note: When possible, use
HfApi.delete_file()for better performance.
Parameters:
path (str) : Path to delete.
recursive (bool, optional) : If True, delete directory and all its contents. Defaults to False.
maxdepth (int, optional) : Maximum number of subdirectories to visit when deleting recursively.
revision (str, optional) : The git revision to delete from.
url[[huggingface_hub.HfFileSystem.url]]
Get the HTTP URL of the given path.
Parameters:
path (str) : Path to get URL for.
Returns:
str
HTTP URL to access the file or directory on the Hub.
walk[[huggingface_hub.HfFileSystem.walk]]
Return all files below the given path.
For more details, refer to fsspec documentation.
Parameters:
path (str) : Root path to list files from.
Returns:
Iterator[tuple[str, list[str], list[str]]]
An iterator of (path, list of directory names, list of file names) tuples.
Xet Storage Details
- Size:
- 12.2 kB
- Xet hash:
- fb5d83cdb710870ce5047791824943f473bec10c247f93cfff98ad8228c19342
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.