Buckets:

htohfa
/

pathfinder_arxiv_data-bucket

Files

xet

htohfa/pathfinder_arxiv_data-bucket / README.md

htohfa

11 days ago

preview code

download

raw

1.16 kB

metadata

license: openrail
dataset_info:
  features:
    - name: ads_id
      dtype: string
    - name: arxiv_id
      dtype: string
    - name: title
      dtype: string
    - name: abstract
      dtype: string
    - name: embed
      sequence: float64
    - name: umap_x
      dtype: float64
    - name: umap_y
      dtype: float64
    - name: date
      dtype: date32
    - name: cites
      dtype: int64
    - name: bibcode
      dtype: string
    - name: keywords
      sequence: string
    - name: ads_keywords
      sequence: string
    - name: read_count
      dtype: int64
    - name: doi
      sequence: string
    - name: authors
      sequence: string
    - name: aff
      sequence: string
    - name: cite_bibcodes
      sequence: string
    - name: ref_bibcodes
      sequence: string
  splits:
    - name: train
      num_bytes: 20400431642
      num_examples: 1149771
  download_size: 12212217773
  dataset_size: 20400431642
configs:
  - config_name: default
    data_files:
      - split: train
        path: data/train-*

This dataset is associated with the Pathfinder app (https://pfdr.app, paper at https://arxiv.org/abs/2408.01556) and is updated roughly monthly to keep pace with new literature in astrophysics and cosmology.

Xet Storage Details

Size:: 1.16 kB
Xet hash:: 193f99cad7c0ee5199eea4e264d722daf8a22b14ba5c5b3e347a1b3cc003eea3

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.