add reference to data paper

#1
by AndiLindner - opened
Files changed (1) hide show
  1. README.md +20 -2
README.md CHANGED
@@ -42,7 +42,25 @@ Species with less than 50 images were excluded from training. The final dataset
42
 
43
  The model was trained in the context of a data paper in which the butterfly and moth images dataset it was trained on was published:
44
 
45
- Citation will be added after publication.
46
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  ```
48
 
 
42
 
43
  The model was trained in the context of a data paper in which the butterfly and moth images dataset it was trained on was published:
44
 
45
+ ```bibtex
46
+ @Article{Barkmannetal2025a,
47
+ author={Barkmann, Friederike
48
+ and Lindner, Andreas
49
+ and W{\"u}rflinger, Ronald
50
+ and H{\"o}ttinger, Helmut
51
+ and R{\"u}disser, Johannes},
52
+ title={Machine learning training data: over 500,000 images of butterflies and moths (Lepidoptera) with species labels},
53
+ journal={Scientific Data},
54
+ year={2025},
55
+ month={Aug},
56
+ day={06},
57
+ volume={12},
58
+ number={1},
59
+ pages={1369},
60
+ abstract={Deep learning models can accelerate the processing of image-based biodiversity data and provide educational value by giving direct feedback to citizen scientists. However, the training of such models requires large amounts of labelled data and not all species are equally suited for identification from images alone. Most butterfly and many moth species (Lepidoptera) which play an important role as biodiversity indicators are well-suited for such approaches. This dataset contains over 540.000 images of 185 butterfly and moth species that occur in Austria. Images were collected by citizen scientists with the application ``Schmetterlinge {\"O}sterreichs'' and correct species identification was ensured by an experienced entomologist. The number of images per species ranges from one to nearly 30.000. Such a strong class imbalance is common in datasets of species records. The dataset is larger than other published dataset of butterfly and moth images and offers opportunities for the training and evaluation of machine learning models on the fine-grained classification task of species identification.},
61
+ issn={2052-4463},
62
+ doi={10.1038/s41597-025-05708-z},
63
+ url={https://doi.org/10.1038/s41597-025-05708-z}
64
+ }
65
  ```
66