Merge branch 'main' of https://huggingface.co/NCHC-bio/cell_x_gene_visualization into main a61de1d whats2000 commited on Feb 13
Upload d7476ae2-e320-4703-8304-da5c42627e71__HTAPP-330-SMP-1082_scRNA-seq.h5ad e0bac17 verified freshnemo commited on Feb 13
feat(eda): normalize dataset paths and deduplicate results in summary 95969f7 whats2000 commited on Feb 13
feat(eda): retrieve chunk size for each dataset in batch processing 5e1e99a whats2000 commited on Feb 12
feat(config): update dataset size thresholds for improved processing efficiency b1d3f22 whats2000 commited on Feb 12
feat(eda): update large file processing to support parallel workers and enhance metadata caching 75aa70e whats2000 commited on Feb 12
feat(eda): categorize datasets into small, dask-ready, and xlarge for improved processing f06cfcb whats2000 commited on Feb 12
feat(eda): adjust worker settings and add emergency mode for handling failed slices for extremly large db122fd whats2000 commited on Feb 12
fix(eda): correct max_workers and min_workers values for optimal resource allocation 311496c whats2000 commited on Feb 12
feat(eda): add adaptive scaling parameters and initial worker configuration for improved resource management d94a334 whats2000 commited on Feb 12
feat(eda): enhance resource utilization by optimizing worker allocation and processing parameters 32516b1 whats2000 commited on Feb 12
feat(eda): optimize resource allocation and processing parameters for enhanced performance 5910420 whats2000 commited on Feb 12
feat(metadata): add handling for missing datasets in CELLxGENE metadata and update status reporting 596560a whats2000 commited on Feb 12
feat(eda): add cache validation and retry mechanism for metadata build 6eb2e4a whats2000 commited on Feb 12
fix(eda): optimize gene statistics calculation in distributed EDA ac07329 whats2000 commited on Feb 12
feat(eda): update resource specifications for optimized performance 4e03c42 whats2000 commited on Feb 12
fix(eda): remove undefined 'info' variable reference causing crash 19a6596 whats2000 commited on Feb 12
refactor(slurm): update resource allocation and remove deprecated script 08c5297 whats2000 commited on Feb 12
feat(slurm): add SKIP_CACHE_BUILD option to skip metadata cache building 5d80e52 whats2000 commited on Feb 12
feat(eda): implement hybrid processing strategy for small and large datasets b8d98f3 whats2000 commited on Feb 11
feat(eda): refactor distributed EDA script for improved performance and memory management 74e20c3 whats2000 commited on Feb 11
fix(eda): optimize memory usage and ensure complete data computation 14cc169 whats2000 commited on Feb 11
feat(eda): migrate to Dask distributed with adaptive scaling and memory limits 450c8b2 whats2000 commited on Feb 11
fix(eda): use recent throughput instead of cumulative average for adaptive scaling d25b7a0 whats2000 commited on Feb 11
feat(eda): add adaptive worker reduction based on throughput monitoring e4396ec whats2000 commited on Feb 11
fix(config): remove mem_per_worker_gib from config files and calculate dynamically in resource_probe script 0c8f912 whats2000 commited on Feb 11
fix(config): clarify max_memory_gib allocation for staged processing 2cae30c whats2000 commited on Feb 11
fix(config): increase max_entries to 1T to include 520B entry dataset 2138486 whats2000 commited on Feb 11
fix(retry): add size categorization after merge to prevent null categories 05143cc whats2000 commited on Feb 11
fix(eda): include all successfully scanned datasets (ok_retry, ok_h5py) d2cd091 whats2000 commited on Feb 11
docs(cache): clarify incremental cache behavior and metadata skip option 436909b whats2000 commited on Feb 11
feat(recovery): add corrupted file redownload script and documentation cee344c whats2000 commited on Feb 11
feat(retry_failed_cache): implement dataset retry mechanism and merging of results 874e4c6 whats2000 commited on Feb 11
fix(cache): implement two-phase scanning to handle large files serially and prevent OOM 3ec846f whats2000 commited on Feb 11
feat(pipeline): add YAML config, metadata-aware scheduling, and dataset slicing 856e1ba whats2000 commited on Feb 11
Initial commit: distributed EDA pipeline, max non-zero reporting, and notebook dddcc0f whats2000 commited on Feb 11