# SLURM Job Scripts Quick reference for submitting jobs to the cluster. ## Available Jobs | Script | Purpose | Resources | Usage | |--------|---------|-----------|-------| | `slurm_verify.sh` | Verify paper results | 32G RAM, 1hr | `sbatch scripts/slurm_verify.sh [syn30\|fdr\|dali\|all]` | | `slurm_embed.sh` | Embed FASTA sequences | 64G RAM, GPU, 4hr | `sbatch scripts/slurm_embed.sh input.fasta output.npy` | | `slurm_calibrate_fdr.sh` | Compute FDR thresholds | 32G RAM, 2hr | `sbatch scripts/slurm_calibrate_fdr.sh` | ## Verification Options - `syn30` - JCVI Syn3.0 annotation (Paper Figure 2A: 59/149 = 39.6%) - `fdr` - FDR algorithm verification - `dali` - DALI prefiltering (Tables 4-6: 82.8% TPR, 31.5% DB reduction) - `clean` - CLEAN enzyme classification (Tables 1-2: hierarchical loss control) - `all` - Run all verifications Note: Full CLEAN verification with precision/recall metrics requires the CLEAN package from https://github.com/tttianhao/CLEAN. The basic verification uses pre-computed data. ## Quick Commands ```bash # Check job status squeue -u $USER # View job output (use Read tool or cat, avoid tail -f on login node) cat logs/cpr-verify-JOBID.out # Cancel a job scancel JOBID # Submit verification jobs sbatch scripts/slurm_verify.sh syn30 sbatch scripts/slurm_verify.sh dali sbatch scripts/slurm_verify.sh all # Submit other jobs sbatch scripts/slurm_embed.sh my_sequences.fasta my_embeddings.npy sbatch scripts/slurm_calibrate_fdr.sh ``` ## Output All jobs write to `logs/` directory: - `logs/cpr-JOB-JOBID.out` - stdout - `logs/cpr-JOB-JOBID.err` - stderr