I would like to express my appreciation for your contributions with the datasets. They provide a valuable baseline for researchers applying LLMs in the telecommunications domain.
I am currently looking to benchmark TeleYAML, TeleQnA, and TeleLogs datasets but have a few queries regarding the experimental setup:
Model Configuration: I reviewed the related papers but could not find specific details on the inference parameters used (e.g., temperature, top_p, max_tokens). Could you please share these settings?
Consistency: Were the hyperparameters, prompts, and pre-processing steps kept consistent across all the models you evaluated?
Code Availability: Is the inference or evaluation code open-sourced? Access to the scripts would be very helpful for replicating the environment and ensuring a fair comparison.
Thank you for your time and assistance.
Best regards,
Johnson