This paper explores the use of large language models as judges for assessing both explicit and implicit relevance in entity retrieval tasks, providing a novel evaluation framework for information retrieval systems.
This work investigates how large language models can serve as effective judges for evaluating relevance in entity retrieval systems.