Skip to main content Skip to main navigation

Publikation

STRICTA: Structured Reasoning in Critical Text Assessment for Peer Review and Beyond

Nils Dycke; Matej Zecevic; Ilia Kuznetsov; Beatrix Suess; Kristian Kersting; Iryna Gurevych
In: Wanxiang Che; Joyce Nabende; Ekaterina Shutova; Mohammad Taher Pilehvar (Hrsg.). Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2025, Vienna, Austria, July 27 - August 1, 2025. Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), Pages 22687-22727, Association for Computational Linguistics, 2025.

Zusammenfassung

Critical text assessment is at the core of many expert activities, such as fact-checking, peer review, and essay grading. Yet, existing work treats critical text assessment as a black box problem, limiting interpretability and human- AI collaboration. To close this gap, we intro- duce Structured Reasoning In Critical Text Assessment (STRICTA), a novel specification framework to model text assessment as an ex- plicit, step-wise reasoning process. STRICTA breaks down the assessment into a graph of interconnected reasoning steps drawing on causality theory (Pearl, 1995). This graph is populated based on expert interaction data and used to study the assessment process and facilitate human-AI collaboration. We for- mally define STRICTA and apply it in a study on biomedical paper assessment, resulting in a dataset of over 4000 reasoning steps from roughly 40 biomedical experts on more than 20 papers. We use this dataset to empirically study expert reasoning in critical text assess- ment, and investigate if LLMs are able to imi- tate and support experts within these workflows. The resulting tools and datasets pave the way for studying collaborative expert-AI reasoning in text assessment, in peer review and beyond.

Weitere Links