Inter-Rater Disagreements in Applying the Montreal Classification for Crohn's Disease: The Five-Nations Survey Study
Author
Date
2025-06Permanent link
http://hdl.handle.net/11351/13886DOI
10.1002/ueg2.12757
ISSN
2050-6414
WOS
001398976800001
PMID
39825768
Abstract
Background: The Montreal classification has been widely used in Crohn's disease since 2005 to categorize patients by the age of onset (A), disease location (L), behavior (B), and upper gastrointestinal tract and perianal involvement. With evolving management paradigms in Crohn's disease, we aimed to assess the performance of gastroenterologists in applying the Montreal classification.
Methods: An online survey was conducted among participants at an international educational conference on inflammatory bowel diseases. Participants classified 20 theoretical Crohn's disease cases using the Montreal classification. Agreement rates with the inflammatory bowel diseases board (three expert gastroenterologists whose consensus rating was considered the gold standard) were calculated for gastroenterologist specialists and fellows/specialists with ≤ 2 years of clinical experience. A majority vote < 75% among participants was considered a notable disagreement. The same cases were classified using three large language models (LLMs), ChatGPT-4, Claude-3, and Gemini-1.5, and assessed for agreement with the board and gastroenterologists. Fleiss Kappa was used to assess within-group agreement.
Results: Thirty-eight participants from five countries completed the survey. In defining the Montreal classification as a whole, specialists (21/38 [55%]) had a higher agreement rate with the board compared to fellows/young specialists (17/38 [45%]) (58% vs. 49%, p = 0.012) and to LLMs (58% vs. 18%, p < 0.001). Disease behavior classification was the most challenging, with 76% agreement among specialists and fellows/young specialists and 48% among LLMs compared to the inflammatory bowel diseases board. Regarding disease behavior, within-group agreement was moderate (specialists: k = 0.522, fellows/young specialists: k = 0.532, LLMs: k = 0.577; p < 0.001 for all). Notable points of disagreement included: defining disease behavior concerning obstructive symptoms, assessing disease extent via video capsule endoscopy, and evaluating treatment-related reversibility of the disease phenotype.
Conclusions: There is significant inter-rater disagreement in applying the Montreal classification, particularly for disease behavior in Crohn's disease. Improved education or revisions to phenotype criteria may be needed to enhance consensus on the Montreal classification.
Keywords
Crohn's disease; Inflammatory bowel diseases; Montreal classificationBibliographic citation
Ukashi O, Amiot A, Laharie D, Menchén L, Gutiérrez A, Fernandes S, et al. Inter-Rater Disagreements in Applying the Montreal Classification for Crohn’s Disease: The Five-Nations Survey Study. United Eur Gastroenterol J. 2025 Jun;13(5):685–96.
Audience
Professionals
This item appears in following collections
- HVH - Articles científics [4470]
The following license files are associated with this item:





