Supercalifragilisticexpialidocious: Why Using the “Right” Readability Formula in Children’s Web Search Matters

Garrett Allen, Ashlee Milton, Katherine Landau Wright, Jerry Alan Fails, Casey Kennington, and Maria Soledad Pera. 2022. “Supercalifragilisticexpialidocious: Why Using the “Right” Readability Formula in Children’s Web Search Matters”. Full paper to appear in Proceedings of the 44th European Conference on Information Retrieval (ECIR ‘22).


Readability is a core component of information retrieval (IR) tools as the complexity of a resource directly affects its relevance: a resource is only of any use if the user is able to comprehend it. Even so, the connection between readability and IR is often overlooked. As a starting point towards advancing knowledge on the influence of readability on IR, we focus on Web search for children. We first explore how traditional formulas–which are simple, efficient, and portable–fare when applied to estimating the readability of Web resources written in English targeting children. We then empirically show that readability can indeed sway children’s information access. Outcomes from our work reveal that: (i) for Web resources targeting children, a simple formula suffices as long as it considers contemporary terminology and audience requirements, and (ii) instead of turning to Flesch-Kincaid–a popular readability formula–the use of the “right” formula can shape Web search tools to best serve children. The work presented herein builds on three pillars: Target Audience, Application, and Expertise. It serves as a blueprint to identify readability estimation methods that best apply to and inform IR-related applications, serving varied audiences (e.g., users experiencing dyslexia and English language learners).