Mario Giulianelli, Arabella Sinclair and Raquel Fernández
Speakers are thought to use efficient information transmission strategies for effective communication. For example, they transmit information at a constant rate in written text. We analyse these strategies in monologue and dialogue datasets, combining information-theoretic measures with probability estimates obtained from Transformer-based language models. We find (i) that information density decreases overall in spoken open domain and written task-oriented dialogues, while it remains uniform in written texts; (ii) that speakers’ choices are oriented towards global, rather than local, uniformity of information; and (iii) that uniform information density strategies are at play in dialogue when we zoom in on topically and referentially coherent contextual units. Besides providing new empirical evidence on written and spoken language production, we believe that our studies can directly inform the development of more human-like natural language generation models.