David Pinto, Paolo Rosso, Héctor Jiménez-Salazar: On the Assessment of Text Corpora. NLDB 2009: 281-290