Accessibility and Trajectory-Based Text Characterization

Abstract

Several complex systems are characterized by exhibiting intricate properties that occur at multiple scales. These multi-scale characterizations are used in various applications. In particular, texts can be characterized by a hierarchical structure, which can be approached by using multi-scale concepts and methods. Here, we adopt an extension of the multi-scale, mesoscopic approach – hereafter referred to as a recurrence network – to represent text narratives, in which only the recurrent relationships among tagged parts of speech (subject, verb and direct object) are considered to establish connections among sequential pieces of text. The characterization of the texts was then achieved by considering scale-dependent complementary methods: accessibility and symmetry. To evaluate the potential of these concepts, we approached the problem of distinguishing between meaningful and meaningless texts and different literary genres (namely, fiction and non-fiction). A set of 300 books was considered and compared by using the above approaches. The recurrence network characterization was able to discriminate to some extent between real and meaningless and between the two genres assessed. Thus, our results indicate that recurrence networks are able to capture subtleties in book plots, suggesting that a similar methodology can be used in related networked applications.

Publication
Information Sciences
Date