5.2 Syuzhet similarity
Kulkarni et al.  addresses the problem of computing exposition similarity computation for given pair of input documents. The approach utilizes a general knowledge base in the form of a term co-occurrence graph (TCG) computed from all articles in Wikipedia, to help in creating a story model for comparison.
In this model, the narrative style or the syuzhet, is modeled as a sequence of topics that are used to narrate the story. This is computed by a random-walk over a term co-occurrence graph (TCG) comprising of nouns, named entities, and verbs, created from the documents; and augmenting it with a general knowledge TCG obtained from Wikipedia.
A stationary distribution obtained from the random walk model, represents an “aboutness” distribution of topics that represent the exposition style. This model of computing aboutness was first proposed by Rachakonda et al. .
Exposition similarity between two documents is then measured in two steps:
(1) Measuring the size of “common topic space”
(2) Measuring ranked similarity of aboutness of topical terms
Final Syncretic Similarity score is computed by weighting this similarity score with the joint probability of common topics between the documents.
The proposed method was tested on a hand-crafted test dataset consisting of 22 test cases collected from the Internet. These test cases were on different themes like movies, news articles, and encyclopedic articles. The paper also proposes future work that could further improve the existing methodology.
Narratology is an important area of study, that can help explain several cognitive, social, and cultural phenomena. While the study of narratives has attracted interest from philosophers from a long time, current day advances in computational technology as well as advances in natural language processing, have now opened opportunities for computational modeling of narratives. Breakthroughs in this area is likely to have several implications both in our theoretical understanding of cognition and intelligence, as well as in myriad practical applications.