How to represent speech as a network

Netts pipeline


Network of Transcript Semantics

During my time at the University of Cambridge, I wrote an NLP-package for analysing the content of natural speech. The algorithm (netts) constructs networks from a speech transcript that represent the content of what the speaker said. The idea here is that the nodes in the network show the entities that the speaker mentioned, like a cat, a house, etc. (usually nouns). And the edges of the network show the relationships between the entities. We called these networks semantic speech networks.

Why did we want to represent speech content as a network? There is evidence that psychiatric conditions, in particular early psychosis (schizophrenia), can be traced early in abnormal language. In psychosis, there is a phenomenon called ‘loosening of associations’, where connection between ideas can become tenuous and extraneuous concepts seem to intrude into the line of thought. We hypothesised that mapping speech content as a network could capture this abnormal language features early in psychosis - potentially aiding diagnosis and disease monitoring. In a clinical sample of patients with early psychosis and controls, that was indeed the case (you can find the preprint here).

We also believe that this tool could be useful for analysing speech content in other conditions and also in healthy and developing populations - the semantic speech networks are quite rich in information. If you’re interested in using semantic speech networks for your own data, we have written an installation guide and tutorials for different use cases.

GitHub release PyPI pyversions


Source Code:

Media Coverage: Medscape Article


Netts was written by Caroline Nettekoven in collaboration with Sarah Morgan.

Netts was packaged in collaboration with Oscar Giles, Iain Stenson and Helen Duncan.

Caroline Nettekoven
Caroline Nettekoven
Postdoctoral Researcher

I am interested in the neural basis of complex behaviour. To study this, I use neuroimaging techniques, computational modelling of behaviour and brain stimulation.