Machine Learning Approach on Tinnitus Talk

Frédéric

Member
Author
Podcast Patron
Benefactor
Advocate
Jan 2, 2016
972
Marseille, France
Tinnitus Since
11/19/2012
Cause of Tinnitus
acoustic trauma
Assessing the Heterogeneity of Complaints Related to Tinnitus and Hyperacusis from an Unsupervised Machine Learning Approach: An Exploratory Study

Abstract

Introduction:
Subjective tinnitus (ST) and hyperacusis (HA) are common auditory symptoms that may become incapacitating in a subgroup of patients who thereby seek medical advice. Both conditions can result from many different mechanisms, and as a consequence, patients may report a vast repertoire of associated symptoms and comorbidities that can reduce dramatically the quality of life and even lead to suicide attempts in the most severe cases. The present exploratory study is aimed at investigating patients' symptoms and complaints using an in-depth statistical analysis of patients' natural narratives in a real-life environment in which, thanks to the anonymization of contributions and the peer-to-peer interaction, it is supposed that the wording used is totally free of any self-limitation and self-censorship.

Methods: We applied a purely statistical, non-supervised machine learning approach to the analysis of patients' verbatim exchanged on an Internet forum. After automated data extraction, the dataset has been preprocessed in order to make it suitable for statistical analysis. We used a variant of the Latent Dirichlet Allocation (LDA) algorithm to reveal clusters of symptoms and complaints of HA patients (topics). The probability of distribution of words within a topic uniquely characterizes it. The convergence of the log-likelihood of the LDA-model has been reached after 2,000 iterations. Several statistical parameters have been tested for topic modeling and word relevance factor within each topic.

Results: Despite a rather small dataset, this exploratory study demonstrates that patients' free speeches available on the Internet constitute a valuable material for machine learning and statistical analysis aimed at categorizing ST/HA complaints. The LDA model with K = 15 topics seems to be the most relevant in terms of relative weights and correlations with the capability to individualizing subgroups of patients displaying specific characteristics. The study of the relevance factor may be useful to unveil weak but important signals that are present in patients' narratives.

Discussion/Conclusion: We claim that the LDA non-supervised approach would permit to gain knowledge on the patterns of ST- and HA-related complaints and on patients' centered domains of interest. The merits and limitations of the LDA algorithms are compared with other natural language processing methods and with more conventional methods of qualitative analysis of patients' output. Future directions and research topics emerging from this innovative algorithmic analysis are proposed.

Further in the article:
The goal of this proof-of-concept study is to test the feasibility and interest of the LDA method. This method is aimed at extracting topics related to HA from freely posted comments on an Internet forum (TinnitusTalk.com). Future directions and research topics of this algorithmic analysis in the field of ST/HA are also discussed.

@Hazel, @Markku: were you aware of this?
@JohnAdams: it seems that some researchers listened (by telepathy) to our suggestions.

Full article: https://www.karger.com/Article/FullText/504741
 

Log in or register to get the full forum benefits!

Register

Register on Tinnitus Talk for free!

Register Now