UNISONO – Sensor system with AI-driven vocal biomarkers

Last update: 25th February 2025

13. September 2022 – The start of the innovation project “UNISONO: Sensor system for AI-driven clinical phenotyping with speech biomarkers for heart failure” was announced today by Zana Technologies GmbH, a German-based provider of Conversation and Voice AI technology for healthcare, together with Cosinuss GmbH, a certified medical technology company specializing in real-time mobile measurement of vital parameters, and the Comprehensive Heart Failure Center (CHFC) at the University Hospital Würzburg.

UNISONO aims to develop a novel system combining an ear-worn sensor with speech recognition. In addition to the continuous measurement of vital parameters, the sensor will be equipped with a speech assistant that enables voice-guided communication. The collected data will be utilized to derive novel vocal biomarkers for AI-driven clinical phenotyping of patients with chronic heart failure, a condition that is affecting more than 3 million people in Germany alone.

In a very competitive selection, UNISONO made it into the top 8% of projects funded by the German Federal Ministry of Education and Research as part of its program “KMU-innovativ (program for innovative SMEs) in “Interactive technologies for health and quality of life”. The 3-year project started on August 1st, 2022, and a first joint meeting was held involving all partners and the project executing agency VDI/VDE Innovation + Technology GmbH.

Novelty of UNISONO

The innovation of UNISONO is to leverage Zana’s existing AI platform with the novel vocal biomarkers technology through the clinical expertise of the CHFC, while extending the cosinuss° in-ear sensor for speech interaction. Decompensated heart failure is a complex clinical picture characterized by symptoms such as shortness of breath, edema and reduced exercise tolerance. Few studies have described voice changes in the context of decompensation. However, acoustic measurements of altered vocal characteristics could serve as early indicators of incipient decompensation or changes in the patient’s state of health.

“With UNISONO, we are investigating how speech and vital data can be combined to such an extent and how the data quality can be improved by an intelligent assistant in order to use it as a health predictor in heart failure.”, states Dr. Julia Hoxha, CEO of Zana and coordinating partner. She adds: “In doing so, UNISONO brings AI-powered collection of real-world data to core healthcare and clinical research.”

To simultaneously enable voice interaction and measure vital signs (such as body temperature, heart rate, oxygen saturation and respiratory rate), the hardware of cosinuss°’s patented ear sensor is enhanced with a microphone and speaker. “This technology enables us to continuously monitor the patient’s vital parameters and voice in real time over several weeks without the need for complex cabling, thus building up a large database for the development of a voice biomarker for clinical phenotyping,” says Dr. Johannes Kreuzer, CEO of cosinuss°.

In collaboration with the clinical partner CHFC, digital clinical phenotypes will be identified from the collected data and linked to established factors associated with a poorer prognosis in heart failure. “Vocal biomarkers have huge potential for improving patient care in heart failure since they are non-invasive, low-cost, easy to collect repetitively, and can be assessed remotely”, explains Dr. Fabian Kerwagen, MPH, leader of project UNISONO at the CHFC. “Combining patient’s voice with ear-worn technology will facilitate comprehensive digital phenotyping of heart failure patients and offer new opportunities for telemonitoring and prevention in heart failure”.

Update (April 2024): First interim results

On March 19, 2024, the fourth consortium meeting of all project partners took place at the cosinuss° offices in Munich (see Fig. 1). The prototype of the in-ear sensor c-med° alpha, which has been expanded to include an audio function, was presented here. On the other hand, the first interim results of the AHF Voice Study, which is part of the UNISONO project, were presented and discussed. The monocentric, prospective cohort study is being conducted at the University Hospital of Würzburg. By the time of the consortium meeting, 100 of the targeted 123 patients had already been recruited. Results are now available for the first 50 AHF Voice patients who were admitted to the hospital between April and August 2023. Under the supervision of the study staff, the patients¹ recorded their voice daily using a special smartphone app developed by Zana Technologies GmbH. Three different voice tasks were completed: spontaneous speech, holding the vowel /a:/ for as long as possible and reading a standardized text. In addition, each patient’s medical history, routine blood values, results from technical examinations such as cardiac ultrasound and health questionnaires were collected.

Consortium meeting of the UNISONO project partners: Dr.-Ing. Julia Hoxha and Ongun Tuna from Zana Technologies GmbH, Dr. Fabian Kerwagen and Maximilian Bauser from University Hospital Würzburg, as well as Dr. Michael Weber and David Geiger from Cosinuss GmbH.

Fig. 1: Consortium meeting of the UNISONO project partners: Dr. Julia Hoxha and Ongun Tuna from Zana Technologies GmbH, Dr. Fabian Kerwagen and Maximilian Bauser from the University Hospital of Würzburg and Dr. Michael Weber and David Geiger from Cosinuss GmbH.

Analysis of voice changes

In the first analyses, the recorded sustained vowel sounds (/a:/) of the patients were used. The aim was to find out whether a non-invasive method for detecting an impending heart failure episode is possible based on changes in the recorded vocal characteristics. The first interim analysis was based on 45 pairs of voice recordings (one at hospital admission and one at discharge, n=90). The recorded speech features were extracted using Praat² software. The results show that the following aspects of the patient’s voice changed significantly between admission and discharge: maximum phonation time, total energy, number of pulses, shimmer (superimposition of noise on the fundamental frequency of a speech signal), cepstral peak prominences, number of voice breaks, and jitter (irregularity in the fundamental frequency or period of a speech signal). No significant changes were detected in the following voice characteristics: Mean values of pitch and harmonics-to-noise ratio (see Fig. 2).

Multiple boxplotts showing the differences between admission and discharge values for the parameters Jitter, Shimmer, MPT, and CPP.

Fig. 2: Changes of voice parameters (boxplots) from admission to discharge. Statistical significance was tested using student’s t-test. MPT = maximum phonation time. CPP = cepstral peak prominences. Reference: herzmedizin.de

This interim analysis shows that a number of easily deducible vocal characteristics change depending on the state of decompensated heart failure. The clinical utility of such vocal biomarkers as a non-invasive method for detecting the onset of heart failure is promising, but requires further research.

Voice-based phenotyping of patients

A comprehensive phenotyping of the patients took place during the hospital stay. A total of 2,753 voice recordings from 42 patients were included in the analysis (average age 74±11 years, 64% men)³. Machine learning methods⁴ were used to process and cluster the audio data and the voice characteristics contained therein. In this way, three clusters with different phenotypes could be identified in the unsupervised clustering based on voice features alone (see Fig. 3): Cluster 1 had the longest duration of heart failure, the highest levels of natriuretic peptides (hormones involved in the regulation of water-electrolyte balance) and the lowest left ventricular ejection fraction (LVEF, indicates how much blood leaves the left ventricle during a heartbeat). Cluster 2 had an intermediate LVEF and the highest potassium level. Cluster 3 had the highest LVEF and the highest proportion of women.

A scatterplott with blue points for 1, green points for 2 and red points for 3.

Fig. 3: Clusters resulting from Principal Component Analysis (PCA, K-means approach) on voice-based features extracted from patients’ audio recordings. Reference: herzmedizin.de

The interim results of the phenotyping indicate that cluster analyses of voice features based on machine learning are able to identify different groups of heart failure patients.

Outlook 2024/2025

As the UNISONO project progresses, the “in-ear sensor sub-study” (feasibility study) will start at the end of this year. For this purpose, as mentioned at the beginning, the in-ear sensor from cosinuss°, the c-med° alpha (see Fig. 4), was modified to meet the project requirements. The most important change is the addition of audio functionality. In addition to the installation of a microphone and loudspeaker, various other adjustments and optimizations were made. A working prototype of the in-ear sensor has already been completed and tested. The modified c-med° alpha including audio functionality fulfills all requirements with regard to the measurement accuracy of the vital parameters⁵ and is currently being tested for its audio quality.

Fig. 4: c-med° alpha, a class IIa medical measuring device that continuously generates data streams of three important vital parameters: core body temperature, pulse rate, oxygen saturation (SpO2).

In addition, the analyses described here were presented at the 90th DGK Annual Meeting. Further interim results of the AHF Voice study will also be presented at various congresses in the coming months, including the Heart Failure – World Congress on Acute Heart Failure in May 2024 (Lisbon, Portugal)

Update (February 2025): Validation of the prototype

The UNISONO innovation project for AI-controlled clinical phenotyping of heart failure patients has reached another milestone: cosinuss° has completed the planned prototype, which, in addition to the continuous measurement of vital parameters, now also has audio functionality – with an integrated loudspeaker and microphone, similar to headphones.

The associated feasibility study will soon be carried out at the University Hospital of Würzburg, during which around 15 patients will use the c-med° alpha prototype in their home environment for three months. Both vital data and voice recordings will be collected in order to obtain novel vocal biomarkers for AI-supported analysis.

We look forward to reporting further updates and results soon!

UNISONO website

www.unisono-projekt.com

Press comments

Authors

Gerrit Schweiger

B.A. Kommunikationsdesigner und UX/UI Designer mit Schwerpunkt auf Digitalisierung im Gesundheitswesen. // B.A. Communication Designer and UX/UI Designer with a focus on digitalization in healthcare.
View all posts
Melanie Schade

M.A. Kommunikationswissenschaft und Online-Marketing-Expertin mit Schwerpunkt auf Gesundheits- und Wissenschaftskommunikation. // M.A. Communication Studies and online marketing expert with a focus on health and science communication.
View all posts

Quellen / References

Inclusion criteria: Hospitalization for acute heart failure, age ≥18 years, life expectancy ≥6 months. Exclusion criteria: High output heart failure, cardiogenic shock, a listing for high urgency heart transplantation or a history of vocal cord disease or surgery.
Version 6.3.13, 31.07.2023, The Netherlands
Of 50 patients, four withdrew their consent and a further four patients were excluded due to poor admission quality.
A vocal biomarker pipeline was established for audio processing and extraction of specific vocal features. The data was analyzed using principal component analysis (PCA) with an unsupervised K-Means clustering approach. The number of clusters was determined using the silhouette score.
The SpO2 algorithm has been validated on the data of the validation study for medical devices and complies with the standard.