Linguistic Summarization of Sensor Data for Early Illness Recognition in Eldercare

Prinicpal Investigator:

Co-Investigators: , , ,


Project Summary

Specific Aims

Many older adults in the US prefer to live independently for as long as they are able, despite the onset of severe chronic conditions [Esbaugh08]. Older adults are particularly at-risk for late assessment of physical or cognitive changes due to many factors: their impression that such changes are simply a normal part of aging; their reluctance to admit to a problem; and their fear of losing independence. Early detection of health changes is the key to maintaining health, independence, and function [Yarnall03], [Boock02], [Ridley05]. Non-wearable sensors such as depth camera, motion detectors (PIR) or bed mats (BCG) are able to detect changes in gait [Stone15a,b], activity [Popescu12] or sleep [Heise10,11], respectively, and have emerged as a possible solution for early detection of health changes [Rantz15, Hayes14, Dawadi13]. For example, these sensors can detect behavioral patterns caused by dementia such as aberrant motion behavior (pacing), agitation (increased activity) and sleep changes [Cerejeira12;Galambos12] and other nonspecific behaviors such as lethargy, weakness and falls that have high predictive value for acute illness in nursing homes [Boock03; Chaaraoui12].

Since 2005, our interdisciplinary research team has investigated, developed and rigorously tested a state of the art sensor monitoring system for older adults in TigerPlace, a unique eldercare facility near the University of Missouri (MU) campus [Rantz05a]. In a retrospective study in TigerPlace [NIH R21, Rantz PI, 2009-2012], we have detected patterns of changes in sensor data in 42% of the cases 10 days to 2 weeks preceding a significant health event [Rantz08a, 08b, 09,10, 12]. In a current, more extensive, NIH R01 study in 14 Missouri nursing homes, we are prospectively measuring the effectiveness of sensors coupled with a clinical alert system to detect early signs of illness [Rantz PI, 2012-2016]. While in-home sensors hold enormous potential for identifying early changes in health, analyzing the resulting data is challenging for clinicians due to its variety (many sensor types) and velocity (continuous monitoring). Currently, to review this data, clinicians must navigate to a secure interface and review a graphic or video data display which takes at least 10 minutes per incident [Alexander2011]. Moreover, issues such as the nationwide nursing shortage [HRSA2014], staff turnover and poor staff communication, lead to inadequate knowledge of the resident [Majerovitz2009] that may lead to overlooking early signs of illness.

To address these challenges, we propose to investigate a new knowledge generation methodology based on linguistic health summaries as a tool to integrate and summarize vast amounts of in-home sensor data in order to detect early health changes, save clinicians’ time and improve communication in nursing homes. Several ICU studies [Gatt09; Deemter99; Luo12; Law05] have demonstrated that clinical decisions based on human authored textual summaries were superior in accuracy to those based on graphical output from sensors. By leveraging our ongoing R01 study, we will implement and evaluate the new knowledge generation methodology in TigerPlace and at 14 Americare nursing homes in Missouri.

Our summarization approach is based on two novel methodologies: linguistic protoform summaries (LPS) and sensor sequence annotations (SSA). LPS is a natural language generation technique based on soft templates able to describe the temporal nature of large sensor data streams while capturing their inherent uncertainty. A typical LPS [Jain15, Wilbik12] would look like: “The resident had frequent restlessness episodes last night”. On the other hand, SSA summarizes sensor data using the National Library of Medicine (NLM) Unified Medical Language System (UMLS) concepts extracted from the related TigerPlace/Americare nursing Electronic Health Record (EHR). A typical SSA statement would be: “Sensor patterns consistent with depression were detected last week”. Our SSA summarization strategy is based on our previous work in early illness detection [Haji13,14,15]. While LPS describe a resident’s behavior, SSA provide a possible clinical interpretation for it. This type of data summarization will simplify the delivery of health change information to clinicians.

The specific aims of this study are:

Aim 1: Question: What is an effective format of the summaries for clinician interpretation? We will build the alert terminology and templates (protoforms) for natural language generation of summaries. The alert terminology will be based on UMLS concepts. We will validate the alert terminology and templates using clinician input from focus groups.

Aim 2: Question: Are linguistic summaries improving clinical outcomes in nursing homes? We will evaluate summaries’ accuracy using a randomized two group design study (alerts with/without summarization). We will track health outcomes such as hospitalizations, SF12, gait speed, etc., enabled by the R01 study, to compare the changes in health status between the two groups.

Aim 3: Question: Are summaries saving clinicians time?

The evaluation will be performed by tracking clinicians’ time to interpret the summaries as compared to the regular graphical alerts.


The figure below presents a system that summarizes the data generated by 3 sets of sensors over a two week period. The three plots show the bathroom motion, motion density and bed restlessness of an older adult living in an apartment equipped with a network of sensors. Bathroom motion and motion density represent the daily activity captured in the bathroom and over-all motion in the apartment, respectively, as measured by motion sensors. Bed restlessness is the amount of activity on the bed which was measured by a bed sensor placed under the mattress. The text boxes show the summaries produced automatically by our system.

The bathroom motion tonight is a lot higher than many of the nights in past 2 weeks with an increasing trend in the past 3 nights. The motion density tonight is a lot higher than many of the nights in past 2 weeks.
The bathroom motion tonight is a lot higher than many of the nights in past 2 weeks with an increasing trend in the past 3 nights. The motion density tonight is a lot higher than many of the nights in past 2 weeks.


Jain A & Keller JM, “On the Computation of Semantically Ordered Truth Values of Linguistic Protoform Summaries,” Proc., IEEE Intl. Conf. on Fuzzy Systems, Istanbul, Turkey, August 2015, pp 1-8.

Jain A & Keller JM, “Textual Summarization of Events Leading to Health Alerts,” Proc., IEEE Intl. Conf. of the Engineering in Medicine and Biology Society, Milan, Italy, August 2015, pp 7634-7637.

Jain A, Keller JM & Bezdek JC, “Quantitative and Qualitative Comparison of Periodic Sensor Data,” Proc., IEEE-EMBS Intl. Conf. on Biomedical and Health Informatics (BHI), Las Vegas, NV, February 2016, pp 37-40.

Wilbik A & Keller J, “A Fuzzy Measure Similarity between Sets of Linguistic Summaries,” IEEE Transactions on Fuzzy Systems, 2013, 21(1):183-189.

Wilbik A & Keller J, “A Distance Metric for a Space of Linguistic Summaries,” Fuzzy Sets and Systems, 2012, 208:79-94.

Wilbik A, Keller J & Bezdek J, “Generation of Prototypes from Sets of Linguistic Summaries,” Proc., 2012 IEEE World Congress on Computational Intelligence, Brisbane, Australia, June 10-15, 2012, pp 472-479.

Wilbik A, Keller J & Alexander G, “Linguistic Summarization of Sensor Data for Eldercare,” Proc., IEEE International Conference on Systems, Man, and Cybernetics, Anchorage, AK, October 9-12, 2011, pp 2595-2599.

Alexander GL, WilbiK A, Keller JM & Musterman K, “Generating Sensor Data Summaries to Communicate Change in Elder’s Health Status,” Applied Clinical Informatics, 2014, 5(1):73-84.