A team of UC Riverside computer scientists has received a $1.2 million grant from The National Science Foundation to mine data from pediatric intensive care units. Leading UCR computer science Professor Eamonn Keogh is working with professors of computer engineering, Dr. Walid Najjar and Vasilis Tsotras, to identify binary relationships, which may potentially save lives and reduce healthcare costs. Keogh is working closely with Dr. Randall Wetzel, a doctor from Children’s Hospital in Los Angeles, as well as graduate student David Kale.
Through the creation of fast, specialized software, Professor Keogh’s team hopes to use algorithms to find patterns that will provide doctors with another resource of information for treating patients. Keogh plans to use a process called “Machine Learning,” in which a computer learns with more experience. Keogh related this to how an email inbox filters spam. Essentially, the two basic phases of this detailed operation are mining and monitoring.
Normally, vital sign data such as heartbeat, respiration rate and temperature is measured through sensors that are hooked up to a patient. Most of the data collected is discarded, since health physicians typically only record numbers such as a patient’s daily temperature.
This data, however, is on record at the Dr. Wetzel of the Children’s Hospital, which has stored a collection of archives for 10 years. The collection includes sets of data called a time series, which is used to keep track of a patient’s health over a period of time. Keogh claimed the easy part of the project is encoding the algorithms in the computer software at the hospital and having it constantly monitor for a rule. The challenging parts of the process is actually finding the algorithms.
“You want to find rules in data,” said Keogh. “Most rules in the data though, are either wrong or trivial.” An example might be that one may find that people who have babies tend to be female. Although this is perfectly valid, it’s a rule that is already known.
Keogh and his team will display the results of all mined data as estimated percentages on graphic charts. Any degree of uncertainty is commonly experienced in health examinations, due to a number of external factors that may affect the results. “The initial monitoring will occur by replicating patients through computer simulations, in which data will be gathered and monitored in real time,” states Keogh. Afterwards, Keogh and his team can decide whether or not they have found something that should be brought to the attention of the medical community.
“Telling the algorithm what rule to find and containing that rule is really the great challenge actually,” said Keogh. Once they propose a certain rule through publishing papers, people may do follow-up studies to validate it. After validation, Keogh promotes his philosophy of reproducibility and claims the world can do what it wants with the information. How did they actually obtain an approval for their project’s money proposal? Early stages of experimentation involved the use of insect prototypes.
Most notably, Keogh and team received the best paper award this year for looking at a set of trillion data points—the largest mining data paper ever published. The team is well on their way to starting up their research that will improve the quality of pediatric care. “This kind of thing is very close to our hearts that we can make a medical difference. [And] my key to research is, if you can find something you’re passionate it in, it’s not work it’s basically fun at that point,” states Keogh.