A new study by University of Leicester academics has shown that lower severity trauma patients could be more likely to die after two to three weeks.
Using data from the largest trauma database in Europe, the Trauma Audit and Research Network (TARN) database, Dr Evgeny Mirkes, Professor Tim Coats, Professor Jeremy Levesley and Professor Alexander Gorban used 165,559 trauma cases to conduct the research, among them 19,289 cases with unknown outcome.
The problem of multipeak mortality is well-known in trauma study. For the whole TARN dataset, the team found that the coefficient of mortality decreases monotonically over time. However, for lower severities, the coefficient of mortality is a non-monotonic function which may have maxima at the second and third weeks. This means that while the probability of dying for all trauma cases consistently decreases over time, there are peaks in the probability of death for those with less severe trauma who remain in a hospital at around 14 days and 21 days after admission.
In their paper "Handling missing data in large healthcare dataset: a case study of unknown trauma outcomes" which is published in journal Computers in Biology and Medicine, they also demonstrate that unknown trauma outcomes are not missed "completely at random" and show that it is impossible to exclude these cases from analysis despite the large amount of data available.
Handling of missing data is one of the main tasks in data pre-processing, especially in large public service datasets. In healthcare, the problems with data are not due to their size, but their quality. As human inspection at this scale is impossible, there is a desperate need for intelligent tools for accuracy and believability control.
One of the reasons for data incompleteness is fragmentation, which is unavoidable due to the diverse structure of the health service. Although the problem of handling missed values in large healthcare datasets is not yet completely solved, the approach developed by the team of Leicester academics can be applied to various healthcare datasets which have the problem of lost patients, inter-hospital transfer, and missing outcomes.