Datafish Multiphase Data Mining Technique to Match Multiple Mutually Inclusive Independent Variables in Large PACS Databases
Recommended Citation
Kelley BP, Klochko C, Halabi S, and Siegal D. Datafish multiphase data mining technique to match multiple mutually inclusive independent variables in large pacs databases. J Digit Imaging 2016 Jun;29(3):331-6.
Document Type
Article
Publication Date
6-1-2016
Publication Title
Journal of digital imaging : the official journal of the Society for Computer Applications in Radiology
Abstract
Retrospective data mining has tremendous potential in research but is time and labor intensive. Current data mining software contains many advanced search features but is limited in its ability to identify patients who meet multiple complex independent search criteria. Simple keyword and Boolean search techniques are ineffective when more complex searches are required, or when a search for multiple mutually inclusive variables becomes important. This is particularly true when trying to identify patients with a set of specific radiologic findings or proximity in time across multiple different imaging modalities. Another challenge that arises in retrospective data mining is that much variation still exists in how image findings are described in radiology reports. We present an algorithmic approach to solve this problem and describe a specific use case scenario in which we applied our technique to a real-world data set in order to identify patients who matched several independent variables in our institution's picture archiving and communication systems (PACS) database.
Medical Subject Headings
Data Mining; Database Management Systems; Humans; Radiology Information Systems; Software
PubMed ID
26572132
Volume
29
Issue
3
First Page
331
Last Page
336