Feasibility of Using Zero-Shot Learning in Transformer-Based Natural Language Processing Algorithm for Key Information Extraction from Head and Neck Tumor Board Notes

Document Type

Conference Proceeding

Publication Date


Publication Title

Int J Radiat Oncol Biol Phys


Purpose/Objective(s): Natural language processing (NLP) technology has the potential to automate information aggregation and summarization in oncology. One example is the automation of patient registry creation. In this work, we aim to show (1) the feasibility of using modern NLP algorithms to extract key information from tumor board notes, and (2) the impact of prompt engineering on the quality of the results. Materials/Methods: In this IRB-approved study, we obtained the texts of head and neck tumor board notes for 306 unique patients. Five key pieces of information used to create a patient registry were predefined: age, gender, tumor histology, tumor stage, and primary location. The NLP algorithm used was a modified Text-To-Text Transfer Transformer (T5) model that was initially trained on the Colossal Clean Crawled Corpus (C4) dataset and subsequently fine-tuned on the Stanford Question Answering Dataset (SQuAD) to perform the downstream task of extractive question answering. The NLP model and trained weights were obtained from the Hugging Face platform. During inference, the entire body of the tumor board note and a related question were fed as inputs, and the model predicted a sequence of texts in response to the question. Two sets of questions of similar semantic meanings were used. Questions in prompt set #1 included “What is the gender?”, “What is the age?”, “What is the type of carcinoma in pathological diagnosis?”, “What is the stage?”, and “Where is the carcinoma located at?”. Questions in prompt set #2 include “Is the patient male or female?”, “How old is the patient?”, “What kind of cancer?”, “What is the cancer stage?”, and “What is the tumor location?”. Each model-predicted response was compared to the ground truth extracted from the tumor board notes. A response was classified as true if it is consistent with the ground truth, otherwise, it was deemed false. The response accuracy for each question was subsequently calculated. Results: The median number of words in each tumor board note was 448 (range, 219 – 1505). The accuracy of the NLP algorithm for each question from either set is reported in Table 1. Algorithm performance is higher for extracting objective information such as age, gender, and histology. In addition, it was found that questions of similar semantic meanings but with different wording can lead to significantly different results. Conclusion: We demonstrated that a transformer-based extractive question-answering NLP algorithm can be successfully used for extracting information from head and neck tumor board notes with zero-shot learning. Furthermore, our results highlight the significance of prompt engineering for applying NLP for this task. Future work on finetuning these algorithms to oncology-specific texts can potentially enhance algorithm performance for more difficult tasks.





First Page