The role of large language models in advancing head and neck cancer research and care: a narrative review
Introduction
Head and neck cancer (HNC) comprises a broad class of primary malignant neoplasms located in the oral cavity and throat, hypopharynx, nasopharynx, oropharynx, lips, nasal cavity and paranasal sinuses, along with salivary gland malignancies among others, and is a notable global health concern (1,2). Although the advances in the treatment strategies, the 5-year survival rates have not shown a significant improvement over the past few decades for HNC; thus, there is a need for new strategies in diagnosis, treatment, and management of these dreadful diseases (3,4).
Artificial intelligence has also seen another recent breakthrough of large language models (LLMs) and is gradually revolutionizing several fields, including medical research (5). Science and technology with artificial intelligence (AI)-powered LLMs like OpenAI’s Generative Pre-trained Transformer (GPT) series, may improve in the understanding and creation of human-like texts, which offers the possibility of extensive applications in medicine-computing data and communicating with patients, as well as synthesizing literary data (5,6). In addition, the incorporation of LLMs in health care is likely to bring about positive changes in prediction models, thereby leading to improvement in the field of medicine where physicians and other care givers are able to attend to their patients in a better way (6,7).
The aim of this literature review is to explore the emerging role of LLMs within HNC. In this paper, we review the current usage, the possible advantages and the issues tied to applying of these advanced tools in addressing HNC. Moreover, we outline how LLMs may be used to support more targeted patient care and draw possible future research avenues. We present this article in accordance with the Narrative Review reporting checklist (available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-152/rc).
Methods
Literature search strategy
A literature review was performed for assessing the role of LLM in HNC through major electronic databases like PubMed, Science direct, Web of Science, among others. The search was carried out using a combination of keywords: (large language model OR ChatGPT) AND (head and neck OR oropharynx OR oral OR thyroid) AND (cancer OR neoplasm). The different articles used in the study were sourced from the different scholarly, open journal platforms that have articles written in English without any restriction in the year of publication.
Inclusion criteria
We selected the most recent and relevant research publications retrieved from scholarly databases, which are related to the use of LLM in the diagnosis, treatment, or management of HNC. The articles had to outline the findings of the LLM applications and it was vital for these studies to include detailed information on the issue addressed as well as the methods applied.
Exclusion criteria
Our review included only peer-reviewed sources, abstracts, presentations, editorials, and commentaries, while using the data sources, efforts were made to exclude such studies that were taken from patients who had cancers of other parts of the body other than the head and neck region. It also excluded any articles that did not use LLM at all or only used quantitative methods without considering explicit machine-learning techniques (Table 1).
Table 1
Items | Specification |
---|---|
Date of search | April 2nd, 2024 |
Databases and other sources searched | PubMed, Science direct, Web of Science, among others |
Search terms used | (large language model OR ChatGPT) AND (head and neck OR oropharynx OR oral OR thyroid) AND (cancer OR neoplasm) |
Timeframe | Up to April 2024 |
Inclusion and exclusion criteria | Inclusion criteria: recent and relevant research on LLM in diagnosing, treating, or managing head and neck cancer; studies must outline LLM application findings and detailed methodologies |
Exclusion criteria: only peer-reviewed sources, abstracts, presentations, editorials, and commentaries included; excluded studies on cancers outside the head and neck region; excluded articles without LLM use or using only quantitative methods without machine-learning techniques | |
Selection process | L.L.S. and I.J.C.N. conducted the selection independently and M.A.L. was the third evaluator in case of disagreement |
LLM, large language models.
Data analysis
We conducted a qualitative descriptive review to compare findings across existing studies, identify common trends, patterns, and variations in the use and effectiveness of LLMs for HNC. These included classification according to the type of LLM application ranging from evolution and capabilities to the applications of LLMs, the interaction of LLMs with HNC among others and data management/analysis.
Results
LLM
Deep learning, a specific form of AI, is used by LLMs to analyze and produce contextually relevant and grammatically flawless text. This can significantly enhance industries such as customer support, writing and editing services, and data analysis by providing complex and human-like natural language processing (NLP) capabilities (Figure 1).
Evolution and capabilities of LLMs
LLMs play a crucial role in advancing NLP. These models use machine learning and deep learning techniques to learn from vast amounts of textual data, leading to the creation of neural networks that excel in both understanding and generating human language (5,8). For example, GPT-3, with its 175 billion parameters, demonstrates a wide range of capabilities. It can proficiently respond to complex queries, summarize documents effectively, generate coherent text that aligns with the provided context, and even assist in writing and debugging computer code. These skills showcase the adaptability of LLMs across various fields such as research, content creation, software development, and more (5,7,8).
Current applications of LLMs in healthcare
In healthcare, LLMs are increasingly used for a variety of applications. They assist in clinical decision support, patient management, and medical research. For instance, LLMs can analyze electronic health records (EHRs) to identify patterns that predict disease progression, extract meaningful health information, and provide personalized treatment recommendations (8,9). Furthermore, they are employed in automating administrative tasks such as documentation, coding, and billing, thereby increasing operational efficiency (9,10).
Intersection of LLMs and HNC
The use of LLMs in the setting of HNC management is an exciting advance for improving both depth of insight into and quality of patient care. LLM may assist in the earlier detection phase of cancers by deep analysis of patient symptoms (11). They are able to review significant amount of research data, making the drug development and clinical decision support process accelerate scientific progress and develop new therapeutic approaches (11,12). Furthermore, patient-level data available for patients are increasingly attractive as LLM resources, which are considered important for patient education, and information in individual-specific profiles on their own disease self-management and treatment choices (12,13).
Data management and analysis
Given the complexity and high volume of patient data in HNC, LLMs are necessary for the management and analysis of these large datasets (11). LLMs are capable of automatically reading and interpreting unstructured text data from scientific articles, reports from clinical trials, and patient records, which can be helpful in gleaning insights or discovering patterns that are not easily discernible by human analysts (13,14). This enables researchers to connect disparate literature and clinical studies more quickly, ultimately contributing to the elucidation of cancer biology and therapeutic targets (14).
Predictive analytics
Historical data are being used to provide risk estimates in HNC by means of LLMs for predictive analytics (5,14). By evaluating data patterns on treatment approaches and patient responses obtained in historical studies, such models can predict the wisdom of responses to a range of treatment approaches, probable side effects, and chances of survival (15). The greatest benefit of this predictive capability is in diseases such as HNC, where early detection and accurate prediction will substantially impact outcome (14-16).
Personalized medicine
In the context of personalized medicine, LLMs are crucial for improving patient-specific treatment options. LLMs are able to generate personalized therapeutic recommendations (5,8) by integrating genetic profiles as well as other data such as lifestyle information and historical health data from clinical and laboratory records. These individualized treatment strategies may include more recent applications, including targeted cancer therapies and immunotherapy, due to their enhanced efficacy and limited adverse effects compared to traditional methods (15). This selective deployment of LLMs not only enhances therapeutic efficacy but also greatly mitigates the guesswork in medication prescribing, thus promoting a more personalized and patient-centric healthcare paradigm (15,17).
Clinical decision support
LLMs help clinicians providing real-time medical information and patient-specific advice (18). These methods incorporate data from disparate sources, including real-time patient monitoring systems and newer research findings, to deliver increased point-of-care actionable information to the clinician (19,20). In high-acuity, time-limited environments requiring rapid clinical decisions, this support can be vital (20).
Challenges and limitations
The application of LLMs in HNC is not without limitations and ethical concerns, despite their promise. Significant privacy concerns are raised by the handling of sensitive patient data, and there must be robust safeguards for personal health information (5). The fact that AI models can inadvertently perpetuate systemic bias when diagnosing or treating members of particular patient demographics reminds you that the performance of these systems should be continuously checked and refined. Similarly, incumbent in the seamless integration of LLMs into existing medical workflows are logistical challenges. Frequently, this entails substantial adaptation of healthcare IT systems and the extensive training of medical personnel in order to apply the tools in daily clinical practice (21,22). In addition, the fidelity and interpretability of clinical AI-based decisions are paramount prerequisites for trust and interpretability in AI medicine (12,16). These challenges raise the importance of thoughtful deployment and robust evaluation on LLMs within healthcare infrastructures to harness the benefits while mitigating potential harms (17).
Addressing the challenges
Given these issues, strong data security and strict regulatory standards for the use of AI in healthcare are important to tackle (23). There is a pressing need to reskill healthcare professionals to be able to leverage AI, provided by broad educational and training programs, teaching them how to utilize tools to the best of their abilities (5). Continuous monitoring and auditing of AI systems are also necessary to guarantee their accuracy and fairness (23,24). By thinking ahead, the AI Integrity Steering Committee is putting in place safeguards to reduce the possibility of future biases or inaccuracies in AI-enabled technologies used in patient care. These steps are vital for instilling trust and confidence in AI-driven healthcare solutions (24).
Discussion
LLMs have shown great potential to enhance HNC care. LLMs may be able to decrease the time to treatment initiation by taking steps out of the diagnostic process, such as automatically reading imaging findings for more accurate diagnosis in a timelier manner through the use of advanced data analysis and interpretation functions (5,7,8). However, their performance strongly depends on the quality of input data and the complexity of the models, which can vary greatly across different care settings. Additionally, while LLMs can process large volumes of medical literature and patient datasets for the inference of personalized treatment options, their deployment must be executed with caution. Issues related to data access, quality, publication bias, and relevance can question the validity of such capabilities (5,10,15).
Addressing the varied genetic and molecular profiles of individual tumors, these personalized regimens are vital in the management of multifaceted oncological diseases. This focus on particular populations could lead to the improvement of therapeutic efficacy and the reduction of side effects, though data from actual practice supporting this anecdotally beneficial effect has remained sparse (8,15). The constraints of current LLM implementations are those of the biases that are always possible in a given training data; this, in turn, can influence the generalizability of treatment recommendations across diverse patient populations (15).
In addition, LLMs play an important role in speeding up head and neck oncological research by automating the extraction and integration of insights from multiple studies and clinical trials (21). Accelerating research and strengthening the rigor of the findings given the more refined patterns and correlations that this may identify and that human investigators may not attend to (21-23). Nevertheless, reliance on automated systems may compromise the reproducibility of results and may miss critical context that a human researcher might catch (22).
LLMs analyze the treatment protocols based on real-world data and clinical outcomes to continuously improve the treatment process. LLMs can be used to model longitudinal data that predict patient response and side effects with chemotherapy to individualize doses while minimizing negative effects and maximizing patient outcomes (17). The dynamic nature of this calibration assists in adapting treatment strategies in response to changing evidence, an attribute that is especially important in a field such as HNC, in which new treatment modalities are introduced regularly (25,26). However, translating such findings into routine clinical practice remains difficult due to systems-level nuances in healthcare institutions and varying degrees of practitioners’ trust in AI-derived advice (23).
Moreover, LLMs provide health professionals with up-to-date, evidence-based recommendations that are individualized to the patient’s context and can help in the decision-making process. Nonetheless, their uptake should also be facilitated in a manner that supports, not subverts, clinical judgment (26).
The future horizons of LLMs in HNC are wide and uncertain. Several expected developments include combining LLMs with other AI approaches such as imaging AI, potentially creating synergistic AI aiding diagnostics and therapeutics (26). For example, while the text analysis capabilities of LLMs could be leveraged in combination with imaging AI to transform the diagnosis and surveillance of cancer, such combinations face significant technical and regulatory hurdles (16,17).
Another potential application is the incorporation of LLMs into predictive modeling for HNC, where their input may enable the prediction of outcomes of different treatment plans based on highly detailed patient models (19,27). This could help improve targeting performance and reduce the trial-and-error nature connected with cancer care (27,28). By supplying patient information in the process, LLMs could assist in decision-making and potentially improve outcomes (27). LLMs could similarly assist in tuning and personalizing treatment protocols in the complex setting of surgery (28). Nonetheless, the level of complexity needed for these models also means they would likely require rigorous validation and regulatory approval before they could be implemented broadly (16).
Future evolution of LLMs to understand and predict patient-specific responses to specific treatment combinations has the potential to allow the development of significantly more powerful real-time adaptive cancer treatment regimens (17-19). Such regimens would be responsive to the evolving pattern of harms and benefits in the patient’s drug experience, absolutely maximizing therapeutic efficacy and minimizing toxicity (27). For the practical use of these systems in clinical settings, system integration, ongoing healthcare provider education, and meticulous observation will be necessary to guarantee patient safety and treatment effectiveness (28).
Guidance on future research directions
The work concludes with recommendations for future research directions, emphasizing the need for interdisciplinary collaboration and technological advancements to overcome existing barriers and exploit emerging opportunities in areas such as those described below (Figure 2).
Interdisciplinary collaboration
Future research should focus on interdisciplinary collaboration between oncologists, data scientists, and AI experts. This interdisciplinary approach will enhance the development and application of LLMs in HNC, ensuring that the models are both clinically relevant and technologically advanced.
Enhanced data collection
Developing standardized protocols for collecting high-quality, comprehensive data sets, including genetic, environmental, lifestyle, and treatment response data, is essential. This will improve the training of LLMs and increase the accuracy and applicability of their predictive analytics.
Ethical and bias studies
Research focused on identifying and mitigating biases in LLMs is critical. Ensuring that these models perform equitably across diverse populations is essential for their ethical application in clinical settings.
Longitudinal studies
Conducting longitudinal studies to track the outcomes and effectiveness of LLM-driven interventions in HNC can provide insights into their real-world impact and inform continuous improvements in the models.
Integration with other AI systems
Investigating the integration of LLMs with other AI technologies, such as diagnostic imaging tools and robotic surgery systems, could lead to more comprehensive AI-driven HNC care systems.
Training and education
Healthcare providers should receive training on the capabilities and limitations of LLMs. This education will help clinicians understand how to best utilize these tools in practice and how to interpret their outputs correctly.
Pilot programs
Before full-scale implementation, pilot programs should be conducted to assess the efficacy and safety of LLM applications in clinical HNC settings. Feedback from these programs can be used to make necessary adjustments and improvements.
Regulatory compliance
Ensure that all applications of LLMs in clinical settings comply with existing healthcare regulations and privacy laws. Developing new regulatory frameworks may also be necessary as these technologies advance.
Patient consent and transparency
It is crucial to maintain transparency with patients regarding the use of AI in their care, including the benefits and risks. Obtaining informed consent should be a standard practice when LLMs are involved in the treatment process.
Infrastructure development
Healthcare facilities should invest in the necessary IT infrastructure to support the integration of LLMs, ensuring that these systems are secure, reliable, and capable of handling large volumes of data.
Feedback mechanisms
Establishing mechanisms for continuous feedback from both healthcare providers and patients can help in monitoring the effectiveness of LLMs and addressing any issues promptly.
Conclusions
This area of research emphasizes the increasing role of LLMs in HNC management, with the potential to enhance diagnostic accuracy, personalize treatment regimens, and foster accelerated oncologic research. By processing and synthesizing vast medical literature and patient data, LLMs would provide powerful tools to improve the precision of HNC care. However, associated with these are caveats on data quality, model bias, and integration into broader healthcare frameworks, where considerable benefits can proceed. It is crucial to achieve validation that is secure and in compliance with regulatory to ensure safety for patients and treatment efficacy. From this, great promise will be held for integrating LLMs with other AI technologies, such as imaging AI, for improved diagnosis and treatments of HNCs. Predictive models and real-time adaptive treatment regimens can significantly improve treatment precision with minimal trial-and-error aspects in cancer therapeutics. Active research and collaborations between multidisciplinary teams are a must to surmount the challenges and harness the full potential of LLMs to eventually achieve improved patient outcomes and the development of HNC treatment.
Acknowledgments
Funding: This study was financed in part by
Footnote
Reporting Checklist: The authors have completed the Narrative Review reporting checklist. Available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-152/rc
Peer Review File: Available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-152/prf
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-152/coif). L.L.S. serves as the unpaid Associate Editor-in-Chief of Journal of Medical Artificial Intelligence from February 2024 to January 2026. The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Barsouk A, Aluru JS, Rawla P, et al. Epidemiology, Risk Factors, and Prevention of Head and Neck Squamous Cell Carcinoma. Med Sci (Basel) 2023;11:42. [Crossref] [PubMed]
- Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
- Matos LL, Kowalski LP, Chaves ALF, et al. Latin American Consensus on the Treatment of Head and Neck Cancer. JCO Glob Oncol 2024;10:e2300343. [Crossref] [PubMed]
- Wei K, Fritz C, Rajasekaran K. Answering head and neck cancer questions: An assessment of ChatGPT responses. Am J Otolaryngol 2024;45:104085. [Crossref] [PubMed]
- de Souza LL, Fonseca FP, Martins MD, et al. ChatGPT and medicine: a potential threat to science or a step towards the future? J Med Artif Intell 2023;6:19. [Crossref]
- de Souza LL, Fonseca FP, Araújo ALD, et al. Machine learning for detection and classification of oral potentially malignant disorders: A conceptual review. J Oral Pathol Med 2023;52:197-205. [Crossref] [PubMed]
- de Souza LL, Lopes MA, Santos-Silva AR, et al. The potential of ChatGPT in oral medicine: a new era of patient care? Oral Surg Oral Med Oral Pathol Oral Radiol 2024;137:1-2. [Crossref] [PubMed]
- Shah NH, Entwistle D, Pfeffer MA. Creation and Adoption of Large Language Models in Medicine. JAMA 2023;330:866-9. [Crossref] [PubMed]
- Li H, Moon JT, Purkayastha S, et al. Ethics of large language models in medicine and medical research. Lancet Digit Health 2023;5:e333-5. [Crossref] [PubMed]
- The Lancet Digital Health. Large language models: a new chapter in digital health. Lancet Digit Health 2024;6:e1. [Crossref] [PubMed]
- Wu RT, Dang RR. ChatGPT in head and neck scientific writing: A precautionary anecdote. Am J Otolaryngol 2023;44:103980. [Crossref] [PubMed]
- Kuşcu O, Pamuk AE, Sütay Süslü N, et al. Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer? Front Oncol 2023;13:1256459. [Crossref] [PubMed]
- Vaira LA, Lechien JR, Abbate V, et al. Accuracy of ChatGPT-Generated Information on Head and Neck and Oromaxillofacial Surgery: A Multicenter Collaborative Analysis. Otolaryngol Head Neck Surg 2024;170:1492-503. [Crossref] [PubMed]
- Guo E, Gupta M, Sinha S, et al. neuroGPT-X: toward a clinic-ready large language model. J Neurosurg 2024;140:1041-53. [Crossref] [PubMed]
- Liu J, Wang C, Liu S. Utility of ChatGPT in Clinical Practice. J Med Internet Res 2023;25:e48568. [Crossref] [PubMed]
- Sievert M, Conrad O, Mueller SK, et al. Risk stratification of thyroid nodules: Assessing the suitability of ChatGPT for text-based analysis. Am J Otolaryngol 2024;45:104144. [Crossref] [PubMed]
- Marchi F, Bellini E, Iandelli A, et al. Exploring the landscape of AI-assisted decision-making in head and neck cancer treatment: a comparative analysis of NCCN guidelines and ChatGPT responses. Eur Arch Otorhinolaryngol 2024;281:2123-36. [Crossref] [PubMed]
Liu S Wright AP Patterson BL Assessing the Value of ChatGPT for Clinical Decision Support Optimization. - Liu S, Wright AP, Patterson BL, et al. Using AI-generated suggestions from ChatGPT to optimize clinical decision support. J Am Med Inform Assoc 2023;30:1237-45. [Crossref] [PubMed]
- Ferdush J, Begum M, Hossain ST. ChatGPT and Clinical Decision Support: Scope, Application, and Limitations. Ann Biomed Eng 2024;52:1119-24. [Crossref] [PubMed]
- Malik S, Zaheer S. ChatGPT as an aid for pathological diagnosis of cancer. Pathol Res Pract 2024;253:154989. [Crossref] [PubMed]
- Oon ML, Syn NL, Tan CL, et al. Bridging bytes and biopsies: A comparative analysis of ChatGPT and histopathologists in pathology diagnosis and collaborative potential. Histopathology 2024;84:601-13. [Crossref] [PubMed]
- Tan S, Xin X, Wu D. ChatGPT in medicine: prospects and challenges: a review article. Int J Surg 2024;110:3701-6. [Crossref] [PubMed]
- Preiksaitis C, Rose C. Opportunities, Challenges, and Future Directions of Generative Artificial Intelligence in Medical Education: Scoping Review. JMIR Med Educ 2023;9:e48785. [Crossref] [PubMed]
- Kim JK, Chua M, Rickard M, et al. ChatGPT and large language model (LLM) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine. J Pediatr Urol 2023;19:598-604. [Crossref] [PubMed]
- Benary M, Wang XD, Schmidt M, et al. Leveraging Large Language Models for Decision Support in Personalized Oncology. JAMA Netw Open 2023;6:e2343689. [Crossref] [PubMed]
- Chiesa-Estomba CM, Lechien JR, Vaira LA, et al. Exploring the potential of Chat-GPT as a supportive tool for sialendoscopy clinical decision making and patient information support. Eur Arch Otorhinolaryngol 2024;281:2081-6. [Crossref] [PubMed]
- Durairaj KK, Baker O, Bertossi D, et al. Artificial Intelligence Versus Expert Plastic Surgeon: Comparative Study Shows ChatGPT "Wins" Rhinoplasty Consultations: Should We Be Worried? Facial Plast Surg Aesthet Med 2024;26:270-5. [Crossref] [PubMed]
Cite this article as: Souza LL, Correia-Neto IJ, Fonseca FP, Martins MD, Paes de Almeida O, Pontes HAR, Mariano FV, Santos-Silva AR, Araújo ALD, Khurram SA, Kowalski LP, Alzahem A, Hagag A, Vargas PA, Lopes MA. The role of large language models in advancing head and neck cancer research and care: a narrative review. J Med Artif Intell 2024;7:28.