Improving documentation quality and patient interaction with AI: a tool for transforming medical records—an experience report
Highlight box
Key findings
• Voa demonstrated significant growth in document generation and user adoption. The data show that the platform has established itself as an effective tool, seamlessly integrating into clinical workflows.
• User satisfaction consistently improved, with a decline in detractors on the Net Promoter Score and an 84% score on the Customer Satisfaction Score.
• Voa has proven to be a scalable solution, and challenges remain for future refinements to further optimize its clinical utility.
What is known and what is new?
• Excessive documentation time contributes to physician burnout, medical errors, and reduced quality of patient care. AI tools, including digital scribes, have proven to alleviate some of these challenges by automating the documentation process.
• Voa is the first AI tool tailored to Brazilian healthcare, successfully integrating into clinical workflows and improving both documentation efficiency and user satisfaction in a real-world setting.
What is the implication, and what should change now?
• The tool has been validated for scalability and adaptability in the Brazilian healthcare environment.
• Further refinements should focus on improving transcription accuracy, especially for Brazilian Portuguese medical terminology, and expanding customizable templates. Randomized controlled trials are important to solidify long-term clinical benefits and generalize findings across diverse healthcare settings.
Introduction
Background
For every hour physicians spend with patients, nearly two additional hours are spent on electronic health record (EHR) and desk work within the clinic day, with another one to two hours of personal time each night dedicated to additional computer and clerical tasks (1). Physicians spend 34% to 55% of their workday creating notes and reviewing medical records in the EHR (2). Primary care physicians spent an average of 16.5 minutes in face-to-face interactions with patients and 16.4 minutes on EHR tasks, including after-hours work, according to a study of 982 clinic visits (3). The 2020 Medscape National Physician Burnout and Suicide Report indicated a burnout rate of about 43%, with too many bureaucratic tasks topping the list of burnout causes (4). Healthcare workers, and especially perioperative clinicians seem to be at particular risk for burnout (5).
Studies have shown that manual documentation during medical consults increases the risk of medical errors and decreases the overall quality of care. The repetitive use of “copy and paste” can cause severe adverse patient events by introducing inaccuracies and spreading outdated information. It leads to discordant notes and contributes to creating long notes that mask essential clinical information, significantly contributing to these issues (6). A mistake in copy-pasting is reported to account for over 36% of errors in data entry, which can have serious implications for patient safety (7).
Voa AI tool
To address these issues, an AI-driven tool called Voa was developed in Brazil to convert audio from medical consultations into optimized clinical documents. Voa combines automatic transcription technologies, like Whisper (8), with generative artificial intelligence to convert speech into text. A generative AI layer that corrects common transcription errors, adapts specific medical terms, and optimizes grammar then processes this text. It ensures the clinical and linguistic accuracy required in medical records. The tool further organizes and structures the text into a coherent medical document, regardless of the conversation’s sequence, facilitating seamless integration into electronic medical record systems. Also, Voa can be used on both mobile and desktop platforms, providing healthcare professionals with the flexibility to document consultations from any device.
Doctors can begin recording at the start of a patient visit and proceed with the consultation as usual. Once the recording is finished, the system automatically generates the document, which can be copied with a single click, edited, and printed. Voa’s capabilities extend beyond medical history documentation to include optimized prescriptions, medical certificates, referrals, examination requests, and clinical summaries. The clinical summary is particularly valuable, as it translates the consultation into language that is easily understandable for the patient, summarizing key points discussed, diagnoses made, orientations given, and treatment plans.
Step-by-Step usage
Starting a consultation
Doctors initiate the process by clicking on the “Start Recording” button at the beginning of a consultation. This allows Voa to capture the entire conversation between the doctor and the patient. It is also possible to upload an audio file if necessary. A schematic of steps 1, 2 and 3 can be found in Figure 1.
Pausing and continuing
If needed, the recording can be paused and resumed. During the consultation, doctors can speak naturally without worrying about the order of information. The system is designed to handle parallel conversations and unrelated remarks seamlessly. Non-essential information and side conversations are filtered out, allowing the focus to remain on critical medical details.
Finishing the consultation
At the end of the consultation, doctors click “Generate Medical History”, and Voa generates the document.
Reviewing transcriptions
Additionally, there is a window with the transcription of the consultation that allows doctors to review what has been literally converted from the conversation between the doctor and the patient (see Figure 2). Every minute of the consultation, the transcribed parts are updated in real-time as the consultation is being recorded. Upon concluding the consultation and generating the medical history document, the complete literal transcription is also presented. This allows for comparative analysis by the doctor between what was spoken and what has been organized and structured into a document.
Adding annotation
Doctors can type notes in this annotation field (see Figure 2) while recording, and when generating the document, Voa will consider both the annotations and the audio. These notes can include specific instructions or additional information that was not covered during the audio recording, such as clinical information about the patient, including medication lists, previous medical records, and examination lists. Voa incorporates these notes into the final structured document, to capture all relevant information.
Medical history
Voa organizes the recorded audio into a structured document, which includes sections such as Chief Complaint, History of Present Illness, Lifestyle Habits, Past Medical History, Vaccination History, Current Medications, Family History, Physical Examination, Vital Signs, Test Results, Diagnostic Hypotheses, Prescribed Medications, Recommendations, Additional Tests, Medical Certificate, and Referral. Voa automatically completes these sections if the relevant information is mentioned during the consultation. If a topic is not discussed, it is simply left out of the document. An example of the generated document can be found in Figure 3.
Regenerating the document
Doctors can add additional notes even after the medical history has been generated (see Figure 3). Once the annotations are added, the doctor can regenerate the document. Voa incorporates these notes into a new structured document, ensuring all relevant information is reorganized.
Finalizing and integrating
The final document is organized and structured in medical language appropriate for clinical records. It can be edited if necessary and then securely copied and pasted into the medical record system with a single click, ensuring all patient data is accurately recorded and easily accessible. It is also possible to print the document with a single click (see Figure 3).
Document sharing
It is possible to share documents with other Voa users by generating a link (see Figure 3). Doctors can select to allow editing or restrict the document to view-only, facilitating better case discussions and collaboration among medical professionals.
Additional document generation
Besides medical history, Voa can generate other essential documents such as Prescription, Medical Certificate, Referral, Examination Request, and Clinical Summary. These documents are generated after the medical history and are created if the doctor has discussed these topics during the consultation. If the document contains relevant information on these topics, Voa can subsequently generate these additional documents, organizing everything accurately.
System architecture and technologies used
The system architecture incorporates several components and is continuously improved to meet the evolving needs of healthcare professionals. First, data capture is performed securely and in real-time, ensuring that audio from consultations is immediately available for processing. The initial transcription is handled by Whisper, which provides a literal conversion of speech to text. Subsequently, templates developed by Voa, associated with a large language model (LLM) based on GPT-4o, adapted to the needs of various specialties, processes this raw transcription through its AI layer, which refines the text to correct errors and adapt it to medical language, considering the nuances of Brazilian Portuguese. This refined transcription includes additional annotations provided by the physician, which are processed together with the audio transcription by the AI to generate a comprehensive, organized, and structured medical document.
To improve accuracy and avoid potential issues such as hallucinations, where AI-generated content can include incorrect or misleading information, Voa’s algorithms are continuously refined based on user feedback. These improvements focus on enhancing the precision of medical terminology, context understanding, and document structure, ensuring the tool produces reliable and accurate clinical records. Regular updates based on clinical usage help minimize the occurrence of such errors, making the system more reliable for clinical applications.
The platform’s development involved extensive training and testing with diverse audio samples to improve its performance in real-world scenarios. The output is formatted in Markdown and rendered for the user in a WYSIWYG editor. Audio recordings are anonymized and encrypted in accordance with the Brazilian General Data Protection Law (LGPD) to ensure the security and privacy of patient information.
Rationale and knowledge gap
A comprehensive study by Stanford Medicine and Google Health involving 254 primary care providers (PCPs) examined their experiences with EHR documentation and attitudes towards AI-assisted documentation, aiming to identify the most time-consuming and burdensome aspects of EHR documentation and gauge PCPs’ preferences for AI assistance. Key findings indicated that AI-assisted documentation could alleviate clerical burdens, allowing providers to focus more on cognitive tasks and patient care, emphasizing that AI tools should be inconspicuous, efficient, and provide high-quality, accurate notes (9).
AI can free up physicians’ cognitive and emotional space for patients and shift the focus away from transactional tasks to personalized care (10). AI increases learning capacity and provides decision support systems at scales that are transforming the future of healthcare (11). Tools like ChatGPT use LLMs, multi-layer neural networks trained on large amounts of data to simulate human conversation (12). LLMs have already been used to interpret electronic medical record data (13) and demonstrate a high level of understanding of clinical dialogue structure, evidence towards the potential of AI-assisted tools in reducing clinical documentation burden (14).
A study found that summarizing EHRs burdens clinicians. Eight LLMs were adapted for radiology reports, patient questions, progress notes, and doctor-patient dialogue. Summaries from well-adapted LLMs were equivalent (45%) or superior (36%) to those by medical experts, suggesting LLMs can reduce documentation burdens and improve patient care (15). AI-generative technologies such as the Med-Gemini platform have demonstrated how highly capable, multimodal models specialized in medicine can significantly improve clinical reasoning and multimodal understanding in medical contexts (16).
Digital scribes or intelligent documentation support systems, leveraging advances in speech recognition, natural language processing, and artificial intelligence, automate the clinical documentation task currently conducted by humans. These tools offer a gateway into the clinical workflow for more advanced support for diagnostic, prognostic, and therapeutic tasks (17). The more seamless the digital scribe solution, the greater the support for the clinician engagement with patients. Any digital scribe solution that requires ongoing input and supervision throughout the consultation will distract clinicians from patients and replace the distractions and disruptions of using an EHR with those of a digital scribe (18).
Several AI-driven tools are transforming medical documentation by enhancing efficiency and accuracy in clinical settings. For example, Freed captures patient interactions and generates SOAP notes, reducing the time clinicians spend typing by adapting to their unique documentation styles and integrating smoothly with EHR systems, allowing for easy copy-pasting of notes (19). Suki AI assists clinicians with real-time documentation by ambiently capturing patient interactions and generating detailed medical notes. It integrates deeply with major EHR systems,and prioritizes AI safety, minimizing the risk of errors before finalizing notes in the EHR (20). DeepScribe focuses on accurate transcription of complex medical dialogues, improving patient interaction quality by allowing clinicians to maintain eye contact during consultations (21). Nuance Dragon Medical One is a specialized speech recognition software for healthcare, efficiently transcribing spoken words into text. It streamlines clinical workflows by enabling fast and accurate documentation of patient records, including notes, prescriptions, and other medical documents (22).
Technologies like Voa emerge as solutions to optimize the documentation process, enhance clinician satisfaction, and reduce documentation time. This possibility is supported by studies demonstrating that AI can improve these tasks for doctors and alleviate issues related to performing numerous tedious, repetitive, and often difficult tasks, such as adding documentation to electronic medical records (23). It underscores the view that artificial intelligence will not replace doctors but will enhance their capabilities and their efforts to care for patients. Perhaps the only healthcare providers who will lose their jobs over time may be those who refuse to work alongside artificial intelligence (24).
Objective
This study aims to validate Voa as a generative AI tool, offered as a software as a service (SaaS) solution, designed to enhance medical record creation, reduce the bureaucratic burden on doctors, and minimize errors commonly associated with copy-paste practices. The study will evaluate key implementation metrics and provide an overview of the supporting technologies, with a focus on user acceptance and adoption among physicians. Additionally, we present an experience report on the tool’s implementation in Brazil, highlighting ongoing improvements driven by user feedback and clinical usage, within a methodological framework that can be adapted and refined in future research.
Methods
Implementation and evaluation
Metrics were collected and analyzed over the period from March 6 to August 5, 2024, focusing on doctors who created an account on Voa after being informed through advertisements and webinars. Quantitative metrics included document generation and user count, while qualitative metrics focused on user satisfaction and feedback, gathered through surveys like Net Promoter Score (NPS) and Customer Satisfaction Score (CSAT).
To analyze the quantitative metrics, we focused on key indicators, including the number of users per day, the cumulative user count, the number of documents generated daily, and the cumulative document count. Daily monitoring of the number of medical history generated allowed us to track the tool’s usage volume continuously. Additionally, we assessed adherence levels by tracking both the number of new users per day and the cumulative number of users. The activation rate, defined as the percentage of new users who registered and generated at least one document within a week, was also analyzed to gauge how intuitive and effective the platform is in meeting user needs from the outset.
For qualitative metrics, we implemented a comprehensive survey strategy utilizing UserPilot. We deployed the NPS survey to assess user satisfaction and the likelihood of recommending Voa to others. The NPS score was calculated based on responses to the question: “On a scale of 0 to 10, how likely are you to recommend Voa to a friend or colleague?” Respondents were categorized into three groups: Detractors [0–6], Passives [7–8], and Promoters [9–10]. Depending on the response, users were asked targeted follow-up questions to gather detailed feedback. Users who submitted the survey would see it again after 45 days, while those who chose to be asked later would see the survey after 7 days.
In addition to the NPS survey, we employed the CSAT Score to gauge overall user satisfaction. Users were asked, “How would you rate your experience with Voa so far?” on a scale of 1 to 5, with 1 being “Very Unsatisfied” and 5 being “Very Satisfied”. For CSAT, we collected 100% of responses daily and triggered the survey after a 3-second delay on a page. The survey was repeated every 2 days until dismissed twice by the user.
Voa was officially launched on March 5, 2024, but prior to this, a small group of beta testers was given early access to the platform. While data collection technically began on February 26, 2024, for these early testers, the official analysis commenced from the launch date to align with the public release. Thus, the quantitative metrics and the CSAT score were evaluated from March 6 to August 5, 2024. The weekly activation rate, a key metric measuring the percentage of new users who generated at least one document within a week of registration, was calculated starting from March 18, 2024. This date was chosen because it provides a full one-week window from the official launch to accurately assess user engagement and behavior. Given that the activation rate is inherently tied to user registration dates, starting the analysis from this point ensures a standardized approach that reflects the behavior of users who joined the platform following its official release, thus offering a more consistent and reliable assessment. The NPS was analyzed from the beginning of April through the end of July to evaluate the platform’s performance on a monthly basis. This approach allowed us to track user satisfaction and engagement trends over consistent periods.
Statistical analysis
A descriptive analysis was conducted on both quantitative and qualitative data. For quantitative metrics, the average number of documents generated, user counts, and activation rates were extracted using SQL queries directly from the Postgres database and organized in an Excel sheet to generate the graphs. This date was used to calculate and visualize the key trends. The activation rate was calculated by tracking the percentage of new users who generated at least one document within 7 days of registration, with the calculation being updated every 7 days to reflect new user behavior. The qualitative data from the NPS and CSAT surveys were summarized using percentages and frequencies to provide insights into user satisfaction and sentiment over time. This analysis provided a comprehensive overview of user behavior and platform performance throughout the study period. The NPS score was calculated as (%promoters − %detractors) to determine the loyalty index.
Important considerations
There is still considerable debate regarding the validation of AI tools in healthcare, and a comprehensive study highlights several key elements that must be addressed. First, the tool must demonstrate its ability to address a critical clinical need, such as enhancing decision-making in areas where current judgment may be insufficient, such as early diagnosis of critical conditions. It should also provide clinically meaningful improvements in patient outcomes or care processes. Furthermore, external validation across diverse populations is crucial to ensure the tool’s safety and accuracy in various clinical settings. Finally, the AI tool should integrate seamlessly into clinical workflows, delivering outputs that are user-friendly, easily interpretable, and actionable for clinicians (25).
Moreover, the tool must be developed under a governance and regulatory framework that ensures continued safety and efficacy throughout its lifecycle. This includes compliance with software as a medical device regulations and continuous monitoring for performance shifts due to data or environment changes. Clinician trust is essential, which requires transparency regarding the tool’s data, assumptions, and limitations, as well as collaboration with interdisciplinary teams for its development and deployment. Ensuring the tool aligns with clinician workflows and enhances rather than replaces their judgment is also critical for its widespread adoption (25).
In choosing the NPS and CSAT for our study, we recognize these tools as valuable initial indicators of user sentiment and satisfaction. The Journal of the Academy of Marketing Science article acknowledges NPS as a popular measure, yet it emphasizes the importance of viewing it as one component within a broader feedback strategy (26). Also, a meta-analysis conducted by Mittal et al. (2023) highlights the significant impact of customer satisfaction on both customer-level outcomes, such as retention and word of mouth, and firm-level outcomes, emphasizing the critical role of CSAT in understanding and improving these key performance indicators (27). As customer feedback methods continue to evolve, incorporating NPS and CSAT allows us to capture immediate insights while remaining open to integrating more refined tools as they are developed.
It is important to highlight that the NPS and CSAT surveys conducted allowed users to provide feedback through comments. A comment section allowed users to provide further feedback, including criticisms, praises, and suggestions. These testimonials were carefully considered for the continuous updating and improvement of the platform. Additionally, meetings and calls were held with doctors to discuss their experiences and gather insights directly, ensuring a personal touch in the feedback process and fostering a collaborative environment for the platform’s development. Feedback collected through these surveys can also provide insights into whether the generated documentation was organized as expected and whether there were any instances of hallucinations, where LLMs might generate incorrect or misleading information in the documents.
This study did not involve traditional human subjects but focused on analyzing metrics related to the tool’s usage by voluntary doctors who integrated the system into their practice. All collected information was anonymized, ensuring no access to patient data or identifiable information, thus obviating the need for approval from a research ethics committee. The study prioritized evaluating an already implemented technology without compromising data privacy or security. The paper’s evaluations are based on tool usage metrics and data, not patient-specific information.
Results
Quantitative metrics
Figure 4 tracks the cumulative number of documents generated using the Voa platform from its inception on March 6 to August 5, 2024. The platform started with 15 documents on March 6, growing to 734 by the end of the month. The upward trend continued, with the total reaching 3,735 by the end of April. In May, document generation accelerated significantly, with the cumulative total reaching 8,707 by May 31. This growth persisted through June and July, surpassing 16,000 documents by July 8 and hitting 23,716 by August 1. As of August 5, 2024, the platform had generated a total of 24,654 documents.
Figure 5 tracks the cumulative number of users who registered on the Voa platform from March 6 to August 5, 2024. In the early days of March, the number of users started with slower growth, increasing from 7 users on March 6 to around 239 users by March 31. This growth rate appears relatively stable, with small daily variations. The graph during this period shows a slight incline, indicating that the service began to gain traction but at a moderate pace. From April onwards, the growth became more noticeable. For instance, on April 1, the cumulative number of users was 253, and by the end of the month, it had jumped to 633.
Between early May and mid-June, the graph shows a more constant pattern with a slightly less steep incline compared to the previous period. User numbers continued to rise steadily, but without significant peaks. For example, users grew from 649 at the beginning of May to 1,027 by the end of the month. However, from late June into July, the curve steepened again, reflecting a faster growth rate, with users increasing from 1,062 in early June to 1,305 by June 29. This accelerated pace continued through July and August, with the number of users reaching 2,006 by August 5.
Figure 6 illustrates the number of documents generated daily by the Voa platform from March 6 to August 5, 2024, along with a rolling average to highlight trends. The solid yellow line shows significant day-to-day variation in document generation, while the dashed red line represents the rolling average, smoothing fluctuations.
In early March 2024, document generation was low, ranging between 0 and 15 documents per day during the first week. By mid-March, activity surged, with a peak of 75 documents being generated on March 19, stabilizing the 7-day rolling average at around 40 documents per day. April marked significant growth, with the rolling average reaching 96 documents per day by April 11 and peaking at 129 by the end of the month. Daily generation reached a high of 208 documents on April 30.
In May, growth continued, highlighted by 308 documents generated on May 15, with a rolling average of 186 by May 20. June and July saw further increases, with a peak of 504 documents on July 31 and a rolling average of 363 in this date. Although there was a slight decrease in early August, with the daily count dropping to 74 on August 3 and the rolling average declining to 343, overall activity remained high, with 410 documents generated on August 5.
Notable seasonal patterns are evident, particularly on weekends, when document generation dipped. For instance, on April 13 and 14, numbers dropped to 6 and 58 documents, followed by a sharp increase to 175 on the following Monday. This pattern recurred across the data, such as on July 20 and 21 (92 and 41 documents) and August 3 and 4 (74 and 32 documents), with corresponding jumps on the subsequent weekdays (407 documents on July 22 and 410 on August 5).
Figure 7 shows the number of new users registrations per day joining the Voa platform daily from March 6 to August 5, 2024. In the initial months from March to April, the daily user numbers showed significant fluctuation, with days seeing as many as 23 new users (March 23) and others registering as low as 1 or 2 (March 10). From May to June, the daily user growth began to stabilize, with fewer extreme fluctuations. The number of new users per day consistently ranged between 10 and 20, indicating a more predictable and steady growth phase. While there were still occasional dips, such as on June 22 with only 2 new users.
In July and August, a noticeable surge in daily user growth occurred, with days reaching over 30 new users, particularly on July 28 and July 31 with 39 new users. This suggests the platform hit a tipping point, where it began attracting more attention and achieving stronger growth. The pattern of daily growth became more robust, with fewer low-activity days and a general trend of sustained user acquisition.
Figure 8 illustrates the weekly activation rate of new users on the Voa platform from March 18 to August 5, 2024. Initially, the activation rate fluctuated, starting at 0.31 on March 18 and slightly decreasing to 0.29 by March 25. By April 1, the rate improved to 0.41 and continued to vary, reaching 0.41 again on April 8. A significant increase followed, peaking at 0.58 on April 15 before dipping to 0.43 on April 22 and ending April at 0.47 on April 29.
In May 2024, the activation rate displayed an upward trend, starting at 0.48 on May 6, rising to 0.57 on May 13, and reaching its highest point of 0.63 on May 20. Although the rate slightly decreased to 0.53 by May 27, it remained consistently above 40%. June and July 2024 showed strong, albeit variable, rates. Beginning at 0.55 on June 3, the rate peaked at 0.60 on June 24, before gradually decreasing to 0.53 by July 1. It climbed to 0.59 on July 8, before declining to 0.40 by July 29. The final data point on August 5, 2024, indicated a recovery to 0.48. Notably, after the initial increase in April, the activation rate consistently stayed above 0.40, demonstrating sustained user engagement throughout the period.
Qualitative metrics
NPS results
Figure 9 displays the NPS trends throughout the evaluation period, showing a progressive improvement in user sentiment. In April 2024, the NPS was 18, based on 11 responses, with 54.55% Promoters, 8.09% Passives, and 38.36% Detractors. By May 2024, the NPS had improved to 33, with 21 responses. Promoters accounted for 52.38%, Passives for 28.57%, and Detractors had decreased to 19.05%.
In June 2024, there was a substantial positive shift, with 26 responses resulting in an NPS of 62. Promoters increased to 73.08%, Passives decreased to 15.38%, and Detractors dropped further to 11.54%. By July 2024, the NPS slightly decreased to 58, with 45 responses. Promoters made up 64.44%, Passives 28.89%, and Detractors just 6.67%.
Across all periods, the overall NPS was 50, based on 103 total responses, with 63.11% Promoters, 23.3% Passives, and 13.59% Detractors. Of the responses, 17 included additional feedback. Among these, one comment suggested improving the tool’s audio capture during physical exams, while another recommended adding more document templates, such as for procedure indications, budgets, and standardized exams, or providing users with customization options for these forms. On the positive side, users highlighted that the tool was fast in generating documents, greatly assisted in streamlining their workflow, and praised its practicality and ease of use.
CSAT results
Figure 10 displays the CSAT results from 100 responses provided from March 6 to August 5 give an overview of user satisfaction with the Voa platform. Most users, 54% (54 responses), rated their experience with the highest score of 5. Additionally, 30% (30 responses) gave a rating of 4, while 13% (13 responses) rated their experience with a 3. Lower satisfaction levels were less common, with only 1% (1 response) giving a rating of 2, and 2% (2 responses) giving the lowest rating of 1. Among the 100 respondents, 22 provided detailed feedback. Improvements highlighted the need for more guidance on creating prescriptions, examination requests, reports, and referrals independently. There were also mentions of the system occasionally confusing medication names, leading to the need for manual checks. Additionally, some users noted challenges in fully adapting to the system and occasionally forgetting to turn it off. On the positive side, several users mentioned that Voa accurately captured the entire conversation, praised its precision and completeness, and expressed high satisfaction with the tool overall.
Discussion
Key findings
Growth in document generation and users
The Voa platform experienced substantial growth in both document generation and user registration between March 6 and August 5, 2024. As outlined in Figure 4, the cumulative number of documents grew from 15 on the platform’s launch date to 24,654 by August 5, marking an important increase over just five months. The steepest period of growth occurred in May, when the number of documents surged from 3,735 at the end of April to 8,707 by May 31, almost doubling within that month. This trend continued into the summer months, reaching 23,716 documents by August 1. A similar pattern is seen in user registration, as shown in Figure 5, where the user base grew from 7 users in early March to 2,006 by August 5. The acceleration in user growth was especially noticeable in late June and July, suggesting an increased uptake and engagement with the platform during this period.
User engagement and activation challenges
The platform exhibited strong user engagement, consistently maintaining an activation rate above 40% following the initial surge in April. As shown in Figure 5, the cumulative number of users grew steadily, reflecting ongoing interest in the platform. However, the activation rate experienced notable fluctuations, particularly towards the end of July, when a dip was observed. During this period, Figure 7 shows a steady increase in registrations throughout July, despite lower activation levels. The activation rate began to recover in early August, coinciding with a continued rise in user registrations and increased engagement with the platform.
User satisfaction and feedback
The NPS, illustrated in Figure 9, and CSAT, illustrated in Figure 10, provided valuable insights into user satisfaction and areas for improvement. The NPS increased from 18 in April 2024 to 62 in June 2024, indicating a growing positive sentiment among users. By July 2024, 64.44% of respondents were classified as Promoters, reflecting strong user advocacy for Voa. Notably, there was a significant decrease in the percentage of Detractors, dropping from 38.36% in April to just 6.67% in July, which highlights the platform’s improvement in meeting user expectations. Additionally, the number of responses increased over time, with 45 responses recorded in July compared to just 11 in April, suggesting greater user engagement with the platform. Overall, the cumulative NPS score was 50, based on 103 responses, with 63.11% of respondents classified as Promoters, 23.3% as Passives, and 13.59% as Detractors. The CSAT results further supported this positive trend, with 84% of users rating their experience with Voa as either 4 or 5 out of 5. Qualitative feedback also highlighted areas for improvement, such as the need for better physical examination documentation and more customizable document templates. Despite the strong quantitative ratings, only a small number of respondents provided qualitative feedback, with 17 out of 103 NPS respondents and 22 out of 100 CSAT respondents offering additional comments. Some users expressed concerns about the occasional confusion with medication names, underscoring the need for manual verification, and others mentioned the system’s tendency to remain on after the consultation, creating minor issues in adapting fully to the workflow.
Seasonality patterns
The data also reveals notable seasonal patterns in document generation, particularly related to weekly cycles. As shown in Figure 6, document creation tends to dip during weekends, with substantial rebounds occurring on Mondays. For example, on April 13 and 14, only 6 and 58 documents were generated, respectively, while on the following Monday, the platform saw a sharp rise to 175 documents. This weekly fluctuation suggests that the platform’s users are significantly more active during the workweek. A similar trend is observed in late July, where weekend document counts dropped to 92 and 41 on July 20 and 21, followed by a jump to 407 on July 22.
Voa adheres to fundamental standards for validating AI tools
The Voa platform aligns with essential principles for AI tool validation in healthcare. It addresses a critical clinical need by reducing the documentation burden on physicians, allowing them to focus on patient care. By utilizing a generative AI layer to enhance accuracy and structure medical records, Voa reduces the risk of manual errors and improves patient safety. Furthermore, its seamless integration into clinical workflows, demonstrated by positive user adoption and satisfaction, confirms Voa’s ability to meet the practical needs of healthcare professionals while ensuring compliance with data protection regulations.
Explanations of findings
Growth in document generation and user
The significant growth in document generation and user adoption on the Voa platform from March to August 2024, as depicted in Figures 4,5, highlights its successful integration into clinical practice. Figure 4 shows that initially, document generation was slow, reflecting a typical learning curve as users took time to familiarize themselves with the platform’s features and adapt it to their workflows. As users gained confidence and became comfortable with the platform, the consistent rise in both document generation and user registrations suggests that Voa became a routine part of daily clinical activities. The sharp increase in document generation after the initial phase points to a growing trust in the platform, as healthcare professionals began using it more frequently for their documentation needs. Figure 5 further illustrates this trend, showing the parallel rise in user registrations, indicating that the platform was not only attracting new users but also maintaining regular engagement among existing users. This pattern of growth across both figures suggests that Voa successfully addressed the needs of its target audience, becoming a trusted and valuable tool in clinical practice, with users increasingly relying on it as part of their daily operations.
User engagement and activation challenges
The maintenance of an activation rate above 40% indicates that an increasing number of users are utilizing the Voa platform after registering, which can be attributed to effective engagement strategies, direct calls, and personalized contact with users. However, the dip in the activation rate observed towards the end of July may be explained by the fact that many doctors in Brazil typically take vacations during this period. This suggests that while some users registered, they may not have had the opportunity to actively test the platform during that time. They might have encountered Voa through advertisements or promotions but postponed trying it out due to their time off.
This hypothesis is further supported by the data from Figure 7, which shows a noticeable increase in daily user registrations from late July through early August. Many healthcare professionals could have registered in July while on vacation but waited until August to actively explore the platform upon returning to work. The sharp rise in user activity during August aligns with the idea that professionals were back from vacation, which allowed them to engage more fully with Voa.
Despite the temporary dip in the activation rate, the steady growth in user registrations (as illustrated in Figure 5) indicates sustained interest in the platform. The fact that Voa is the first platform of its kind in Brazil could also contribute to initial hesitations, as some users may require additional time to adapt to new workflows. The onboarding process might seem complex to some, especially given the busy schedules of healthcare professionals, further explaining the lag in adoption even after registration. Nevertheless, the steady increase in registrations and the recovery in the activation rate in August signal a positive trajectory for Voa as more users integrate it into their practice.
User satisfaction and feedback
The significant drop in the percentage of Detractors from 38.36% in April to just 6.67% in July, coupled with the increase in the number of responses from 11 in April to 45 in July, reflects the growing confidence and positive sentiment among Voa users. This shift, alongside a high CSAT where 84% of users rated their experience as 4 or 5 out of 5, suggests that the platform’s integration into clinical workflows is effectively addressing user needs and easing documentation burdens. As users became more familiar with Voa’s features, their satisfaction increased, leading to a larger portion of the user base becoming Promoters. However, the challenges identified in the qualitative feedback, such as the need for improvements in physical examination documentation and more customizable document templates, likely stem from the evolving needs of users as they delve deeper into the platform’s capabilities.
The feedback regarding occasional confusion with medication names highlights a key challenge in the platform’s current transcription capabilities. This issue stems from the fact that the transcription systems like Whisper are not yet fully adapted to the complexities of Brazilian Portuguese, particularly when it comes to medical terminology. Differences in pronunciation, regional accents, and the similarity of certain medication names make it difficult for automated systems to consistently achieve high accuracy. These limitations underscore the need for more advanced linguistic models that can better handle the specificities of the Brazilian healthcare environment, ensuring higher fidelity in transcription and reducing the need for manual verification by users.
Additionally, other users mentioned that the system’s tendency to remain on after the consultation created minor workflow interruptions, as it required manual intervention to turn it off. While these issues are not major impediments, they suggest that there is room for improvement in making the platform more intuitive and seamless in everyday clinical use. These feedback points indicate that while Voa is largely effective, adjustments are still needed to optimize its integration into diverse clinical settings, particularly in terms of local language adaptation and workflow automation.
As engagement grows, so do expectations for performance and flexibility, prompting users to identify areas where the platform could be further refined. The reason not all users who provided ratings also offered detailed feedback can be attributed to several factors. Some users may feel sufficiently satisfied with the platform and see no need to provide additional comments, especially if their experience meets their expectations. Others might find the feedback process time-consuming or unnecessary, particularly if their suggestions are minor. This disparity between quantitative ratings and qualitative feedback is often seen in surveys, where users who are either highly satisfied or dissatisfied are more likely to provide detailed comments, while those with moderate experiences tend to give ratings without further elaboration.
Seasonality patterns
The seasonal patterns in document generation, as shown in Figure 6, highlight a clear link between user activity and the workweek, with significant dips in document creation on weekends followed by sharp rebounds on Mondays. This suggests that the platform is primarily used in professional settings, where users engage more heavily during business days and less so on weekends. The consistent fluctuations, with document generation dropping on Saturdays and Sundays and then surging at the start of the week, indicate that most of the platform’s users are likely healthcare professionals who predominantly interact with the platform as part of their weekday routines.
Voa adheres to fundamental standards for validating AI tools
Moreover, the Voa platform aligns with many of the principles required for validating AI tools in healthcare (25). Firstly, Voa addresses a critical clinical need by significantly reducing the documentation burden faced by physicians, enabling them to focus more on patient care rather than clerical tasks. This is in line with the need for AI tools to solve practical problems in healthcare settings. The platform improves documentation accuracy by using a generative AI layer that structures medical records, reducing common errors associated with manual entry and copy-paste practices, which is essential for improving clinical outcomes and ensuring patient safety.
Voa’s successful integration into clinical workflows through features like real-time transcription, structured document generation, and cross-platform compatibility ensures that the tool remains user-friendly and aligned with clinicians’ needs. The positive user adoption rates and high satisfaction metrics demonstrate its acceptance among healthcare professionals, further validating its effectiveness. As the tool continues to evolve based on user feedback, it reflects an ongoing commitment to addressing clinician requirements, enhancing decision-making processes, and maintaining regulatory compliance with data protection laws like the LGPD in Brazil.
Comparison with similar researches
Studies have shown that manual documentation during medical consults increases the risk of medical errors and decreases the overall quality of care due to inaccuracies introduced by “copy and paste” practices (6,7). AI technologies, such as Voa, can automate documentation, and enhance the quality of medical records (9-11). AI tools, like digital scribes, have demonstrated similar benefits in reducing the documentation burden and improving patient care (12,14-18).
Similar to tools like Freed (19) and Suki AI (20), Voa streamlines the documentation process by reducing the need for manual typing and copy-pasting. It allows the entire generated document to be copied with a single click (see Figure 3), facilitating easy integration with EHR systems and minimizing the risks of manual documentation errors. Additionally, like DeepScribe (21), Voa enhances patient interactions by enabling physicians to maintain eye contact during consultations, seamlessly capturing and organizing dialogue. Moreover, Voa, like Nuance Dragon Medical One (22), efficiently documents patient records, including notes, prescriptions, and other medical documents.
A novelty of the Voa tool is its adaptation to the specific linguistic and cultural needs of Brazilian healthcare. Unlike many global solutions, Voa has been tailored to handle the nuances of Brazilian Portuguese, including regional accents and medical terminology that may not be accurately captured by other systems. Additionally, it aligns with the common structure and style of medical histories used in Brazil, ensuring that the generated documentation meets local clinical standards. This cultural and linguistic customization enhances the accuracy of transcriptions and minimizes errors that might arise from using tools not adapted to the Brazilian context. This focus on localization makes Voa uniquely positioned to serve the healthcare needs of Brazilian professionals, improving both the quality of documentation and the overall clinical workflow.
The study reinforces the significant impact of documentation burden on physician burnout, emphasizing that excessive time spent on EHR and clerical tasks detracts from patient care and contributes to high levels of burnout among healthcare professionals (1,2,4,5). This underscores the need for tools like Voa that can streamline documentation processes and reduce the pressures associated with these tasks.
Implications and actions needed
The positive results observed so far are promising and point to the potential for continuous growth. To improve user onboarding and engagement, it may be beneficial to enhance educational materials, provide clearer guidance on how to use the platform effectively, and continue building trust with users through consistent communication and support. Additionally, expanding the platform’s features to include more customizable options, and integrating user feedback into future updates can help ensure that Voa remains responsive to the evolving needs of healthcare professionals. Continuous monitoring of user engagement metrics, combined with proactive outreach efforts, will be key to maintaining high activation rates and fostering long-term user loyalty. Moreover, seasonality pattern underscores the importance of recognizing user behavior influenced by workweek schedules and potentially adjusting platform support or features around these cycles to optimize usage.
Strengths and limitations
One of the major strengths of this study is its presentation of both qualitative and quantitative metrics over March to August, offering a comprehensive view of the Voa tool’s impact on clinical practice. The evaluation included key indicators such as document generation, user adoption, and activation rates, along with user satisfaction measures through NPS and CSAT surveys.
Additionally, the study evaluates the ability to transcribe and organize complex clinical conversations in Brazilian Portuguese, which is important given the linguistic and cultural nuances specific to Brazil. However, some limitations must be addressed. One key challenge is the difficulty in systematically analyzing informal feedback, which is often resolved in real time during direct interactions with patients, rather than being documented in a way that is suitable for scientific analysis. Moreover, tracking and analyzing hallucinations in transcribed documents is challenging. Many users tend to make manual corrections without reporting these issues, complicating the collection of consistent data on errors. This lack of documented feedback hinders efforts to further refine the algorithms, as real-time corrections are not always logged or analyzed.
As Voa is the first tool of its kind in Brazil, there are limited precedents or standardized processes for evaluating the efficacy and reliability of similar AI-driven tools. This gap in the literature makes it difficult to compare the results of this study with broader AI validation studies, highlighting the need for more comprehensive frameworks in future research. Studies, including randomized controlled trials, are essential to complement evidence of the tool’s effectiveness, as well as to gain a deeper understanding of its impact on clinical outcomes.
Conclusions
This study presents a comprehensive descriptive statistical analysis that validates the Voa AI tool as an effective SaaS solution for reducing the documentation burden in healthcare. By analyzing both quantitative and qualitative data, we demonstrated that Voa successfully enhances clinical workflows, with significant growth in document generation and user adoption. The methodology employed aligns with best practices for AI validation in healthcare, focusing on real-world performance and user feedback. Voa’s successful implementation and user engagement provide a foundation for future studies to build upon, offering a comparative standard for the development and evaluation of new technologies in this domain.
Key findings revealed a consistent user activation rate and highlighted the need for enhanced onboarding and continuous support to ensure that new users can fully integrate Voa into their practice. As the first platform of its kind in Brazil, Voa faces the challenge of overcoming initial hesitations and trust issues among users who may be unfamiliar with such AI-driven solutions. Building user confidence is crucial, and this can be achieved through ongoing education, personalized support, and demonstrating the platform’s reliability and effectiveness over time. Additionally, further studies, including randomized controlled trials, and the refinement of transcription systems better adapted to linguistic variations and specialized medical terms, are essential to ensuring that this kind of tool can effectively serve diverse healthcare settings.
This report reinforces the idea that artificial intelligence is redefining the boundaries of medical practice, making tools like Voa not just aids but essential components in clinical processes. As Voa continues to evolve, it is expected that its adoption will expand, bringing continuous improvements in operational efficiency and patient care quality. As highlighted, it is increasingly evident that doctors who choose not to integrate these advanced technologies into their practices may find themselves at a significant disadvantage. Reluctance to adopt innovative AI-based solutions may limit doctors’ ability to efficiently meet the demands of a complex and data-driven healthcare system.
Acknowledgments
Funding: None.
Footnote
Data Sharing Statement: Available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-213/dss
Peer Review File: Available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-213/prf
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-213/coif). S.A.T. and F.S.L. are co-founders of Voa Health. E.A.R. is a partner of Voa Health. The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Sinsky C, Colligan L, Li L, et al. Allocation of Physician Time in Ambulatory Practice: A Time and Motion Study in 4 Specialties. Ann Intern Med 2016;165:753-60. [Crossref] [PubMed]
- Lin SY, Shanafelt TD, Asch SM. Reimagining Clinical Documentation With Artificial Intelligence. Mayo Clin Proc 2018;93:563-5. [Crossref] [PubMed]
- Young RA, Burge SK, Kumar KA, et al. A Time-Motion Study of Primary Care Physicians' Work in the Electronic Health Record Era. Fam Med 2018;50:91-9. [Crossref] [PubMed]
- Medscape. National Physician Burnout & Suicide Report 2020. Available online: https://www.medscape.com/slideshow/2020-lifestyle-burnout-6012460
- Shanafelt TD, Sloan JA, Habermann TM. The well-being of physicians. Am J Med 2003;114:513-9. [Crossref] [PubMed]
- Al Bahrani B, Medhi I. Copy-Pasting in Patients' Electronic Medical Records (EMRs): Use Judiciously and With Caution. Cureus 2023;15:e40486. [Crossref] [PubMed]
- Cheng CG, Wu DC, Lu JC, et al. Restricted use of copy and paste in electronic health records potentially improves healthcare quality. Medicine (Baltimore) 2022;101:e28644. [Crossref] [PubMed]
- Radford A, Kim JW, Xu T, et al. Robust speech recognition via large-scale weak supervision. 2022. Available online: https://cdn.openai.com/papers/whisper.pdf
- Hong G, Wilcox L, Sattler A, et al. Clinicians’ Experiences with EHR Documentation and Attitudes Toward AI-Assisted Documentation [White paper]. Stanford University School of Medicine and Google Health 2020. Available online: https://med.stanford.edu/content/dam/sm/healthcare-ai/images/Stanford-Google_AI-Scribe_WhitePaper.pdf
- Lin SY, Mahoney MR, Sinsky CA. Ten Ways Artificial Intelligence Will Transform Primary Care. J Gen Intern Med 2019;34:1626-30. [Crossref] [PubMed]
- Noorbakhsh-Sabet N, Zand R, Zhang Y, et al. Artificial Intelligence Transforms the Future of Health Care. Am J Med 2019;132:795-801. [Crossref] [PubMed]
- Uszkoreit J. Transformer: A novel neural network architecture for language understanding. Google AI Blog. (2017, August 31). Available online: https://research.google/blog/transformer-a-novel-neural-network-architecture-for-language-understanding/
- Alberts IL, Mercolli L, Pyka T, et al. Large language models (LLM) and ChatGPT: what will the impact on nuclear medicine be? Eur J Nucl Med Mol Imaging 2023;50:1549-52. [Crossref] [PubMed]
Sezgin E Sirrianni J Kranz K. Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden. - Van Veen D, Van Uden C, Blankemeier L, et al. Adapted large language models can outperform medical experts in clinical text summarization. Nat Med 2024;30:1134-42. [Crossref] [PubMed]
Saab K Tu T Weng WH Capabilities of Gemini models in medicine. 2024 . arXiv: 2404.18416.- Coiera E, Kocaballi B, Halamka J, et al. The digital scribe. NPJ Digit Med 2018;1:58. [Crossref] [PubMed]
- Quiroz JC, Laranjo L, Kocaballi AB, et al. Challenges of developing a digital scribe to reduce clinical documentation burden. NPJ Digit Med 2019;2:114. [Crossref] [PubMed]
- Freed AI. The AI Medical Scribe for Clinicians. 2023. Available online: https://www.getfreed.ai
- Suki AI. Suki Assistant: Enterprise-grade AI for Clinicians. (2023). Available online: https://www.suki.ai
- Ulrich A. DeepScribe Review: Accelerate Medical Documentation With AI. 2023. Available online: https://austinulrich.com/deepscribe-review
- Nuance. Dragon Medical One: AI Powered Clinical Documentation (2023). Available online: https://www.nuance.com/healthcare/dragon-ai-clinical-solutions/dragon-medical-one.html
- Aminololama-Shakeri S, López JE. The Doctor-Patient Relationship With Artificial Intelligence. AJR Am J Roentgenol 2019;212:308-10. [Crossref] [PubMed]
- Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Healthc J 2019;6:94-8. [Crossref] [PubMed]
- Scott IA, van der Vegt A, Lane P, et al. Achieving large-scale clinician adoption of AI-enabled decision support. BMJ Health Care Inform 2024;31:e100971. [Crossref] [PubMed]
- Baehre S, O’Dwyer M, O’Malley L, et al. The use of Net Promoter Score (NPS) to predict sales growth: insights from an empirical investigation. J Acad Mark Sci 2022;50:67-84. [Crossref]
- Mittal V, Han K, Frennea C, et al. Customer satisfaction, loyalty behaviors, and firm financial performance: what 40 years of research tells us. Mark Lett 2023;34:171-87. [Crossref]
Cite this article as: Basei de Paula PA, Bruneti Severino JV, Berger MN, Veiga MH, Parente Ribeiro KD, Loures FS, Todeschini SA, Roeder EA, Marques GL. Improving documentation quality and patient interaction with AI: a tool for transforming medical records—an experience report. J Med Artif Intell 2025;8:19.