FN Clarivate Analytics Web of Science VR 1.0 PT J AU Pastor, GC Noriega-Santiáñez, L AF Pastor, Gloria Corpas Noriega-Santianez, Laura TI Human versus Neural Machine Translation Creativity: A Study on Manipulated MWEs in Literature SO INFORMATION DT Article DE literary translation; neural machine translation; creativity; manipulated multiword expressions; human evaluation AB In the digital era, the (r)evolution of neural machine translation (NMT) has reshaped both the market and translators' workflow. However, the adoption of this technology has not fully reached the creative field of literary translation. Against this background, this study aims to explore to what extent NMT systems can be used to translate the creative challenges posed by idioms, specifically manipulated multiword expressions (MWEs) found in literary texts. To carry out this pilot study, five manipulated MWEs were selected from a fantasy novel and machine-translated (English > Spanish) by four NMT systems (DeepL, Google Translate, Bing Translator, and Reverso). Then, each NMT output as well as a human translation are assessed by six professional literary translators by using a human evaluation sheet. Based on these results, the creativity obtained in each translation method was calculated. Despite the satisfactory performance of both DeepL and Google Translate, HT creativity was highly superior in almost all manipulated MWEs. To the best of our knowledge, this paper not only contributes to the ongoing study of NMT applied to literature, but it is also one of the few studies that delve into the almost unexplored field of assessing creativity in neural machine-translated MWEs. C1 [Pastor, Gloria Corpas; Noriega-Santianez, Laura] Univ Malaga, Res Inst Multilingual Language Technol IUITLM, Malaga 29016, Spain. C3 Universidad de Malaga RP Pastor, GC (corresponding author), Univ Malaga, Res Inst Multilingual Language Technol IUITLM, Malaga 29016, Spain. EM gcorpas@uma.es; laura.noriega@uma.es TC 3 Z9 6 PD SEP PY 2024 VL 15 IS 9 AR 530 DI 10.3390/info15090530 WC Computer Science, Information Systems WE Emerging Sources Citation Index (ESCI) SC Computer Science UT WOS:001326458900001 DA 2026-04-14 ER PT J AU Noriega-Santiáñez, L Pastor, GC AF Noriega-Santianez, Laura Pastor, Gloria Corpas TI Measuring Creative Phraseology in Literature: Machine Translation Systems Versus Large Language Models SO YEARBOOK OF PHRASEOLOGY DT Article DE creativity; comparative idioms; neural machine translation systems; large language models; professional translators; translation students; evaluation AB In a growing digital scenario where phraseology has become aware of technological realities, literary translation is timidly testing sophisticated AI-based tools. This study aims at assessing the quality of the output rendered by neural machine translation (NMT) systems, i.e., DeepL and Google Translate, and large language models (LLMs), i.e., ChatGPT and Gemini, in the English>Spanish translation of five comparative idioms extracted from literary texts. To this end, professional literary translators and translation undergraduates evaluate their output against human translation (HT), following the parameters proposed by Corpas Pastor and Noriega-Santi & aacute;& ntilde;ez (2024) to measure creativity in the translation of multiword-expressions: adequacy (morphosyntactic, semantic, and pragmatic) and novelty. The findings show that HT stands out, although NMT systems outperformed morphosyntactically. LLMs, especially ChatGTP, show promising creative results. Therefore, this study serves to reflect on the use of technologies for the translation of creative phraseology in the context of literature. C1 [Noriega-Santianez, Laura; Pastor, Gloria Corpas] Univ Malaga, Res Inst Multilingual Language Technol, Translat & Interpreting, 27 Blvd Louis Pasteur, Malaga 29010, Spain. C3 Universidad de Malaga RP Noriega-Santiáñez, L (corresponding author), Univ Malaga, Res Inst Multilingual Language Technol, Translat & Interpreting, 27 Blvd Louis Pasteur, Malaga 29010, Spain. EM laura.noriega@uma.es; gcorpas@uma.es TC 0 Z9 0 PD NOV 25 PY 2025 VL 16 IS 1 BP 125 EP 152 DI 10.1515/phras-2025-0006 WC Language & Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001611959200005 DA 2026-04-14 ER PT J AU Yao, GY Fan, LX AF Yao, Guangyuan Fan, Lingxi TI An entropy-based study of Simplification in ChatGPT translations compared to neural machine translation and human translation across genres SO PLOS ONE DT Article ID CORPUS; ENGLISH AB This study investigates the phenomenon of simplification in Chinese-to-English translation across Human Translation (HT), neural machine translation (NMT), and large language model (LLM)-based translation, ChatGPT as an example. Employing entropy-based metrics (unigram entropy and Part-of-Speech (POS) entropy) to assess lexical and syntactic complexity, the research analyzes translations across three genres: political texts, fiction, and academic. Findings reveal that political and academic texts exhibit lexical simplification, and texts of all genres show a syntactic simplification trend, with the simplified degree varying across translation modes. While genre exerts minimal influence on lexical complexity, it significantly impacts syntactic complexity, with academic texts showing the lowest and fiction the highest complexity levels. Notably, ChatGPT's translations consistently exhibit greater lexical complexity, as evidenced by higher unigram entropy scores compared to those of Neural Machine Translation. These results challenge the notion of simplification as a universal feature of translation, instead highlighting its probabilistic nature influenced by translation mode and genre. The study underscores the efficacy of entropy-based measures in capturing nuanced differences in translation complexity and advocates for a modal approach to translation studies that accounts for the unique characteristics of various translation methods. C1 [Yao, Guangyuan] Cent South Univ, Sch Foreign Languages, Changsha, Hunan, Peoples R China. [Fan, Lingxi] Beihang Univ, Hangzhou Int Innovat Inst, Hangzhou, Zhejiang, Peoples R China. C3 Central South University; Beihang University RP Fan, LX (corresponding author), Beihang Univ, Hangzhou Int Innovat Inst, Hangzhou, Zhejiang, Peoples R China. EM 23112747g@connect.polyu.hk TC 0 Z9 0 PD DEC 31 PY 2025 VL 20 IS 12 AR e0339762 DI 10.1371/journal.pone.0339762 WC Multidisciplinary Sciences WE Science Citation Index Expanded (SCI-EXPANDED) SC Science & Technology - Other Topics UT WOS:001660629900001 DA 2026-04-14 ER PT J AU Ferragud, MF AF Ferragud, Maria Ferragud TI La traducción automática literaria: análisis de errores de la traducción automática y las traducciones de estudiantes del mismo texto original SO TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO DT Article DE machine translation (MT); human translation (HT); literary translation; neural machine translation (NMT); machine translation training ID NEURAL MACHINE TRANSLATION; COMPETENCES AB This article compares the MT of a literary text with human translations by translation students. The main purpose of this study is to determine whether MT and translations by students have aspects in common and differ from published human translations. The results show that, although the MT makes more errors, and of a more serious character than the students, some specific errors of semiotic, meaning, pragmatics, text, lexis, and style coincide. C1 [Ferragud, Maria Ferragud] Univ Jaume 1, Castellon de La Plana, Spain. C3 Universitat Jaume I RP Ferragud, MF (corresponding author), Univ Jaume 1, Castellon de La Plana, Spain. EM ferragud@uji.es TC 0 Z9 1 PD DEC PY 2023 IS 21 BP 184 EP 232 DI 10.5565/rev/tradumatica.334 WC Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001134206200003 DA 2026-04-14 ER PT J AU Noriega-Santiáñez, L Pastor, GC AF Noriega-Santianez, Laura Pastor, Gloria Corpas TI Machine vs Human Translation of Formal Neologisms in Literature: Exploring E-tools and Creativity in Students SO TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO DT Article DE machine translation; formal neologisms; literary translation; human translation; technological tools; technological resources; creativity ID IMPACT AB This article compares the output of three neural machine translation systems (Google Translate, DeepL, and Phrase TMS) and human translation (undergraduate level students, English into Spanish). It focuses on five formal neologisms extracted from literary texts, thus considering creativity, and technology adoption and training. C1 [Noriega-Santianez, Laura; Pastor, Gloria Corpas] Univ Malaga, IUITLM, Malaga, Spain. C3 Universidad de Malaga RP Noriega-Santiáñez, L (corresponding author), Univ Malaga, IUITLM, Malaga, Spain. EM laura.noriega@uma.es; gcorpas@uma.es TC 5 Z9 6 PD DEC PY 2023 IS 21 BP 233 EP 264 DI 10.5565/rev/tradumatica.338 WC Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001134206200007 DA 2026-04-14 ER PT J AU Gao, RY Lin, YM Zhao, N Cai, ZG AF Gao, Ruiyao Lin, Yumeng Zhao, Nan Cai, Zhenguang G. TI Machine translation of Chinese classical poetry: a comparison among ChatGPT, Google Translate, and DeepL Translator SO HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS DT Article AB Recent studies have highlighted ChatGPT's remarkable capabilities in machine translation. However, little attention has been paid to its application in literary translation, particularly within the realm of Chinese classical poetry. To explore the potential of ChatGPT's abilities in poetry translation, we conducted a comparative analysis of poetry translation quality, contrasting ChatGPT (with two different prompts) with Google Translate and DeepL Translator regarding fidelity, fluency, language style, and machine translation style. The results revealed that ChatGPT outperformed Google Translate and DeepL Translator in all evaluation criteria, suggesting its exceptional ability in poetry translation. Furthermore, when employing a prompt that instructs ChatGPT to preserve the rhythm and rhyme of poems, ChatGPT demonstrated a remarkable ability to retain the beauty of the original poetic language, setting itself apart from conventional machine translation systems. Our analysis further elucidated ChatGPT's proficiency in comprehending and translating some common symbols, imagery, and underlined semantic components which contributes to coherent and fluent translations. Our research opens up ChatGPT's new possibilities in translating ancient literary texts into foreign languages. C1 [Gao, Ruiyao; Lin, Yumeng; Cai, Zhenguang G.] Chinese Univ Hong Kong, Dept Linguist & Modern Languages, Shatin, Hong Kong, Peoples R China. [Zhao, Nan] Hong Kong Baptist Univ, Dept Translat Interpreting & Intercultural Studies, Kowloon, Hong Kong, Peoples R China. [Cai, Zhenguang G.] Chinese Univ Hong Kong, Brain & Mind Inst, Shatin, Hong Kong, Peoples R China. C3 Chinese University of Hong Kong; Hong Kong Baptist University; Chinese University of Hong Kong RP Cai, ZG (corresponding author), Chinese Univ Hong Kong, Dept Linguist & Modern Languages, Shatin, Hong Kong, Peoples R China.; Cai, ZG (corresponding author), Chinese Univ Hong Kong, Brain & Mind Inst, Shatin, Hong Kong, Peoples R China. EM zhenguangcai@cuhk.edu.hk TC 25 Z9 31 PD JUN 26 PY 2024 VL 11 IS 1 AR 835 DI 10.1057/s41599-024-03363-0 WC Humanities, Multidisciplinary; Social Sciences, Interdisciplinary WE Social Science Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI) SC Arts & Humanities - Other Topics; Social Sciences - Other Topics UT WOS:001255373300002 DA 2026-04-14 ER PT J AU Liu, ZX Yao, GY AF Liu, Zhaoxia Yao, Guangyuan TI Referential explicitation in translations by large language models, neural machine translation system, and human translators across genres SO ACROSS LANGUAGES AND CULTURES DT Article; Early Access DE explicitation; machine translation; explicitness of reference; multidimensional analysis; large language models ID ENGLISH; CORPUS; CONNECTIVES; CONTACT AB This study examines referential explicitation in Large Language Models (LLMs)-based machine translation (MT) compared to neural machine translation (NMT) and human translations. Referential explicitation-the process of making implicit referential expressions explicit in the target text, has been recognized as a crucial aspect of translation studies. Through Multidimensional Analysis (MDA), findings indicate that all translation modes perform referential explicitation across various genres including news, academic, and fiction. The degree of explicitation in LLM-based machine translation falls between human translations and neural machine translations. Moreover, machine translations prioritize clarity and transparency, whereas human translations tend to preserve implicit references especially in fiction. This difference underscores the challenge for neural machine translations in balancing clarity with cultural nuances. Notably, LLM-based machine translations demonstrate improvements, achieving performance closer to human translators due to their distinct underlying logic compared to neural machine translation. Additionally, genre-specific patterns reveal consistently high levels of explicitation in news and academic texts, while fiction translations vary in their preservation of implicit references. This study enhances the theoretical understanding of translation universals and contributes to the development of advanced evaluation metrics for machine translation. C1 [Liu, Zhaoxia] Shanghai Univ, Sch Foreign Languages, Shanghai, Peoples R China. [Yao, Guangyuan] Cent South Univ, Sch Foreign Languages, Changsha, Hunan, Peoples R China. C3 Shanghai University; Central South University RP Yao, GY (corresponding author), Cent South Univ, Sch Foreign Languages, Changsha, Hunan, Peoples R China. EM yc37719@um.edu.mo TC 0 Z9 0 PD 2026 FEB 24 PY 2026 DI 10.1556/084.2026.01199 EA FEB 2026 WC Linguistics; Language & Linguistics WE Social Science Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI) SC Linguistics UT WOS:001702782200001 DA 2026-04-14 ER PT J AU Vieira, LN Zelenka, N Youdale, R Zhang, XC Carl, M AF Vieira, Lucas Nunes Zelenka, Natalie Youdale, Roy Zhang, Xiaochun Carl, Michael TI Translating science fiction in a CAT tool: machine translation and segmentation settings SO TRANSLATION & INTERPRETING-THE INTERNATIONAL JOURNAL OF TRANSLATION AND INTERPRETING DT Article DE Literary translation; post-editing; machine translation; neural machine translation; computer-assisted translation; CAT tools; science fiction; Chinese translation AB There is increasing interest in machine assistance for literary translation, but research on how computer-assisted translation (CAT) tools and machine translation (MT) combine in the translation of literature is still incipient, especially for non-European languages. This article presents two exploratory studies where English-to-Chinese translators used neural MT to translate science fiction short stories in Trados Studio. One of the studies compares post-editing with a 'no MT' condition. The other examines two ways of presenting the texts on screen for post -editing, namely by segmenting them into paragraphs or into sentences. We collected the data with the Qualititivity plugin for Trados Studio and describe a method for analysing data collected with this plugin through the translation process research database of the Center for Research in Translation and Translation Technology (CRITT). While post-editing required less technical effort, we did not find MT to be appreciably timesaving. Paragraph segmentation was associated with less post -editing effort on average, though with high translator variability. We discuss the results in the light of broader concepts, such as status-quo bias, and call for more research on the different ways in which MT may assist literary translation, including its use for comparison purposes or, as mentioned by a participant, for 'inspiration'. C1 [Vieira, Lucas Nunes; Zelenka, Natalie; Youdale, Roy] Univ Bristol, England, England. [Zhang, Xiaochun] UCL, London, England. [Carl, Michael] Kent State Univ, Kent, OH 44240 USA. C3 University of Bristol; University of London; University College London; University System of Ohio; Kent State University; Kent State University Kent; Kent State University Salem RP Vieira, LN (corresponding author), Univ Bristol, England, England. EM l.nunesvieira@bristol.ac.uk; natalie.zelenka@bristol.ac.uk; roy.youdale@bristol.ac.uk; xiaochun.zhang@ucl.ac.uk; mcarl6@kent.edu TC 7 Z9 10 PY 2023 VL 15 IS 1 BP 216 EP 235 DI 10.12807/ti.115201.2023.a11 WC Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:000941189700011 DA 2026-04-14 ER PT J AU Ouldelhaj, D Benmakhlouf, H AF Ouldelhaj, Driss Benmakhlouf, Hajar TI ARTIFICIAL INTELLIGENCE AND LITERARY TRANSLATION: REALITY AND PROSPECTS SO PERSPECTIVAS DE LA COMUNICACION DT Article DE artificial intelligence; literary translation; human skills; tools; epistemological metaphor; emotional intelligence AB In the modern era, the integration of technology, especially artificial intelligence (AI), into various domains presents both opportunities and challenges. Notably, while AI has made strides in numerous fields, its influence on literary translation is particularly profound. Our research assesses AI's strengths and limitations in comparison to vital human translation skills essential for top-tier literary translations. Through a focused comparative analysis of fragmented literary texts, our results highlight that, despite machine translation's advancements and speed, it cannot fully replace human translators. Specifically in literary contexts, maintaining quality and accuracy remains a daunting task for AI-driven systems. C1 [Ouldelhaj, Driss] Univ Hassan I, Fac Lenguas Arte & Ciencias Humanas, Settat, Morocco. [Benmakhlouf, Hajar] Univ Int Casablanca, Casablanca, Morocco. C3 Hassan First University of Settat RP Ouldelhaj, D (corresponding author), Univ Hassan I, Fac Lenguas Arte & Ciencias Humanas, Settat, Morocco. EM oldlhjdriss@gmail.com; hajar.benmakhlouf@uic.ac.ma TC 0 Z9 0 PY 2025 VL 18 AR 3603 DI 10.56754/0718-4867.2025.3603 WC Communication WE Emerging Sources Citation Index (ESCI) SC Communication UT WOS:001578840000005 DA 2026-04-14 ER PT J AU Moreno, AI Mora, MED AF Moreno, Ana Ibanez Mora, Maria Esther Dominguez TI Google Translate versus DeepL in Spanish to English translation of Don Quixote SO TRANSLATION AND TRANSLANGUAGING IN MULTILINGUAL CONTEXTS DT Article DE literary translation; neural machine translation; collocations ID COLLOCATIONS; DIFFICULTIES AB This paper analyses the effectiveness of neural machine translation when applied to literary translation and, more specifically, to the translation of collocations, one of the most difficult aspects in machine translation (Corpas-Pastor 2015; Shraiden and Mahadin 2015). Literary translation continues to constitute one of the biggest challenges for machine translation (Toral and Way 2018), where cohesion errors are amongst the most frequent (Voigt and Jurafsky 2012). A comparative analysis of the translation of the first chapter of the world literature masterpiece El ingenioso hidalgo don Quijote de la Mancha - known as Don Quixote in English - was carried out, paying close attention to collocations. The human translation done by Tom Lathrop (Don Quixote) was compared to the target texts obtained with the two biggest neural machine translation systems today, Google Translate and DeepL, to see which provided more accurate results. The results confirm that neural machine translation offers highly reliable results. On a quantitative level the margins are very narrow when determining which system, DeepL or Google Translate, is better. DeepL scored better in terms of accuracy and recall, but in the BLEU metrics Google Translate scored 28.10 and DeepL 26.63. On a qualitative level and from a subjective point of view, we found DeepL's translation to be somewhat more fluid and natural than Google Translate's. C1 [Moreno, Ana Ibanez; Mora, Maria Esther Dominguez] Univ Nacl Educ Distancia UNED, Madrid, Spain. [Moreno, Ana Ibanez; Mora, Maria Esther Dominguez] Univ Nacl Educ Distancia UNED, Fac Philol, Paseo Senda Rey 7, Madrid 28040, Spain. C3 Universidad Nacional de Educacion a Distancia (UNED); Universidad Nacional de Educacion a Distancia (UNED) RP Moreno, AI (corresponding author), Univ Nacl Educ Distancia UNED, Fac Philol, Paseo Senda Rey 7, Madrid 28040, Spain. EM aibanez@flog.uned.es; mestherdominguez@yahoo.es TC 1 Z9 1 PD JAN 7 PY 2025 VL 11 IS 1 BP 65 EP 87 DI 10.1075/ttmc.00154.iba WC Education & Educational Research; Linguistics; Language & Linguistics WE Emerging Sources Citation Index (ESCI) SC Education & Educational Research; Linguistics UT WOS:001398320900003 DA 2026-04-14 ER PT J AU Mohsen, MA AF Mohsen, Mohammed Ali TI Artificial Intelligence in Academic Translation: A Comparative Study of Large Language Models and Google Translate SO PSYCHOLINGUISTICS DT Article DE ChatGPT; Machine Translation; Google Translate; articles' abstract AB Purpose. The advent of Large Language Model (LLM), a generative artificial intelligence (AI) model, in November 2022 has had a profound impact on various domains, including the field of translation studies. This motivated this study to conduct a rigorous evaluation of the effectiveness and precision of machine translation, represented by Google Translate (GT), in comparison to Large Language Models (LLMs), specifically ChatGPT 3.5 and 4, when translating academic abstracts bidirectionally between English and Arabic. Methods. Employing a mixed-design approach, this study utilizes a corpus comprising 20 abstracts sourced from peer-reviewed journals indexed in the Clarivate Web of Science, specifically the Journal of Arabic Literature and Al-Istihlal Journal. The abstracts are equally divided to represent both English-Arabic and Arabic- English translation directionality. The study's design is rooted in a comprehensive evaluation rubric adapted from Hurtado Albir and Taylor (2015), focusing on semantic integrity, syntactic coherence, and technical adequacy. Three independent raters carried out assessments of the translation outputs generated by both GT and LLM models. Results. Results from quantitative and qualitative analyses indicated that LLM tools significantly outperformed MT outputs in both Arabic and English translation directions. Additionally, ChatGPT 4 demonstrated a significant advantage over ChatGPT 3.5 in Arabic-English translation, while no statistically significant difference was observed in the English-Arabic translation directionality. Qualitative analysis findings indicated that AI tools exhibited the capacity to comprehend contextual nuances, recognize city names, and adapt to the target language's style. Conversely, GT displayed limitations in handling specific contextual aspects and often provided literal translations for certain terms. C1 [Mohsen, Mohammed Ali] Najran Univ, King Abdulaziz St, Najran 66251, Saudi Arabia. C3 Najran University RP Mohsen, MA (corresponding author), Najran Univ, King Abdulaziz St, Najran 66251, Saudi Arabia. EM mmohsen1976@gmail.com TC 9 Z9 13 PY 2024 VL 35 IS 2 DI 10.31470/2309-1797-2024-35-2-134-156 WC Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001301887100006 DA 2026-04-14 ER PT J AU Karakus, IS Strechen, I Gupta, A Nalaie, K Chen, CL Hassett, LC Barwise, AK AF Karakus, Ibrahim Serhat Strechen, Inna Gupta, Ankita Nalaie, Keivan Chen, Christine L. Hassett, Leslie C. Barwise, Amelia K. TI Bridging language gaps in healthcare: a systematic review of the practical implementation of neural machine translation technologies in clinical settings SO JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION DT Review DE translation; interpretation; neural machine translation; large language model; non-English language preference ID ACCESS; IMPACT AB Objectives Effective communication is crucial in healthcare, and for patients with a non-English language preference (NELP), professional interpreters are recognized as the gold standard in supporting bidirectional communication. However, interpreters are not always readily available, prompting the exploration of other options for translation and interpretation. The recent developments in artificial intelligence-based neural network translation tools, namely neural machine translation (NMT) may enable robust interpretation and translation.Materials and Methods We conducted a systematic review (SR) to evaluate the literature on NMT for this purpose. We did a comprehensive search of several databases with guidance from a professional librarian. The search was limited to the year 2000 onwards and English language. Title and abstract screening and full-text review were independently conducted by two reviewers with conflicts resolved by a third reviewerResults 2867 studies were identified with 10 studies included in the final analysis. Among these, six evaluated interpretation in real or simulated clinical settings and four examined translation of discharge materials. Google Translate and ChatGPT were assessed in several studies. Accuracy differed by language, with low-resource languages performing worse.Discussion NMT technologies in healthcare have several advantages including broad language accessibility and potential cost savings for institutions. Despite improved accuracy of these novel tools, due to possible critical errors NMT tools are not yet ready for widespread clinical use.Conclusion Future studies should focus on optimizing evaluation methods as well as how best to integrate these technologies into real-time clinical settings. C1 [Karakus, Ibrahim Serhat; Strechen, Inna; Gupta, Ankita; Nalaie, Keivan] Mayo Clin, Dept Anesthesiol & Perioperat Med, 200 First St SW, Rochester, MN 55905 USA. [Nalaie, Keivan] Mayo Clin, Dept Nursing, Div Nursing Res, Rochester, MN 55905 USA. [Chen, Christine L.] Mayo Clin, Dept Internal Med, Rochester, MN 55905 USA. [Hassett, Leslie C.] Mayo Clin, Mayo Clin Lib, Rochester, MN 55905 USA. [Barwise, Amelia K.] Mayo Clin, Div Pulm & Crit Care Med, Rochester, MN 55905 USA. [Barwise, Amelia K.] Mayo Clin, Biomed Eth Res Program, Rochester, MN 55905 USA. C3 Mayo Clinic; Mayo Clinic; Mayo Clinic; Mayo Clinic; Mayo Clinic; Mayo Clinic RP Karakus, IS (corresponding author), Mayo Clin, Dept Anesthesiol & Perioperat Med, 200 First St SW, Rochester, MN 55905 USA. EM karakus.ibrahim@mayo.edu TC 2 Z9 2 PD NOV PY 2025 VL 32 IS 11 BP 1756 EP 1766 DI 10.1093/jamia/ocaf150 EA SEP 2025 WC Computer Science, Information Systems; Computer Science, Interdisciplinary Applications; Health Care Sciences & Services; Information Science & Library Science; Medical Informatics WE Science Citation Index Expanded (SCI-EXPANDED); Social Science Citation Index (SSCI) SC Computer Science; Health Care Sciences & Services; Information Science & Library Science; Medical Informatics UT WOS:001573720400001 DA 2026-04-14 ER PT J AU Wang, HT AF Wang Hongtao TI Defending the last bastion A sociological approach to the challenged literary translation SO BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION DT Article DE translator's habitus; actor-network; human signification; machine simulation; positive ethics ID MACHINE TRANSLATION; HABITUS AB Growing interest has been noted in applying AI-powered machine translation ( MT) to literary translation, hailed as the last bastion of human translation. Despite achieving considerable progress in this field, research has either ignored or underestimated the particularity, complexity, and cultural significance of literary translation, which can be examined from a sociological approach. Drawing on the sociological theories of Bourdieu, Latour, Callon, and Baudrillard, the present paper analyses the innate nature of literary translation and highlights three fundamental issues that need to be addressed in applying MT to literary texts. First, the poetics of literary translation is built on human translators' long-acquired habitus, thus, in the case of MT, an algorithm comparable to the creative human habitus must be derived if MT aspires to take on the role of the human translator. Second, literary translation constitutes a dynamic network connected by various human and non-human actors, thus the aspects not included in the interlingual transference of MT should be compensated through more effective interactions between the machine and other actors. Third, the cultural-ethical issues related to MT should be thoroughly examined because the present MT of literary texts is a machine simulation of the psychological human translation, which undermines both the meaning generation of literary translation and the knowledge accumulation of cultural production. Therefore, literary translation must be handled by qualified human translators until we can undoubtedly ensure that MT can be effectively and safely applied to literary texts. C1 [Wang Hongtao] Beijing Foreign Studies Univ, Beijing, Peoples R China. C3 Beijing Foreign Studies University RP Wang, HT (corresponding author), Beijing Foreign Studies Univ, Sch English & Int Studies, 2 North Xisanhuan Rd, Beijing 100089, Peoples R China. EM wanghongtao@bfsu.edu.cn TC 5 Z9 5 PD SEP 26 PY 2023 VL 69 IS 4 BP 465 EP 482 DI 10.1075/babel.00330.wan WC Linguistics; Language & Linguistics WE Social Science Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI) SC Linguistics UT WOS:001069891700002 DA 2026-04-14 ER PT J AU Al Rousan, R Jaradat, R Malkawi, M AF Al Rousan, Rafat Jaradat, Raghad Malkawi, Muna TI ChatGPT translation vs. human translation: an examination of a literary text SO COGENT SOCIAL SCIENCES DT Article DE ChatGPT; human translation; multidimensional quality metrics; literary translation ID CHALLENGES; TRENDS AB This study evaluates the proficiency of ChatGPT-based translation compared to Human Translation (HT) using an Arabic literary work. It also examines potential translation gaps in ChatGPT and explores its potential to replace human translators. The research analyzes 12 excerpts from Mawsim Al-Hijra Ela Al-Shamal (1966) by Tayeb Salih, comparing the English translation by Denys Johnson-Davies (Season of Migration to the North, 1969) with ChatGPT's output. A mixed-method approach (qualitative and quantitative) was used, assessing translations through three dimensions of the Multidimensional Quality Metrics (MQM) framework: accuracy, fluency, and design. The MQM scoring model was also employed to ensure reliability. The findings show that HT is more accurate, with an average accuracy score of 94.5%, compared to 77.9% for ChatGPT. However, ChatGPT produces fluent translations, scoring 97.2% in fluency versus 96.6% for HT. Despite its fluency, ChatGPT struggles with design-related elements and often introduces superfluous content. The study concludes that ChatGPT is not a fully reliable tool for translating Arabic literature, which requires professional human translators like Denys Johnson-Davies to ensure accuracy and cultural sensitivity. C1 [Al Rousan, Rafat; Jaradat, Raghad; Malkawi, Muna] Yarmouk Univ, Dept Translat, Irbid, Jordan. C3 Yarmouk University RP Al Rousan, R (corresponding author), Yarmouk Univ, Dept Translat, Irbid, Jordan. EM rafat.r@yu.edu.jo TC 5 Z9 8 PD DEC 31 PY 2025 VL 11 IS 1 AR 2472916 DI 10.1080/23311886.2025.2472916 WC Social Sciences, Interdisciplinary WE Emerging Sources Citation Index (ESCI) SC Social Sciences - Other Topics UT WOS:001439952700001 DA 2026-04-14 ER PT J AU Yao, XF Kang, YB McCosker, A AF Yao, Xiaofang Kang, Yong-Bin McCosker, Anthony TI Missing the human touch? SO TRANSLATION SPACES DT Article DE translation; large language models; GPT; stylometry; posthumanism ID MACHINE TRANSLATION; STYLOMETRY AB Existing research suggests that machine translations of literary texts remain unsatisfactory. Such quality assessment often relies on automated metrics and subjective human ratings, with little attention to the stylistic features of machine translation. Current understanding is limited regarding the extent to which AI may transform the literary translation landscape, with implications for other critical domains for translation such as the creative industries more broadly. This pioneering study investigates the stylistic features of AI translations, specifically examining GPT-4's performance against human translations of Chinese online literature. Our computational stylometry analysis reveals that GPT-4 translations closely mirror human translations in lexical, syntactic and content features. In addition to showing the relevance of stylometry for analysing the features of AI translation, the study provides critical insights into the implications of AI for literary translation in the posthuman tradition, where the line between machine and human translation becomes increasingly blurry. C1 [Yao, Xiaofang] Univ Hong Kong, Pokfulam, Centennial Campus, Hong Kong, Peoples R China. [Kang, Yong-Bin; McCosker, Anthony] Swinburne Univ Technol, Hawthorn, England. C3 University of Hong Kong RP Yao, XF (corresponding author), Univ Hong Kong, Pokfulam, Centennial Campus, Hong Kong, Peoples R China. EM xiaofang.yao@hku.hk; ykang@swin.edu.au; amccosker@swin.edu.au TC 0 Z9 0 PD DEC 18 PY 2025 VL 14 IS 2 BP 303 EP 330 DI 10.1075/ts.24043.yao WC Language & Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001641748600007 DA 2026-04-14 ER PT J AU Hongtao, W AF Hongtao, Wang TI Defending the last bastion SO BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION DT Article; Early Access DE translator's habitus; actor-network; human signification; machine simulation; positive ethics; Habitus du traducteur; reseau d'acteurs; signification humaine; simulation par la machine; ethique positive ID MACHINE TRANSLATION; HABITUS AB Growing interest has been noted in applying AI-powered machine translation (MT) to literary translation, hailed as the last bastion of human translation. Despite achieving considerable progress in this field, research has either ignored or underestimated the particularity, complexity, and cultural significance of literary translation, which can be examined from a sociological approach. Drawing on the sociological theories of Bourdieu, Latour, Callon, and Baudrillard, the present paper analyses the innate nature of literary translation and highlights three fundamental issues that need to be addressed in applying MT to literary texts. First, the poetics of literary translation is built on human translators' long-acquired habitus, thus, in the case of MT, an algorithm comparable to the creative human habitus must be derived if MT aspires to take on the role of the human translator. Second, literary translation constitutes a dynamic network connected by various human and non-human actors, thus the aspects not included in the interlingual transference of MT should be compensated through more effective interactions between the machine and other actors. Third, the culturalethical issues related to MT should be thoroughly examined because the present MT of literary texts is a machine simulation of the psychological human translation, which undermines both the meaning generation of literary translation and the knowledge accumulation of cultural production. Therefore, literary translation must be handled by qualified human translators until we can undoubtedly ensure that MT can be effectively and safely applied to literary texts. C1 [Hongtao, Wang] Beijing Foreign Studies Univ, Sch English & Int Studies, 2 North Xisanhuan Rd, Beijing 100089, Peoples R China. C3 Beijing Foreign Studies University RP Hongtao, W (corresponding author), Beijing Foreign Studies Univ, Sch English & Int Studies, 2 North Xisanhuan Rd, Beijing 100089, Peoples R China. EM wanghongtao@bfsu.edu.cn TC 2 Z9 2 PD 2023 JUL 6 PY 2023 DI 10.1075/babel.00330 EA JUL 2023 WC Linguistics; Language & Linguistics WE Social Science Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI) SC Linguistics UT WOS:001060471100001 DA 2026-04-14 ER PT J AU Lin, XY Liu, J Zhang, JM Lim, SJ AF Lin, Xinyue Liu, Jin Zhang, Jianming Lim, Se-Jung TI A Novel Beam Search to Improve Neural Machine Translation for English-Chinese SO CMC-COMPUTERS MATERIALS & CONTINUA DT Article DE Neural machine translation; beam search; reinforcement learning AB Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, overcoming the weaknesses of conventional phrase-based translation systems. Although NMT based systems have gained their popularity in commercial translation applications, there is still plenty of room for improvement. Being the most popular search algorithm in NMT, beam search is vital to the translation result. However, traditional beam search can produce duplicate or missing translation due to its target sequence selection strategy. Aiming to alleviate this problem, this paper proposed neural machine translation improvements based on a novel beam search evaluation function. And we use reinforcement learning to train a translation evaluation system to select better candidate words for generating translations. In the experiments, we conducted extensive experiments to evaluate our methods. CASIA corpus and the 1,000,000 pairs of bilingual corpora of NiuTrans are used in our experiments. The experiment results prove that the proposed methods can effectively improve the English to Chinese translation quality. C1 [Lin, Xinyue; Liu, Jin] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China. [Zhang, Jianming] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Peoples R China. [Lim, Se-Jung] Honam Univ, Liberal Arts & Convergence Studies, Gwangju 62399, South Korea. C3 Shanghai Maritime University; Changsha University of Science & Technology; Honam University RP Liu, J (corresponding author), Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China. EM jinliu@shmtu.edu.cn TC 6 Z9 9 PY 2020 VL 65 IS 1 BP 387 EP 404 DI 10.32604/cmc.2020.010984 WC Computer Science, Information Systems; Materials Science, Multidisciplinary WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science; Materials Science UT WOS:000552255300023 DA 2026-04-14 ER PT J AU Ferragud, MF AF Ferragud, Maria Ferragud TI Human and Machine Translation of the somatic idioms in the English Catalan COVALT corpus SO CAPLLETRA DT Article DE phraseological units; somatic idioms; literary translation; machine translation (MT); human translation (HT); corpora AB The objective of this study is to conduct a comparative analysis of human translation (HT) and machine translation (MT) in relation to a specific category of phraseological units: somatic idioms. For that purpose, we used the triple-aligned English-Catalan subcorpus of COVALT composed of 36 texts of narrative literature: the original texts in English, their translations into Catalan, and the MT into Catalan made with a MT system trained specifically for literary texts. A search was made for 5 lemmas in English and their equivalents in Catalan, and a corpus of 607 phraseological units was obtained, which were subsequently classified according to a list of translation techniques for phraseological units. The results show that, although phraseological loss is observed in both HT and MT, the loss is much greater in the case of MT and the percentages are more balanced in the case of HT. C1 [Ferragud, Maria Ferragud] Jaume I Univ, Castellon De La Plana, Spain. C3 Universitat Jaume I RP Ferragud, MF (corresponding author), Jaume I Univ, Castellon De La Plana, Spain. EM ferragud@uji.es TC 0 Z9 0 PY 2026 IS 80 BP 103 EP 128 DI 10.7203/caplletra.80.29639 WC Language & Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001730114200007 DA 2026-04-14 ER PT J AU Wang, QR Xu, J AF Wang, Qingran Xu, Jun TI Neural machine translation in AVT teaching in China: An in-depth analysis from the readability perspective SO LINGUISTICA ANTVERPIENSIA NEW SERIES-THEMES IN TRANSLATION STUDIES DT Article DE neural machine translation; NMT; AVT teaching; MT; machine translation evaluation; readability test; Chinese-English translation AB As audiovisual translation (AVT) becomes more complex and diverse, the need for advanced machine learning techniques has been increasing sharply, driving the widespread adoption of neural machine translation (NMT) technology in the field. This study contributes to the literature by evaluating the performance of NMT technology in AVT teaching. Based on readability theory, we constructed an evaluation framework with 12 indicators, built comparable corpora consisting of human and post-edited subtitle translations of corporate videos, and used them to examine the performance of four online NMT systems (Google Translate, Baidu Translate, Bing Translator, and Youdao Translate) in AVT teaching. Our statistical analyses and case studies show that Google Translate outperforms the other three platforms in all the readability tests, and it can enhance the readability of post-edited subtitles at five levels (word, syntax, textbase, situation model, genre and rhetorical). The performance of the other three platforms varies across different tests. Concrete examples are provided to substantiate the statistical analyses. Our study adds value to existing research both by examining the application and performance of NMT in AVT teaching and by suggesting potential directions for the refinement of current NMT systems. C1 [Wang, Qingran; Xu, Jun] China Univ Polit Sci & Law, Beijing, Peoples R China. C3 China University of Political Science & Law RP Wang, QR (corresponding author), China Univ Polit Sci & Law, Beijing, Peoples R China. EM cu192019@cupl.edu.cn; xujun289@163.com TC 1 Z9 1 PY 2023 VL 22 BP 161 EP 180 WC Linguistics; Language & Linguistics WE Social Science Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI) SC Linguistics UT WOS:001164823800006 DA 2026-04-14 ER PT J AU Bernal, PC AF Castillo Bernal, Pilar TI Computer-assisted literary translation applied to the historical novel (Ge Spanish): Training and comparison of machine translation systems SO QUADERNS DE FILOLOGIA-ESTUDIS LINGUISTICS DT Article DE CALT; machine translation; historical novel; German; post-editing AB Computer-assisted literary translation (CALT) has become a research point of interest in the last decade. Youdale's work (2020) proves the usefulness of CAT tools for the literary translator; Kenny and Winters (2020) look into CAT impact on the translator's style; Moorkens et al. (2018) compare different machine translation (MT) systems and Toral & Way (2015) train a MT engine for translating novels. Based on these and on my prior study (Castillo, 2022a, 20221), 2017), in the present work, a MT system is trained with the existing translation of a German historical novel of legal content, to be applied to the translation of a second novel on the same topic. Subsequently, the machine translation is post-edited and its errors compared to the results of an untrained MT system, with special focus on post-editing style and legal language. The purpose is to ascertain to what extent a customized MT system can assist in the literary translator's complex work. C1 [Castillo Bernal, Pilar] Univ Cordoba, Cordoba, Spain. C3 Universidad de Cordoba RP Bernal, PC (corresponding author), Univ Cordoba, Cordoba, Spain. EM pilar.castillo.bernal@uco.es TC 0 Z9 0 PY 2022 VL 27 BP 71 EP 85 DI 10.7203/QF.27.24624 WC Language & Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:000899234200004 DA 2026-04-14 ER PT J AU Way, A Rothwell, A Youdale, R AF Way, Andy Rothwell, Andrew Youdale, Roy TI Why more Literary Translators should embrace Translation Technology SO TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO DT Article DE literary translation; human translation; translation technology; machine translation AB Machine translation (MT) quality has improved significantly with the advent of neural techniques. Some communications about these improvements have been the product of overeager marketing hype, but MT is playing a real role in the lives of many human translators today. MT has even started to be used in pilot studies for the translation of literature, with results that outperformed anticipated outcomes. Nonetheless, its use and uptake as well as the acknowledgement of its potential merit are meeting with a degree of resistance, especially among some more experienced literary translators. In other areas, translators have complained about tools being foisted upon them, and have sought consultation on the design of translation technology. There are examples where translator input into tool design has happened to good effect, but in literary translation per se, translators have been recorded as avoiding such conversations. In this article, we investigate why some literary translators behave differently to their peers in other fields of translation. Finally, we offer pointers as to how translation technology, MT in particular, could benefit literary translators who have an open mind concerning technology. C1 [Way, Andy] Dublin City Univ, ADAPT Ctr, Sch Comp, Dublin, Ireland. [Rothwell, Andrew] Swansea Univ, Sch Culture & Commun, Swansea, Wales. [Youdale, Roy] Univ Bristol, Sch Modern Languages, Bristol, England. C3 Dublin City University; Swansea University; University of Bristol RP Way, A (corresponding author), Dublin City Univ, ADAPT Ctr, Sch Comp, Dublin, Ireland. EM andy.way@adaptcentre.ie; a.j.rothwell@swansea.ac.uk; roy.youdale@bristol.ac.uk TC 8 Z9 8 PD DEC PY 2023 IS 21 BP 87 EP 102 DI 10.5565/rev/tradumatica.344 WC Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001134206200010 DA 2026-04-14 ER PT J AU Mrinalini, K Vijayalakshmi, P Nagarajan, T AF Mrinalini, K. Vijayalakshmi, P. Nagarajan, T. TI SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems SO IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING DT Article DE Measurement; Speech processing; Correlation; Computational modeling; Surface morphology; Machine translation; Linguistics; Automatic evaluation; human correlation; individual metric; paraphrase-BERT; sentence-BERT AB Machine translation (MT) outputs are widely scored using automatic evaluation metrics and human evaluation scores. The automatic evaluation metrics are expected to be easily computable and a reflection of human evaluation. Traditional string-based metrics such as BLEU, ChrF++ scores, are widely used to evaluate MT systems, but fail to account for synonyms that appear in the state-of-the-art neural machine translation (NMT) systems, owing to their inability to evaluate paraphrases. While similarity-based metrics such as Yisi, BERTScore address this issue, these metrics need to be modified to better evaluate morphologically rich Indian languages such as, Tamil and Hindi. The current work proposes a novel and individual sentence-BERT based similarity (SBSim) metric, that makes use of a paraphrase-BERT model and sentence-level embedding to evaluate NMT outputs. The effectiveness of the BLEU, ChrF++, Yisi, BERTScore, and the proposed SBSim are evaluated on English-to-Tamil and English-to-Hindi NMT outputs. The sentence-level metric correlation of the proposed SBSim metric with respect to human scores is observed to outperform the existing metrics with a correlation of 0.9123 and 0.9052 for English-to-Tamil and English-to-Hindi NMT systems, respectively. Further, the average metric correlation of the SBSim metric is also observed to be the highest with a value of 0.9801 and 0.9836 for these NMT systems, respectively. The proposed metric is also evaluated on WMT2020 dataset and reports the highest correlation of 0.7129 with the human scores. C1 [Mrinalini, K.; Vijayalakshmi, P.] Sri Sivasubramaniya Nadar Coll Engn, Chennai 603110, Tamil Nadu, India. [Nagarajan, T.] Shiv Nadar Univ, Chennai 603110, Tamil Nadu, India. C3 SSN College of Engineering; Shiv Nadar University RP Mrinalini, K (corresponding author), Sri Sivasubramaniya Nadar Coll Engn, Chennai 603110, Tamil Nadu, India. EM mri-nalinik@ssn.edu.in; vijayalakshmip@ssn.edu.in; nagarajant@snuchennai.edu.in TC 12 Z9 14 PY 2022 VL 30 BP 1396 EP 1406 DI 10.1109/TASLP.2022.3161160 WC Acoustics; Engineering, Electrical & Electronic WE Science Citation Index Expanded (SCI-EXPANDED) SC Acoustics; Engineering UT WOS:000783540100004 DA 2026-04-14 ER PT J AU Kim, JK Chua, M Rickard, M Lorenzo, A AF Kim, Jin K. Chua, Michael Rickard, Mandy Lorenzo, Armando TI ChatGPT and large language model (LLM) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine SO JOURNAL OF PEDIATRIC UROLOGY DT Review DE Large language model; ChatGPT; Generative pre-trained transformer; Artificial intelligence AB IntroductionThere is currently no clear consensus on the standards for using large language models such as ChatGPT in academic medicine. Hence, we performed a scoping review of available literature to understand the current state of LLM use in medicine and to provide a guideline for future utilization in academia.Materials and methodsA scoping review of the literature was performed through a Medline search on February 16, 2023 using a combination of keywords including artificial intelligence, machine learning, natural language processing, generative pretrained transformer, ChatGPT, and large language model. There were no restrictions to language or date of publication. Records not pertaining to LLMs were excluded. Records pertaining to LLM ChatBots and ChatGPT were identified and evaluated separately. Among the records pertaining to LLM ChatBots and ChatGPT, those that suggest recommendations for ChatGPT use in academia were utilized to create guideline statements for ChatGPT and LLM use in academic medicine.ResultsA total of 87 records were identified. 30 records were not pertaining to large language models and were excluded. 54 records underwent a full-text review for evaluation. There were 33 records related to LLM ChatBots or ChatGPT.DiscussionFrom assessing these texts, five guideline state-ments for LLM use was developed: (1) ChatGPT/LLM cannot be cited as an author in scientific manu -scripts; (2) If use of ChatGPT/LLM are considered for use in academic work, author(s) should have at least a basic understanding of what ChatGPT/LLM is; (3) Do not use ChatGPT/LLM to produce entirety of text in manuscripts; humans must be held accountable for use of ChatGPT/LLM and contents created by ChatGPT/LLM should be meticulously verified by humans; (4) ChatGPT/LLMs may be used for editing and refining of text; (5) Any use of ChatGPT/LLM should be transparent and should be clearly outlined in scientific manuscripts and acknowledged.ConclusionFuture authors should remain mindful of the po-tential impact their academic work may have on healthcare and continue to uphold the highest ethical standards and integrity when utilizing ChatGPT/LLM. C1 [Kim, Jin K.; Chua, Michael; Rickard, Mandy; Lorenzo, Armando] Hosp Sick Children, Dept Surg, Div Urol, Toronto, ON, Canada. [Kim, Jin K.; Chua, Michael; Lorenzo, Armando] Univ Toronto, Dept Surg, Div Urol, Toronto, ON, Canada. [Chua, Michael] St Lukes Med Ctr, Inst Urol, Quezon City, Philippines. [Kim, Jin K.] Hosp Sick Children, Div Urol, 555 Univ Ave, Toronto, ON M5G 1X8, Canada. C3 University of Toronto; Hospital for Sick Children (SickKids); University of Toronto; Saint Lukes Medical Center - Philippines; University of Toronto; Hospital for Sick Children (SickKids) RP Kim, JK (corresponding author), Hosp Sick Children, Div Urol, 555 Univ Ave, Toronto, ON M5G 1X8, Canada. EM jjk.kim@mail.utoronto.ca TC 157 Z9 161 PD OCT PY 2023 VL 19 IS 5 BP 598 EP 604 DI 10.1016/j.jpurol.2023.05.018 WC Pediatrics; Urology & Nephrology WE Science Citation Index Expanded (SCI-EXPANDED) SC Pediatrics; Urology & Nephrology UT WOS:001086204600001 DA 2026-04-14 ER PT J AU Araghi, S Palangkaraya, A AF Araghi, Sahar Palangkaraya, Alfons TI The link between translation difficulty and the quality of machine translation: a literature review and empirical investigation SO LANGUAGE RESOURCES AND EVALUATION DT Article DE Machine translation; Human translation; Translation difficulty; Automatic machine translation evaluation AB We survey the relevant literature on translation difficulty and automatic evaluation of machine translation (MT) quality and investigate whether source text's translation difficulty features contain any information about MT quality. We analyse the 2017-2019 Conferences on Machine Translation (WMT) data of machine translation quality of English news text translated to eleven different languages (Chinese, Czech, Estonian, Finnish, Latvian, Lithuanian, German, Gujarati, Kazakh, Russian, and Turkish). We find (weak) negative correlation between the source text's length, polysemy and structural complexity and the corresponding human evaluated quality of machine translation. This suggests a potentially important but measureable influence of source text's translation difficulty on MT quality. C1 [Araghi, Sahar; Palangkaraya, Alfons] Swinburne Univ Technol, Ctr Transformat Innovat, Melbourne, Australia. C3 Swinburne University of Technology RP Araghi, S (corresponding author), Swinburne Univ Technol, Ctr Transformat Innovat, Melbourne, Australia. EM sahararaghi@gmail.com; apalangkaraya@swin.edu.au TC 0 Z9 0 PD DEC 1 PY 2024 VL 58 IS 4 BP 1093 EP 1114 DI 10.1007/s10579-024-09735-x EA JUN 2024 WC Computer Science, Interdisciplinary Applications WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science UT WOS:001242772400001 DA 2026-04-14 ER PT J AU Bhuvaneswari, K Varalakshmi, M AF Bhuvaneswari, Kumar Varalakshmi, Murugesan TI Efficient incremental training using a novel NMT-SMT hybrid framework for translation of low-resource languages SO FRONTIERS IN ARTIFICIAL INTELLIGENCE DT Article DE hybrid NMT-SMT; incremental training; beam search; SMT phrase table; low-resource languages ID MACHINE TRANSLATION AB The data-hungry statistical machine translation (SMT) and neural machine translation (NMT) models offer state-of-the-art results for languages with abundant data resources. However, extensive research is imperative to make these models perform equally well for low-resource languages. This paper proposes a novel approach to integrate the best features of the NMT and SMT systems for improved translation performance of low-resource English-Tamil language pair. The suboptimal NMT model trained with the small parallel corpus translates the monolingual corpus and selects only the best translations, to retrain itself in the next iteration. The proposed method employs the SMT phrase-pair table to determine the best translations, based on the maximum match between the words of the phrase-pair dictionary and each of the individual translations. This repeating cycle of translation and retraining generates a large quasi-parallel corpus, thus making the NMT model more powerful. SMT-integrated incremental training demonstrates a substantial difference in translation performance as compared to the existing approaches for incremental training. The model is strengthened further by adopting a beam search decoding strategy to produce k best possible translations for each input sentence. Empirical findings prove that the proposed model with BLEU scores of 19.56 and 23.49 outperforms the baseline NMT with scores 11.06 and 17.06 for Eng-to-Tam and Tam-to-Eng translations, respectively. METEOR score evaluation further corroborates these results, proving the supremacy of the proposed model. C1 [Bhuvaneswari, Kumar] Vellore Inst Technol, Sch Comp Sci Engn & Informat Syst, Vellore, Tamil Nadu, India. [Varalakshmi, Murugesan] Vellore Inst Technol, Sch Comp Sci & Engn, Vellore, Tamil Nadu, India. C3 Vellore Institute of Technology (VIT); VIT Vellore; Vellore Institute of Technology (VIT); VIT Vellore RP Varalakshmi, M (corresponding author), Vellore Inst Technol, Sch Comp Sci & Engn, Vellore, Tamil Nadu, India. EM mvaralakshmi@vit.ac.in TC 7 Z9 9 PD SEP 25 PY 2024 VL 7 AR 1381290 DI 10.3389/frai.2024.1381290 WC Computer Science, Artificial Intelligence; Computer Science, Information Systems WE Emerging Sources Citation Index (ESCI) SC Computer Science UT WOS:001331447200001 DA 2026-04-14 ER PT J AU Tafa, TO Hashim, SZM Othman, MS Alhussian, H Nasser, M Abdulkadir, SJ Huspi, SH Adeyemo, SO Bena, YA AF Tafa, Taofik O. Hashim, Siti Zaiton Mohd Othman, Mohd Shahizan Alhussian, Hitham Nasser, Maged Abdulkadir, Said Jadid Huspi, Sharin Hazlin Adeyemo, Sarafa O. Bena, Yunusa Adamu TI Machine Translation Performance for Low-Resource Languages: A Systematic Literature Review SO IEEE ACCESS DT Review DE Translation; Transformers; Multilingual; Systematic literature review; Data models; Adaptation models; Transfer learning; Measurement; Training; Accuracy; Low-resource languages; machine translation; machine translation performance; machine translation techniques; systematic literature review ID MODELS AB Machine translation (MT) for low-resource languages continues to face significant challenges because of limited digital resources and parallel corpora, despite remarkable developments in neural machine translation (NMT). Addressing these challenges requires a thorough review of existing research to identify effective strategies and methods. To achieve this, a systematic literature review (SLR) is conducted following PRISMA guidelines and systematically analysing studies published in various academic databases in the last five years (between 2020 and 2024). A total of 69 relevant articles were examined to evaluate the performance of MT, explore persistent challenges and assess the effectiveness of proposed or used solutions. The analysis shows that while NMT has emerged as the predominant approach, its effectiveness is often reduced by the scarcity of training data and the structural complexity of low-resource languages. Strategies such as active learning, data augmentation, multilingual models and transfer learning are identified as critical for improving translation performance. Additionally, emerging research trends, including data pre-processing, optimization of decoder and rule-based approach demonstrate promising directions for addressing existing limitations. In terms of evaluation, most of the studies used Character n-gram F-score (ChrF), Translation Edit Rate (TER), Metric for Evaluation of Translation with Explicit Ordering (METEOR), Word Error Rate (WER) and Bilingual Evaluation Underscore (BLEU) as techniques' validation metrics. This review provides a detailed evaluation of the current state of MT for low-resource languages and emphasizes the need for further research into underrepresented languages and the development of comprehensive datasets. C1 [Tafa, Taofik O.; Hashim, Siti Zaiton Mohd; Othman, Mohd Shahizan; Huspi, Sharin Hazlin; Adeyemo, Sarafa O.; Bena, Yunusa Adamu] Univ Teknol Malaysia, Fac Comp, Johor Baharu 81310, Malaysia. [Tafa, Taofik O.; Adeyemo, Sarafa O.] Fed Coll Educ Tech Gusau, Dept Comp Sci, Gusau 632101, Nigeria. [Alhussian, Hitham; Nasser, Maged; Abdulkadir, Said Jadid] Univ Teknol PETRONAS, Dept Comp & Informat Sci, Seri Iskandar 32610, Perak, Malaysia. [Bena, Yunusa Adamu] Kebbi State Univ Sci & Technol, Fac Engn, Aliero, Nigeria. C3 Universiti Teknologi Malaysia; Universiti Teknologi Petronas RP Tafa, TO (corresponding author), Univ Teknol Malaysia, Fac Comp, Johor Baharu 81310, Malaysia.; Tafa, TO (corresponding author), Fed Coll Educ Tech Gusau, Dept Comp Sci, Gusau 632101, Nigeria.; Nasser, M (corresponding author), Univ Teknol PETRONAS, Dept Comp & Informat Sci, Seri Iskandar 32610, Perak, Malaysia. EM tafa@graduate.utm.my; maged.nasser@utp.edu.my TC 1 Z9 2 PY 2025 VL 13 BP 72486 EP 72505 DI 10.1109/ACCESS.2025.3562918 WC Computer Science, Information Systems; Engineering, Electrical & Electronic; Telecommunications WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science; Engineering; Telecommunications UT WOS:001479471700015 DA 2026-04-14 ER PT J AU Shafayat, S Yoon, D Choi, J Jang, W Jung, S AF Shafayat, Sheikh Yoon, Dongkeun Choi, Jiwoo Jang, Woori Jung, Seohyon TI Evaluating English-Korean Literary Machine Translations: A Dataset Featuring the RULER and VERSE Annotation Methods SO JOURNAL OF OPEN HUMANITIES DATA DT Article DE evaluating literary translation; machine translation; English to Korean translation; annotation dataset; computational literary studies AB This dataset offers a novel, high-resolution resource for evaluating English-to-Korean machine translations of literary texts. It is uniquely positioned to support a wide range of research and development efforts across machine translation, evaluation methodology, literary studies, and human-AI interaction. By combining two original annotation schemes developed by our team-RULER and VERSE-with paragraphlevel alignment, the dataset provides multiple points of entry for both computational and humanities-based inquiry. C1 [Shafayat, Sheikh; Jung, Seohyon] Korea Adv Inst Sci & Technol, Sch Comp, Daejeon, South Korea. [Yoon, Dongkeun] Korea Adv Inst Sci & Technol, Grad Sch AI, Seoul, South Korea. [Choi, Jiwoo; Jang, Woori; Jung, Seohyon] Korea Adv Inst Sci & Technol, Sch Digital Humanities & Computat Social Sci, Daejeon, South Korea. C3 Korea Advanced Institute of Science & Technology (KAIST); Korea Advanced Institute of Science & Technology (KAIST); Korea Advanced Institute of Science & Technology (KAIST) RP Jung, S (corresponding author), Korea Adv Inst Sci & Technol, Sch Comp, Daejeon, South Korea.; Jung, S (corresponding author), Korea Adv Inst Sci & Technol, Sch Digital Humanities & Computat Social Sci, Daejeon, South Korea. EM seohyon.jung@kaist.ac.kr TC 0 Z9 0 PY 2025 VL 11 AR 62 DI 10.5334/johd.393 WC Humanities, Multidisciplinary; Social Sciences, Interdisciplinary WE Emerging Sources Citation Index (ESCI) SC Arts & Humanities - Other Topics; Social Sciences - Other Topics UT WOS:001652162700001 DA 2026-04-14 ER PT J AU You, M Zhang, J Wong, DF Lan, KX AF You, Mu Zhang, Jing Wong, Derek F. Lan, Kaixin TI How well can state-of-the-art machine translation systems render a 16th-century Chinese novel? SO CADERNOS DE TRADUCAO DT Article DE machine translation; Chinese-Portuguese literary translation; Journey to the West; human evaluation; culture-specific item ID QUALITY ASSESSMENT AB This study evaluates the performance of state-of-the-art machine translation systems in rendering Journey to the West, a culturally rich 16th-century Chinese novel, into Portuguese. Employing a mixed-methods approach, we compare translations produced by DeepSeek-V3, GPT-4o, DeepL Pro, and NovelTrans-J against a published human translation. Quantitative assessments conducted by an expert evaluator examine accuracy, fluency, stylistic elegance, cultural appropriateness, and overall translation quality at both the sentence and chunk levels. The results reveal that three MT systems (DeepSeek-V3, GPT-4o, and NovelTrans-J) produce translations of comparable or superior quality to the human translation. Among them, NovelTrans-J consistently outperforms all other participants, particularly in terms of cultural appropriateness. In contrast, DeepL Pro demonstrates significantly weaker performance across all evaluated dimensions. To complement the quantitative analysis, a qualitative investigation focuses on the rendering of culture-specific items (CSIs). NovelTrans-J exhibits outstanding performance, producing the fewest mistranslations and uniquely providing explanatory notes that facilitate reader comprehension. DeepSeek-V3 and GPT-4o also handle CSIs competently, though with less consistency, while DeepL Pro struggles considerably, showing a high rate of CSI mistranslations and generally low quality. Interestingly, the human translation also contains notable CSI-related errors, particularly in cases involving semantically opaque expressions, an area in which all participants encounter significant difficulty. These findings underscore the growing potential of MT systems to handle complex, culturally rich literary texts, although certain challenges, such as the translation of semantically opaque expressions, remain significant obstacles. We hope this study provides an updated perspective on the current capabilities of MT and offers practical insights to guide the development of future systems that can more accurately capture and transmit the distinctive cultural nuances embedded in literary works. C1 [You, Mu; Zhang, Jing; Wong, Derek F.; Lan, Kaixin] Univ Macau, Macau, Peoples R China. C3 University of Macau RP You, M (corresponding author), Univ Macau, Macau, Peoples R China. EM youmuafonso@gmail.com; jingz@um.edu.mo; derekfw@um.edu.mo; nlp2ct.kaixin@gmail.com TC 1 Z9 1 PY 2025 VL 45 SI SI AR e108394 DI 10.5007/2175-7968.2025.e108394 WC Language & Linguistics WE Emerging Sources Citation Index (ESCI) SC Linguistics UT WOS:001594966300002 DA 2026-04-14 ER PT J AU Mah, SH AF Mah, Seung-Hye TI Defining language dependent post-editing guidelines for specific content The case of the English-Korean pair to improve literature machine translation styles SO BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION DT Article DE machine translation (MT); machine translation post-editing (MTPE); literature MT; style guideline in literature MTPE; differences between English and Korean AB The rapid development of neural machine translation systems and the emergence of the e-book have broadened the scope of text types that can be translated by machines. At the early stage of the machine's infiltration into the translation field, target texts were mainly technical texts such as patents, instruction manuals, etc. Literary texts have been considered as the last bastion of human translation because the machine translation (MT) has produced word-for-word translation, unsuitable for literary texts with distinct stylistic elements. However, it turns out that the field of literary translation was not immune to the rise of MT. Style is one of the critical elements in literary texts, but it has been dismissed in the existing MT post-editing guidelines. Therefore, this research attempts to provide methodological ideas about how to come up with a machine translation post-editing guideline (MTPE) for style improvement especially for language pairs with divergent syntax and semantics like English and Korean. First, the linguistic and cultural differences in writing styles are sorted out based on previous research. Second, the different ways in which human translators address writing style are investigated. Third, the strategies that human translators employ in their translations are applied to machine translation post-editing to demonstrate how the strategies can be incorporated into English-Korean MTPE to improve style. This preliminary research would lay the groundwork for refining post-editing style guidelines and for accumulating manually post-edited data for style improvement, which would be conducive to building and customizing automatic post-editing systems. C1 [Mah, Seung-Hye] Hankuk Univ Foreign Studies, Yongin, Gyeonggi Do, South Korea. C3 Hankuk University Foreign Studies RP Mah, SH (corresponding author), Hankuk Univ Foreign Studies, Sch English Interpretat & Translat, 81 Oedae Ro,17035 Cheoin Gu, Yongin, Gyeonggi Do, South Korea. EM shm213@gmail.com TC 6 Z9 6 PY 2020 VL 66 IS 4-5 BP 811 EP 828 DI 10.1075/babel.00174.mah WC Linguistics; Language & Linguistics WE Social Science Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI) SC Linguistics UT WOS:000593026800018 DA 2026-04-14 ER PT J AU Martin, A Roussi, K Rice, H Bertuzzi, A AF Martin, Alison Roussi, Kalliopi Rice, Hannah Bertuzzi, Andrea TI EVALUATION OF THE USE OF CHATGPT IN DATA EXTRACTION FOR SYSTEMATIC LITERATURE REVIEWS: FRIEND, FOE, OR FICTION? SO VALUE IN HEALTH DT Meeting Abstract C1 [Martin, Alison; Roussi, Kalliopi; Rice, Hannah; Bertuzzi, Andrea] Crystallise Ltd, Colchester, England. TC 0 Z9 0 PD DEC PY 2025 VL 28 IS 12 SU 1 MA MSR100 WC Economics; Health Care Sciences & Services; Health Policy & Services WE Science Citation Index Expanded (SCI-EXPANDED); Social Science Citation Index (SSCI) SC Business & Economics; Health Care Sciences & Services UT WOS:001664359800131 DA 2026-04-14 ER PT J AU Alkaoud, M AF Alkaoud, Mohamed TI A bilingual benchmark for evaluating large language models SO PEERJ COMPUTER SCIENCE DT Article DE Natural language processing; Large language models; Multilingual NLP; LLM evaluation; Arabic NLP; ChatGPT AB This work introduces a new benchmark for the bilingual evaluation of large language models (LLMs) in English and Arabic. While LLMs have transformed various fields, their evaluation in Arabic remains limited. This work addresses this gap by proposing a novel evaluation method for LLMs in both Arabic and English, allowing for a direct comparison between the performance of the two languages. We build a new evaluation dataset based on the General Aptitude Test (GAT), a standardized test widely used for university admissions in the Arab world, that we utilize to measure the linguistic capabilities of LLMs. We conduct several experiments to examine the linguistic capabilities of ChatGPT and quantify how much better it is at English than Arabic. We also examine the effect of changing task descriptions from Arabic to English and vice-versa. In addition to that, we find that fastText can surpass ChatGPT in finding Arabic word analogies. We conclude by showing that GPT-4 Arabic linguistic capabilities are much better than ChatGPT's Arabic capabilities and are close to ChatGPT's English capabilities. C1 [Alkaoud, Mohamed] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh, Saudi Arabia. C3 King Saud University RP Alkaoud, M (corresponding author), King Saud Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh, Saudi Arabia. EM malkaoud@ksu.edu.sa TC 7 Z9 8 PD FEB 29 PY 2024 VL 10 AR e1893 DI 10.7717/peerj-cs.1893 WC Computer Science, Artificial Intelligence; Computer Science, Information Systems; Computer Science, Theory & Methods WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science UT WOS:001174942400004 DA 2026-04-14 ER PT J AU Asebel, MH Assefa, SG Haile, MA AF Asebel, Muluken Hussen Assefa, Shimelis Getu Haile, Mesfin Abebe TI Exploring the evolution and future prospects of Amharic to English machine translation: a systematic review SO FRONTIERS IN ARTIFICIAL INTELLIGENCE DT Review DE machine translation; Amharic; English; systematic review; low-resource languages AB Introduction: In the last couple of decades, Amharic-English translation has greatly improved from a rule-based approach to contemporary systems that apply neural networks. Even after these advancements, problems remain because of the Amharic language's resource-scarce nature, such as inadequate datasets, tools for working with the language, and the intricate semantics and grammar of Amharic as compared to English. This systematic review seeks to analyze the evolution of the Amharic-English machine translation, the prominent ongoing difficulties, the noteworthy research undertakings, and the prospects of the research focus. Methods: This review uses a systematic approach to study the literature on Amharic-English machine translation. Important documents were retrieved from academic websites, and those with relevance to the methodologies of machine translation, language resources development, and evaluation practices were chosen. Primarily, the focus was on both statistical and neural machine translation models, especially those with transformer structures. Results: The initial attempts to translate English to Amharic and vice-versa relied on statistic machine translation (SMT), which set the stage for the evolution to neural machine translation (NMT). The use of transformer models has impacted the accuracy and fluidity of translations tremendously. Still, there is a lack of sufficient parallel corpora, effective methods for tokenization of Amharic, and other resources. Recently, the focus has been on creating new datasets, improving token-level engineering, and modifying NMT models for Amharic's complex morphological structure. Discussion: The complete solutions for enhancing Amharic-English translation remain elusive and include the lack of sufficient data, semantic correspondence, and grammatical consistency within and across translations. Pursuable avenues include augmentation of data, tokenization on the language level, and incorporation of linguistic elements into the parallel corpora. In addition, creating effective evaluation frameworks along with comprehensive linguistic data is important for assessing and improving translation tools. With these changes, cross-cultural interaction and increasing accessibility to modern technologies will be achieved. C1 [Asebel, Muluken Hussen; Haile, Mesfin Abebe] Adama Sci & Technol Univ, Sch Elect Engn & Comp, Dept Comp Sci & Engn, Adama, Ethiopia. [Assefa, Shimelis Getu] Univ Denver, Morgridge Coll Educ, Dept Res Methods & Informat Sci, Denver, CO USA. C3 Adama Science & Technology University; University of Denver RP Asebel, MH (corresponding author), Adama Sci & Technol Univ, Sch Elect Engn & Comp, Dept Comp Sci & Engn, Adama, Ethiopia. EM muluken2@gmail.com TC 0 Z9 0 PD MAY 23 PY 2025 VL 8 AR 1456245 DI 10.3389/frai.2025.1456245 WC Computer Science, Artificial Intelligence; Computer Science, Information Systems WE Emerging Sources Citation Index (ESCI) SC Computer Science UT WOS:001504441500001 DA 2026-04-14 ER PT J AU Leiter, C Lertvittayakumjorn, P Fomicheva, M Zhao, W Gao, Y Eger, S AF Leiter, Christoph Lertvittayakumjorn, Piyawat Fomicheva, Marina Zhao, Wei Gao, Yang Eger, Steffen TI Towards Explainable Evaluation Metrics for Machine Translation SO JOURNAL OF MACHINE LEARNING RESEARCH DT Article DE evaluation metrics; explainability; interpretability; machine translation; ma chine translation evaluation AB Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for machine translation (for example, COMET or BERTScore) are based on black -box large language models. They often achieve strong correlations with human judgments, but recent research indicates that the lower -quality classical metrics remain dominant, one of the potential reasons being that their decision processes are more transparent. To foster more widespread acceptance of novel high -quality metrics, explainability thus becomes crucial. In this concept paper, we identify key properties as well as key goals of explainable machine translation metrics and provide a comprehensive synthesis of recent techniques, relating them to our established goals and properties. In this context, we also discuss the latest state-of-the-art approaches to explainable metrics based on generative models such as ChatGPT and GPT4. Finally, we contribute a vision of next -generation approaches, including natural language explanations. We hope that our work can help catalyze and guide future research on explainable evaluation metrics and, mediately, also contribute to better and more transparent machine translation systems. C1 [Leiter, Christoph] Univ Mannheim, Nat Language Learning Grp, B6 26, D-68159 Mannheim, Germany. [Lertvittayakumjorn, Piyawat] Imperial Coll London, London, England. [Fomicheva, Marina] Univ Sheffield, Sheffield, England. [Zhao, Wei] Heidelberg Inst Theoret Studies, Heidelberg, Germany. [Gao, Yang] Royal Holloway Univ London, Egham, England. [Eger, Steffen] Univ Mannheim, Mannheim, Germany. C3 University of Mannheim; Imperial College London; University of Sheffield; Heidelberg Institute for Theoretical Studies; University of London; Royal Holloway University London; University of Mannheim RP Leiter, C (corresponding author), Univ Mannheim, Nat Language Learning Grp, B6 26, D-68159 Mannheim, Germany. EM CHRISToPH.LEITER@UNI-MANNHEIM.DE; PL1515@IMPERIAL.AC.UK; M.FoMICHEVA@SHEFFIELD.AC.UK; WEI.ZHAo@H-ITS.oRG; GAoSTAYYANG@GooGLE.CoM; STEFFEN.EGER@UNI-MANNHEIM.DE TC 8 Z9 12 PY 2024 VL 25 AR 75 WC Automation & Control Systems; Computer Science, Artificial Intelligence WE Science Citation Index Expanded (SCI-EXPANDED) SC Automation & Control Systems; Computer Science UT WOS:001204542600001 DA 2026-04-14 ER PT J AU Sonntagbauer, M Haar, M Kluge, S AF Sonntagbauer, Michael Haar, Markus Kluge, Stefan TI Artificial intelligence: How will ChatGPT and other AI applications change our everyday medical practice? SO MEDIZINISCHE KLINIK-INTENSIVMEDIZIN UND NOTFALLMEDIZIN DT Article DE Big data; Diagnostic support systems; Machine learning; Digitization in medicine; Large language models AB Background: With the free provision of the chat robot "ChatGPT" by the company OpenAI in November 2022, an application of artificial intelligence (AI) became tangible for everyone.Objectives: An explanation of the basic functionality of large language models (LLM) is given, followed by a presentation of application options of ChatGPT in medicine, and an outlook and discussion of possible dangers of AI applications.Methods: Problem solving with ChatGPT using concrete examples. Analysis and discussion of the available scientific literature.Results: There has been a significant increase in the use of AI applications in scientific work, especially in scientific writing. Wide application of LLM in writing medical documentation is conceivable. Technical functionality allows the use of AI applications as a diagnostic support system. There is a risk of spreading and entrenching inaccuracies and bias through application of LLM. Regulation of this new technology is pending.Conclusion: AI applications such as ChatGPT have the potential to permanently change everyday medical practice. An examination of this technology and evaluation of opportunities and risks is warranted. C1 [Sonntagbauer, Michael; Haar, Markus; Kluge, Stefan] Univ Klinikum Hamburg Eppendorf, Klin Intens Med, Hamburg, Germany. [Sonntagbauer, Michael] Univ Klinikum Hamburg Eppendorf, Klin Intens Med, 10 Raum 02-5-062-1,Martinistr 52, D-20251 Hamburg, Germany. C3 University of Hamburg; University Medical Center Hamburg-Eppendorf; University of Hamburg; University Medical Center Hamburg-Eppendorf RP Sonntagbauer, M (corresponding author), Univ Klinikum Hamburg Eppendorf, Klin Intens Med, 10 Raum 02-5-062-1,Martinistr 52, D-20251 Hamburg, Germany. EM m.sonntagbauer@uke.de TC 16 Z9 18 PD JUN PY 2023 VL 118 IS 5 SI SI BP 366 EP 371 DI 10.1007/s00063-023-01019-6 EA APR 2023 WC Medicine, General & Internal WE Science Citation Index Expanded (SCI-EXPANDED) SC General & Internal Medicine UT WOS:000981342600003 DA 2026-04-14 ER PT J AU Zhang, XM AF Zhang, Xiaomei TI Application of Minimax Optimization Mechanism in Chinese-English Machine Translation Quality Estimation SO IEEE ACCESS DT Article DE Feature extraction; Translation; Predictive models; Estimation; Optimization; Generators; Reliability; Context modeling; Computational modeling; Vectors; Machine translation quality estimation; neural machine translation; generator-discriminator model; convolutional neural networks AB Machine Translation Quality Estimation (MTQE) is pivotal in bridging the gap between machine-generated translations and human translation quality, especially in real-time applications where post-editing is not feasible. Despite advancements with Neural Machine Translation (NMT), challenges such as mistranslation, omissions, and over-translation persist. Traditional MTQE models often suffer from incoherent optimization goals due to their dual phase architecture, limiting their effectiveness. In this study, we introduce a novel approach that integrates minimax optimization to unify the prediction and estimation phases under a common optimization goal. Utilizing the Ranger optimizer, our model comprises a generator based on the T5 (Text-to-Text Transfer Transformer) and a discriminator leveraging Convolutional Neural Networks (ConvNet). Additionally, we incorporate a data reliability screening module to ensure the discriminator is trained on high-quality data. Experiments conducted on the Chinese-English corpus demonstrate that the superiority of proposed approach. Furthermore, the generator maintains comparable BLEU scores to baseline models, confirming that the minimax optimization mechanism does not compromise its performance. Ablation studies highlight the optimal settings for the optimizer, pooling strategies, and learning rates, underscoring the importance of data reliability screening in achieving stable training outcomes. Our findings indicate that minimax optimization is a viable strategy for enhancing MTQE models, offering a path toward more accurate and reliable translation quality assessments. C1 [Zhang, Xiaomei] Lyuliang Univ, Dept Foreign Languages, Lvliang 033001, Shanxi, Peoples R China. C3 Lvliang University RP Zhang, XM (corresponding author), Lyuliang Univ, Dept Foreign Languages, Lvliang 033001, Shanxi, Peoples R China. EM uprfoghinm62@gmx.com TC 0 Z9 0 PY 2025 VL 13 BP 19026 EP 19039 DI 10.1109/ACCESS.2025.3533656 WC Computer Science, Information Systems; Engineering, Electrical & Electronic; Telecommunications WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science; Engineering; Telecommunications UT WOS:001410187600009 DA 2026-04-14 ER PT J AU Wang, HJ Du, XJ Hao, XK AF Wang, Huijun Du, Xiaojing Hao, Xinkun TI RMixDA: A Random Compound Data Augmentation Operation Framework for Neural Machine Translation SO IEEE ACCESS DT Article DE Data augmentation; Robustness; Noise; Data models; Translation; Training; Heuristic algorithms; Compounds; Neural machine translation; Vectors; Contrastive learning; data augmentation; neural machine translation; random walk AB In Neural Machine Translation (NMT), data augmentation is an effective method for improving model robustness by generating diverse augmented data from existing datasets. Typically, the quality of augmented data is evaluated based on its similarity and diversity to training data. Current approaches often rely on a single type of operation, limiting their ability to fully capture the range of possible variations in real-world data. This limitation will affect the robustness and generalization of the model. To address this, we propose RMixDA: a Random Mixed Algorithm Framework For Data Augmentation, which dynamically integrates multiple NMT-oriented data augmentation methods. RMixDA defines an extensible collection of multi-augmentation operations using the Backus-Naur Form (BNF) framework, which categorizes data augmentation methods into pre-embedding (pre_embed_op) , embedding (embed_op) and post-embedding (post_embed_op) levels. These operations are combined using a random walk algorithm to generate diverse augmented data, for maintaining both similarity and diversity. We further incorporate contrastive learning, identifying augmented data as positive or negative samples to evaluate similarity more effectively. Additionally, a novel evaluation method for augmented data quality is introduced, which is regularized by the NMT model's loss function to improve the positive correlation between augmented data quality and model training. Extensive experimental studies demonstrate the effectiveness of RMixDA and its scalability for integrating diverse data augmentation operations. On the IWSLT14 German-English and WMT14 English-German benchmarks, RMixDA achieves BLEU scores of 37.87 and 29.13, respectively, outperforming state-of-the-art methods by up to 3.44 and 1.83. This framework showcases practical utility in real-world NMT tasks, particularly for enhancing translation quality in low-resource language scenarios. C1 [Wang, Huijun; Hao, Xinkun] Guangxi Univ, Sch Comp Elect Informat, Nanning 530000, Peoples R China. [Du, Xiaojing] Univ South Australia, STEM, Adelaide, SA 5000, Australia. C3 Guangxi University; Adelaide University; University of South Australia RP Du, XJ (corresponding author), Univ South Australia, STEM, Adelaide, SA 5000, Australia. EM xiaojing.du@mymail.unisa.edu.au TC 0 Z9 0 PY 2025 VL 13 BP 105101 EP 105112 DI 10.1109/ACCESS.2025.3580635 WC Computer Science, Information Systems; Engineering, Electrical & Electronic; Telecommunications WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science; Engineering; Telecommunications UT WOS:001515605100046 DA 2026-04-14 ER PT J AU Su, AY Knebel, A Xu, AY Kaper, M Schmitt, P Nassar, JE Singh, M Farias, MJ Kim, J Diebo, BG Daniels, AH AF Su, Audrey Y. Knebel, Ashley Xu, Andrew Y. Kaper, Marco Schmitt, Phillip Nassar, Joseph E. Singh, Manjot Farias, Michael J. Kim, Jinho Diebo, Bassel G. Daniels, Alan H. TI Evaluation of retrieval-augmented generation and large language models in clinical guidelines for degenerative spine conditions SO EUROPEAN SPINE JOURNAL DT Article DE Artificial intelligence; Degenerative spinal diseases; Retrieval-augmented generation; Large Language models; ChatGPT-4o; NotebookLM ID CHATGPT AB PurposeDegenerative spinal diseases often require complex, patient-specific treatment, presenting a compelling challenge for artificial intelligence (AI) integration into clinical practice. While existing literature has focused on ChatGPT-4o performance in individual spine conditions, this study compares ChatGPT-4o, a traditional large language model (LLM), against NotebookLM, a novel retrieval-augmented model (RAG-LLM) supplemented with North American Spine Society (NASS) guidelines, for concordance with all five published NASS guidelines for degenerative spinal diseases.MethodsA total of 118 questions from NASS guidelines regarding five degenerative spinal conditions were presented to ChatGPT-4o and NotebookLM. All responses were scored based on accuracy, evidence-based conclusions, supplementary and complete information.ResultsOverall, NotebookLM provided significantly more accurate responses (98.3% vs. 40.7%, p < 0.05), more evidence-based conclusions (99.1% vs. 40.7%, p < 0.05), and more complete information (94.1% vs. 79.7%, p < 0.05), while ChatGPT-4o provided more supplementary information (98.3% vs. 67.8%, p < 0.05). These discrepancies became most prominent in nonsurgical and surgical interventions, wherein ChatGPT often produced recommendations with unsubstantiated certainty.ConclusionWhile RAG-LLMs are a promising tool for clinical decision-making assistance and show significant improvement from prior models, physicians should remain cautious when integrating AI into patient care, especially in the context of nuanced medical scenarios. C1 [Su, Audrey Y.; Knebel, Ashley; Xu, Andrew Y.; Kaper, Marco; Schmitt, Phillip; Nassar, Joseph E.; Singh, Manjot; Farias, Michael J.; Kim, Jinho] Brown Univ, Warren Alpert Med Sch, Providence, RI USA. [Diebo, Bassel G.; Daniels, Alan H.] Brown Univ, Dept Orthopaed Surg, Providence, RI 02912 USA. C3 Brown University; Brown University RP Daniels, AH (corresponding author), Brown Univ, Dept Orthopaed Surg, Providence, RI 02912 USA. EM audrey_su@brown.edu; ashley_knebel@brown.edu; andrew_xu@brown.edu; marco_kaper@brown.edu; phillip_schmitt@brown.edu; jen06@mail.aub.edu; manjot_singh@brown.edu; michael_farias@brown.edu; jinho_kim@brown.edu; dr.basseldiebo@gmail.com; alandanielsmd@gmail.com TC 4 Z9 4 PD MAR PY 2026 VL 35 IS 3 BP 1301 EP 1310 DI 10.1007/s00586-025-08994-8 EA JUL 2025 WC Clinical Neurology; Orthopedics WE Science Citation Index Expanded (SCI-EXPANDED) SC Neurosciences & Neurology; Orthopedics UT WOS:001524804800001 DA 2026-04-14 ER PT J AU Ed-Dali, R AF Ed-Dali, Rachid TI Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: performance, challenges, and implications SO COGENT ARTS & HUMANITIES DT Article DE AI-assisted literary translation; Arabic-English translation; translation quality evaluation; pragmatic coherence; ChatGPT4.5; DeepSeek R1; Translation; Interpreting; Language and Linguistics; Computational Linguisticx ID ARTIFICIAL-INTELLIGENCE; ISSUES AB Artificial Intelligence (AI) tools such as DeepSeek R1 and ChatGPT 4.5 have emerged as promising aids in Arabic-English literary translation. This study aims to compare the translation performance of these two systems using a mixed-methods approach. Quantitative analysis was conducted through five established evaluation metrics, BLEU, COMET, METEOR, BERTScore, and sacreBLEU, to assess accuracy, fluency, and semantic coherence. Complementing these measures, a qualitative evaluation was carried out with 80 undergraduate students from Cadi Ayyad University, who critically assessed anonymized AI-generated translations against their versions and published human translations using a structured rubric. Results indicate that DeepSeek R1 achieved consistently higher automated metric scores across literary genres (novels, plays, poems). However, qualitative analysis highlighted persistent challenges in pragmatic coherence, cohesion, and emotional depth, especially in poetry and dramatic texts. While DeepSeek R1 demonstrated potential in lexical accuracy and fluency, significant human intervention remains essential for achieving high-quality literary translations. Future research should integrate larger-scale human evaluations to comprehensively assess the capabilities and limitations of AI translation tools in diverse literary contexts. C1 [Ed-Dali, Rachid] Cadi Ayyad Univ, Fac Letters & Human Sci, Dept English Studies, Marrakech, Morocco. C3 Cadi Ayyad University of Marrakech RP Ed-Dali, R (corresponding author), Cadi Ayyad Univ, Fac Letters & Human Sci, Dept English Studies, Marrakech, Morocco. EM r.eddali@uca.ma TC 2 Z9 2 PD DEC 31 PY 2025 VL 12 IS 1 AR 2531183 DI 10.1080/23311983.2025.2531183 WC Humanities, Multidisciplinary WE Emerging Sources Citation Index (ESCI) SC Arts & Humanities - Other Topics UT WOS:001530330400001 DA 2026-04-14 ER PT J AU Mohamed, SA Abdou, MA Elsayed, AA AF Mohamed, Shereen A. Abdou, Mohamed A. Elsayed, Ashraf A. TI Residual Information Flow for Neural Machine Translation SO IEEE ACCESS DT Article DE Information flow; neural machine translation; neural sequence-to-sequence networks; residual connections; WMT14 English-German translation task AB Automatic machine translation plays an important role in reducing language barriers between people speaking different languages. Deep neural networks (DNN) have attained major success in diverse research fields such as computer vision, information retrieval, language modelling, and recently machine translation. Neural sequence-to-sequence networks have accomplished noteworthy progress for machine translation. Inspired by the success achieved by residual connections in different applications, in this work, we introduce a novel NMT model that adopts residual connections to achieve better performing automatic translation. Evaluation of the proposed model has shown an improvement in translation accuracy by 0.3 BLEU compared to the original model, using an ensemble of 5 LSTMs. Regarding training time complexity, the proposed model saves about 33% of the time needed by the original model to train datasets of short sentences. Deeper neural networks of the proposed model have shown a good performance in dealing with the vanishing/exploding problems. All experiments have been performed over NVIDIA Tesla V100 32G Passive GPU and using the WMT14 English-German translation task. C1 [Mohamed, Shereen A.; Elsayed, Ashraf A.] Alexandria Univ, Fac Sci, Dept Math & Comp Sci, Alexandria 21527, Egypt. [Abdou, Mohamed A.] City Sci Res & Technol Applicat, Informat Res Inst, Alexandria 21527, Egypt. [Elsayed, Ashraf A.] Al Alamein Int Univ, Fac Comp Sci & Engn, Al Alamein 21527, Egypt. C3 Egyptian Knowledge Bank (EKB); Alexandria University; Egyptian Knowledge Bank (EKB); City of Scientific Research & Technological Applications (SRTA-City); Alamein International University (AIU) RP Mohamed, SA (corresponding author), Alexandria Univ, Fac Sci, Dept Math & Comp Sci, Alexandria 21527, Egypt. EM shereen.nafie@alexu.edu.eg TC 1 Z9 1 PY 2022 VL 10 BP 118313 EP 118320 DI 10.1109/ACCESS.2022.3220691 WC Computer Science, Information Systems; Engineering, Electrical & Electronic; Telecommunications WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science; Engineering; Telecommunications UT WOS:000886316000001 DA 2026-04-14 ER PT J AU Munk, M Munkova, D AF Munk, Michal Munkova, Dasa TI Detecting errors in machine translation using residuals and metrics of automatic evaluation SO JOURNAL OF INTELLIGENT & FUZZY SYSTEMS DT Article DE Machine translation; evaluation; residuals; analytical language; inflectional language; MT errors ID MODELS AB Errors and residuals are closely related measures of the deviation. An error is a deviation of the observed value (PEMT output) from the expected value (MT output), while the residual of the observed value is the difference between the observed and predicted value of quality. We propose an exploratory data technique representing an ideal instrument to evaluate and improve machine translation (MT) systems. The main contribution consists of a rigorous technique (a statistical method), novel to the research of MT evaluation given by residual analysis to identify differences between MT output and post-edited machine translation output regarding human translation (reference). The residual analysis of the automatic metrics can help us to discover significant differences between MT and PEMT and to identify questionable issues regarding the one reference. In this study, we show the usage of residuals in MT evaluation. Using residual analysis, we identified sentences, in which significant differences were found in the scores of automatic metrics between MT output and post-edited (PE) MT output from Slovak into English. C1 [Munk, Michal] Constantine Philosopher Univ Nitra, Dept Informat, Nitra, Slovakia. [Munkova, Dasa] Constantine Philosopher Univ Nitra, Dept Translat Studies, Stefanikova 67, Nitra 94974, Slovakia. C3 Constantine the Philosopher University in Nitra; Constantine the Philosopher University in Nitra RP Munkova, D (corresponding author), Constantine Philosopher Univ Nitra, Dept Translat Studies, Stefanikova 67, Nitra 94974, Slovakia. EM dmunkova@ukf.sk TC 11 Z9 12 PY 2018 VL 34 IS 5 BP 3211 EP 3223 DI 10.3233/JIFS-169504 WC Computer Science, Artificial Intelligence WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science UT WOS:000433204800035 DA 2026-04-14 ER PT J AU Guerberof-Arenas, A Toral, A AF Guerberof-Arenas, Ana Toral, Antonio TI To be or not to be SO TARGET-INTERNATIONAL JOURNAL OF TRANSLATION STUDIES DT Article DE literary translation; translation reception; machine translation; post-editing; narrative engagement; enjoyment; comprehension; creativity ID AUDIOVISUAL TRANSLATION AB This article presents the results of a study focusing on the reception of a fictional story by Kurt Vonnegut translated from English into Catalan and Dutch in three conditions: machine translated, post-edited, and human translated. Participants (n = 223) rated the three conditions using three scales: narrative engagement, enjoyment, and translation reception. The results show that human translation had higher engagement, enjoyment, and translation reception in Catalan, compared to the post-edited and machine-translated translations. However, Dutch readers scored the post-edited translation higher than the human and machine translation, and the highest engagement and enjoyment scores were reported for the original English version. We hypothesize that when reading a fictional story in translation, not only are the condition and the quality of the translation key to understanding its reception, but also the participants' reading patterns, reading language, and, potentially, the status of the source language in their own societies. C1 [Guerberof-Arenas, Ana; Toral, Antonio] Univ Groningen, Groningen, Netherlands. [Guerberof-Arenas, Ana] Univ Groningen, Ctr Language & Cognit, Oude Kijk Jatstr 26, NL-9712 EK Groningen, Netherlands. C3 University of Groningen; University of Groningen RP Guerberof-Arenas, A (corresponding author), Univ Groningen, Ctr Language & Cognit, Oude Kijk Jatstr 26, NL-9712 EK Groningen, Netherlands. EM a.guerberof.arenas@rug.nl; a.toral.ruiz@rug.nl TC 5 Z9 8 PD MAY 16 PY 2024 VL 36 IS 2 BP 215 EP 244 DI 10.1075/target.22134.gue EA APR 2024 WC Linguistics; Language & Linguistics WE Social Science Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI) SC Linguistics UT WOS:001225017700004 DA 2026-04-14 ER PT J AU Ho, CN Tian, TFY Ayers, AT Aaron, RE Phillips, V Wolf, RM Mathioudakis, N Dai, TL Klonoff, DC AF Ho, Cindy N. Tian, Tiffany Ayers, Alessandra T. Aaron, Rachel E. Phillips, Vidith Wolf, Risa M. Mathioudakis, Nestoras Dai, Tinglong Klonoff, David C. TI Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review SO BMC MEDICAL INFORMATICS AND DECISION MAKING DT Article DE Artificial Intelligence; ChatGPT; Clinical decision-making; Large Language Model; Machine learning ID PERFORMANCE AB BackgroundThe large language models (LLMs), most notably ChatGPT, released since November 30, 2022, have prompted shifting attention to their use in medicine, particularly for supporting clinical decision-making. However, there is little consensus in the medical community on how LLM performance in clinical contexts should be evaluated.MethodsWe performed a literature review of PubMed to identify publications between December 1, 2022, and April 1, 2024, that discussed assessments of LLM-generated diagnoses or treatment plans.ResultsWe selected 108 relevant articles from PubMed for analysis. The most frequently used LLMs were GPT-3.5, GPT-4, Bard, LLaMa/Alpaca-based models, and Bing Chat. The five most frequently used criteria for scoring LLM outputs were "accuracy", "completeness", "appropriateness", "insight", and "consistency".ConclusionsThe most frequently used criteria for defining high-quality LLMs have been consistently selected by researchers over the past 1.5 years. We identified a high degree of variation in how studies reported their findings and assessed LLM performance. Standardized reporting of qualitative evaluation metrics that assess the quality of LLM outputs can be developed to facilitate research studies on LLMs in healthcare. C1 [Ho, Cindy N.; Tian, Tiffany; Ayers, Alessandra T.; Aaron, Rachel E.] Diabet Technol Soc, Burlingame, CA USA. [Phillips, Vidith; Mathioudakis, Nestoras] Johns Hopkins Univ, Sch Med, Baltimore, MD USA. [Wolf, Risa M.] Johns Hopkins Univ Hosp, Div Pediat Endocrinol, BALTIMORE, MD USA. [Wolf, Risa M.; Dai, Tinglong] Johns Hopkins Univ, Hopkins Business Hlth Initiat, Washington, DC USA. [Dai, Tinglong] Johns Hopkins Univ, Carey Business Sch, Baltimore, MD USA. [Dai, Tinglong] Johns Hopkins Univ, Sch Nursing, Baltimore, MD USA. [Klonoff, David C.] Mills Peninsula Med Ctr, Diabet Res Inst, 100 South San Mateo Dr,Room 1165, San Mateo, CA 94401 USA. C3 Johns Hopkins University; Johns Hopkins University; Johns Hopkins Medicine; Johns Hopkins University; Johns Hopkins University; Johns Hopkins University RP Klonoff, DC (corresponding author), Mills Peninsula Med Ctr, Diabet Res Inst, 100 South San Mateo Dr,Room 1165, San Mateo, CA 94401 USA. EM dklonoff@diabetestechnology.org TC 16 Z9 17 PD NOV 26 PY 2024 VL 24 IS 1 AR 357 DI 10.1186/s12911-024-02757-z WC Medical Informatics WE Science Citation Index Expanded (SCI-EXPANDED) SC Medical Informatics UT WOS:001364059400003 DA 2026-04-14 ER PT J AU Lin, YQ Wang, ZH AF Lin, YiQing Wang, ZhongHua TI A novel method for linguistic steganography by English translation using attention mechanism and probability distribution theory SO PLOS ONE DT Article ID STEGANALYSIS AB To enhance our ability to model long-range semantical dependencies, we introduce a novel approach for linguistic steganography through English translation. This method leverages attention mechanisms and probability distribution theory, known as NMT-stega (Neural Machine Translation-steganography). Specifically, to optimize translation accuracy and make full use of valuable source text information, we employ an attention-based NMT model as our translation technique. To address potential issues related to the degradation of text quality due to secret information embedding, we have devised a dynamic word pick policy based on probability variance. This policy adaptively constructs an alternative set and dynamically adjusts embedding capacity at each time step, guided by variance thresholds. Additionally, we have incorporated prior knowledge into the model by introducing a hyper-parameter that balances the contributions of the source and target text when predicting the embedded words. Extensive ablation experiments and comparative analyses, conducted on a large-scale Chinese-English corpus, validate the effectiveness of the proposed method across several critical aspects, including embedding rate, text quality, anti-steganography, and semantical distance. Notably, our numerical results demonstrate that the NMT-stega method outperforms alternative approaches in anti-steganography tasks, achieving the highest scores in two steganalysis models, NFZ-WDA (with score of 53) and LS-CNN (with score of 56.4). This underscores the superiority of NMT-stega in the anti-steganography attack task. Furthermore, even when generating longer sentences, with average lengths reaching 47 words, our method maintains strong semantical relationships, as evidenced by a semantic distance of 87.916. Moreover, we evaluate the proposed method using two metrics, Bilingual Evaluation Understudy and Perplexity, and achieve impressive scores of 42.103 and 23.592, respectively, highlighting its exceptional performance in the machine translation task. C1 [Lin, YiQing] Xian Shiyou Univ, Sch Foreign Languages, Xian, Peoples R China. [Wang, ZhongHua] Xian Aeronaut Comp Tech Res Inst, AVIC, Xian, Peoples R China. C3 Xi'an Shiyou University; Aviation Industry Corporation of China (AVIC); Xihang University RP Lin, YQ (corresponding author), Xian Shiyou Univ, Sch Foreign Languages, Xian, Peoples R China. EM tnhlwbvzehpx129@yahoo.com TC 6 Z9 8 PD JAN 2 PY 2024 VL 19 IS 1 AR e0295207 DI 10.1371/journal.pone.0295207 WC Multidisciplinary Sciences WE Science Citation Index Expanded (SCI-EXPANDED) SC Science & Technology - Other Topics UT WOS:001136266700048 DA 2026-04-14 ER PT J AU Liu, Y Wang, F AF Liu, Yang Wang, Fan TI Investigating the interpretability of ChatGPT in mental health counseling: An analysis of artificial intelligence generated content differentiation SO COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE DT Article DE Interpretability analysis; Mental health; Machine learning; AIGC; Psychological counseling; Large language models; Topic modeling ID DISORDERS; SEEKING AB The global impact of COVID-19 has caused a significant rise in the demand for psychological counseling services, creating pressure on existing mental health professionals. Large language models (LLM), like ChatGPT, are considered a novel solution for delivering online psychological counseling. However, performance evaluation, emotional expression, high levels of anthropomorphism, ethical issues, transparency, and privacy breaches need to be addressed before LLM can be widely adopted. This study aimed to evaluate ChatGPT's effectiveness and emotional support capabilities in providing mental health counseling services from both macro and micro perspectives to examine whether it possesses psychological support abilities comparable to those of human experts. Building on the macro-level evaluation, we conducted a deeper comparison of the linguistic differences between ChatGPT and human experts at the microlevel. In addition, to respond to current policy requirements regarding the labeling, we further explored how to identify artificial intelligence generated content (AIGC) in counseling texts and which micro-level linguistic features can effectively distinguish AIGC from user-generated content (UGC). Finally, the study addressed transparency, privacy breaches, and ethical concerns. We utilized ChatGPT for psychological interventions, applying LLM to address various mental health issues. The BERTopic algorithm evaluated the content across multiple mental health problems. Deep learning techniques were employed to differentiate between AIGC and UGC in psychological counseling responses. Furthermore, Local Interpretable Model-agnostic Explanation (LIME) and SHapley Additive exPlanations (SHAP) evaluate interpretability, providing deeper insights into the decision-making process and enhancing transparency. At the macro level, ChatGPT demonstrated performance comparable to human experts, exhibiting professionalism, diversity, empathy, and a high degree of human likeness, making it highly effective in counseling services. At the micro level, deep learning models achieved accuracy rates of 99.12 % and 96.13 % in distinguishing content generated by ChatGPT 3.5 and ChatGPT 4.0 from UGC, respectively. Interpretability analysis revealed that context, sentence structure, and emotional expression were key factors differentiating AIGC from UGC. The findings highlight ChatGPT's potential to deliver effective online psychological counseling and demonstrate a reliable framework for distinguishing between artificial intelligence-generated and human-generated content. This study underscores the importance of leveraging large-scale language models to support mental health services while addressing high-level anthropomorphic issues and ethical and practical challenges. C1 [Liu, Yang; Wang, Fan] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China. C3 Wuhan University RP Wang, F (corresponding author), Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China. EM 2021101040028@whu.edu.cn TC 2 Z9 4 PD AUG PY 2025 VL 268 AR 108864 DI 10.1016/j.cmpb.2025.108864 EA AUG 2025 WC Computer Science, Interdisciplinary Applications; Computer Science, Theory & Methods; Engineering, Biomedical; Medical Informatics WE Science Citation Index Expanded (SCI-EXPANDED) SC Computer Science; Engineering; Medical Informatics UT WOS:001504646800001 DA 2026-04-14 ER EF