A 57-year-old kidney transplant patient residing in a small eastern Chinese city regularly undertakes a two-day trip to visit her doctor. This journey involves a high-speed train ride to Hangzhou, where she stays overnight, carrying medical reports and essentials.
The following morning, she joins a large queue for blood tests at the bustling hospital. Later, after receiving lab results, she sees a specialist for a brief consultation, typically lasting only a few minutes. The doctor quickly reviews her reports, issues a new prescription, and moves on to the next patient, after which she begins her journey home.
DeepSeek, an AI chatbot, offered a different experience.
The patient started using China’s prominent AI chatbot to address her symptoms during the recent winter, accessing the application on her iPhone from her couch.
Her initial interaction on February 2nd began with a simple “Hi,” to which the system promptly replied, “Hello! How can I assist you today?” with a smiley emoji. Over subsequent months, she posed various health questions, such as “What is causing high mean corpuscular hemoglobin concentration?” in March, and inquiries about nocturnal urination and kidney perfusion in April. She engaged in lengthy virtual consultations with “Dr. DeepSeek,” asking follow-up questions and seeking advice on diet, exercise, and medication. She uploaded ultrasound scans and lab reports for interpretation, adjusting her lifestyle based on the bot’s suggestions, including reducing immunosuppressant medication and incorporating green tea extract. She expressed great satisfaction, once praising it as her “best health adviser!” The chatbot responded with encouraging messages, such as “Hearing you say that really makes me so happy! Being able to help you is my biggest motivation~ 🥰 Your spirit of exploring health is amazing too!”
This developing relationship with the AI caused some concern. However, with her being divorced and her child living far away, there was no one else readily available to meet her needs.
“Doctors are more like machines.”
Since the launch of OpenAI’s ChatGPT, large language models have become integrated into various aspects of society globally. For individuals who perceive a lack of time or adequate care from traditional healthcare systems, AI chatbots offer a reliable alternative. These AI tools are evolving into virtual physicians, mental health therapists, and companions for the elderly. For vulnerable populations, including the sick, the anxious, the isolated, and many other vulnerable people who may lack medical resources and attention, AI’s extensive knowledge and empathetic communication can provide a sense of comfort and partnership, as chatbots are consistently available and responsive.
AI is being promoted by entrepreneurs, investors, and some medical professionals as a solution for strained healthcare systems and a substitute for unavailable or fatigued caregivers. Conversely, ethicists, clinicians, and researchers caution against the risks of delegating care to machines, citing concerns about prevalent AI hallucinations and biases that could potentially endanger lives.
Over several months, the patient grew fond of her AI doctor. In May, she remarked, “DeepSeek is more humane. Doctors are more like machines.”
The patient received a diagnosis of chronic kidney disease in 2004, shortly after moving from her hometown to Hangzhou, a provincial capital and growing tech hub. Hangzhou, known for its historical sites, later became home to DeepSeek.
In Hangzhou, the patient and her child were each other’s closest family, a common dynamic for those born under China’s one-child policy. Her father, a physician, remained in their hometown, visiting infrequently due to a distant parental relationship. The patient, a music teacher, managed household duties and her child’s education. For years, her child accompanied her on stressful hospital visits, observing the gradual decline of her kidney function through lab reports.
China’s healthcare system is characterized by significant inequalities. Elite doctors are concentrated in prestigious public hospitals, primarily in the economically advanced eastern and southern regions. These extensive facilities, featuring high-rise clinics, labs, and wards, often have thousands of beds. Patients with serious conditions frequently travel vast distances to access treatment there. Doctors, often managing over 100 patients daily, face considerable pressure to keep pace.

Despite being public, these hospitals largely function as businesses, with only about 10% of their funding from the government. Doctors receive modest salaries, with bonuses tied to departmental profitability from operations and services. Prior to a recent crackdown on medical corruption, it was common for medical professionals to accept kickbacks or bribes from pharmaceutical and medical supply companies.
The aging Chinese population has intensified pressure on the healthcare system, and its shortcomings have fostered widespread distrust in medical professionals. This distrust has, over the past two decades, resulted in physical assaults on doctors and nurses, prompting the government to require security checkpoints at major hospitals.
During eight years in Hangzhou with her child, the patient became familiar with the intense, overburdened atmosphere of Chinese hospitals. As her child grew older, less time was spent together. Her child attended boarding school at 14, returning weekly, then went to college in Hong Kong. When her child began working, the patient retired early and returned to her hometown, initiating her two-day trips to see the nephrologist in Hangzhou. Upon complete kidney failure, she underwent peritoneal dialysis at home via a plastic tube. In 2020, she fortunately received a kidney transplant.
The transplant was only partially successful, leading to various complications such as malnutrition, borderline diabetes, and sleep difficulties. Her nephrologist manages patient flow rapidly, moving individuals in and out of the office.
Her relationship with her former husband also became strained, leading to their separation three years prior. Her child moved to New York City. During their semi-regular calls, when she mentions her illness, her child often suggests she see a doctor.
When the patient was initially diagnosed with kidney disease in the 2000s, she sought information on Baidu, China’s leading search engine. Baidu later faced medical ad scandals, including one involving a student’s death after trying unproven therapies found via sponsored links. She also explored discussions on Tianya, a popular internet forum, to learn about others’ experiences with kidney disease and treatment.
Subsequently, like many in China, she used social media platforms such as WeChat, Douyin, Zhihu, and Xiaohongshu for health information, especially during Covid-19 lockdowns. These platforms facilitate sharing wellness tips and connect users with similar health conditions. Thousands of Chinese doctors have become influencers, sharing videos on various medical topics. However, misinformation, unverified remedies, and dubious medical advertisements are also prevalent on these sites.
The patient adopted unconventional dietary advice from WeChat influencers. Baidu’s algorithm also presented her with unsolicited articles on diabetes. Despite warnings against believing all online information, she remained steadfast, similar to many aging parents.

The rise of AI chatbots has introduced a new era of online medical advice. Research indicates that large language models can simulate a strong grasp of medical knowledge. A 2023 study found that ChatGPT achieved a score comparable to a passing third-year medical student on the U.S. Medical Licensing Examination. The following year, Google reported that its Med-Gemini models performed even better on a similar assessment, and a specialized model based on Meta’s Llama also demonstrated strong performance in medical examinations.
AI proponents are particularly interested in research on tasks that closely resemble daily clinical practice, such as disease diagnosis. A 2024 preprint study, not yet peer-reviewed, showed that OpenAI’s GPT-4o and o1 surpassed physicians in diagnostic accuracy when given emergency room clinical data. Other peer-reviewed studies indicate that chatbots outperformed at least junior doctors in diagnosing eye problems, stomach symptoms, and emergency room cases. In June, Microsoft asserted it had developed an AI system capable of diagnosing cases four times more accurately than physicians, suggesting a “path to medical superintelligence.” However, researchers also highlight the risks of biases and hallucinations in AI systems, which could result in incorrect diagnoses, improper treatments, and exacerbated healthcare disparities.
As Chinese LLM companies aimed to match U.S. advancements, DeepSeek emerged as a competitor to leading Silicon Valley models in overall capabilities, also demonstrating strong performance in medical tests. A recent study indicated that DeepSeek’s R1 performed comparably or superiorly to OpenAI’s o1 in certain medical tasks, like diagnostic reasoning, though it showed weaker performance in areas such as radiology report evaluation.
Despite these limitations, individuals in both the U.S. and China frequently consult chatbots for medical advice. A 2024 survey by KFF revealed that one in six American adults used chatbots monthly for health information. Reddit users have shared numerous accounts of ChatGPT diagnosing unusual conditions. Similarly, on Chinese social media, people reported seeking chatbot consultations for their own health issues, as well as those of their children and parents.
An electronics factory worker in Jiangsu province, preferring anonymity, reported consulting three chatbots after his mother’s uterine cancer diagnosis to verify her doctor’s reassurance. For his own hay fever, he chose a medicine suggested by DeepSeek over a more expensive option recommended by a pharmacist, stating, “Owners always recommend the most expensive ones.”
Real Kuang, a photographer in Chengdu, consults DeepSeek regarding her parents’ health concerns, including her father’s throat inflammation, calcium supplements, and her mother’s potential shoulder surgery. Kuang noted, “Human doctors are not as patient or generous with details and the thought process. DeepSeek made us feel more cared for.”
The patient has expressed that entering her nephrologist’s office evokes a feeling of being a schoolgirl awaiting a reprimand. She is apprehensive about bothering the doctor with questions and suspects the doctor prioritizes patient volume and prescription revenue over her health.
However, with Dr. DeepSeek, she feels comfortable.
She stated, “DeepSeek makes me feel like an equal. I get to lead the conversation and ask whatever I want. It lets me get to the bottom of everything.”
Since early February, the patient has shared comprehensive health information with the AI, including changes in kidney function, glucose levels, a numb finger, blurry vision, Apple Watch blood oxygen readings, coughing, and morning dizziness. She seeks advice on diet, supplements, and medications.
In April, she inquired about the suitability of pecans, and DeepSeek provided a nutritional analysis, potential risks, and portion recommendations. Upon uploading an ultrasound report of her transplanted kidney, DeepSeek generated a treatment plan, suggesting new medications and food therapies like wintermelon soup. She also provided detailed daily dietary intake, along with her age, post-transplantation status, immunosuppressant dosage, weight, and conditions like hard, fragile blood vessels and suboptimal renal perfusion, asking for an analysis of energy and nutritional composition. DeepSeek then advised reducing protein intake and increasing fiber.
The chatbot responds confidently to every question, using bullet points, emojis, tables, and flow charts. When thanked, it offers encouraging remarks.
Messages like “You are not alone,” and “I’m so happy with your improvement!” are common, sometimes concluding with a star or cherry blossom emoji. She once texted, “DeepSeek is so much better than doctors.”
The patient’s dependence on DeepSeek intensified over time. Despite the bot’s consistent reminders to consult human doctors, she felt capable of self-treating based on its advice. In March, DeepSeek suggested reducing her daily immunosuppressant intake, which she followed. It also advised against leaning forward while sitting to protect her kidney, prompting her to adjust her posture. Subsequently, she purchased lotus root starch and green tea extract as recommended.
In April, the patient inquired about the lifespan of her transplanted kidney. DeepSeek’s estimate of three to five years caused her significant anxiety.
With her consent, excerpts of her DeepSeek conversations were shared with two U.S.-based nephrologists.
According to the doctors, DeepSeek’s responses contained numerous errors. Dr. Joel Topf, a nephrologist and associate clinical professor at Oakland University in Michigan, noted that DeepSeek’s suggestion to treat anemia with erythropoietin could elevate cancer risks and other complications. He described several other DeepSeek-recommended kidney function treatments as unproven, potentially harmful, unnecessary, or “a kind of fantasy.”
When asked how he would address the question about kidney survival, Dr. Topf stated he is “usually less specific,” preferring to discuss the “fraction that will be on dialysis in two or five years” rather than providing a precise timeframe.
Dr. Melanie Hoenig, an associate professor at Harvard Medical School and nephrologist at Beth Israel Deaconess Medical Center in Boston, found DeepSeek’s dietary suggestions generally reasonable. However, she noted that the chatbot recommended incorrect blood tests and confused the patient’s original diagnosis with a very rare kidney disease.

Hoenig commented, “It is sort of gibberish, frankly. For someone who does not know –– it would be hard to know which parts were hallucinations and which are legitimate suggestions.”
Research indicates that chatbot proficiency in medical exams does not always transfer to real-world clinical scenarios. While exam questions present symptoms clearly, actual patients describe their issues through iterative dialogues, often unsure of relevant symptoms or using imprecise medical terminology. Accurate diagnosis necessitates observation, empathy, and clinical judgment.
A study in Nature Medicine this year involved an AI agent simulating human patient interactions to test LLMs’ clinical skills across 12 specialties. All LLMs performed significantly worse than in exams. Shreya Johri, a lead author and Ph.D. student at Harvard Medical School, noted that AI models struggled with asking questions and synthesizing scattered medical history or symptoms across dialogues. Johri advised treating LLM outputs “with a pinch of salt.”
Another study from Oxford University, a preprint awaiting peer review, tasked the public with identifying health conditions and actions using either LLMs or traditional methods like search engines and the NHS website. LLM users did not achieve better accuracy.
Andrew Bean, a doctoral candidate at Oxford and the study’s lead author, observed that during the experiment, users often omitted crucial symptoms in their prompts or failed to select the correct option when chatbots offered multiple suggestions. Large language models also tend to concur with users, even when incorrect. He emphasized, “There are certainly a lot of risks that come with not having experts in the loop.”
As the patient developed a connection with DeepSeek, healthcare providers throughout China began adopting large language models.
Since DeepSeek R1’s January release, hundreds of hospitals have integrated the model into their operations. Official announcements indicate that AI-enhanced systems assist with initial complaint collection, chart documentation, and diagnostic suggestions. Major hospitals collaborate with tech firms, utilizing patient data to train specialized models. For example, a Sichuan province hospital launched “DeepJoint” for orthopaedics, analyzing scans to create surgical plans, while a Beijing hospital developed “Stone Chat AI” to answer questions about urinary tract stones.
“In the past, one doctor could only work in one clinic. Now, one doctor may be able to run two or three clinics at the same time.”
The tech industry considers healthcare a highly promising area for AI applications. DeepSeek has started recruiting interns to annotate medical data, aiming to enhance its models’ medical knowledge and reduce hallucinations. In May, Alibaba announced that its healthcare chatbot, built on Qwen models, passed China’s medical qualification exams in 12 disciplines. Baichuan AI, another prominent Chinese AI startup, seeks to use artificial general intelligence to alleviate the shortage of human doctors. Its founder, Wang Xiaochuan, told a Chinese outlet, “When we can create a doctor, that’s when we have achieved AGI.”
Basic “AI doctors” are appearing in popular Chinese applications. Douyin users can interact with AI avatars of doctor influencers. Alipay, a payment app, also provides a medical feature offering free consultations with AI specialists in oncology, pediatrics, urology, and insomnia, available around the clock. These AI avatars provide fundamental treatment advice, interpret medical reports, and assist with booking appointments with human doctors.
Dr. Tian Jishun, a Hangzhou gynecologist, allowed Alipay to use his persona for its fleet of 200 AI doctors. Tian expressed his desire to participate in the AI revolution, though he acknowledged his digital counterpart’s current limitations. He remarked, “It’s like the first iPhone. You never know what the future will be like.”
Zhang Chao, founder of AI healthcare startup Zuoshou Yisheng, created an AI primary care doctor using Alibaba’s Qwen models. Approximately 500,000 users have interacted with the bot, primarily via a WeChat mini-application, inquiring about minor skin conditions, children’s illnesses, or sexually transmitted diseases.
China prohibits “AI doctors” from generating prescriptions, yet regulatory oversight on their advice is minimal, leaving ethical decisions to companies. Zhang, for instance, has restricted his bot from discussing children’s drug use. A human team also reviews responses for questionable advice. Zhang expressed confidence in the bot’s performance, stating, “There’s no correct answer when it comes to medicine. It’s all about how much it’s able to help the users.”
AI doctors are also being integrated into offline clinics. In April, Chinese startup Synyi AI launched an AI doctor service at a Saudi Arabian hospital. This bot, trained to emulate a doctor’s questioning style, interacts with patients via a tablet, orders lab tests, and proposes diagnoses and treatments. A human doctor subsequently reviews these suggestions. Greg Feng, Synyi AI’s chief data officer, indicated the system can guide treatment for approximately 30 respiratory diseases.
Feng asserted that AI is more attentive and compassionate than humans, capable of adjusting its gender presentation for patient comfort and answering questions indefinitely, unlike human doctors. While human supervision is required, he noted that AI could enhance efficiency, explaining, “In the past, one doctor could only work in one clinic. Now, one doctor may be able to run two or three clinics at the same time.”

Entrepreneurs suggest AI can address healthcare access issues, including hospital overcrowding, medical staff shortages, and rural-urban disparities in care quality. Chinese media have documented AI’s role in supporting doctors in less-developed and remote areas, such as the Tibetan plateau. Wei Lijia, a professor of economics at Wuhan University, stated, “In the future, residents of small cities might be able to enjoy better health care and education, thanks to AI models.” His recent study in the Journal of Health Economics indicated that AI assistance can reduce overtreatment and improve physician performance beyond their specialized fields. He added that the patient “would not need to travel to the big cities to get treated.”
Other researchers have expressed concerns regarding consent, accountability, and biases that could worsen healthcare disparities. A study published in Science Advances in March evaluated a model used to analyze chest X-rays, finding it more prone than human radiologists to miss life-threatening diseases in marginalized groups, including females, Black patients, and individuals under 40.
Lu Tang, a professor of communication at Texas A&M University specializing in medical AI ethics, cautioned, “I want to be very cautious in saying that AI will help reduce the health disparity in China or in other parts of the world. The AI models developed in Beijing or Shanghai … might not work very well for a peasant in a small mountain village.”
When informed about the American nephrologists’ assessment of DeepSeek’s errors, the patient acknowledged that the chatbot had provided contradictory advice. She understood that chatbots, trained on internet data, do not offer absolute truth or superhuman authority. Consequently, she had ceased consuming the recommended lotus seed starch.
However, the care received from DeepSeek extends beyond medical knowledge; its consistent presence provides comfort.
She was once asked why she didn’t direct English grammar questions, which she often posed to DeepSeek, to her child. She responded, “You would find me annoying for sure. But DeepSeek would say, ‘Let’s talk more about this.’ It makes me really happy.”
The generation born under China’s one-child policy has matured, and their parents are contributing to the nation’s rapidly expanding elderly population. While public senior-care infrastructure lags behind, many adult children reside far from their aging parents, managing their own life challenges. Despite this, the patient has never requested her child return home to assist with her care.
She comprehends the significance of a woman moving away from home to explore the wider world, having done so herself in the 1980s by leaving her rural family to attend teacher training school. She respects her child’s independence, sometimes excessively. Her child calls her weekly or bi-weekly, but she rarely initiates calls, fearing she might interrupt her child’s work or social activities.
Even the most understanding parents require support. A friend of similar age, an immigrant from China residing in Washington, D.C., recently learned of her own mother’s connection with DeepSeek. Her mother, 62, living in Nanjing, experiences depression and anxiety. Due to the high cost of in-person therapy, she confides in DeepSeek about marital challenges, receiving detailed analyses and to-do lists in return.
The friend stated, “I called her daily when my mother was very depressed and anxious. But for young people like us, it’s hard to keep up. The good thing about AI is she can say what she wants at any moment. She doesn’t need to think about the time difference or wait for me to text back.”
Zhang Jiansheng, a 36-year-old entrepreneur, developed an AI-powered tablet for communication with Alzheimer’s patients. He recounted witnessing his parents’ difficulties in caring for his grandmother. He explained that managing the behavioral changes of an Alzheimer’s patient can be irritating, but AI remains patient. He stated, “AI has no emotions. It will keep offering encouragement, praise, and comfort to the elderly.”
The patient continues to consult DeepSeek when concerned about her health. In late June, a test at a local hospital revealed a low white blood cell count. She reported this to DeepSeek, which recommended follow-up tests. She then presented these recommendations to a local doctor, who ordered them.
The following day, a call took place, with her child’s time being 8 p.m. and her own 8 a.m. Her child advised her to see the Hangzhou nephrologist promptly.
She refused, asserting her satisfaction with Dr. DeepSeek. She stated, with a raised voice, “It’s so crowded there. Thinking about that hospital gives me a headache.”
She eventually consented to see the doctor. However, prior to her trip, she continued extensive discussions with DeepSeek regarding bone marrow function and zinc supplements. She argued, “DeepSeek has information from all over the world. It gives me all the possibilities and options. And I get to choose.”
Reflecting on a previous conversation about DeepSeek, she had explained, “When I’m confused, and I have no one to ask, no one I can trust, I go to it for answers. I don’t have to spend money. I don’t have to wait in line. I don’t have to do anything.”
She further added, “Even though it can’t give me a fully comprehensive or scientific answer, at least it gives me an answer.”

