Uncategorized

The role of perspective in patients’ perception of artificial intelligence in online medical platforms

The role of perspective in patients’ perception of artificial

Author links open overlay panel
Matthias F.C. Hudecek a b 1
,
Eva Lermer c d 1
,
Susanne Gaube c e
,
Julia Cecil c
,
Silke F. Heiss f
,
Falk Batz g

Abstract

In the near future, online medical platforms enabled by artificial intelligence (AI) technology will become increasingly more prevalent, allowing patients to use them directly without having to consult a human doctor. However, there is still little research from the patient’s perspective on such AI-enabled tools. We, therefore, conducted a preregistered 2×3 between-subjects experiment (N = 266) to examine the influence of perspective (oneself vs. average person) and source of advice (AI vs. male physician vs. female physician) on the perception of a medical diagnosis and corresponding treatment recommendations. Results of robust ANOVAs showed a statistically significant interaction between the source of advice and perspective for all three dependent variables (i.e., evaluation of the diagnosis, evaluation of the treatment recommendation, and risk perception). People prefer the advice of human doctors to an AI when it comes to their own situation. In contrast, the participants made no differences between the sources of medical advice when it comes to assessing the situation of an average person. Our study contributes to a better understanding of the patient’s perspective of modern digital health technology. As our findings suggest the perception of AI-enabled diagnostic tools is more critical when it comes to oneself, future research should examine the relevant factors that influence this perception.

Keywords

AI
Digital health technology
Perspective
Health
Trust in medical services
Online medical platform

1. Theoretical background

1.1. Introduction

The rapid development of digital technologies is leading to enormous changes in healthcare (Hummelsberger et al., 2023). These include, for example, the growing emergence of online medical platforms that use artificial intelligence (AI) to provide medical advice and consultations directly to patients (Haupt, 2019). These platforms have become increasingly prevalent in recent years as they offer several benefits, such as convenience, accessibility, and cost-effectiveness (Bharti et al., 2020). However, despite the growing popularity of these platforms, research on how people accept and interact with them is still limited (Mataczynski et al., 2020). The existing literature suggests that there is positive attitude toward AI in healthcare (e.g., Maassen, 2021Robertson et al., 2023), but also a preference for human physicians over AI technology (e.g., Lennartz et al., 2021Yakar et al., 2022). So far, it was not investigated whether the generally positive attitude towards AI technologies in healthcare varies in situations where patients themselves are affected by the technology or when it affects another person. The aim of this study was to close this gap. Based on Construal Level Theory (CLT; Trope & Liberman, 2010), we investigated how diagnoses and treatment recommendations of online medical platforms are perceived compared to human doctors if the situation affects a person directly vs. an average person. The current research is important because although AI-enabled digital health technology has become increasingly prevalent in healthcare, comparatively little is known about how patients view the usage of these tools. Understanding patients’ preferences and concerns is crucial for designing AI-enabled tools that meet their needs and expectations.

1.2. Patients’ perception of online medical platforms and AI in healthcare

The term online medical platform generally refers to all online health services providing consultation and/or treatment of medical conditions through information technology (Jiang et al., 2021). Online medical platforms, sometimes also referred to as telemedicine, offer a variety of services including online medical consultation, counseling, and health management (e.g., El-Sherif et al., 2022Liang et al., 2021). With the increasing prevalence of AI technology in healthcare, it is essential to anticipate patients’ reactions toward these services as they become a significant aspect of medical care (Richardson et al., 2022). This is especially critical because patients’ appreciation regarding AI may hinder the widespread adoption and utilization of these technologies. Research on non-medical uses of AI has demonstrated that the general public’s perception of AI can vary widely from worries of loss of control of AI or ethical concerns for AI to hopes for AI in healthcare and education (Fast & Horvitz, 2017). Negative attitudes toward AI are reflected by the so-called algorithm aversion described by Dietvorst et al. (2015). Algorithm aversion as a construct means that humans often favor human advice or forecasters over a statistical algorithm (i.e., AI) in making their decision. As such, people usually even prefer human advice when they have experienced that the algorithm outperforms the human. This tendency may be costly, especially for decision-making situations in which algorithms are better forecasters than humans (e.g., Grove et al., 2000Kaufmann & Wittmann, 2016Guidotti et al., 2006). Jussupow et al. (2020) differentiate in their review between three occurrences of aversion: First, people can choose between a human and an algorithm that provides advice or performs a task. Second, people can use the assessment of the human or the algorithm to form their own decision. And third, the human or algorithm providing advice can be evaluated differently. All three types of algorithm aversion seem to be relevant in the context of online medical platforms. For example, a systematic review found that many individuals showed reservations toward the use of AI in healthcare and stated instead a preference for human care (Young et al., 2021). A recent study showed how people respond to risk management recommendations from an AI compared to human experts. Participants were asked to make decisions regarding medical risks, among other things, and it was found that participants generally preferred human experts over AI. In the follow-up study, participants were asked to make a decision before receiving recommendations from human experts or AI. The results showed that participants changed their decisions more readily when receiving dissenting recommendations from human experts compared to the AI (Larkin et al., 2022). These findings underscore the significance of patient involvement to ensure that AI technology is integrated into healthcare in a way that promotes public trust and alleviates potential fears (Richardson et al., 2021). So far, in the medical field, many studies have shown that respondents generally have a positive attitude toward AI. For instance, in one German study, more than half of the participants (N = 462 patients) rated the utilization of AI in medicine as positive or very positive, while only a small percentage (4.77%) reported negative or very negative perceptions. Respondents did not express significant concerns regarding AI but strongly agreed that physicians should retain control over AI technology (Fritsch et al., 2022). This ties in with research on different roles of AI. Typically, researchers differentiate between AI as a support system for physicians or health care professionals vs. independently performing diagnoses (Holzinger et al., 2022). Nelson et al. (2020) also found similar results using a qualitative approach (N = 48 patients from the UK) focusing on skin cancer screening. While 75% of patients said they would recommend AI to their friends and family, 94% emphasized the significance of a symbiotic relationship between humans and AI (Nelson et al., 2020), thus advocating AI as a support system instead of an independent actor. Furthermore, a recent review article indicates that patients and the public generally supported the use of AI in healthcare (Young et al., 2021). Contrary to the aforementioned algorithm aversion, these findings might reflect the so-called algorithm appreciation (Logg et al., 2019). Under certain conditions it was shown in a series of experimental studies that participants prefer advice of an algorithm over human advice (Logg et al., 2019You et al., 2022). A more recent study that built on the results of Logg and colleagues, found that algorithmic aversion and appreciation could be created manipulating the framing of the algorithmic vs. human agent (Hou & Jung, 2021). It is thus interesting to further investigate which factors can explain why people behave in an algorithmic appreciative way in one situation and in an aversive way in another.
In summary, there seems to be a positive attitude toward AI in healthcare, but also a preference for human physicians over AI technology. A question that arises in this context is whether male and female physicians are equally preferred to AI. Numerous studies suggest that female and male physicians are perceived differently (Hall et al., 2011Mast, 2007). For example, patients talk differently to male vs. female physicians (Hall & Roter, 2002), and rate both the nature of their interactions (Mast & Kadji, 2018) and their competence (Hall et al., 2015) as well as the satisfaction with the physicians (Duberstein et al., 2007) differently. Similar results were also found in a study that asked participants to imagine a virtual medical visit (Mast et al., 2007). Hall et al. (2015) investigated whether patients rated male and female physicians differently based on their patient-centered skills. The results showed that male physicians were judged more positively when they displayed higher patient-centeredness, while female physicians did not receive the same credit for this behavior. The study suggests that female physicians may not be seen as competent when displaying patient-centeredness, as it is expected of them. Therefore, it is interesting not only to examine the difference between human and machine, but also to distinguish between the gender in the human condition.

1.3. Perception as a matter of perspective?

In the context of online medical platforms and AI technology in healthcare, it was not yet investigated whether the generally positive attitude towards AI technologies in healthcare varies in situations where patients themselves are affected by the technology or when it affects another person. Construal Level Theory (CLT) offers a useful theoretical framework for this. CLT posits that the way individuals mentally construct and perceive events and objects is significantly influenced by their perceived psychological distance (PD) from these entities (Trope & Liberman, 2010). PD describes the extent to which a circumstance is “not part of one’s direct experience” (Trope et al., 2007, p. 2). Objects that are psychologically distant from the subject are represented abstractly and theoretically. Peng et al. (2013) figuratively compare a high construal level (i.e., high PD) to the perspective of a bird viewing a sprawling forest from above. In contrast, they analogize a low construal level to an animal that closely sees single trees at the bottom of the forest. CLT proposes that the subjective PD to an event or object influences whether one thinks about it more concretely or abstractly (Lermer et al., 20152016b). CLT has been applied in various contexts (e.g., climate change: Brügger, 2020Wang et al., 2019; smartphone usage while driving: Lim et al., 2021; autonomous shuttle busses: Schandl et al., 2023; attitude toward physical exercise: Wang et al., 2022) and has a wide reach in evaluating risks in different domains (Lermer et al., 2016a). For example, studies on risk assessment have shown, that risk perceptions vary depending on who is at risk. Lermer et al. (2013) found that risks that affect oneself are perceived lower than when they affect other people (e.g., an average citizen). One assumption is that the phenomenon of unrealistic optimism (Harris & Hahn, 2011) is decisive for this effect. According to the motto “it won’t happen to me”, most people consider the probability that a negative event will occur lower for oneself than if it affects someone else (Wills, 1981). However, according to the Construal Level Theory (CLT; Trope & Liberman, 2010), risks that affect oneself should be rated higher than risks that affect others, because here the psychological distance (PD) is lower. Consequently, studies on risk perception have shown that more concrete thinking leads to higher risk assessments than abstract thinking (Lermer et al., 2016a). It is unclear whether these research findings can be applied to the context of online medical platforms. Unrealistic optimism would suggest that risk assessment is lower for oneself than for others. Therefore, if the use of AI is expected to involve risk, then this risk should be estimated higher for others than for oneself. However, the opposite assumption would be made according to the CLT. Other relevant variables to capture the attitude toward AI in healthcare represent the evaluation of diagnosis and treatment recommendation. Trust in a diagnosis and the belief that it is correct is crucial for the relationship between the patient and the treating contact and, not least, for the course of treatment (Lu et al., 2018Nguyen et al., 2009). To the best of the authors’ knowledge, there are no studies that investigated the effect of perspective on evaluation of diagnosis and treatment recommendation.
Thus, the current state of research and the gaps identified here lead to the following two research questions (RQ) explored in this study. What influence do (RQ-1) perspective (target person: self vs. average person) and (RQ-2) source of advice (AI vs. male physician vs. female physician) have on the evaluation of the A) diagnosis, B) treatment recommendation, C) risk perception?
Overall, this study aims to contribute to a better understanding of patients’ perspectives on modern digital health technology by identifying relevant factors that influence their perceptions.

2. Materials and method

2.1. Design

The experimental design and the research questions were preregistered on Open Science Framework (https://osf.io/6kzxf; this is a temporary and anonymous link for the peer-review process. The repository will be made public upon acceptance). We conducted a 2 (perspective) x 3 (source of advice) between-subjects online experiment. Participants were randomly assigned to one of the six conditions. Perspective as the first independent variable had two levels. Accordingly, participants were asked to imagine

[condition self]: that you have recently been vaccinated and now have severe discomfort that also causes concern, and you are wondering if this is still normal. Therefore, you go to an online medical platform and seek medical advice. There you describe all your symptoms and get then assigned to a medical contact.
[condition average person]: that an average German citizen has recently been vaccinated and now has severe discomfort that also causes concern, and wonders if this is still normal. Therefore, the person goes to an online medical platform and seeks medical advice. There the person describes all his or her symptoms and gets then assigned to a medical contact.
Source of advice, as the second independent variable had three levels: AI, male physician, and female physician. As the scenario continued, participants had to imagine the following:

[condition self]: On the online medical platform, you will be assigned to a medical contact.
[condition AI]: You have been assigned to the Symptom-Checkbot. Symptom-Checkbot is an artificial intelligence that has been trained to make medical diagnoses that exactly match your health problem.
[condition male/female physician]: You have been assigned to Dr. Andreas Weber/Dr. Andrea Weber who is a specialist in medical diagnoses that exactly match your health problem.
[condition average person]:

On the online medical platform, the average person will be assigned to a medical contact.
[condition AI]: An average person has been assigned to the Symptom-Checkbot. Symptom-Checkbot is an artificial intelligence that has been trained to make medical diagnoses that exactly match the health problem of the average person.
[condition male/female physician]: The average person has been assigned to Dr. Andreas Weber/Dr. Andrea Weber who is a specialist in medical diagnoses that exactly match the health problem of the average person.
At the end of the scenario, all participants received the same diagnosis and treatment recommendation:

[AI vs. female vs. male physician] gives the following diagnosis: These are normal side effects of vaccination.
[AI vs. female vs. male physician] gives the following treatment recommendation: Do nothing and do not worry.
In addition to the text of the scenario, source of advice was accompanied by graphic illustrations. For the male/female physician condition, participants were shown a figure with a stethoscope that looked either male or female. In the AI condition, an abstract image of a human head with computer neurons was shown. We decided to use the vaccination scenario as a cover story since it was a very present and common topic at the time of data collection due to vaccinations against COVID-19. In addition, we were looking for a description of symptoms that sounded severe enough (i.e., “severe discomfort”) to make people believe it would make sense to seek medical advice without at the same time stressing the participants in an unethical way or giving them the impression that they needed to see an emergency doctor immediately. All study materials can be found in the online repository (https://osf.io/vwxcj/). Following the scenario, participants were asked to rate the dependent variables (i.e., evaluation of the diagnosis, treatment recommendation, and risk perception), and complete a survey assessing psychometric scales and demographics.

2.2. Measures

To examine the three dependent variables (i.e., evaluation of the diagnosis, treatment recommendation, and risk perception), we developed eight items. Four items were used to measure the evaluation of the diagnosis. This variable captured both participants’ trust in the diagnosis (e.g., “I trust the diagnosis”) as well as their trust in the medical expertise of the source of advice (e.g., “I trust the medical expertise of [condition]”; condition refers to female/male doctor or Symptom-Checkbot) on a 7-point Likert scale from “strongly disagree” to “strongly agree”. These items were averaged to an index of evaluation of diagnosis (α = 0.89). Two items examined the evaluation of treatment recommendation. This variable measured the extent to which individuals felt comforted by the recommendation and whether they would adhere to it (e.g., “I feel comforted by the treatment recommendation”). Both items were measured on a 7-point Likert scale from “strongly disagree” to “strongly agree” and were averaged and resulted in an index of evaluation of treatment recommendation (α = 0.80). Finally, two items were used to measure risk perception. This variable measured the perceived risk that the diagnosis and the treatment recommendation could be wrong (e.g., “How high do you estimate the risk that the diagnosis could be wrong?”) on a scale from 0% to 100%. The average of these two items showed excellent internal consistency α of 0.92.
In addition, the following control variables were considered: attitude towards artificial intelligence, health anxiety, technology commitment, and socio-demographic variables (gender, age, social status, and health insurance status).
Attitudes toward AI was measured with the General Attitudes towards Artificial Intelligence Scale (GAAIS; Schepman & Rodway, 2020). Participants had to answer 20 items (e.g., “Artificially intelligent systems can help people feel happier”) on a 5-point Likert scale from “strongly disagree” to “strongly agree”. Items were averaged to two subscales (positive attitude: α = 0.67 and negative attitude: α = 0.43). Also, an attention check item was part of the scale (“I would be grateful if you could select agree”). Participants who failed to answer the attention check correctly were excluded from the analysis.
Health anxiety was examined using the health anxiety inventory (MK-HAI) developed by Bailer and Witthöft (2006). This scale assesses a person’s tendency for health-related concerns with 14 items (e.g., “I spend a lot of time worrying about my health”) on a five-point Likert scale from “strongly disagree” to “strongly agree”. All items were averaged to an index of health anxiety (α = 0.92).
Technology commitment was measured with the Technology Commitment Short Scale (Neyer et al., 2016). This construct captures the personal attitude toward and handling of modern technology. Participants had to answer 12 items (e.g., “I am very curious regarding new technical developments”) on a 5-point Likert scale from “not true at all” to “completely true”. Items were averaged to calculate technology commitment (α = 0.82).
The of MacArthur Scale of Subjective Social Status (Adler et al., 2000Hudecek et al., 2022) was used to assess subjective social status. Participants placed themselves on a given drawing of a ladder with ten rungs according to the following description: “Think of this ladder as representing where people stand in our society. At the top of the ladder are the people who are the best off, those who have the most money, most education, and best jobs. At the bottom are the people who are the worst off, those who have the least money, least education, and worst jobs or no job”.
We measured a set of socio-demographic variables. Participants were asked to report their age, gender, formal education in terms of German educational achievement levels (ranging from no educational attainment at all to a university degree), and their average working hours per month. We assessed participants’ current health condition and their attitude toward vaccinations as well as their vaccination status against COVID-19. We also asked participants to provide their height and weight to calculate their Body mass index (BMI). Moreover, we assessed participants’ health insurance status (public vs. private health insurance).
A manipulation check was included at the end of the survey. For this, participants had to indicate from whom they had previously received the diagnosis and treatment recommendation on the online medical platform as part of the scenario. Individuals who failed to answer the manipulation check correctly were excluded from the analysis.

2.3. Sample

The data were collected using pools from three universities across Germany. Participants were informed about the purpose of the study and gave consent prior to participation. The consent also included that the data would be used for publication and stored as open data in the Open Science Framework. Participants received course credit for participation. The online questionnaire was set to “require” answers from participants apart from the demographic variables also including the COVID-19 vaccination status. Thus, there are no missing values on the psychological measures in the dataset.
In total, 350 participants completed the survey. Before running the analyses, we excluded certain participants from the dataset according to predefined criteria that were part of the preregistration. The questionnaire was programmed that only those participants who answered the attention check item correctly could complete the questionnaire. However, all subjects who failed to answer the manipulation check correctly were excluded (N = 69). Moreover, as stated in our preregistration, participants who had negative attitudes toward vaccination and were not vaccinated against COVID-19 were excluded from the analyses (N = 15). The reason for doing so is that the experimental scenarios required the participants to imagine that they had recently been vaccinated. The final sample comprised N = 266 persons. Participants were 74% female students with an average age of M = 24.83 years (SDage = 5.81, Rangeage = 18–51). Regarding the educational level, 9% had completed vocational training, 86% had a university of applied sciences entrance qualification (i.e., German Fachhochschulreife, N = 40) or a university entrance qualification, i.e., German Abitur N = 190), and 5% of the respondents already had an academic degree. The majority of the participants worked at least part-time with an average of over 25 h per week (SD = 14.66), which clearly distinguishes these participants (63% of the total sample) from classic student samples.

2.4. Statistical analyses

Data were analyzed using R Studio (version 2022.07.1 on macOS, R version 4.2.1). We used two-way ANOVAs to analyze the effect of perspective and source of advice on the dependent variables. We checked the statistical assumptions using the Shapiro-Wilk test (normality assumption) and the Levene test (homoscedasticity). As the assumption of normal distribution was violated for all dependent variables and across all conditions and homoscedasticity could only be fully established for treatment recommendation, we calculated robust ANOVAs using the WRS2 package by Mair and Wilcox (2020). The calculation of robust ANOVAs is based on 20% trimmed means. As Mair and Wilcox (2020) describe, the “appeal of a 20% trimmed mean is that it achieves nearly the same amount of power as the mean when sampling from a normal distribution” (p. 1). It is important to note that the robust two-way ANOVA does not report any degrees of freedom since an adjusted critical value is used to determine significance, instead of the F-value, a Q statistic is reported (Mair & Wilcox, 2020). After running the ANOVA, we calculated robust two-sided pairwise comparisons to analyze the simple main and interaction effects.
To analyze the effect of the control variables, we computed multiple linear regressions for each dependent variable with the independent and the control variables as predictors. As Cook’s barplots indicated several outliers for the dependent variables, we calculated robust regression analyses using the robustbase package (Maechler, 2022).  

3. Results

Table 1 displays means and standard deviations for the dependent variables for each condition (AI vs. female physician vs. male physician, self vs. average person).

Table 1. Means and standard deviations of the dependent variables by condition.

Perspective Source of diagnosis N EOD EOT Risk perception
Self AI 45 3.61 (1.44) 3.84 (1.51) 50.53 (19.99)
Female physician 45 4.80 (1.42) 5.17 (1.43) 28.66 (19.05)
Male physician 43 4.92 (1.51) 5.34 (1.56) 32.38 (22.23)
Average person AI 47 4.10 (1.17) 4.43 (1.31) 43.28 (19.87)
Female physician 47 4.27 (1.36) 4.61 (1.55) 42.60 (21.37)
Male physician 39 4.28 (1.23) 4.56 (1.48) 41.44 (26.80)
Note. EOD = evaluation of diagnosis; EOT = evaluation of treatment recommendation; Standard deviations are shown in brackets.

3.1. Evaluation of diagnosis

To answer research questions RQ-1-A and RQ-2-A, exploring the influence of perspective (self vs. average person) and source of advice (AI vs. male physician vs. female physician) on the evaluation of the diagnosis, we performed a robust two-way ANOVA. Results show a significant main effect of source of diagnosis on the evaluation of the diagnosis (Q = 15.54, p = .001). Pairwise comparisons regarding the main effect of source of diagnosis show that the evaluation of diagnoses was significantly lower for diagnoses coming from AI compared to male (ψˆ = −1.74, p = .001) or female physicians (ψˆ = −1.46, p = .001). The evaluation of diagnosis between male and female physicians was not significantly different (ψˆ = −0.28, p = .584). No significant main effect was found for perspective (Q = 2.11, p = .149). In addition, there was a statistically significant interaction between source of diagnosis and perspective (Q = 10.03, p = .009). Post hoc tests revealed that the effect of source of advice was dependent on perspective (see Fig. 1a). Specifically, participants’ evaluation of the diagnosis from AI was significantly lower compared to diagnoses from male (ψˆ = −1.32, p = .012) or female physicians (ψˆ = −1.24, p = .006) when it came to their own perspective (condition self). In contrast, no differences were found between the different sources of advice for the perspective of an average person (ψˆ = −0.08, p = .880).
Fig. 1a

  1. Download: Download high-res image (176KB)
  2. Download: Download full-size image

Fig. 1a. Interaction plot for evaluation of diagnosis (EOD).

Note. Figure shows trimmed means (20%).

3.2. Evaluation of treatment recommendation

Turning to the analysis of research questions RQ-1-B and RQ-2-B regarding the evaluation of treatment recommendation, the robust two-way ANOVA again showed a significant main effect of source of advice on evaluation of treatment recommendation (Q = 19.30, p = .001) but not for perspective (Q = 3.16, p = .078). As before, the evaluation of treatment recommendation was significantly lower for recommendations coming from AI compared to male (ψˆ = −2.08, p < .001) or female physicians (ψˆ = −1.81, p < .001). The overall evaluation of the treatment recommendation between male and female physicians was not significantly different (ψˆ = −0.28, p = .625). As for the evaluation of the diagnosis, the results showed that a significant interaction effect exists between source of advice and perspective in predicting the evaluation of treatment recommendation (Q = 7.93, p = .023). Post hoc tests revealed that the effect of source of advice was again dependent on perspective (see Fig. 1b). Specifically, participants’ evaluation of treatment recommendation from AI was significantly lower compared to diagnoses from male (ψˆ = −1.29, p = .029) or female physicians (ψˆ = −1.19, p = .012) when it came to their own perspective (condition self). In contrast, no differences were found between the different sources of advice for the perspective of an average person (ψˆ = −0.10, p = .861).
Fig. 1b

  1. Download: Download high-res image (169KB)
  2. Download: Download full-size image

Fig. 1b. Interaction plot for evaluation of treatment recommendation (EOT).

Note. Figure shows trimmed means (20%).

3.3. Risk perception

Regarding research questions RQ-1-C and RQ-2-C with risk perception as the dependent variable, the robust two-way ANOVA revealed again a significant main effect for source of advice (Q = 14.44, p = .002) and a significant interaction effect (Q = 11.82, p = .004). As before, no significant main effect was found for perspective (Q = 3.43, p = .067). Again, we found a similar pattern as for the previous research questions. Risk perception was significantly higher for AI compared to male (ψˆ = 23.78, p = .007) or female physicians (ψˆ = 25.30, p = .001), while the risk perception between male and female physicians was not significantly different (ψˆ = −1.53, p = .853). In addition, post hoc tests regarding the significant interaction revealed that the effect of source of advice was dependent on perspective (see Fig. 1c). Specifically, participants’ risk perception was significantly higher for AI compared to male (ψˆ = 18.08, p = .037) or female physicians (ψˆ = 23.92, p = .001) when it came to their own perspective (condition self). In contrast, no differences were found between the different sources of advice for the perspective of an average person (ψˆ = −5.84, p = .481).
Fig. 1c

  1. Download: Download high-res image (167KB)
  2. Download: Download full-size image

Fig. 1c. Interaction plot for risk perception.

Note. Figure shows trimmed means (20%).

3.4. Control variables

Table 2 displays the correlations between the control variables with the dependent variables. Robust multiple regression analyses for the dependent variables revealed effects for some of the control variables (see Table 3aTable 3bTable 3ca–c). Regarding the evaluation of the diagnosis, no effects were found for any of the control variables. Turning to the evaluation of the treatment recommendation, we found a significant positive effect for health anxiety (b = 0.35, SE = 0.15, p = .020). All other control variables had no significant effect. Regarding risk perception as the third dependent variable, health anxiety (b = −4.52, SE = 2.16, p = .037) again turned out to be a significant control variable. Also, health insurance status (b = 10.36, SE = 4.75, p = .026) had a significant impact on risk perception, suggesting that participants with public health insurance perceived higher risk compared to participants with private health insurance.

Table 2. Means, standard deviations, and correlations with confidence intervals for the control and dependent variables.

Variable M SD 1 2 3 4 5 6 7 8
1. EOD 4.32 1.42
2. EOT 4.65 1.54 .77** [.71, .81]
3. Risk perception 39.88 22.57 −.67** [-.73, −.59] −.57** [-.64, −.48]
4. GAAIS positive 3.61 0.54 .03 [-.09, .15] −.02 [-.14, .10] −.03 [-.15, .09]
5. GAAIS negative 3.23 0.68 .01 [-.11, .13] −.03 [-.15, .09] −.01 [-.13, .11] .52** [.43, .60]
6. Technology Commitment 3.82 0.52 .04 [-.08, .16] −.02 [-.14, .10] −.03 [-.15, .09] .52** [.43, .61] .46** [.36, .55]
7. Health anxiety 2.39 0.75 .06 [-.06, .18] .13* [.01, .25] −.11 [-.23, .01] −.10 [-.22, .02] −.24** [-.35, −.13] −.25** [-.36, −.13]
8. Social status 6.21 1.18 .00 [-.12, .12] .04 [-.08, .16] .08 [-.04, .20] .10 [-.02, .22] .20** [.08, .31] .27** [.15, .38] −.11 [-.23, .01]
9. Age 24.83 5.41 .01 [-.11, .13] −.02 [-.14, .10] −.02 [-.14, .10] −.06 [-.18, .06] −.05 [-.17, .07] .12* [.00, .24] −.09 [-.21, .03] .19** [.07, .30]
Note. EOD = evaluation of diagnosis, EOT = evaluation of treatment recommendation; M and SD are used to represent mean and standard deviation, respectively. Values in square brackets indicate the 95% confidence interval for each correlation. The confidence interval is a plausible range of population correlations that could have caused the sample correlation (Cumming, 2014). * indicates p < .05. ** indicates p < .01.

Table 3a. Robust multiple regression analysis on evaluation of diagnosis (EOD).

Variable B SE t p 95% CI
(Intercept) 3.90 1.35 2.88 .004 [1.2, 6.6]
Distancea −0.57 0.31 −1.86 .063 [-1.2, 0]
Perspectiveb 0.19 0.28 0.67 .503 [-0.4, 0.7]
Perspectivec 0.10 0.28 0.37 .714 [-0.4, 0.7]
GAAIS positive −0.01 0.22 −0.05 .958 [-0.4, 0.4]
GAAIS negative 0.15 0.20 0.74 .460 [-0.2, 0.5]
Technology commitment 0.09 0.23 0.40 .689 [-0.4, 0.5]
Health anxiety 0.14 0.12 1.12 .262 [-0.1, 0.4]
BMI −0.04 0.03 −1.66 .098 [-0.1, 0]
Gender 0.10 0.28 0.35 .727 [-0.4, 0.6]
Social Status −0.01 0.09 −0.11 .916 [-0.2, 0.2]
Age 0.01 0.01 0.91 .361 [0, 0]
Health insuranced −0.29 0.33 −0.87 .384 [-0.9, 0.4]
Distance x Perspectivee 1.10 0.43 2.52 .012 [0.2, 1.9]
Distance x Perspectivef 1.37 0.47 2.92 .004 [0.5, 2.3]
Note. CI = confidence interval.
a
Condition self = 1, condition average person = 0.
b
Condition female = 1, AI = 0.
c
Condition male = 1, condition AI = 0.
d
Public health insurance = 1, private health insurance = 0.
e
Condition self = 1, condition average person = 0, Condition female = 1, AI = 0.
f
Condition self = 1, condition average person = 0, Condition male = 1, AI = 0; GAAIS = general attitude towards artificial intelligence scale.

Table 3b. Robust multiple regression analysis on evaluation of treatment recommendation (EOT).

Variable B SE t p 95% CI
(Intercept) 4.36 1.38 3.17 .002 [1.7, 7.1]
Distancea −0.67 0.36 −1.88 .062 [-1.4, 0]
Perspectiveb 0.28 0.34 0.83 .407 [-0.4, 0.9]
Perspectivec 0.19 0.35 0.56 .579 [-0.5, 0.9]
GAAIS positive −0.23 0.23 −0.99 .322 [-0.7, 0.2]
GAAIS negative 0.24 0.23 1.06 .289 [-0.2, 0.7]
Technology commitment −0.02 0.26 −0.07 .942 [-0.5, 0.5]
Health anxiety 0.35 0.15 2.34 .020 [0.1, 0.6]
BMI −0.04 0.02 −1.51 .133 [-0.1, 0]
Gender 0.31 0.25 1.24 .216 [-0.2, 0.8]
Social Status 0.07 0.10 0.71 .476 [-0.1, 0,3]
Age 0.00 0.02 0.02 .987 [0, 0]
Health insuranced −0.65 0.36 −1.81 .072 [-1.4, 0.1]
Distance x Perspectivee 1.23 0.46 2.65 .009 [0.3, 2.1]
Distance x Perspectivef 1.63 0.50 3.24 .001 [0.6, 2.6]
Note. CI = confidence interval.
a
Condition self = 1, condition average person = 0.
b
Condition female = 1, AI = 0.
c
Condition male = 1, condition AI = 0.
d
Public health insurance = 1, private health insurance = 0.
e
Condition self = 1, condition average person = 0, Condition female = 1, AI = 0.
f
Condition self = 1, condition average person = 0, Condition male = 1, AI = 0; GAAIS = general attitude towards artificial intelligence scale.

Table 3c. Robust multiple regression analysis on risk perception.

Variable B SE t p 95% CI
(Intercept) 53.77 19.89 2.70 .007 [14.8, 92.8]
Distancea 7.39 4.61 1.60 .110 [-1.6, 16.4]
Perspectiveb −0.77 4.95 −0.16 .877 [-10.5, 8.9]
Perspectivec −3.50 6.52 −0.54 .592 [-16.3, 9.3]
GAAIS positive 1.18 3.72 0.32 .751 [-6.1, 8.5]
GAAIS negative −2.23 2.83 −0.79 .430 [-7.8, 3.3]
Technology commitment −4.24 4.56 −0.93 .353 [-13.2, 4.7]
Health anxiety −4.52 2.16 −2.10 .037 [-8.8, −0.3]
BMI 0.26 0.44 0.58 .560 [-0.6., 1.1]
Gender 0.98 4.48 0.22 .826 [-7.8, 9.8]
Social Status 2.31 1.39 1.66 .098 [-1.2, 0.3]
Age −0.47 0.39 −1.19 .234 [-1.2, 0.3]
Health insuranced 10.63 4.75 2.24 .026 [1.3, 19.9]
Distance x Perspectivee −22.21 6.73 −3.30 .001 [-35.4, −9.0]
Distance x Perspectivef −16.17 7.78 −2.08 .039 [-31.4, −0.9]
Note. CI = confidence interval.
a
Condition self = 1, condition average person = 0.
b
Condition female = 1, AI = 0.
c
Condition male = 1, condition AI = 0.
d
Public health insurance = 1, private health insurance = 0.
e
Condition self = 1, condition average person = 0, Condition female = 1, AI = 0.
f
Condition self = 1, condition average person = 0, Condition male = 1, AI = 0; GAAIS = general attitude towards artificial intelligence scale.

4. Discussion

Digital health technology using AI to interact with patients will become increasingly important in the near future (Madhav & Tyagi, 2022). However, there is limited research on patients’ perspectives, especially when it comes to experimental study designs. Consequently, we conducted a scenario-based experiment to examine the influence of perspective (self vs. average person) and source of advice (AI vs. male physician vs. female physician) on the perception of a medical diagnosis, corresponding treatment recommendations and risk perception in the context of online medical platforms. Overall, our study provides three key findings:
First, the results show factors that influence the patient’s perception of AI in the context of online medical platforms. In general, diagnoses, treatment recommendations, and risk perception of AI compared to human physicians are perceived more negatively. This reflects previous research that has shown that algorithm aversion (Dietvorst et al., 2015) not only applies to medical professionals (Gaube et al., 2021) but is also common among patients (Wu et al., 2021). Specifically, the significant interaction effects indicate that respondents rate all three dependent variables worse for the AI if they are affected by its diagnosis and treatment recommendation themselves. However, when it comes to judging the situation for other people, AI is not perceived more negatively than human physicians, i.e., there was no difference between the different perspectives. The results thus tie in with existing research on Construal Level Theory (CLT; Trope & Liberman, 2010). CLT research has shown, that the evaluation of risk in various domains (Lermer et al., 2016a), smartphone usage while driving (Lim et al., 2021), or attitude toward physical exercise (Wang et al., 2022) depend on whether a particular event or situation is more distal or proximal. In the case of more concrete thinking, risks are perceived higher (Lermer et al., 2015), and the likelihood of engaging in, for example, pro-environmental behaviors increase (Wang et al., 2020). In addition, the present study results also expand existing research. Previous studies have demonstrated both positive (Nelson et al., 2020) and negative (Promberger & Baron, 2006) perceptions of AI from a patient’s perspective. Here, our study provides a possible explanation for these different perceptions based on the CLT. Accordingly, AI is perceived more negatively when a person is directly affected, while it is perceived more positively when others are affected. This potential mechanism also fits the explanation offered by Longoni and colleagues (Longoni et al., 2019). They suggest that patients’ negative attitudes toward AI in healthcare are due to what they call uniqueness neglect. This describes the concern that one’s unique characteristics, circumstances, and symptoms will be neglected when being cared for by AI-based healthcare providers.
Second, we found that patients did not evaluate male and female physicians differently. According to this, the results of the current study oppose gender bias from the patient’s perspective, and they indicate that it does not matter to patients whether they are advised by a male or female doctor on an online medical platform. Previous research identifying gender biases on the patient side has mostly referred to the face-to-face interaction between patient and physician (e.g., Mast & Kadji, 2018). However, due to the setting of an online medical platform, this kind of interaction was not given in our study; therefore, it seems plausible that no differences between female and male physicians were identified. One potential technical advancement of online medical platforms is the use of chatbots (Nadarzynski et al., 2020). However, in this case, the design and response behavior of the chatbot should ensure that it communicates empathically. Previous studies have indicated that chatbots’ ability to show empathy is important for patients accepting such technologies (Liu & Sundar, 2018). Future studies could therefore investigate what type of patient-AI interaction and communication on online medical platforms improves patient acceptance and trust in diagnoses and treatment recommendations.
Third, there was little influence of possible moderator variables. Of the control variables that were included in this study, only health anxiety showed a significant statistical effect on treatment recommendation ratings and risk perception. This finding ties into existing research, as health anxiety is linked to safety-seeking behavior (Helbig-Lang & Petermann, 2010Lermer et al., 2021) and higher levels of risk perception (Lindner et al., 2022Mohd Salleh Sahimi et al., 2021). Accordingly, people with higher health anxiety appear to have slightly higher confidence in treatment recommendations per se. However, their overall risk perception is lower. Here, subsequent analyses comparing participants in the human vs. AI conditions show that this association can only be sustained regarding the risk perception of diagnoses and treatment recommendation coming from AI. This should be considered when implementing appropriate services. Otherwise, individuals with high health anxiety might not follow the recommendations, resulting in actual health risks. In addition, a significant association between insurance status and risk perception was found. In our study sample, the proportion of individuals with private health insurance was small (11%). Still, these individuals had higher trust in the recommendations of physicians and AI on the online medical platform in general. Future research should further investigate whether this tendency can be replicated with a larger sample of participants with private health insurance. If such an effect should actually exists, this would result in important consequences for the design of corresponding services. In particular, people with public health insurance should then be addressed in such a way that they are also convinced by the diagnoses they receive on an online platform and can follow the recommendations. Remarkably, there were no significant associations with other potential moderator variables (e.g., general attitude toward artificial intelligence). This suggests that the degree to which a person perceives and evaluates diagnoses and recommendations on an online medical platform is primarily related to whether they come from AI or a human and whether or not the person is personally affected. Further research could follow up on this with a comprehensive study to determine whether and if so, which person-related characteristics influence the perception and acceptance of AI in a medical context from the patient’s perspective. For example, a study on the identification of different user groups through the application of latent profile analyses (Spurk et al., 2020) could be a promising starting point.

4.1. Limitations

Some limitations of the study need to be mentioned. First, we used a scenario-based design which might affect the validity of the results. Although it is known from different contexts of psychological research that the results of scenario-based experiments are comparable to real in-field experiments and show similar validity (Chen et al., 2019Weyrich et al., 2020Zhang et al., 2023), future studies should replicate the design in a more realistic setting. Second, the psychometric quality of one moderator variable capturing participants’ general attitude towards AI was not adequate, as the value of Cronbach’s alpha for negative GAAIS was significantly below the acceptable threshold of .70 (Schermelleh-Engel & Werner, 2012). We have no plausible explanation for this low reliability since this instrument is a validated measure that has already shown good reliability values in other studies (e.g., Carolus et al., 2023). In the present case, the validity of the results might therefore be limited with regard to this variable. Future studies should therefore replicate the results on the basis of the current instruments and additionally use alternative measures to capture participants’ general attitude toward AI (e.g., the ATAI scale of Sindermann et al., 2021) to check for potential differences. Third, as in many psychological studies, the sample is biased towards female participants and has a low mean age of 24.83 years. However, controlling for gender and age had no effects on the dependent variables. In addition, it must be noted that the sample was skewed towards a higher education level, which might have impacted the participants’ perception of artificial intelligence. Future studies should replicate the current findings and aim for a sample with a more heterogeneous educational background. In general, a higher education level is associated with higher acceptance rates of new technologies (Czaja et al., 2006Rice et al., 2019). However, a recent study with a sample from Israel found that education level was not a significant predictor for patients’ acceptance of AI-based technology in primary care (Chalutz Ben-Gal, 2023). Another aspect that future studies should address refers to the design of the scenario. The symptoms in the current study were framed as “severe discomfort”. It seems likely that the results can be generalized to cases or diseases where a severe discomfort is typical of the course of the illness (e.g., having a cold, flu). However, for other diseases or cases (e.g., emergency situations), the study should be replicated to test the generalizability of the results. In addition, the current design could also be used to test the effect for existing AI-assisted applications and symptom checkers (e.g., screen cancer screening).

4.2. Practical implications

The results of our study suggest some practical implications. First, considering that diagnoses, treatment recommendations, and risk perception of AI compared to human physicians are perceived more negatively in general, the main point of contact for patients should remain a real human. Therefore, online medical platforms providing the possibility for patients to check symptoms should not only be developed in a way that they are used by the patients directly. Instead, such symptom checkers could be developed for the use of physicians or healthcare organizations. Thus, humans will not be replaced by AI, but AI will be able to support physicians or healthcare organizations while patients are still in contact with real humans. When AI applications in healthcare become more widespread and accepted in the future, symptom checkers can also be increasingly developed for direct use by patients. Second, our results stress that when it comes to judging the situation for other people, AI is no longer perceived more negatively than human physicians. This effect could provide some opportunities for the design of online medical platforms. When patients use these services and interact directly with them, the symptom checker could refer to others. For example, when recommending actions or providing explanations, it could not only address the user directly, but also emphasize that “other people” with the same symptoms have benefited from certain measures or behaviors. In this way, the psychological distance could be increased, which in turn – according to the results of the current study – should have a positive effect on the perception of the diagnoses and recommendations. However, it must be noted that these considerations still need to be tested in future studies. Third, we found that patients with higher health anxiety tend to perceive greater risks of incorrect diagnoses and treatment recommendations. Therefore, we suggest in line with precision medicine and personalized health care (e.g., Johnson et al., 2021), developing AI-based symptom checkers that can recognize signs of health anxiety in patient interactions (e.g., through natural language processing of patient queries) to respond appropriately. For example, once the symptom checker has identified patients with increased health anxiety, it could provide additional information to alleviate the health anxiety or involve a human doctor or healthcare professional in the conversation.

4.3. Conclusions

In our preregistered scenario-based 2×3 experiment, we examined the influence of perspective (self vs. average person) and source of advice (AI vs. male physician vs. female physician) on the perception of a medical diagnosis and corresponding treatment recommendations and risk perception. In addition, we assessed the importance of several control variables such as health anxiety, general attitude towards artificial intelligence, and socio-demographic variables (e.g., social status). Results show that people prefer the advice of human doctors compared to an AI when it comes to their own situation. In contrast, there are no differences in terms of the source of advice for judging the situation of an average person. Our study contributes to a better understanding of the patient’s perspective of modern digital health tools. As results indicate that the perception of AI-enabled tools is more critical when it comes to yourself, future research should examine the relevant factors that influence this perception.

Data availability statement

The data described in this article are openly available in the Open Science Framework at https://osf.io/vwxcj/. Please note that this is a provisional and anonymous link for the peer review process. In case of acceptance, we will include the link to the public repository.

Funding

The research was funded by a grant from the VolkswagenStiftung (Grant: #98525).

Ethical approval and informed consent statement

Our study was conducted in full compliance with the ethical guidelines of the German Psychological Society (DGPs) and the American Psychological Association (APA). At the time of data collection, it was not common practice at most German universities to obtain ethical approval for studies that did not contain sensitive personal data, did not involve vulnerable groups, and did not pose risks to participants. Therefore, in accordance with the ethical guidelines of the Department of Psychology at LMU Munich, ethical approval was not required for this study. Only anonymous questionnaires were used for the study. No identifying information was obtained from participants. All participants were fully informed about the study and gave informed consent by agreeing to participate. In doing so, they were explicitly informed that all data would be kept confidential and that they could withdraw from the study at any time without giving a reason.

CRediT authorship contribution statement

Matthias F.C. Hudecek: Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing. Eva Lermer: Conceptualization, Funding acquisition, Methodology, Writing – original draft, Writing – review & editing, Supervision. Susanne Gaube: Conceptualization, Methodology, Writing – review & editing. Julia Cecil: Writing – review & editing. Silke F. Heiss: Methodology, Writing – review & editing. Falk Batz: Supervision, Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

There are no acknowledgements to be made.

References

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button