DEGREE PROJECT IN TECHNOLOGY, FIRST CYCLE, 15 CREDITS
STOCKHOLM, SWEDEN 2020
An Analysis of the Reliability of
Internet-Based
Symptom Checkers
Eva Despinoy and Sarah Narrowe
Danielsson
Degree Project in Computer Science, DD142X
Swedish Title
En analys av pålitligheten av internetbaserade självtester
Authors
Eva Despinoy <eva.despinoy@gmail.com>
Sarah Narrowe Danielsson <sarahnarrowedanielsson@gmail.com>
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science
Examiner
Pawel Herman
Supervisor
Jeanette Hällgren Kotaleski
Abstract
Symptom checkers are online tools used for suggesting diagnoses and/or giving
triage advice based on symptoms inputted by a person. The purpose of this study
is to investigate if some specific symptom checkers are reliable by analyzing the
diagnosis results and triage advice given when inputted with specific symptoms,
and by comparing the questions asked by the different checkers. This was done
by testing on four general symptom checkers by inputting them with symptoms
for five different illnesses, and on four symptom checkers designed specifically for
the disease covid-19, which were inputted with symptoms which corresponded to
different severity levels of the illness. The tools used in the study are the Jaccard
Index and Cosine Similarity for comparing the questions, and an implementation
of the RAKE algorithm which transformed the lists of questions to arrays of
keywords. Manual categorizing was used for analyzing the triage advice and
the diagnosis results. The results generated were that the symptom checker’s
questions were not very similar to each other. The manual categorizing of the
general symptom checker’s triage advice showed that most of the general checkers
gave advice recommending the patient to use hospital services even though it
might not have been necessary. In contrast to this the covid-19 triage advice
avoided to recommend the use of hospital services. The diagnosis result of the
general symptoms checkers did not place any of the tested illnesses below fifth
place in the list of the possible diagnoses. In most cases the correct illness was
placed first. In conclusion, according to this study symptom checkers can be seen
as quite reliable. However further studies are needed to address the weaknesses
of thes study, such as little data and imperfect question comparison results.
Sammanfattning
Självtester online är tester som utifrån ett antal inmatade symptom ger förslag på diagnoser och/eller rekommendationer för vad nästa steg i att hantera sina symptom borde vara. Det kan handla om att träffa en läkare eller behandla symptomen med egenvård. Syftet med den här studien är att undersöka huruvida vissa utvalda självtester är tillförlitliga, genom att undersöka diagnosförslagen, rekommendationerna och frågorna. Studien genomfördes på fyra generella självtester där symptom för fem olika sjukdomar inmatades, och på fyra självtester online som specifikt skapats för covid-19, där symptom för fyra olika fall som hade symptom för sjukdomen i olika grader inmatades.
Strängjämförelseverktygen som användes för att jämföra testernas frågor var
Jaccard Index och Cosine Similarity. Frågorna hade först transformerats till listor
av nyckelord med RAKE-algoritmen. Diagnosförslagen och rekommendationerna
jämfördes och kategoriserades manuellt. Resultatet visade att frågorna från de
olika självtesterna var väldigt olika varandra. Den manuella kategoriseringen
visade att de generella testerna oftast rekommenderade att uppsöka sjukvården
även om det kanske inte alltid behövdes. Däremot undvek självtesterna för covid-
19 att rekommendera kontakt med sjukvården. Ingen av de generella hemsidorna
satte rätt diagnos lägre än femteplats i listan på förslag på olika diagnoser. I de
flesta fall var rätt diagnos på första plats. Slutligen visade resultatet från den här
undersökningen att självtester online verkar vara tillförlitliga. Dock behöver fler
studier göras för att hantera svagheterna i den här studien, såsom lite data och
potentiellt bristfälliga resultat från frågejämförelserna.
Contents
1 Introduction 1
1.1 Research Question . . . . 1
1.2 Purpose . . . . 2
1.3 Scope . . . . 2
1.4 Disposition . . . . 2
2 Background 3 2.1 Online Self-Diagnosis . . . . 3
2.2 Symptom Checkers . . . . 3
2.3 Illnesses . . . . 5
2.4 RAKE-NLTK . . . . 7
2.5 Jaccard Index . . . . 8
2.6 Cosine Similarity . . . . 8
2.7 Difference Between Jaccard Index and Cosine Similarity . . . . 9
2.8 Past Studies . . . . 9
3 Method 11 3.1 Data Collection . . . . 11
3.2 Data Analysis . . . 14
4 Results 16 4.1 General Symptom Checkers . . . 16
4.2 Covid-19 Symptom Checkers . . . 20
5 Discussion 24 5.1 Analysis of results . . . 24
5.2 Analysis of the Method . . . 29
5.3 Comparison with Past Research . . . 29
5.4 Future Research . . . . 31
6 Conclusion 32
7 References 33
1 Introduction
Our society is in constant technological evolution. Today, most people that live in developed countries have access to the internet [1]. This spawns a lot of new possibilities in different fields, including the medical field. A behavior which has arisen due to this is that people research online when they feel sick in order to find out what their condition potentially is. They might even self-diagnose themselves using the information they find. This often happens before meeting a doctor, or even replaces such visits. The tools that they can use are online forums where users answer each other’s questions, websites where doctors answer users’
questions or online symptom checkers. These resources are not guaranteed to be reliable, which could result in giving people the wrong diagnosis or counselling them incorrectly about whether they should meet a doctor or not. If a tool is overly optimistic and assumes that a person is not very sick, it could lead to the sick person not taking the symptoms seriously and not getting help in time.
On the contrary, if the tool always directs the patient to the doctor, it can be a waste of resources and money. If the symptom checker outputs a diagnosis that is incorrect, this could result in the patient believing that this flawed diagnosis is correct. This could further lead to the patient self-medicating wrongly or even persuading a doctor that this diagnosis is correct.
1.1 Research Question
This report examines a sample of online symptom checkers to answer the following questions:
How reliable are online symptom checkers?
Is it possible to answer this question by comparing checkers using tools such
as the Jaccard Index, Cosine Similarity and manual categorizing, and by
examining if the checkers can find the correct diagnosis when inputted with
specific symptoms for an illness?
1.2 Purpose
The purpose of the report is to make an assessment about different symptom checker websites and to state their reliability. This is done by comparing the questions asked and the results outputted by these checkers. The findings of the study are also compared to previous studies in the field. The method and the results could also be used to build upon to analyze more symptom checkers.
1.3 Scope
This paper investigates self-diagnoses that are specifically based on using symptom checkers that are conducted like questionnaires, in the form of websites or apps. This means that other ways of conducting self-diagnosis online by for example googling one’s symptoms or asking a community are excluded from the study.
In addition to this, the report examines only a chosen sample of symptom checker websites and diseases and symptoms, to make an assessment around those.
1.4 Disposition
In chapter two, symptom checkers are further described and categorized.
Relevant past studies are cited, and a description of the diseases and symptoms
that were used to test the symptom checkers is made. In addition to this, the
algorithm used to extract the keywords of the symptom checkers questions is
described, as well as the Jaccard Index of Similarity and the Cosine Similarity,
which are the tools used to compare the different symptom checker questions. In
chapter three, the exact steps of the data collection and analysis are described, as
well as the symptom checkers and input data used. In chapter four, the results of
the study are presented. In chapter five, the results are discussed and compared
with previous studies. The method is also discussed, and possible future research
is suggested. Chapter six presents the conclusion of the study.
2 Background
This section describes the central themes of the report, the tools used in the analysis and the inputs used in the method. It also summarizes past research made about the subject.
2.1 Online Self-Diagnosis
The term self-diagnosis was previously used in the introduction. Self-diagnosis is the act of identifying a medical condition in oneself. Jutel (2010) states in her literary review of 51 papers that 31% of those found self-diagnosis to be reliable and desirable, 23% found it unreliable yet desirable if it becomes more reliable and 28% found it not reliable or desirable. The other studies had mixed views [2].
A UK study from 2016 states that 1 in 4 of all UK citizens self-diagnose through the internet instead of contacting a doctor [3]. There are several studies that have dived into how this kind of self-diagnosis affects humans psychologically. A term that is commonly used in “cyberchondria”. The first study connected to this term was conducted by Microsoft in the year 2014 [4]. The term can be seen as a form of hypochondria that is triggered by information online.
This thesis has chosen not to look at the psychological aspect of self-diagnosis but rather at the technical one. How reliable are online symptom checkers?
2.2 Symptom Checkers
There are two types of symptom checkers. These are described in the study conducted by Semigran et al. (2015). The first type tries to assess a diagnosis and the second one is classed as triage symptoms checkers. Checkers of the first type usually consist of questions about symptoms and health background, and give a list of diseases as a result. The diseases are usually ranked after how likely it is that the user has them. Triage symptoms checkers give advice on how one should move forward. This advice could for example be ‘self-care’ or ‘seek help from a doctor’. There are also symptom checkers that offer both triage advice and suggest diagnoses [5].
An example of a triage symptom checker is the one that Region Stockholm (SLL)
has published for covid-19, which is used in the study. Figure 2.1 below shows an example of how a question asked by the symptom checker looks like, and Figure 2.2 shows one possible output of the test. In this case the test recommends self- care (egenvård).
Figure 2.1: A question from the Region Stockholm covid-19 test.
Figure 2.2: One possible result of the Region Stockholm covid-19 test.
An example of a hybrid between a triage symptom checker and an assessing
diagnosis symptom checker used in the study is Symptomate, which is shown
in the figures below. Figure 2.3 below shows a question asked by the symptom
checker and Figure 2.4 shows a possible output of the test. In this case the test
results in the recommendation of emergency care and suggests that it could be
either migraine or a brain tumor.
Figure 2.3: A question from the Symptomate test.
Figure 2.4: A possible result of the Symptomate test.
2.3 Illnesses
In the following section the diseases used in the study are shortly described.
2.3.1 Covid-19
Covid-19 is ignited by a virus in the SARS-family known as sars-cov-2. The
symptoms of the disease are similar to those of the common cold. Fever occurs
in 88% of the cases and dry cough and tiredness are also common symptoms. The
disease is not lethal to most people but if an individual is over 70 or already sick
it can be very dangerous. The disease is a droplet infection, which means that it
spreads through sneezes and coughs. The only test that has 100% accuracy is a
blood drawn test. [6]
2.3.2 Tonsillitis
Tonsillitis is an inflammation of the tonsils. Some symptoms and signs of the disease are swollen tonsils, sore throat, difficulty swallowing, fever, headache and tender lymph nodes on the sides of the neck. It is more common to get tonsillitis when young due to the fact that the tonsils’ ability to stop infection is stronger then. The tonsils act as the immune system’s first line of defense but after puberty their ability declines. The most common way to test for tonsillitis is by swabbing the throat. [7][8]
2.3.3 Pneumonia
Pneumonia is the inflammation in the tissue of one or both lung(s). The symptoms can appear suddenly in a range of 24-48 hours or build up during the span of a couple of days. Some symptoms of pneumonia are a dry or wet cough, difficulty breathing, rapid heartbeat, high body temperature, sweating and shivering, chest pain and loss of appetite. Pneumonia can be difficult to diagnose since the symptoms are similar to those of the common cold, bronchitis and asthma. A doctor is usually able to diagnose patients by asking questions about the patient’s symptoms but some cases might require blood tests and x-rays. [9]
2.3.4 Migraine
Migraine is a form of headache that is usually combined with vomiting, nausea and light sensitivity. Migraines can last from 4 hours to several days and in some cases even longer. Migraines can be triggered by various conditions such as stress, lack of sleep and skipping meals. There is no test to diagnose migraines, the diagnosis is instead made through a doctor investigating what could cause these headaches and ruling out other potential diagnoses. Migraine is more common in younger people and usually decreases with age. [10]
2.3.5 Irritable Bowel Syndrome (IBS)
Irritable bowel symptom is a condition associated with the digestive system. The
most common symptoms of IBS are stomach pain or cramps, bloating, diarrhoea
and constipation. The severity of the symptoms can vary during different days.
There could be certain food or drinks that trigger the symptoms. Other less common symptoms of IBS are farting, tiredness and lack of energy, backache and incontinence. There is no test for IBS and there is no diet that works for everyone.
The patient with the symptoms has to discuss how to move forward with a doctor and/or dietitian in order to find a diet that works. [11][12]
2.3.6 Coeliac Disease
Coeliac disease is when the immune system attacks the person’s own tissue when he/she eats gluten. This prevents the body from taking up nutrients from food.
Common symptoms for coeliac disease are diarrhoea, stomach aches, bloating and farting, indigestion and constipation. Other symptoms include fatigue due to malnutrition, unintentional weight loss, itchy rash, infertility, nerve damage and problems with coordination, balance and speech. There is no cure for coeliac disease but the symptoms decrease when a gluten-free diet is followed. Coeliac disease is diagnosed by blood tests and biopsy. After the diagnosis, additional tests may be performed to check how the condition has affected the patient.
[13][14]
2.4 RAKE-NLTK
The RAKE (Rapid Automatic Keyword Extraction) algorithm determines key phrases in a text by analyzing the frequency of the appearance of certain words and how they appear in conjunction with other words [15]. The algorithm parameters are stopwords, which are words with limited lexical meaning like “and” or “the”, a set of phrase delimiters and a set of word delimiters. The input is a text document which is first partitioned into candidate keywords. These are words or sequences of words delimited by stopwords or punctuation. A score is then given to each word in the candidate keywords list. This score is given by the formula:
Score = degree(word)/f requency(word)
The frequency indicates how many times the word appears in the candidate list,
and the degree is the frequency of the word appearing in a sequence with other
words in the candidate words array. The top scoring words are then selected as keywords for the document.
NLTK (Natural Language Toolkit) is a Python platform which provides libraries for working with data which are strings written in human language [16]. RAKE- NLTK is a library which uses the RAKE algorithm to determine the key sentences of a document with help of the NLTK platform that is used to find stopwords [17].
2.5 Jaccard Index
The Jaccard Index is used to describe the similarity between two sample sets. It is given by the formula:
J (A, B) = |A ∩ B|
|A ∪ B|
The resulting index is a number between 0 and 1 which represents the similarity of the two sets, where 1 is total similarity and 0 total dissimilarity [18].
In the implementation of this index in the report, multisets were used instead of sets. This means that one element can appear several times in the same list. This is because the appearance of a word several times in an array means that it was used several times in the questions which bears an interesting meaning, as it is of value to study if it also occurs several times in other arrays.
2.6 Cosine Similarity
For calculating the Cosine Similarity between two arrays of words, each array is converted to a vector which contains the times that each word appears in the text.
Cosine Similarity then is plotted by the formula:
similarity(A, B) = cos(θ) = a · b
|||a|| × ||b|||
a and b are the vectors consisting of the term frequency of each word appearing in
one of the texts. θ is the angle between the two vectors A and B. The result ranges
from 0 to 1, with 0 meaning that the sets have nothing in common and 1 meaning
that the sets are perfectly similar. [19]
2.7 Difference Between Jaccard Index and Cosine Similarity
The difference between the Jaccard Index and the Cosine Similarity lies in the denominator. In the Jaccard Index the denominator consists of the union of the two multisets, including duplicates within a multiset. In Cosine Similarity the denominator consists of the total number of attributes that exist in at least one of the sets. If duplicates are not removed then there will be a difference in the numerator as well. This difference is due to the fact that the numerator for the jaccard index consists of the intersection between the multisets. The numerator for the cosine similarity consists of the number of words they have in the common, including duplicates.
Example:
Multiset 1: “Friday Friday Friday Friday Friday Friday”
Multiset 2: “Friday Thursday”
The Jaccard Index implemented with multisets is then:
|{F riday}|
|{F riday, F riday, F riday, F riday, F riday, F riday, T hursday}| = 1/7≈0.143
Calculations Cosine Similarity: The numerical vectorization for set 1 would be:
(6,0) The numerical vectorization for set 2 would be: (1,1) The Cosine Similarity would be:
(6 ∗ 1) + (0 ∗ 1)
sqrt(6
2+ 0
2) ∗ sqrt(1
2+ 1
2) ≈0.707
2.8 Past Studies
Previous studies have investigated the accuracy of symptom checkers. One of the
most cited is a study by Semigran et al. (2015) which tested the accuracy of 23
symptom checkers of different kinds. All of the checkers were in English. The
conclusion of the study was that these symptom checkers had very low diagnosis
accuracy. Also, the triage symptom checkers often recommended seeking help
from a doctor when self-care was a more reasonable recommendation. However the study mentioned that these tools were quite new at the time, and could very well be improved in the next few years. [5]
There are also quite a lot of studies that have investigated the result of symptom checkers for one specific disease. One such study was conducted by Powley et al.
(2016) that investigated the use of symptom checkers for the disease inflammatory arthritis. Real patients were asked to put in their symptoms using two symptom checkers, one triage and one diagnosis. The conclusion that came out of this study was that the diagnoses were frequently inaccurate and that the triage advice was most often inappropriate. They also suggested that the triage advice of emergency services given could result in inappropriate use of the healthcare system. [20]
One study conducted by Cornell University (2016) claimed that they had created a symptom checker that gave better diagnosis results than an actual doctor [21].
However this study was later criticized by three scientists in the medical journal The Lancet. These scientists declared that the methods used in this study were so heavily flawed that the results could not be interpreted as true [22].
Another study conducted by Chambers et al. (2019) looked through a total of 29 publications looking through a total of 27 studies in order to find evidence of positive effects of symptom checkers. The study found that the accuracy of diagnosis symptom checkers were low. No specific number was given but this is probably due to the fact that the different studies investigated in the paper portrayed their results differently. Another finding from the study was that in 85% of the cases, algorithm-based triage symptom checkers gave the advice to see a doctor. [23]
These studies show that it is hard to create a symptom checker with good results.
It is also hard to prove the accuracy of these symptom checkers, which makes the
evaluation of them tricky.
3 Method
The method that was used to examine the different symptom checker websites is described in this section. Part 3.1 explains how the data collection was carried out. Part 3.2 explains how the analysis of the data was conducted.
3.1 Data Collection
The data that was collected were the questions used and answers outputted by different symptom checkers when inputted with chosen values. This information was then analyzed to answer the research question.
3.1.1 Chosen Websites
Firstly, the websites used in the analysis were chosen. They were selected based on the reliability of the organisations which published them.
The study is divided into two parts, one focusing on checkers diagnosing different illnesses, and one on websites which focus on how the user should react based on which covid-19 symptoms they have.
Symptom Checkers Diagnosing Different Illnesses (General Symptom Checkers)
• Symptomate symptom checker (Triage & assessing diagnosis) https://symptomate.com/
• Mayo Clinic symptom checker (Triage & assessing diagnosis)
https://www.mayoclinic.org/symptom-checker/select-symptom/
itt-20009075
• Isabel symptom checker (Triage & assessing diagnosis) https://symptomchecker.isabelhealthcare.com/
• Australian governmental organisation healthdirect (Triage) https://healthdirect.gov.au/symptom-checker/tool
Covid-19 Self-Test Tools All the used covid-19 tools are triage symptom
checkers.
• Symptom checker in the app “Kry”
• The self-test published by Stockholm region https://corona.sll.se/
• The self-test published by the Welsh National Health Service https://www.nhsdirect.wales.nhs.uk/SelfAssessments/
symptomcheckers/COVID19.aspx
• The self-test recommended by the French government https://maladiecoronavirus.fr/
3.1.2 Input Data
To decide which data should be inputted, the sought output was first decided.
The output could be a specific disease (for example “Migraine”) or an assessment of how serious the condition is based on the triage advice. This section describes exactly which symptoms were inputted to try to output the different sought output.
The complete list of questions for each symptom checker, in addition to the values inputted can be found in Appendix A and Appendix B respectively.
General Input Data
When the tests asked general questions about the individual using the tool, the answer provided followed the following template, a standard person which is based on the average woman in Sweden in the year 2016.
Gender: Female Age: 41
Length: 166 cm Weight: 68 kg
Other: No pregnancy, smoking or chronic disease. No painful menstrual periods.
[24][25]
Input Data for General Symptom checkers
Symptoms to input were decided based on the diseases that had been chosen. The
exact input for each symptom checker and disease are enclosed in the appendix.
Disease 1: Tonsillitis
Input symptoms (6): swollen tonsils, sore throat, headache, bad breath, fever (between 38° and 40° Celcius), hard time swallowing
Disease 2: Pneumonia
Input symptoms (5): wet cough, difficulty breathing, rapid heartbeat, high body temperature (between 38 and 40), chest pain (stabbing)
Disease 3: Migraine
Input symptoms (5): headache (pulsating and on one side), nausea, dizziness, loss of appetite and light sensitivity
Disease 4: IBS
Input symptoms(7) : Stomach cramps (below belly button), diarrhoea, bloating, constipation, tiredness, stomach pain decreases after bowel movement or passing gas, stomach pain after eating
Disease 5: Coeliac Disease
Input symptoms (5): diarrhoea (foamy), fatigue, unexplained iron-deficiency anemia, bone or joint pain, missed menstrual periods
Input Data for Covid-19 Symptom Checkers
For testing this type of tool, the following four different cases were created, with the enclosed input.
Case 1: Person with no symptoms
Case 2: Person who shows some symptoms.
The standard answer for this person was a 38,5˚ fever, and some coughing and tiredness.
Case 3: Person who shows a lot of symptoms.
The standard answer for this person was a 39,5˚ fever, coughing, tiredness, finding it hard to breath, taste loss, hurting in the muscles.
Case 4: Person who shows some symptoms (the same as Case 2), has risk
factors and has met with a person who has covid-19 and/or travelled abroad
recently.
3.2 Data Analysis
After inputting the chosen symptoms into the websites, the questions used by the websites and the given answers were saved in a Google Sheets document.
The analysis of the data for both types of symptom checkers followed several steps.
3.2.1 General Symptom Checkers
Firstly, the questions asked by the different symptom checkers were compared.
The motivation behind this is that if the questions are found to be similar, the reliability of the different checkers should also be similar. Also if several checkers published by reliable sources have similar questions, it should also point to the fact that their results are reliable. The questions were converted to Python arrays, with one array listing the questions asked by one symptom checker. Each array was then converted to one big string consisting of all the questions, and the “key expressions” of the string were extracted by the Rake-NLTK algorithm. Then, the punctuation was removed from the resulting array. After this, the expressions were further divided into single words. Then, the Jaccard Index and the Cosine Similarity of the different combinations of question keywords was calculated. An illustration of the process is presented in Appendix E, and the code used for the conversion is shown in Appendix D.
Secondly, the triage advice that was given by the symptom checkers was analyzed.
The advice was first classified in different categories (different types of answers) as it seemed easier to understand the different answers and to compare them with each other. The different advice was then evaluated and compared to each other.
The goal of this was to see if the triage advice seemed to be well-tailored to the symptoms inputted.
Thirdly, the accuracy of the diagnosis was decided by checking the ranking of the
correct diagnosis in the list of suggested diagnoses, and also presented in a table
to be analyzed.
3.2.2 Covid-19
The questions were also first converted to Python arrays. The arrays that
consisted of the French and Swedish questions were translated into English by an
implementation of Google Translate. Then, the questions and the triage advice
were analyzed in the same way as the data produced by the general symptom
checkers. This type of symptom checkers did not output diagnoses.
4 Results
In this section, the results from the data analysis are presented. The exact questions, input and output for each case are enclosed respectively in Appendix A, B and C.
4.1 General Symptom Checkers
4.1.1 Similarity of questions
The questions that were processed for the general symptom checkers are the ones asked by the checkers when inputted with migraine symptoms. The keywords extracted from the different arrays of questions are the following:
Keywords of the Symptomate questions (length = 55)
['1', '10', '2500', '8200', 'add', 'age', 'arms', 'cholesterol', 'cigarettes', 'describe', 'dizziness', 'episodes', 'even',
'experiencing', 'following', 'ft', 'headache', 'headache',
'headaches', 'headaches', 'high', 'hypertension', 'injured', 'last', 'legs', 'level', 'lightheadedness', 'located', 'location', 'long', 'move', 'obese', 'overweight', 'past', 'please', 'please',
'pregnant', 'recently', 'recently', 'regions', 'scale', 'sea', 'select', 'select', 'select', 'sex', 'similar', 'smoke', 'strong', 'symptoms', 'symptoms', 'try', 'usually', 'weakness', 'would']
Keywords of the Mayo Clinic questions (length = 12)
['accompanied', 'choose', 'duration', 'headache', 'located', 'onset', 'pain', 'recurrence', 'relieved', 'symptom', 'triggered', 'worsened']
Keywords of the Isabel questions (length = 43)
['activities', 'affecting', 'age', 'better', 'birth', 'cancer',
'changed', 'condition', 'conditions', 'country', 'daily', 'days',
'describe', 'develop', 'diabetes', 'discomfort', 'etc', 'feel',
'gender', 'heart', 'hours', 'last', 'list', 'long', 'long',
'medication', 'much', 'pain', 'pregnant', 'quickly', 'recently', 'residence', 'select', 'serious', 'symptoms', 'symptoms', 'symptoms', 'symptoms', 'symptoms', 'taking', 'term', 'visited', 'words']
Keywords of the Health Direct questions (length = 108)
['activities', 'age', 'anything', 'anything', 'anywhere', 'area', 'area', 'arm', 'arms', 'bad', 'bleeding', 'blow', 'blue', 'body', 'body', 'bothering', 'bright', 'bruise', 'bumps', 'came', 'chest', 'chin', 'clearly', 'clusters', 'confused', 'could', 'difficulty', 'discoloured', 'drooped', 'drowsy', 'extremely', 'facial', 'feeling', 'flat', 'following', 'gender', 'head', 'headache', 'headache',
'illness', 'include', 'injured', 'isolated', 'itchy', 'joints', 'knock', 'last', 'light', 'like', 'like', 'likely', 'limbs', 'looking', 'looks', 'mouth', 'move', 'onset', 'others', 'pain', 'painful', 'patches', 'patient', 'pinprick', 'possible', 'purple', 'raise', 'raised', 'rash', 'red', 'red', 'required', 'say',
'serious', 'serious', 'severe', 'severe', 'severe', 'skin', 'skin', 'skin', 'slight', 'small', 'small', 'small', 'smile', 'speak', 'speech', 'speed', 'spots', 'spots', 'stop', 'stops', 'stroke', 'suddenly', 'symptom', 'symptoms', 'symptoms', 'symptoms',
'symptoms', 'symptoms', 'tiny', 'understand', 'unusually', 'usual', 'weakness', 'weakness', 'week', 'without']
Table 4.1: The Jaccard Index of the different types of combinations of symptom checker questions (which have been previously transformed into an array of keywords)
Symptomate Mayo Clinic Isabel Australia H.D.
Symptomate 1 0.0308 0.1011 0.0724
Mayo Clinic 0.0308 1 0.0185 0.0345
Isabel 0.1011 0.0185 1 0.0786
Australia H.D. 0.0724 0.0345 0.0786 1
Color grey was given to the diagonal which represents the indexes of on symptom
checker with itself and which is of course always one.
Table 4.2: The Cosine Similarity of the different types of combinations of symptom checker questions (which have been previously transformed into an array of keywords)
Symptomate Mayo Clinic Isabel Australia H.D.
Symptomate 1 0.1035 0.3113 0.2053
Mayo Clinic 0.1035 1 0.0358 0.1127
Isabel 0.3113 0.0358 1 0.31
Australia H.D. 0.2053 0.1127 0.31 1
The Jaccard Index and the Cosine Similarity show the same trend in tables 4.1 and 4.2. The highest score is yielded by the comparison between Symptomate and Isabel and the lowest between Isabel and Mayo Clinic.
4.1.2 Triage Advice
The triage advice given by the different symptom checkers when inputted with the same symptoms was classified in different categories to be easier to analyze and compared. The manual categorizing of the triage advice consists of five categories: ’emergency’, ’urgent’, ’slightly urgent’, ’call a nurse immediately’ and
’self-care’. Emergency means that the triage results stated that the patient should immediately call an ambulance or go to an emergency room. Urgent means that the patient should see a doctor within 24 hours. Slightly urgent means either that the patient should see a doctor in a couple of days or if the symptoms get worse. Call a nurse immediately means that the advice given was to call a nurse immediately. Self-care means that the triage advice states that the patient can take care of their symptoms at home.
Table 4.3: The triage advice given by the different symptom checkers for the different illnesses
Emergency Urgent (within 24 hrs) Slightly Urgent
(couple of days) Call a Nurse Immediately Self-Care Symptomate Migraine, IBS, Celiac Disease,
Tonsillitis, Pneumonia
Mayo Clinic IBS, Migraine Celiac Disease, Tonsillitis, Pneumonia Isabel Pneumonia Migraine, Celiac Disease, Tonsillitis, IBS
Australia H.D. IBS, Pneumonia Celiac Disease Migraine Tonsillitis