The effects of implementing domain knowledge in a recommender system

(1)

F18021

Examensarbete 30 hp Juni 2018

The effects of implementing

domain knowledge in a recommender system

Kerstin Ersson

(2)

Teknisk- naturvetenskaplig fakultet UTH-enheten

Besöksadress:

Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0

Postadress:

Box 536 751 21 Uppsala

Telefon:

018 – 471 30 03

Telefax:

018 – 471 30 00

Hemsida:

http://www.teknat.uu.se/student

Abstract

The effects of implementing domain knowledge in a recommender system

Kerstin Ersson

This thesis presents a domain knowledge based similarity measure for

recommender systems, using Systembolaget's open API with product information as input data. The project includes the development of the similarity measure, implementing it in a content based recommender engine as well as evaluating the model and comparing it to an existing model which uses a bag-of-words based approach. The developed similarity measure uses domain knowledge to calculate the similarity of three feature, grapes, wine regions and production year, to attempt to improve the quality of recommendations. The result shows that the bag-of-words based model performs slightly better than the domain knowledge based model, in terms of coverage, diversity and correctness. However, the results are not conclusive enough to discourage from more investigation into using domain knowledge in recommender systems.

Handledare: Siri Persson

(3)

Populärvetenskaplig sammanfattning

Dagens Internet är fullt av data. Varje sekund postas ca 8 000 tweets, 3 000 samtal på Skype görs och 67 000 Google genomförs, enligt sajten Internet Live Stats.[1]

För att navigera dessa datamängder som möter oss när vi söker information på Google, tittar på serier på Netflix eller shoppar på Amazon behöver vi hjälp. För att hjälpa användarna använder många av dessa sajter så kallade rekommendationssystem (recommender systems), som hjälper användaren att hitta relevanta träffar.[2]

Rekommendationssystemen baseras ofta på två dataset, ett med information om användare och ett med artikelinfo. Systemen baserar rekommendationerna på fakta om användarens preferenser och/eller information om artiklarna.

System som jämför användare för att hitta artiklar att rekommendera kallas för collaborative filtering systems och system som jämför data om artiklar kallas content based.[3][4] Rekommendationssystemet som används i detta projekt är content based.

För att jämföra artiklarna som ska rekommenderas, behövs en metod för att beräkna likheten (similarity measure). Detta mått behöver vara anpassat efter vilken typ av data som artikelbeskrivningarna innehåller.[5] Mycket av den data som idag finns på internet är i form av fritextdata, för vilket ett vanligt sätt att mäta likhet är den så kallade bag-of-words-metoden. Den innebär att texter representeras som vektorer, där varje elementen betecknar de ord texten innehåller. Därefter beräknas avståndet mellan vektorerna med t.ex. cosinuslikhet (cosine similarity).[6]

Detta projekt undersöker huruvida ett rekommendationssystems prestanda förbättras om likhetsberäkningen tar hänsyn till ämneskunskap. Istället för att endast jämföra om två artiklar har samma egenskaper, tar systemet i beaktning att en egenskap kan vara mer eller mindre lik olika andra egenskaper. Datan som används för att testa detta kommer från Systembolagets produktdatabas för viner. Två modeller har jämförts: en som använder bag-of-words-modellen och en som använder domänkunskap.

Resultatet visar att bag-of-words-modellen presterar något bättre med avseende på de tre utvärderingsfaktorer som använts, täckning, olikhet och riktighet.

Resultatet är dock inte otvetydigt och ytterligare utvärdering krävs innan domänkunskap inom rekommendationsmodeller avfärdas.

(4)

1 Introduction

Today, machine learning and data mining are integrated in many of our daily routines. Data is gathered when we are googling, shopping online, listening to music, watching streaming TV services or scrolling through social media.

To sort through these large amounts of data in order to find patterns or correlations, one can use data mining.

As users of these online services, we constantly have to filter through huge amounts of data to find items that are relevant to us. To improve user experience, and to improve sales, most of these services use so called Recommender Systems (RS) to help the user navigate the information overload.[7]

Recommender systems today are, in general, built on either similarity between users or similarity between items.[8] To determine whether two users or items are similar, one needs to use some kind of similarity measure.[9] In Figure 1, we see two well-known examples of recommender systems based on similarity between users, from Spotify and Netflix.

(a) Spotify recommender for radio stations

(b) Netflix recommender for series

Figure 1: Two examples of recommender systems for streaming services

(8)

This project will focus on the type of RS where item-item similarity determines the recommendations. In most such systems, items are commonly represented by a number of features and the engine simply checks whether a feature or tag is present or not, and does not consider inter-feature similarities. E.g. a movie recommender simply checks whether or not a movie is a comedy, and does not consider a drama movie more similar than a horror film. A similar problem is also present in free text analysis, where the RS does not recognize synonyms.[2]

For recommender systems that base the recommendations on items that the user previously liked, one big problem is overspecialization. Overspecialization means that the RS only recommends items that are very similar to the previously liked ones, but not new and unexpected to the user. In order for an RS to be of value for a user, it has to be able to recommend items that the user has not yet discovered, and could not easily discover by themselves.[3]

To combat these challenges, one solution could be to implement domain knowledge in the similarity measure of the RS, thus making it possible to recommend items that are similar, but not necessarily in the same genre. This hypothesis is tested in this project, and require a more complicated system and more work in the implementation phase.

(9)

2 Background

2.1 Problem formulation

The purpose of this project is to investigate whether implementing domain knowledge in the similarity measure can improve the performance of a recommender engine. The goal is to develop a similarity measure that is specified to a certain dataset, and compare its performance to that of an existing algorithm for content based recommender systems.

2.2 Project scope

Due to the time constraints, the scope of the project will be very specific.

The RS will take one single item as input, and based on that provide the user with the top ten recommendations for that input.

Three performance measures will be studied;

• Coverage: How many of the items in the dataset are actually recommended to the user for a set number of randomized inputs?

• Diversity: How different are the recommended items for a certain input?

• Correctness: Which of the two models can present the most relevant recommendations to the user?

The project’s main focus is the similarity measure, and the impact it has on the RS. Other important performance measures, such as usability and system-centric factors, will not be tested. However, some computational aspects will be taken into consideration in the discussion section of the report.

2.3 Related work

Recommender systems are very common in different online services today, where they play a big part in helping the user navigate huge amounts of data. Since these systems are so essential, a lot of research has been done in the field. Earlier work, up to 2007, mostly focused on recommender systems applied to shopping and movie recommendations, but since 2007, the scope

(10)

has expanded to include other applications such as documents and music among others.[10]

Domain knowledge is introduced in systems to improve personalization of recommendations, and is also used to improve cross-domain recommendations.

Two main approaches that are introduced by several authors are case based RS:s and an RS model similar to content based systems. For the case based approach, the user enters some problem description, which is then compared to other cases stored in a data base. The most similar case is then presented as the solution to the user. This model is presented by several other authors, such as Khan et al.[11], Chattopadhyay et al. [12], among others.

The work of Khan et al. resulted in a knowledge based recommender system (KBRS) for diet menus called MIKAS, which can be used for instance in a hospital, where many people have dietary restriction due to different health condition.[11] Chattopadhyay et al. built a KBRS for medical diagnosis, specifically PMS. For each new case, medical experts are presented with k similar cases, and if the result is satisfactory, the case is added to the case base.[12]

Several authors also developed systems similar to CBRS, where a similarity measure relates items in a knowledge base to a user in the user base. Towle et al. developed an explicit user and product description model, where instead of using implicated user models based on user ratings, items and users are explicitly labeled.[13]

A model presented by Ghani et al. uses a knowledge base with items and attributes associated to them. The feature extraction was done using text learning methods, and are supposed to give a semantic context to the item and thus also to the user profile.[14] Another similar approach was done by Martínez et al., where the aim was to find user preference model which uses a qualitative rather than a quantitative measure. The user profile is inferred from an example of an item that the user likes, and then the RS finds matching items in the knowledge base.[15]

(11)

3 Theory

This section describes the techniques and models used in the project, how they work and why they were used. The theory behind recommender systems and similarity measures is explained, and some background information about wine characteristics used in this project is described.

3.1 Recommender systems

Data mining has a wide range of applications, everything from research to directed ads. Another example is recommendations, to find new content that might be of interest to a specific user.

The need for recommender systems can be derived from the so called long tail issue. Compare a physical record store to a streaming web service, the range of available items in a physical store has more constraints than its online dito, such as shelf space. Thus, in the store you might find a couple thousand records, while for example Spotify has millions of titles.[16] So now, with our online presence, even extremely niched items have a place, as long as they can be presented to the relevant users.[17]

For recommender systems (RS), the available data usually contains a set of users and a set of items.[4] There are three RS techniques in particular:

1. Collaborative filtering (CF): bases recommendations on the ratings of an item i by a set of users that are similar to the user u.

People who liked this also liked...

People with preferences similar to you also liked...

2. Content-based (CB) methods use the features of an item to make recommendations.

If you like this you might also like...

3. Hybrid approaches use both methods (1) and (2) combined.

3.1.1 Collaborative filtering

Recommender systems based on collaborative filtering usually only need access to user rating data in order to provide recommendations, no other information

(12)

about users or items are needed. This is the most common type of RS, and a well researched field.[18] In 2006, Netflix held a $1 million competition for the public to improve their RS, thus giving scientists and researchers access to a real, large data set with over 100 million ratings.[19]

Collaborative filtering algorithms can in turn be divided into subcategories.

Memory-based CF algorithms memorize the full matrix of user-item ratings, and uses that to make predictions. Model-based algorithms, on the other hand, use the user-item matrix to create a parametrized prediction model which is then used to make predictions.[20]

User-based models are the most popular kind of memory-based algorithms.

They identify a set of users that are similar to the active user, the so called neighborhood, and use their ratings to predict whether the active user will like new items or not.[18] Item-based CF models similarly look at the set of items that the active user has previously rated and identifies the set of items that are most similar to a specific new item. The model takes the average of the user’s ratings on these selected items to predict what score the item would get.[21]

One of the most prominent advantages of using a CF model is the transferability.

As long as the model has access to user-item ratings it can make predictions, regardless of the diversity of the present items. The same models can be used in many contexts, both for movie recommendations as well as in e-commerce.

Common problems for CF- based models include the cold start issue, meaning the difficulty in recommending items with no reviews, or recommending items to users that have not yet ranked any items. Another big issue is the sparsity of the user-item matrix for systems with many users and/or many items. This also causes problems with regards to scalability.[21]

3.1.2 Content-based recommendations

For content-based filtering, recommendations are commonly based on the user profile, a register of the user’s previous preferences. The algorithm finds the set of items that are most similar to item’s that the current user has given good ratings.[2] Since this project is not tested online, the user profile will consist of a single, positively rated item, provided by the user.

The similarity measurement is based on some categories or keywords that are assigned to the items. Which keywords or categories that are relevant depends on the current user.[22] Content-based RS models can be summarized in three basic steps:[2]

(13)

• Structuring data: In particular, data where item descriptions are in the form of free text, has to go through some preprocessing before being fed to the RS.

• Assembling user profile: In this project, the user profile will simply consist of the description of the one item given as input to the RS.

For more advanced content-based RS models, the input in this step is usually a number of items and their ratings, which then has to be generalized.

• Filtering: The RS uses the user profile to generate similar items for recommendations, either in the form of single items or ranked top lists.

To do this, some similarity measure is used. The similarity measure developed during this project is presented in Section 4.2.

Content based recommender systems solve some of the problems of collaborative filter systems, as they can recommend items that do not have any recommendations, and also do not suffer from the rating sparsity problem. Also, content based RS:s can quickly adapt to new preferences by the users, and since they do not require extensive information about the user they are easy to make secure. Challenges with these models include relying on descriptive data about the items to make good recommendations, and that the system only can recommend items that are similar to items already rated by the user.[23]

3.2 Evaluating recommender systems

In Table 1, some variables to take into consideration when evaluating an RS are listed. These are grouped into four different categories, recommendation-centric, user-centric, system-centric and delivery-centric.[22]

(14)

Table 1: Categories of variables for RS evaluation

Recommendation-centric

Correctness Compare made recommendations to some set of recommendations that are considered correct Coverage How well does the RS cover the

item or user set?

Diversity How dissimilar are the

User-centric

Trustworthiness How trustworthy are the recommendations?

Novelty How well does the RS find

recommendations that are new or unknown to the user?

Serendipity Does the system find surprising but good recommendations?

Utility What is the value gained form the RS for users?

Risk User risk associated to each

recommendation for users

System-centric

Robustness How well does the RS tolerate bias or false information?

Learning rate How fast can the RS assimilate new information?

Scalability How scalable is the RS?

Stability Are the recommendations

consistent over time?

Privacy Are there risks to user privacy?

Delivery-centric

Usability Is the system user friendly?

User preference How do users perceive the RS?

(15)

The recommendation-centric variables focus mostly on objective evaluation of the recommendations, while user-centric variables focus on how the user experiences them. System-centric variables does not focus on the recommendations, but on the system itself and the robustness etc. The delivery-centric focus on how the user interact with the recommendations, and whether the system is easy to use.[22]

3.2.1 Evaluation metrics used for this project

For this project, online measures of engine performance or user satisfaction with many users is not an alternative. The evaluation will focus on the three recommendation-centric measures correctness, coverage and diversity, as we want to investigate the effect of a customized similarity measure. Diversity and coverage are objective measures, which can be tested offline while correctness will require online testing, as we do not have access to labeled training data.

For more information on how these evaluation metrics were tested, see Section 6.3.

3.3 Statistical data types

Most of the unprocessed data available for use in recommender systems is free text data in the form of descriptions or reviews. However, statistical data can contain two general kinds of variables. Nominal data is qualitative and can be symbols or names on things. Nominal data that take a fixed number of values, e.g. colors, is referred to as categorical. Numerical data is quantitative and usually represented by numbers. For numerical data, statistical information such as mean and median values can be of interest, but for nominal data it usually is not meaningful.[24]

3.4 Vector representation

When determining similarity between items containing non-numerical features, converting the data to some numerical representation is a common choice of method. One way to do this is to use vector representation.[25]

For free text data, the most commonly used method is bag-of-words (BOW), which uses word stems as representation where the order in which the words appear is assumed to have little importance. Thus, a text is represented by

(16)

each distinct word, wi, appearing in it. The value for each feature is the number of times that word appears in the text. Common limitations to this method includes excluding stop-words, such as "and" or "but", and words which appear less times than some threshold value.[26]

3.4.1 Example

We have the sentences (1) "Adam likes apples" and (2) "Mary also likes apples". With BOW (but without stopwords and stemming), this would be represented according to Equation 1.

(1) Adam likes apples ⇒ s1 = {Adam : 1, likes : 1, apples : 1}

(2) Karen does not like apples

⇒ s₃ = {Karen : 1, does : 1, not : 1, like : 1, apples : 1}

(1)

To compare these sentences, we rewrite the vectors to contain all possible words, in this case s = {Adam, Karen, does, not, likes, like, apples}, see Equation 2.

(1)s1 = {Adam : 1, Karen : 0, does : 0, not : 0, likes : 1, like : 0, apples : 1}

⇒ s₁ = [1, 0, 0, 0, 1, 0, 1]

(2)s2 = {Adam : 0, Karen : 1, does : 1, not : 1, likes : 0, like : 1, apples: 1}

⇒ s₂ = [0, 1, 1, 1, 0, 1, 1]

(2) From here, these vectors can be compared using some similarity measure, e.g.

cosine similarity.

3.5 Similarity measures

Both in CF and CB recommender engines, similarity measures play an important part and can impact how well the RS performs. The most common choices, for CF models in particular, are cosine similarity or the Pearson correlation coefficient.[27]

Each data type, i.e. nominal, numerical etc, has a different set of appropriate similarity measures. However, real data usually contains data of mixed types.

Despite this, similarity measures for mixed data are relatively unexplored.[28]

(17)

Previous work mostly focus on similarity measures for clustering applications, such as a method where a preprocessing step converts all data into either numerical or nominal.[29]

Some research has been done in the field of similarity measures that apply domain knowledge or semantics to make the similarity measure "smarter".

Semantic similarity measures has a wide range of applications in natural language processing and related areas.[30] One example of an application is cross-domain recommendations, usable in for example e-commerce.[31]

3.5.1 Cosine similarity

When items can be represented as vectors, a common choice for measuring the similarity of two vectors is to take the cosine of the angle between the vectors. This measure is called cosine similarity. [2] For two vectors u and v, the cosine similarity is defined as given by Equation 3.

cos(u, v) = u · v

||u||||v|| (3)

3.5.2 Item-item similarity matrix

The item-item similarity matrix is central to the recommender systems in this project. For a recommender engine with n items, the similarity matrix will be a nxn-matrix. The matrix is built by calculating the similarity scores between all possible pairs of items, as seen in Algorithm 1. Since the similarity score between two items is permutable, similarity(i, j) = similarity(j, i), the similarity matrix will be symmetrical. Thus, similarity scores only need

(18)

to be calculated for half the matrix.

Algorithm 1: Assemble similarity matrix with n items Data: Product dataset

Result: Item-item similarity matrix, sim_mat

1 for all n items i do

/* the similarity score for an item with itself is

always 1 */

2 sim_mat(i,i) = 1;

3 for all items j<i do

4 calculate the similarity between item i and item j;

5 set sim_mat(i,j) and sim_mat(j,i) to similarity(i,j);

6 end

7 end

3.6 Wine

Two of the factors that will mostly affect the taste and style of a wine are the grapes that it is made of and the location where the grapes are grown.[32]

Thus, these were two features of big importance for the similarity measure developed in this project.

3.6.1 Grapes

No one knows exactly how many different grape varieties that exist today, but there are at least several thousand. However, most modern wines use only a few dozen types in different constellations. Fine wines are said to originate from France, and thus most grapes are of French descent.[32]

Different aromas and flavors are associated with different grapes. These aromas can then be grouped together into flavor profiles such as spicy, round, high tannin, which in turn form style profiles like fruity, savory and sweet, for example.[33]

(19)

Table 2: Grape varieties found in Systembolaget’s product dataset sorted in blue and green grapes

Blue grapes Green grapes

Aglianico Albariño

Barbera Chardonnay

Cabernet Franc Chenin Blanc Cabernet Sauvignon Furmint

Carmenère Garganega

Corvina Gewurztraminer

Gamay Godello

Grenache Grüner Veltliner

Malbec Marsanne

Merlot Melon de Bourgogne/Muscadet

Mourvèdre Muskat

Nebbiolo Pinot Blanc

Negroamaro Pinot Gris

Nero d’Avola Riesling

Pinot Noir Sauvignon Blanc

Pinotage Savagnin

Primitvo Semillon

Sangiovese Solaris

Syrah/Shiraz Torrontés

Tannat Verdicchio

Tempranillo Vermentino

Touriga Nacional Viognier

Zinfandel

The grapes used by Systembolaget are listed in Table 2, sorted by blue and green varieties.

3.6.2 Wine regions

Another important factor for the taste of a wine is the location at which the grapes were grown. In Systembolaget’s dataset, both country and region of origin are available features.

The most important factor for the growing region’s effect on the wine is climate, which affects the general characteristics of a wine. Wine from colder

(20)

regions are often more acidic, crisp and light-bodied with flavors of green fruits and herbs. Grapes from warmer regions tend to give full-bodied, bolder wines with higher alcohol content and flavors of dark fruits. Other factors, such as the soil quality, also has an effect on the wine, but in a much more subtle manor.[32] Table 3 show the typical characteristics of wines from different climates. The data in the table was gathered by Jones, G.[34], and was published on the GuildSomm website, which is a website for sommeliers and wine professionals.[35]

Table 3: Characteristics of wines from different climate types

Wine Climate type

characteristic

Cool Intermediate Warm

Fruit Lean, tart Ripe, juicy Overripe,

lush White flavor notes Apple, pear Peach, melon Mango,

pineapple Red flavor notes Cranberry,

cherry Berry, plum Fig, prune

Body Light Medium Full

Acidity Crisp, tangy Integrated Soft,

smooth Alcohol Low to moderate Moderate to high High to

very high Overall style Subtle, elegant Medium

intensity Bold

3.6.3 Year of production

As weather conditions vary from year to year, so does the quality of the wine, and regions with more variations in weather are more affected. Thus, the year in which a wine was made, or its vintage, is of importance for a wine’s quality.

For example, if it rains late in the wine growing season, that can make grapes watery and give less flavor and thus that vintage will be of less quality.[36]

(21)

3.7 Software

3.7.1 Recommender Engine

The code for the recommender engine developed in this project was written in Python 2.7. The following Python libraries were used:

• pandas

• scikit-learn

• numpy

• scipy

pandas is an open source library for Python, that provides tools for data structures and data analysis. In particular, pandas provides structures to manipulate and store data, so called Series, for 1D data, and DataFrames, for 2D data.[37]

3.7.2 The website

For the evaluation of the models, a simple website was built using the Python web framework Flask. Flask allows for simple web applications to be easily developed, by supporting extensions for database implementations etc, but not containing this itself.[38]

(22)

4 Method

This section will describe the two models evaluated in this project and how their respective similarity matrices are assembled.

4.1 The bag-of-words based recommender system

This section will present the vector representation based model that was compared to the knowledge based model, which is presented in Section 4.2.

In order to use a similarity measure for items with some non-numerical features, the data needs to be represented as vectors. There are several word2vec (word to vector) algorithms available. For the first setup, using categorical and text data represented by vectors, the SciKit-Learn class CountVectorizer is used, which converts text collections into a matrix of terms.

4.1.1 Similarity score distribution

To test how the similarity score distribution differs for the two tested models, we look specifically at the similarity scores for the item with article number 2800. For the vector representation based RS, there are about 1500 items with a similarity score above 0, as can be seen in Figure 2. We can also see that about 100 items has a similarity score that is higher than 0.4, which corresponds to ca three features having exact matches. Around 1000 items has a similarity score close to 0.14, which corresponds to one feature having an exact match.

(23)

Figure 2: Similarity score distribution for item with article number 2800 using the BOW approach with cosine similarity

A heat map of the similarity matrix can be seen in Figure 3. The bright diagonal shows all items have a similarity score of 1 with themselves, which is to be expected.

Figure 3: Similarity score heat map for all items in Systembolaget’s product data base using the BOW model with cosine similarity

(24)

4.2 The domain knowledge based recommender system

4.2.1 Developing the customized similarity measure

For the customized similarity measure developed during this project, domain knowledge is used to try to improve the precision of similarity in three features; grapes, regions of origin and vintages. This section will explain how these similarity measures are calculated and the data used to do this.

These three features were chosen because there are a lot of information of how these features affect the taste of wine, and thus would be rather simple to implement.

The features which do not have domain knowledge implemented will still be used, and for those features BOW with cosine similarity is applied. The results are formed by calculating a weighted average and stored the similarity matrix. All similarity scores are normalized.

4.2.2 Grape similarity score

The first feature that is subject to a customized similarity score is Description of content, where the grapes are specified. This is the first feature where domain knowledge is introduced, in the form of a tree map showing the flavor profile for the different grapes. The tree map is derived from a wine map made by the wine enthusiast website Wine Folly.[39] The complete tree maps for red, white and sparkling wines can be found in Appendix A.

The similarity is calculated using the formula given by Equation 4

similarity= nodes in common

maximum possible nodes in common (4) Example: Carmenere is a fruity red grape with 4/4 nodes in common with Zinfandel, giving a similarity score of 1, but only 2/4 with Sangiovese, corresponding to a similarity score of 0.5.

4.2.3 Wine region similarity score

To calculate the similarity of wine regions, they are divided into three climate groups which will result in different flavor characteristics in the wine. The

(25)

climate types and their characteristics are shown in Table 3, in the Theory section of the report.

Table 4: Wine regions divided into generalized climate types

Burgundy Bordeaux Rhône Valley

Champagne Tuscany Southern Italy

Loire Valley Piedmont Southern Spain

Alsace Portugal Argentina

Triveneto Pacific Northwest

(USA) Greece

New Zealand Chile California

Austria Australia

Germany South Africa

North central

Spain Mediterranean

Spain Canada

Northern Spain

A generalized result of the division of wine regions into climate zones is presented in Table 4, and the full table can be seen in Appendix B. The table was constructed using data from Old [32] and the Systembolaget website[40].

The similarity between two regions is then calculated using the matrix shown in Table 5.

Table 5: Table showing similarity measures between regions with different climate types

Cool 1 0.5 0

Intermediate 0.5 1 0.5

Warm 0 0.5 1

For example, the similarity between Champagne, cool climate, and Rhône Valley, warm climate, is 0, and the similarity between Champagne and Mosel in Germany, cool climate, is 1.

(26)

4.2.4 Production year similarity score

On Systembolaget’s webpage, there are tables with ratings on different vintages from different well-known wine regions.[41] An example can be seen in Figure 4, ratings for different vintages from French wine regions are displayed.

Figure 4: Excerpt from Systembolaget’s vintage rating table for wine regions in France

For the production year similarity measure, these tables were used to create average values for each region for the listed years and transferred them into a csv-file Appendix C. The similarity between two wines with given regions of origin and vintages is then calculated as the difference in rating, as shown in Equation 5.

similarity = |ratingwine 1− rating_{wine 2}| (5)

4.2.5 Weighting and normalizing similarity scores

The three similarity measures where domain knowledge is used are weighted equally to form a matrix, mod_mat. For each matrix value mod_mati,j, the RS checks how many of the three domain knowledge features k that are present for both item i and item j, sums their similarity score and normalizes them. The formula is given by Equation 6, where n is the number of features

(27)

present. If n = 0, the matrix value is 0.

mod_mat(i, j) = (1

n

Pn

k=1similarity_k(i, j), if n 6= 0

0, if n = 0 (6)

We now have two matrices with normalized similarity scores, mod_mat and the matrix with BOW cosine similarity scores, bow_mat. These matrices will be added with different weights throughout the evaluation process, as shown in Equation 7.







sim_mat = wa∗ mod_mat + wb∗ bow_mat wa = 0.05, 0.1, ..., 0.95

wb = 1 − wa

(7)

4.2.6 Similarity score distribution

For the domain knowledge setup, using wa = 0.3, wb = 0.7, about 1600 items has a similarity score over 0, which can be seen in Figure 5. Most of these items have similarity scores between 0.1 and 0.3. Compared to the strict BOW model, the similarity scores are more evenly distributed.

(28)

Figure 5: Similarity score distribution for item with article number 2800 using the weighted domain knowledge model. Weight of domain

knowledge similarity score is 0.3.

In Figure 6, we can see the heat map for all similarity scores for the domain knowledge based recommender system, also using wa = 0.3, wb = 0.7.

Similarly to the bag-of-words based model, seen in Figure 3, we see the bright diagonal indicating all items have similarity score 1 with themselves.

Here we can also see that the domain knowledge based RS has lower similarity score in general, and that the patterns look different compared to Figure 3.

Thus, we can conclude that the domain knowledge based model will probably give slightly different recommendations from the bag-of-words based model.

(29)

Figure 6: Heat map for similarity scores using the weighted domain knowledge based model. Weight of domain knowledge similarity score

is 0.3.

(30)

5 Data processing

For this project, data from Systembolaget’s open API has been used. The dataset was chosen since it contains several interesting features, and products that are known to the target group that will be evaluating the RS.

In this section, the preprocessing steps of the data will be discussed and statistical information about the resulting dataset will be shown.

5.1 Preprocessing the data

The data contains parts of the product information, including type of product, a short free text description and price. The dataset used in this project is an example of so called semistructured data. The data contains both features with restricted values, structured data, and free text fields, unstructured data.[6]

(31)

Table 6: Features in the Systembolaget dataset sorted into features selected to be used by the recommender system and

discarded features

Selected features Discarded features

Name Product number

Name2 Product ID

Description of content Pant Product type Price per liter Production year Sales start date

Producer Sales stopped

Region of origin Class of goods Style Packaging

Sealing Supplier Year of tasting

Assortment Text, assortment

Organic Ethical Ethical branding

Kosher Price, incl. taxes

Alcohol content Volume in ml Land of origin

In Table 6, the features in the dataset are listed. Of the about 30 features in the original dataset, seven were chosen to be used in the recommender system.

For the modified RS model, customized similarity measures will be developed for three features: Year of origin, Description of content, and Region of origin.

These features were chosen as they have a big impact on the flavor of the wine, which will probably be the most interesting aspect for the user. For the other selected features, Name, Name2, Product type, and Producer, will be measured with the BOW approach for the existing model as well as the modified model.

The original dataset from Systembolaget contains 18670 entries, including items that are not wine, but were sorted out and discarded. Duplicate entries

(32)

were also removed from the dataset, as well as items that are no longer for sale.

5.1.1 The filtering process

The following steps were taken for the data preprocessing:

• Remove special characters and Swedish letters åäö from dataset

• Remove items from other product categories than wine (from 18K to 10K entries)

• Remove items that are no longer for sale

• Remove features that are not selected to be used by the RS

• Remove items that have no value for the Product Type feature (from 10K to 3.8K entries)

5.2 The processed dataset

The dataset contains 3752 items with seven features after the preprocessing.

In Figure 7, the density of the dataset can be seen. The features with the most missing data are Description of content and Year of origin.

Figure 7: Data density for 12 features in the dataset

(33)

There are three product types present in the processed dataset, red, white, and sparkling wine. Figure 8 shows the distribution of these product types.

Figure 8: Distribution of product types in the processed Systembolaget dataset

In Figure 9, we see the distribution of the styles of wine present in the dataset.

Apart from an over-representation of white, dry wines, the dataset is relatively balanced in that regard.

(34)

Figure 9: Distribution of styles present in the Systembolaget dataset

5.3 Cleaning data features

In order to use the BOW approach, some further processing is needed. All text needs to be in lower case, and names needs to be written without spaces.

The processing steps, and which features they are applied to, are listed below:

• All features: All text in lower case.

• Features with names of producers, regions, grapes etc: Remove spaces, e.g. côtes du rhône becomes côtesdurhône

• Description of content: Remove percentage signs

(35)

6 Training and evaluating the recommender systems

In this section, the process of training and evaluating the RS is presented, as well as information about the evaluation metrics and how they were tested.

6.1 Training

The training process consists of building the similarity matrix for the RS. The similarity matrix gathers all similarity scores between all items in the dataset.

In the case of the free text implementation, the similarity matrix is built using the SciKit Learn functions CountVectorizer and cosine_similarity.

The CountVectorizer function performs the BOW transformation from text to vectors described in Section 3.4.1, and the cosine_similarity function calculates the similarity between all items in the matrix.

For the customized RS, the similarity matrix is a weighted result of the similarity matrix containing the similarity scores calculated with the customized method and a matrix built using BOW with cosine similarity.

For both models, new items in the dataset would require the similarity matrices to be updated to make sure the similarities for the new item are added, thus increasing the size of the matrices.

6.2 Making recommendations

The RS implemented in this project is a so called Top 10 RS, meaning that the ten most similar items will be recommended to the user. The recommendation process thus consists of finding the similarity scores with all other items, adding them to a list, and then sorting the list with the highest scores first. This is explained in Algorithm 2.

Algorithm 2: Make n recommendations for item i input :item i

output :n recommended items

1 get row i from similarity_matrix ;

2 sort list highest to lowest score ;

3 select items 2 : n + 1 ; /* item 1 = item i */

4 present result to user ;

(36)

6.3 Evaluating the algorithm

As previously mentioned, the evaluation metrics used in this project are correctness, coverage and diversity. The similarity score distribution and the average similarity for recommended items are also investigated. Coverage is calculated by storing all items that has been recommended through a large number of runs, and dividing that with the total number of items in the dataset.

Average similarity, coverage and diversity are evaluated by running the RS n = 1000times, and calculating average values, see Algorithm 3.

Algorithm 3: Evaluating the recommender system

1 create the RS instance;

2 randomly select n items;

3 for all selected items do

4 make recommendations;

5 store mean similarity score of recommended items;

6 calculate and store diversity;

7 for all recommended items do

8 if item not in recommended_list then

9 add item to recommended_list

10 end

11 end

12 end

13 calculate average similarity, coverage and diversity;

/* cov = length(recommended_items)/length(all_items) */

6.3.1 Coverage

When measuring the coverage of the system used in this project, we will look at how many items of the total dataset is recommended when the RS is run 1000 times. In Equation 8, Ir is the set of items that are recommended during the 1000 runs and I is the full set of items in the dataset.

coverage= |Ir|

|I| (8)

(37)

6.3.2 Diversity

To measure the diversity of the recommended items, I am using the modified similarity measure developed for the customized recommender engine. The measure used is the average diversity between all recommended items.

For each item, k, in the set of recommended items I = {Ii}ⁿ_i=1, the diversity is calculated as given in Equation 9, where n = 9 in our case with 10 recommended items.

diversity_k = 1 n

X

Ii,i6=k

1 − similarity(Ii, I_k)

(9)

6.3.3 Correctness

For this project, there was no rating data to use in the evaluation process.

To test the correctness, or the quality of the recommendations, two choice testing will be used. The user is presented with recommendations from both the modified model and the standard BOW model, as seen in Figure 10, and will then be asked to fill in a questionnaire to answer which model made the best recommendations.

Figure 10: Screenshot from the website implementation, showing how the user is presented with recommendations from both RS models

The users will be asked two questions in the questionnaire:

(38)

1. Which model (A or B) makes the best recommendations?

2. Why?

What best recommendations means in this context is up to the user to decide, but the user is presented with some example factors to consider, such as relevance or serendipity.

(39)

7 Result

7.1 Performance of weighted setups for the knowledge based model

To test the best setup in terms of weighting the similarity score for the domain knowledge based recommender system, the model was run with several values for the weights.

In Figure 11, we can see that the coverage and diversity is only slightly affected by the changes in weights, while the average similarity strongly decreases when the weights for the knowledge based similarity score increases. This is expected, as this part of the similarity score in general is significantly lower than the bag-of-word part.

Figure 11: Average similarity, coverage and diversity vs weight of features where domain knowledge based similarity scores were applied

We also note a marginal increase in diversity as the domain knowledge’s weight is increased. However, this is probably not an indication that the domain knowledge based model has better diversity in its recommendations,

(40)

but rather an effect of the fact that the similarity score is on average lower for this model.

Similarly, the changes in coverage is probably due to the randomization in the tested items, rather than the changes in the weights.

7.2 Average similarity, coverage and diversity

In Figure 12, the average similarity scores for both the domain knowledge based model and the bag-of-words based model are shown. There is a rather large difference between the similarity scores of these two models, about 0.2, which in part can be explained by the fact that several of the features used for the domain knowledge based similarity score have a rather high missing-rate, as seen in Figure 7, resulting in a lower average for that part of the similarity score. However, the results follow a similar pattern for both models, lowering as the number of recommendations are increased. This is expected, as when you introduce more recommendations, some of them will have lower similarity scores, thus affecting the average value.

Figure 12: Average similarity for recommended items vs number of recommended items per run for vector representation standard model

and knowledge based model

We see a similar pattern in Figures 13 and 14, where average coverage and diversity are shown. For coverage, it is intuitive that as the number of items recommended in each run is increased, the proportion of items recommended

(41)

during 1000 randomized runs will increase. This is also what we see in Figure 13. We can also note that here too, the bag-of-words based model performs slightly better, with a score approximately 0.5 higher than the domain knowledge based system.

Figure 13: Coverage for recommended items vs number of recommended items per run for the bag-of-words based model and the

domain knowledge based model

Similarly, the bag-of-words based model also performs slightly better in terms of diversity. In Figure 14, we can note that the average diversity is about 0.015 lower for the domain knowledge based model.

(42)

Figure 14: Diversity for recommended items vs number of recommended items per run for the bag-of-words based model and the

domain knowledge based model

7.3 Correctness

The questionnaire was sent out in a Facebook group for wine tasting, to reach people who are quite well-versed in the subject in order to get qualitative answers. The results of the 24 responses, are shown in Figure 15. 13 people, or 57.1 %, preferred the recommendations of model A, which is the domain knowledge based model. Eleven people, 42.9 % preferred model B, which is the bag-of-words based model.

(43)

Figure 15: Answers to questionnaire about recommendation quality where A is the domain knowledge based model, and B is the

bag-of-words based model

The users were also asked why they preferred the model they answered.

Six users named price as a factor in choosing the best recommendations, even though they were asked not to consider price a factor. Several users commented on the fact that bag-of-words based model gave recommendations that were more similar to the original choice than the recommendations of the domain knowledge based RS. Two users also commented that the domain knowledge based model gave more varied recommendation, which these user’s preferred. All questionnaire answers are shown in Appendix D.

To summarize, the results from the questionnaire show no distinct advantage to either system. The comments also show that people prefer different kinds of recommendations.

(44)

8 Discussion

The results shown in Section 7.2 shows that the bag-of-words based model outperforms the domain knowledge based system with regards to average similarity, coverage and diversity for this particular dataset. However, a factor that could have significantly impacted these results is that the density of data for these three features is rather low. The current setup only takes into consideration if one or two of the three features in question are missing, not all three. If all three features are missing for an item, this part of the similarity score will be 0, and thus affect the total similarity score.

A better diversity is expected from the bag-of-words based model, since the diversity measure use the domain knowledge based similarity measure for both models. This choice was made in order to be able to make the measure comparative for the two models. However, since the purpose of the domain knowledge similarity measure was to improve the similarity between wines with different but similar characteristics, a lower diversity is expected.

The response rate of the questionnaire was rather low, but this was expected.

The choice of sending it to a small target group with domain knowledge was made to assure that the answers would be qualitative, and thus can give a good indication of the results of a larger test.

The result of the questionnaire shows that no model performs significantly better than the other, and that users prefer different kinds of recommendations.

Some users liked the model that presented them with a list of very similar items, and some liked the model that gave diverse items. Also, price seems to be a very important factor in choosing the best model, and thus should have been a feature in the recommender system similarity measure.

8.1 The bag-of-words based recommender system

As free text similarity is a well researched field in machine learning, there are several available models available for users, one of them being the BOW approach. With Python’s Scikit Learn library, this model is easily implemented, thus making this approach very effortless and easy to use.

A big advantage in using a vector representation based model is that it is applicable to any category of recommendations: the same model can be trained to recommend movies, wine or clothes. However, since this model lacks semantic understanding and domain knowledge, it will sometimes miss

(45)

important nuances in its recommendations. For this specific dataset, this problem is relatively limited as the features are mostly categorical, thus synonyms etc are not used.

Another advantage with the bag-of-words based model was that the training time was much shorter, it only took a few seconds before the system was ready for use, compared to the domain knowledge based model, which took several hours to train.

8.2 The domain knowledge based recommender system

Developing the domain knowledge based recommender system using this particular similarity measure and setup was relatively simple. The similarity measure is based on three features, grapes, region of origin, and year of origin, which are limited in varieties. That is, when new items are added to the dataset, these similarity measures will still be applicable in most cases. In some cases new features might have to be added, but upkeep will probably not be very time consuming.

One of the biggest drawbacks with this approach is the training time. As a rather big matrix has to be put together, constructed with several calculated similarity measures which are weighted together, it is a rather time consuming process, and with more items added to the dataset, training time will increase.

However, as the focus of this project not was to build a computationally efficient RS, there are several measures that can be taken to optimize the training process. One example is to construct similarity matrices for each feature, in order to only have to make the calculations once, and afterwards simply look up a number in the matrix.

(46)

9 Conclusions

The aim of the project was to investigate whether implementing domain knowledge in the similarity measure of a recommender system would improve the performance of the model. The dataset used in the project was Systembolaget’s product information dataset, filtered to only contain wines. A hybrid domain knowledge based recommender system was developed using domain knowledge for three of seven features. This model was then compared to a recommender system using the bag-of-words approach with cosine similarity.

The results show that the approach used in this project did not have a significant enough effect on the recommendations to motivate using such a model for this specific dataset. However, missing values for the features used by the domain knowledge based recommender system could have affected the results, and further testing would be necessary in order to draw any conclusions in general about using domain knowledge for wine recommendations.

9.1 Future work

Due to time constraints, only specific approaches and their affects could be investigated. Other interesting factors, such as trying weighting each feature individually to find the best composition and different constellations of features and how they affect the result, could be part of a future extension of the project.

After the user tests, it was clear that having the price as a feature in the similarity measure would have been a good idea, as many people made comments about this even though they were asked not to consider price as a factor.

A problem in testing the performance of the models is that the knowledge based recommender system does not perform as well when given input items that does not contain the features used by the domain knowledge similarity measures. Testing the system only on items containing these features could thus possibly affect the results, and this could be interesting to look at in the future.

Another important factor in building a good recommender system is to have access to an accurate user profile. In this project, the user profile was built based on a single wine, and will thus not provide a very extensive base for recommendations. It would be interesting to see how the two models used

(47)

in this project would perform if given access to more well-documented user preferences.

(48)

References

[1] Internet Live Stats - Internet Usage & Social Media Statistics;. Available from: http://www.internetlivestats.com/.

[2] Ricci F, Rokach L, Shapira B, Kantor PB. Recommender Systems Handbook. 2011th ed. Springer;.

[3] Khusro S, Ali Z, Ullah I. Recommender Systems: Issues, Challenges, and Research Opportunities; 2016. p. 1179–1189.

[4] Pham MC, Cao Y, Klamma R, Jarke M. A Clustering Approach for Collaborative Filtering Recommendation Using Social Network Analysis;.

[5] Rifqi M, Benhadda H. Similarity measures for binary and numerical data: a survey. International Journal of Knowledge Engineering and Soft Data Paradigms. 2009 Jan;2009(10).

[6] Madani A, Boussaid O, Eddine Zegour D. Semi-structured documents mining: a review and comparison. Procedia Computer Science.

2013;2013(22).

[7] Lampropoulos AS, Tsichrintz¯es G, editors. Machine learning paradigms:

applications in recommender systems. No. 92 in Intelligent systems reference library. Cham: Springer; 2015. OCLC: 936126409.

[8] Leskovec J, Rajaraman A, Ullman J. Chapter 9 Recommendation Systems. In: Mining of Massive Datasets;. .

[9] Konstan JA, Riedl J. Recommender systems: from algorithms to user experience. User Modeling and User-Adapted Interaction. 2012 Apr;22(1-2):101–123. Available from: http://link.springer.com/10.

1007/s11257-011-9112-x.

[10] Hee Park D, Kyeong Kim H, Young Choi I, Kyeong Kim J. A literature review and classification of recommender systems research. Expert Systems with Applications. 2012;2012(39).

[11] Khan AS, Hoffmann A. Building a case-based diet recommendation system without a knowledge engineer. Artificial Intelligence in Medicine.

2003 Feb;27(2):155–179.

(49)

[12] Chattopadhyay S, Banerjee S, Rabhi FA, Acharya UR. A Case-Based Reasoning system for complex medical diagnosis. Expert Systems. 2013 Feb;30(1):12–20. Available from: https://onlinelibrary.wiley.com/

doi/abs/10.1111/j.1468-0394.2012.00618.x.

[13] Towle B, Quinn C. Knowledge Based Recommender Systems Using Explicit User Models. KnowledgePlanet.com; 2000.

[14] Ghani R, Fano A. Building Recommender Systems using a Knowledge Base of Product Semantics. In: In 2nd International Conference on Adaptive Hypermedia and Adaptive Web Based Systems, Malaga; 2002.

.

[15] Martínez L, Barranco MJ, Pérez LG, Espinilla M. A Knowledge Based Recommender System with Multigranular Linguistic Information.

International Journal of Computational Intelligence Systems. 2008 Aug;1(3):225–236. Available from: https://doi.org/10.1080/

18756891.2008.9727620.

[16] Spotify;. Available from: https://www.spotify.com/se/.

[17] Celma O. Chapter 4 The Long Tail in Recommender Systems. In: Music Recommendations and Discovery;. .

[18] Cheng W, Yin G, Dong Y, Dong H, Zhang W. Collaborative Filtering Recommendation on Users’ Interest Sequences. PLOS ONE.

2016;11(5):e0155739. Available from: http://journals.plos.org/

plosone/article?id=10.1371/journal.pone.0155739.

[19] Netflix Prize: Home;. Available from: https://www.netflixprize.

com/index.html.

[20] Lee J, Sun M, Lebanon G. A Comparative Study of Collaborative Filtering Algorithms. arXiv:12053193 [cs, stat]. 2012 May;ArXiv:

1205.3193. Available from: http://arxiv.org/abs/1205.3193.

[21] Sarwar B, Karypis G, Konstan J, Reidl J. Item-based collaborative filtering recommendation algorithms. ACM Press; 2001. p. 285–295.

Available from: http://portal.acm.org/citation.cfm?doid=371920.

372071.

[22] Robillard M, Maalej W, Walker R, Zimmermann T. Recommendation Systems in Software Engineering. vol. 2014. Springer-Verlag Berlin Heidelberg;.

(50)

[23] Isinkaye FO, Folajimi YO, Ojokoh BA. Recommendation systems:

Principles, methods and evaluation. Egyptian Informatics Journal. 2015 Nov;16(3):261–273. Available from: http://www.sciencedirect.com/

science/article/pii/S1110866515000341.

[24] Han J, Kamber M, Pei J. Data Mining Concepts and Techniques. vol.

2012. 3rd ed. Waltham: Morgan Kaufmann;.

[25] Le Q, Mikolov T. Distributed Representations of Sentences and Documents;p. 9.

[26] Jin R, Zhou ZH, Zhang Y. Understanding bag-of-words model: A statistical framework. International Journal of Machine Learning and Cybernetics. 2010 Dec;2010.

[27] Georgiou O, Tsapatsoulis N. The Importance of Similarity Metrics for Representative Users Identification in Recommender Systems. In:

Papadopoulos H, Andreou AS, Bramer M, editors. Artificial Intelligence Applications and Innovations. vol. 339. Berlin, Heidelberg: Springer Berlin Heidelberg; 2010. p. 12–21. Available from: http://link.

springer.com/10.1007/978-3-642-16239-8_5.

[28] S Ali D, Ghoneim A, Saleh M. Data Clustering Method based on Mixed Similarity Measures:. SCITEPRESS - Science and Technology Publications; 2017. p. 192–199. Available from:

http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=

10.5220/0006245601920199.

[29] Parameswari P, Samath JA, Saranya S. Scalable Clustering Using Rank Based Preprocessing Technique for Mixed Data Sets Using Enhanced Rock Algorithm. 2015;p. 8.

[30] Mihalcea R, Corley C, Strapparava C. Corpus-based and Knowledge-based Measures of Text Semantic Similarity;.

[31] kumar V, Shrivastva KMP, Singh S. Cross Domain Recommendation Using Semantic Similarity and Tensor Decomposition. Procedia Computer Science. 2016;85:317–324. Available from: http://

linkinghub.elsevier.com/retrieve/pii/S1877050916305877. [32] Old M. Wine - A Tasting Course. vol. 2014. 2014th ed. London: Dorling

Kindersley;.

(51)

[33] Vindruvor - Lär dig om druvsorter | Systembolaget;. Available from: https://www.systembolaget.se/fakta-och-nyheter/vin/

druvsorter/.

[34] Climate, Grapes, and Wine - Gregory Jones - Articles - GuildSomm;. Available from: https://www.guildsomm.com/

public_content/features/articles/b/gregory_jones/posts/

climate-grapes-and-wine.

[35] GuildSomm;. Available from: https://www.guildsomm.com/about_us/

who-we-are/.

[36] Why Vintage Variation Matters; 2012.

Available from: http://winefolly.com/tutorial/

why-you-need-to-know-about-vintage-variation/.

[37] Python Data Analysis Library — pandas: Python Data Analysis Library;.

Available from: https://pandas.pydata.org/.

[38] Welcome | Flask (A Python Microframework);. Available from: http:

//flask.pocoo.org/.

[39] The Different Types of Wine (Infographic); 2012. Available from: http:

//winefolly.com/review/different-types-of-wine/.

[40] Här kan du orientera dig i vinvärlden | Systembolaget;. Available from: https://www.systembolaget.se/fakta-och-nyheter/vin/

vinkartboken/#/varlden.

[41] Årgångstabell vin | Systembolaget;. Available from: https://www.

systembolaget.se/fakta-och-nyheter/vin/argangstabell/.

(52)

’

Appendix A: Grape similarity tree

Red wine grapes

red

fruity

tart cherry & cranberry round

pinot noir spicy/juicy

gamay

strawberry & cherry spicy

zinfandel barbera carmenere grenache negroamaro primitivo

black cherry & raspberry high tannin

tempranillo

cabernet sauvignon spicy

cabernetfranc sangiovese round

corvina

blueberry & blackberry round

syrah malbec merlot

nero d’avola high tannins

mourvedre

touriga nacional petit sirah

A Appendix A

(53)

spicy shiraz savory

clay & cured meats high tannins

nebbiolo round

brunello truffle & forest

spicy/juicy pinotage

smoke, tobacco & leather high tannins

aglianico tannat

black pepper & gravel spicy/juicy

montepulciano sweet

(54)

White wine grapes

white dry

light, grapefruit & floral pinot blanc

verdicchio vermentino

light, citrus & lemon pinot gris/pinot grigio melondebourgogne

albarino

grner veltliner light, herbal & grassy

sauvignon blanc rich, creamy & nutty

savagnin chardonnay godello

garganega (soave) medium, perfume & floral

viognier torrontes semillon

furmint (tokaji) marsanne

sweet

rich, tropical & honey muskat

off-dry, apricots & peaches chenin blanc

gewrztraminer riesling

(55)

Sparkling wine grapes

sparkling red rose white

dry, creamy & rich vintage champagne dry, light & citrus

champagne

pinot meunier pinot noir cava

macabeo

semi-sweet & floral prosecco

glera

sweet, apricots & rich

(56)

Appendix B: Wine region climate table

Rioja Venetien Apulien

Champagne Tokaj-Hegyalja Cava

Pfalz Piemonte Lisboa

Bourgogne Bordeux Kalifornien

Loiredalen Castilla y Leon Istra

Marlborough Valle Central Western Cape

Wellington Toscana Rhonedalen

Mosel Region del Sur South Australia

Morava Washington State Southeastern Australia

Rheingau VDLT Castilla Cuyo

England Marche Western Australia

Alsace Emilia-Romagna Aconcagua

Nieder ¨Osterreich Alicante

Lombardiet Frankrike sydv¨ast Douro

Rheinhessen Dobrogea Languedoc-Roussillon

Gotlands l¨an Terra Alta Priorat

Sk˚ane l¨an Manchuela Trakien

Ribeira Sacra Abruzzerna Sardinien

Canterbury Beira Valencia

Baden La Mancha Sicilien

Kalmar l¨an Costers del Segre Savoie

Victoria Oregon Kampanien

Kakheti region Vino Spumante Di Qualit´a Del Tipo Aromatico

Serra Gaucha

Valdeorras Tasmanien Tejo

Burgenland Navarra Danube Plain

Jura Del-Balaton Ribera del Duero

Podravina Vinos de Madrid Catalunya

Blekinge l¨an Rueda Montsant

New York State Minho Pened`es

R´ıas Baixas Arlanza Bierzo

La Rioja Valais Cari˜nena

Mosel-Saar-R¨uwer Central Otago Peloponnesos

Rhein Toro Hawke’s bay

W¨urttemberg Nagy-Soml´oi New South Wales

S¨odermanlands l¨an Jumilla Salta

B Appendix B

(57)

Sekt Friuli-Venezia-Giulia Rapel

Nelson Ligurien Trentino-Altoadige

Franken Nahe Eger

J¨amtlands l¨an VDLT de Murcia Terras Dosado

Gisborne Valdepe˜nas

Utiel-Requena Korsika

Somontano Maipo

Campo de Borja Maule

Malaga Mediterranee

Umbrien Bekaa

British Columbia Attica Duna-Tisza K¨ozi Toledo

Ribeiro

Getariako Txakolina Sopron

Pen´ınsula de Set´ubal Santorini

Znojmo Primorski Latium Alentejo Golanh¨ojderna Patagonien Kalabrien Yecla Provence Makedonien Coquimbo Molise Salamanca

Valle dela Orotava Vale dos Vinhedos Povardarje

Samos Kreta

Primorska Hrvatska

The effects of implementing domain knowledge in a recommender system

Examensarbete 30 hp Juni 2018

The effects of implementing

domain knowledge in a recommender system

Kerstin Ersson

Abstract

The effects of implementing domain knowledge in a recommender system

Populärvetenskaplig sammanfattning

Contents

1 Introduction

2 Background

2.1 Problem formulation

2.2 Project scope

2.3 Related work

3 Theory

3.1 Recommender systems

3.2 Evaluating recommender systems

Recommendation-centric

User-centric

System-centric

Delivery-centric

3.3 Statistical data types

3.4 Vector representation

3.5 Similarity measures

3.6 Wine

Blue grapes Green grapes

Wine Climate type

characteristic

3.7 Software

4 Method

4.1 The bag-of-words based recommender system

4.2 The domain knowledge based recommender system

5 Data processing

5.1 Preprocessing the data

5.2 The processed dataset

5.3 Cleaning data features

6 Training and evaluating the recommender systems

6.1 Training

6.2 Making recommendations

6.3 Evaluating the algorithm

7 Result

7.1 Performance of weighted setups for the knowledge based model

7.2 Average similarity, coverage and diversity

7.3 Correctness

8 Discussion

8.1 The bag-of-words based recommender system

8.2 The domain knowledge based recommender system

9 Conclusions

9.1 Future work

References

Appendix A: Grape similarity tree

Red wine grapes

A Appendix A

White wine grapes

Sparkling wine grapes

Appendix B: Wine region climate table

B Appendix B