Collaborative filtering (CF) is a technique used by

recommender system A recommender system, or a recommendation system (sometimes replacing 'system' with a synonym such as platform or engine), is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular ...

s.Francesco Ricci and Lior Rokach and Bracha Shapira
Introduction to Recommender Systems Handbook
Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering has two senses, a narrow one and a more general one. In the newer, narrower sense, collaborative filtering is a method of making automatic

prediction A prediction (Latin ''præ-'', "before," and ''dicere'', "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exact ...

s (filtering) about the interests of a

user Ancient Egyptian roles * User (ancient Egyptian official), an ancient Egyptian nomarch (governor) of the Eighth Dynasty * Useramen, an ancient Egyptian vizier also called "User" Other uses * User (computing), a person (or software) using an ...

by collecting preferences or

taste The gustatory system or sense of taste is the sensory system that is partially responsible for the perception of taste (flavor). Taste is the perception produced or stimulated when a substance in the mouth reacts chemically with taste receptor ...

information from many users (collaborating). The underlying assumption of the collaborative filtering approach is that if a person ''A'' has the same opinion as a person ''B'' on an issue, A is more likely to have B's opinion on a different issue than that of a randomly chosen person. For example, a collaborative filtering recommendation system for preferences in

television Television, sometimes shortened to TV, is a telecommunication medium for transmitting moving images and sound. The term can refer to a television set, or the medium of television transmission. Television is a mass medium for advertising, ...

programming could make predictions about which television show a user should like given a partial list of that user's tastes (likes or dislikes). Note that these predictions are specific to the user, but use information gleaned from many users. This differs from the simpler approach of giving an

average In ordinary language, an average is a single number taken as representative of a list of numbers, usually the sum of the numbers divided by how many numbers are in the list (the arithmetic mean). For example, the average of the numbers 2, 3, 4, 7 ...

(non-specific) score for each item of interest, for example based on its number of

vote Voting is a method by which a group, such as a meeting or an electorate, can engage for the purpose of making a collective decision or expressing an opinion usually following discussions, debates or election campaigns. Democracies elect holde ...

s. In the more general sense, collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. Applications of collaborative filtering typically involve very large data sets. Collaborative filtering methods have been applied to many different kinds of data including: sensing and monitoring data, such as in mineral exploration, environmental sensing over large areas or multiple sensors; financial data, such as financial service institutions that integrate many financial sources; or in electronic commerce and web applications where the focus is on user data, etc. The remainder of this discussion focuses on collaborative filtering for user data, although some of the methods and approaches may apply to the other major applications as well.

Overview

The

growth Growth may refer to: Biology * Auxology, the study of all aspects of human physical growth * Bacterial growth * Cell growth * Growth hormone, a peptide hormone that stimulates growth * Human development (biology) * Plant growth * Secondary grow ...

of the

Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, p ...

has made it much more difficult to effectively extract useful information from all the available online information. The overwhelming amount of data necessitates mechanisms for efficient information filtering. Collaborative filtering is one of the techniques used for dealing with this problem. The motivation for collaborative filtering comes from the idea that people often get the best recommendations from someone with tastes similar to themselves. Collaborative filtering encompasses techniques for matching people with similar interests and making recommendations on this basis. Collaborative filtering algorithms often require (1) users' active participation, (2) an easy way to represent users' interests, and (3) algorithms that are able to match people with similar interests. Typically, the workflow of a collaborative filtering system is: # A user expresses his or her preferences by rating items (e.g. books, movies, or music recordings) of the system. These ratings can be viewed as an approximate representation of the user's interest in the corresponding domain. # The system matches this user's ratings against other users' and finds the people with most "similar" tastes. # With similar users, the system recommends items that the similar users have rated highly but not yet being rated by this user (presumably the absence of rating is often considered as the unfamiliarity of an item) A key problem of collaborative filtering is how to combine and weight the preferences of user neighbors. Sometimes, users can immediately rate the recommended items. As a result, the system gains an increasingly accurate representation of user preferences over time.

Methodology

Collaborative Filtering in Recommender Systems

Collaborative filtering systems have many forms, but many common systems can be reduced to two steps: # Look for users who share the same rating patterns with the active user (the user whom the prediction is for). # Use the ratings from those like-minded users found in step 1 to calculate a prediction for the active user This falls under the category of user-based collaborative filtering. A specific application of this is the user-based Nearest Neighbor algorithm. Alternatively, item-based collaborative filtering (users who bought x also bought y), proceeds in an item-centric manner: # Build an item-item matrix determining relationships between pairs of items # Infer the tastes of the current user by examining the matrix and matching that user's data See, for example, the Slope One item-based collaborative filtering family. Another form of collaborative filtering can be based on implicit observations of normal user behavior (as opposed to the artificial behavior imposed by a rating task). These systems observe what a user has done together with what all users have done (what music they have listened to, what items they have bought) and use that data to predict the user's behavior in the future, or to predict how a user might like to behave given the chance. These predictions then have to be filtered through business logic to determine how they might affect the actions of a business system. For example, it is not useful to offer to sell somebody a particular album of music if they already have demonstrated that they own that music. Relying on a scoring or rating system which is averaged across all users ignores specific demands of a user, and is particularly poor in tasks where there is large variation in interest (as in the recommendation of music). However, there are other methods to combat information explosion, such as

web Web most often refers to: * Spider web, a silken structure created by the animal * World Wide Web or the Web, an Internet-based hypertext system Web, WEB, or the Web may also refer to: Computing * WEB, a literate programming system created by ...

search and

data clustering Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...

Types

Memory-based

The memory-based approach uses user rating data to compute the similarity between users or items. Typical examples of this approach are neighbourhood-based CF and item-based/user-based top-N recommendations. For example, in user based approaches, the value of ratings user ''u'' gives to item ''i'' is calculated as an aggregation of some similar users' rating of the item: :

r_ = \operatorname_ r_

where ''U'' denotes the set of top ''N'' users that are most similar to user ''u'' who rated item ''i''. Some examples of the aggregation function include: :

r_ = \frac\sum\limits_r_

r_ = k\sum\limits_\operatorname(u,u^\prime)r_

where k is a normalizing factor defined as

k =1/\sum_, \operatorname(u,u^\prime),

, and :

r_ = \bar +  k\sum\limits_\operatorname(u,u^\prime)(r_-\bar )

where

\bar

is the average rating of user ''u'' for all the items rated by ''u''. The neighborhood-based algorithm calculates the similarity between two users or items, and produces a prediction for the user by taking the weighted average of all the ratings. Similarity computation between items or users is an important part of this approach. Multiple measures, such as Pearson correlation and vector cosine based similarity are used for this. The Pearson correlation similarity of two users ''x'', ''y'' is defined as :

\operatorname(x,y) = \frac

where I_xy is the set of items rated by both user ''x'' and user ''y''. The cosine-based approach defines the cosine-similarity between two users ''x'' and ''y'' as:John S. Breese, David Heckerman, and Carl Kadie
Empirical Analysis of Predictive Algorithms for Collaborative Filtering
1998 :

\operatorname(x,y) = \cos(\vec x,\vec y) = \frac = \frac

The user based top-N recommendation algorithm uses a similarity-based vector model to identify the ''k'' most similar users to an active user. After the ''k'' most similar users are found, their corresponding user-item matrices are aggregated to identify the set of items to be recommended. A popular method to find the similar users is the Locality-sensitive hashing, which implements the nearest neighbor mechanism in linear time. The advantages with this approach include: the explainability of the results, which is an important aspect of recommendation systems; easy creation and use; easy facilitation of new data; content-independence of the items being recommended; good scaling with co-rated items. There are also several disadvantages with this approach. Its performance decreases when data gets sparse, which occurs frequently with web-related items. This hinders the

scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...

of this approach and creates problems with large datasets. Although it can efficiently handle new users because it relies on a

data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, ...

, adding new items becomes more complicated since that representation usually relies on a specific

vector space In mathematics and physics, a vector space (also called a linear space) is a set whose elements, often called '' vectors'', may be added together and multiplied ("scaled") by numbers called ''scalars''. Scalars are often real numbers, but can ...

. Adding new items requires inclusion of the new item and the re-insertion of all the elements in the structure.

Model-based

In this approach, models are developed using different data mining,

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

algorithms to predict users' rating of unrated items. There are many model-based CF algorithms.

Bayesian networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...

, clustering models, latent semantic models such as

singular value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is re ...

, probabilistic latent semantic analysis, multiple multiplicative factor, latent Dirichlet allocation and Markov decision process based models.Xiaoyuan Su, Taghi M. Khoshgoftaar
A survey of collaborative filtering techniques
Advances in Artificial Intelligence archive, 2009. Through this approach, dimensionality reduction methods are mostly being used as complementary technique to improve robustness and accuracy of memory-based approach. In this sense, methods like

principal component analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...

, known as latent factor models, compress user-item matrix into a low-dimensional representation in terms of latent factors. One advantage of using this approach is that instead of having a high dimensional matrix containing abundant number of missing values we will be dealing with a much smaller matrix in lower-dimensional space. A reduced presentation could be utilized for either user-based or item-based neighborhood algorithms that are presented in the previous section. There are several advantages with this paradigm. It handles the sparsity of the original matrix better than memory based ones. Also comparing similarity on the resulting matrix is much more scalable especially in dealing with large sparse datasets.

Hybrid

A number of applications combine the memory-based and the model-based CF algorithms. These overcome the limitations of native CF approaches and improve prediction performance. Importantly, they overcome the CF problems such as sparsity and loss of information. However, they have increased complexity and are expensive to implement. Usually most commercial recommender systems are hybrid, for example, the Google news recommender system.

Deep-Learning

In recent years a number of neural and deep-learning techniques have been proposed. Some generalize traditional Matrix factorization algorithms via a non-linear neural architecture, or leverage new model types like Variational Autoencoders. While deep learning has been applied to many different scenarios: context-aware, sequence-aware, social tagging etc. its real effectiveness when used in a simple collaborative recommendation scenario has been put into question. A systematic analysis of publications applying deep learning or neural methods to the top-k recommendation problem, published in top conferences (SIGIR, KDD, WWW, RecSys), has shown that on average less than 40% of articles are reproducible, with as little as 14% in some conferences. Overall the study identifies 18 articles, only 7 of them could be reproduced and 6 of them could be outperformed by much older and simpler properly tuned baselines. The article also highlights a number of potential problems in today's research scholarship and calls for improved scientific practices in that area. Similar issues have been spotted also in sequence-aware recommender systems.

Context-aware collaborative filtering

Many recommender systems simply ignore other contextual information existing alongside user's rating in providing item recommendation. However, by pervasive availability of contextual information such as time, location, social information, and type of the device that user is using, it is becoming more important than ever for a successful recommender system to provide a context-sensitive recommendation. According to Charu Aggrawal, "Context-sensitive recommender systems tailor their recommendations to additional information that defines the specific situation under which recommendations are made. This additional information is referred to as the context." Taking contextual information into consideration, we will have additional dimension to the existing user-item rating matrix. As an instance, assume a music recommender system which provide different recommendations in corresponding to time of the day. In this case, it is possible a user have different preferences for a music in different time of a day. Thus, instead of using user-item matrix, we may use

tensor In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects related to a vector space. Tensors may map between different objects such as vectors, scalars, and even other tensor ...

of order 3 (or higher for considering other contexts) to represent context-sensitive users' preferences. In order to take advantage of collaborative filtering and particularly neighborhood-based methods, approaches can be extended from a two-dimensional rating matrix into a tensor of higher order. For this purpose, the approach is to find the most similar/like-minded users to a target user; one can extract and compute similarity of slices (e.g. item-time matrix) corresponding to each user. Unlike the context-insensitive case for which similarity of two rating vectors are calculated, in the

context-aware Context awareness refers, in information and communication technologies, to a capability to take into account the ''situation'' of ''entities'', which may be users or devices, but are not limited to those. ''Location'' is only the most obvious el ...

approaches, the similarity of rating matrices corresponding to each user is calculated by using Pearson coefficients. After the most like-minded users are found, their corresponding ratings are aggregated to identify the set of items to be recommended to the target user. The most important disadvantage of taking context into recommendation model is to be able to deal with larger dataset that contains much more missing values in comparison to user-item rating matrix. Therefore, similar to matrix factorization methods, tensor factorization techniques can be used to reduce dimensionality of original data before using any neighborhood-based methods.

Application on social web

Unlike the traditional model of mainstream media, in which there are few editors who set guidelines, collaboratively filtered social media can have a very large number of editors, and content improves as the number of participants increases. Services like

Reddit Reddit (; stylized in all lowercase as reddit) is an American social news aggregation, content rating, and discussion website. Registered users (commonly referred to as "Redditors") submit content to the site such as links, text posts, imag ...

YouTube YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second mo ...

, and Last.fm are typical examples of collaborative filtering based media. One scenario of collaborative filtering application is to recommend interesting or popular information as judged by the community. As a typical example, stories appear in the front page of

as they are "voted up" (rated positively) by the community. As the community becomes larger and more diverse, the promoted stories can better reflect the average interest of the community members.

Wikipedia Wikipedia is a multilingual free online encyclopedia written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and using a wiki-based editing system. Wikipedia is the largest and most-read refer ...

is another application of collaborative filtering. Volunteers contribute to the encyclopedia by filtering out facts from falsehoods. Another aspect of collaborative filtering systems is the ability to generate more personalized recommendations by analyzing information from the past activity of a specific user, or the history of other users deemed to be of similar taste to a given user. These resources are used as user profiling and helps the site recommend content on a user-by-user basis. The more a given user makes use of the system, the better the recommendations become, as the system gains data to improve its model of that user.

Problems

A collaborative filtering system does not necessarily succeed in automatically matching content to one's preferences. Unless the platform achieves unusually good diversity and independence of opinions, one point of view will always dominate another in a particular community. As in the personalized recommendation scenario, the introduction of new users or new items can cause the cold start problem, as there will be insufficient data on these new entries for the collaborative filtering to work accurately. In order to make appropriate recommendations for a new user, the system must first learn the user's preferences by analysing past voting or rating activities. The collaborative filtering system requires a substantial number of users to rate a new item before that item can be recommended.

Challenges

Data sparsity

In practice, many commercial recommender systems are based on large datasets. As a result, the user-item matrix used for collaborative filtering could be extremely large and sparse, which brings about challenges in the performance of the recommendation. One typical problem caused by the data sparsity is the cold start problem. As collaborative filtering methods recommend items based on users' past preferences, new users will need to rate a sufficient number of items to enable the system to capture their preferences accurately and thus provides reliable recommendations. Similarly, new items also have the same problem. When new items are added to the system, they need to be rated by a substantial number of users before they could be recommended to users who have similar tastes to the ones who rated them. The new item problem does not affect content-based recommendations, because the recommendation of an item is based on its discrete set of descriptive qualities rather than its ratings.

Scalability

As the numbers of users and items grow, traditional CF algorithms will suffer serious scalability problems. For example, with tens of millions of customers

O(M)

and millions of items

O(N)

, a CF algorithm with the complexity of

n

is already too large. As well, many systems need to react immediately to online requirements and make recommendations for all users regardless of their millions of users, with most computations happening in very large memory machines.Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Bosagh Zade
WTF: The who-to-follow system at Twitter
Proceedings of the 22nd international conference on World Wide Web

Synonyms

Synonyms A synonym is a word, morpheme, or phrase that means exactly or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are ...

refers to the tendency of a number of the same or very similar items to have different names or entries. Most recommender systems are unable to discover this latent association and thus treat these products differently. For example, the seemingly different items "children's movie" and "children's film" are actually referring to the same item. Indeed, the degree of variability in descriptive term usage is greater than commonly suspected. The prevalence of synonyms decreases the recommendation performance of CF systems. Topic Modeling (like the Latent Dirichlet Allocation technique) could solve this by grouping different words belonging to the same topic.

Gray sheep

Gray sheep refers to the users whose opinions do not consistently agree or disagree with any group of people and thus do not benefit from collaborative filtering.

Black sheep In the English language, black sheep is an idiom that describes a member of a group who is different from the rest, especially a family member who does not fit in. The term stems from sheep whose fleece is colored black rather than the more comm ...

are a group whose idiosyncratic tastes make recommendations nearly impossible. Although this is a failure of the recommender system, non-electronic recommenders also have great problems in these cases, so having black sheep is an acceptable failure.

Shilling attacks

In a recommendation system where everyone can give the ratings, people may give many positive ratings for their own items and negative ratings for their competitors'. It is often necessary for the collaborative filtering systems to introduce precautions to discourage such manipulations.

Diversity and the long tail

Collaborative filters are expected to increase diversity because they help us discover new products. Some algorithms, however, may unintentionally do the opposite. Because collaborative filters recommend products based on past sales or ratings, they cannot usually recommend products with limited historical data. This can create a rich-get-richer effect for popular products, akin to

positive feedback Positive feedback (exacerbating feedback, self-reinforcing feedback) is a process that occurs in a feedback loop which exacerbates the effects of a small disturbance. That is, the effects of a perturbation on a system include an increase in th ...

. This bias toward popularity can prevent what are otherwise better consumer-product matches. A

Wharton Wharton may refer to: Academic institutions * Wharton School of the University of Pennsylvania * Wharton County Junior College * Paul R. Wharton High School * Wharton Center for Performing Arts, at Michigan State University Places * Wharton, ...

study details this phenomenon along with several ideas that may promote diversity and the "

long tail In statistics and business, a long tail of some distributions of numbers is the portion of the distribution having many occurrences far from the "head" or central part of the distribution. The distribution could involve popularities, random nu ...

." Several collaborative filtering algorithms have been developed to promote diversity and the "

" by recommending novel, unexpected, and serendipitous items.

Innovations

* New algorithms have been developed for CF as a result of the

Netflix prize The Netflix Prize was an open competition for the best collaborative filtering algorithm to predict user ratings for films, based on previous ratings without any other information about the users or films, i.e. without the users being identified e ...

. * Cross-System Collaborative Filtering where user profiles across multiple recommender systems are combined in a multitask manner; this way, preference pattern sharing is achieved across models.. *

Robust collaborative filtering Robust collaborative filtering, or attack-resistant collaborative filtering, refers to algorithms or techniques that aim to make collaborative filtering more robust against efforts of manipulation, while hopefully maintaining recommendation qualit ...

, where recommendation is stable towards efforts of manipulation. This research area is still active and not completely solved.

Auxiliary information

User-item matrix is a basic foundation of traditional collaborative filtering techniques, and it suffers from data sparsity problem (i.e. cold start). As a consequence, except for user-item matrix, researchers are trying to gather more auxiliary information to help boost recommendation performance and develop personalized recommender systems. Generally, there are two popular auxiliary information: attribute information and interaction information. Attribute information describes a user's or an item's properties. For example, user attribute might include general profile (e.g. gender and age) and social contacts (e.g. followers or friends in

social networks A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for a ...

); Item attribute means properties like category, brand or content. In addition, interaction information refers to the implicit data showing how users interplay with the item. Widely used interaction information contains tags, comments or reviews and browsing history etc. Auxiliary information plays a significant role in a variety of aspects. Explicit social links, as a reliable representative of trust or friendship, is always employed in similarity calculation to find similar persons who share interest with the target user. The interaction-associated information - tags - is taken as a third dimension (in addition to user and item) in advanced collaborative filtering to construct a 3-dimensional tensor structure for exploration of recommendation.

References

External links

''Beyond Recommender Systems: Helping People Help Each Other''
page 12, 2001
Recommender Systems.
Prem Melville and Vikas Sindhwani. In Encyclopedia of Machine Learning, Claude Sammut and Geoffrey Webb (Eds), Springer, 2010.
Recommender Systems in industrial contexts - PHD thesis (2012) including a comprehensive overview of many collaborative recommender systems Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions
Adomavicius, G. and Tuzhilin, A. IEEE Transactions on Knowledge and Data Engineering 06.2005
Evaluating collaborative filtering recommender systemsDOI10.1145/963770.963772

Content-Boosted Collaborative Filtering for Improved Recommendations.
Prem Melville, Raymond J. Mooney, and Ramadass Nagarajan. Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI-2002), pp. 187–192, Edmonton, Canada, July 2002.
A collection of past and present "information filtering" projects (including collaborative filtering) at MIT Media LabEigentaste: A Constant Time Collaborative Filtering Algorithm. Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. Information Retrieval, 4(2), 133-151. July 2001.A Survey of Collaborative Filtering Techniques
Su, Xiaoyuan and Khoshgortaar, Taghi. M
Google News Personalization: Scalable Online Collaborative Filtering
Abhinandan Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. International World Wide Web Conference, Proceedings of the 16th international conference on World Wide Web
Factor in the Neighbors: Scalable and Accurate Collaborative Filtering
Yehuda Koren, Transactions on Knowledge Discovery from Data (TKDD) (2009)
Rating Prediction Using Collaborative FilteringRecommender SystemsBerkeley Collaborative Filtering
{{DEFAULTSORT:Collaborative Filtering Collaboration Collaborative software Collective intelligence Information retrieval techniques Recommender systems Social information processing