Thursday, March 21, 2013

Hybrid Recommendation Systems


Hybrid recommendation systems are mix of single recommendation systems as sub-components.  This hybrid approach was introduced to cope with a problem of conventional recommendation systems.  Two main problems have been addressed by researchers in this field, cold-start problem and stability versus plasticity problem.  Cold-start problem occurs when learning based techniques like collaborative, content-based, and demographic recommendation algorithms are used.  Their learning stages are based on users’ information, in most cases a user has to input their ratings or preferences manually and therefore the collection of this kind of information is hard to be achieved.  Stability problem means that it is sometimes hard to change established users’ profiles which have been established after a given period of time using the systems.  The former problem can be solved with the hybrid approach because different type of recommendation technique like knowledge based algorithm can be less affected by the problem.  One of the solutions for the latter problem is temporal discount, which make older ratings with less influence.
Therefore, various hybrid recommendation techniques have been introduced and tested.  Four major recommendation techniques constructing hybrids are collaborative filtering (CF), content-based (CN), demographic, and knowledge-based (KB).  Unlike the first three which make use of learning algorithms, KB exploits domain knowledge and makes inferences about users’ needs and preferences.  Hybrid recommendation systems can produce outputs which outperforms single component systems by combining these multiple techniques.  The most common hybridizing methodology is combining different techniques of different types, for example, mixing CN and CF.  However, it is also possible to mix different techniques of the same type, like naive Bayes based CN plus kNN based CN.  Also, mixing same type of techniques with different datasets can be possible.
Burke (2002) introduced taxonomy for the hybrid recommendation systems.  He classified them into seven categories, weighted, switching, mixed, feature combination, feature augmentation, cascade, and meta-level.
Weighted hybrid – This hybrid combines scores from each component using linear formula.  Therefore, components must be able to produce its recommendation score which can be linearly combinable.  Also, the components have to be consistent relative accuracy across the product space and to perform uniformly.
 Switching hybrid – The issue of this hybrid is selecting one recommender among candidates.  This selection is made according to the situation it is experiencing.  The criterion for the selection like confidence value or external criteria should exist and the components might have different performance with different situations. 
  Mixed hybrid – This is a hybrid which is based on the merging and presentation of multiple ranked lists into one.  Each component of this hybrid should be able to produce recommendation lists with ranks and the core algorithm of mixed hybrid merges them into a single ranked list.  The issue here is how the new rank scores should be produced.  One simple example is simply adding each rank score like CF_rank (3) + CN_rank (2) à Mixed_rank (5).
  Feature combination hybrid – There exist two very different recommendation components for this hybrid, contributing and actual recommender.  The actual recommender works with data modified by the contributing one.  The contributing one injects features of one source to the source of the other component.
 Feature augmentation hybrid – This is similar to the feature combination hybrids but different in that the contributor generates new features. It is more flexible and adds smaller dimension than feature combination method.
 Cascade hybrid – This one is a tie breaker.  The secondary recommender is just a tie breaker and does refinements.
 Meta-level hybrid – For this one, contributing and actual recommenders exist but the former one completely replaces the data for the latter one, not just part of it.




No comments:

Post a Comment