Netflix uses a wide array of Big Data techniques to generate
their above average recommendations. Netflix uses machine-learning algorithms
heavily, essentially before or after almost every other step, in generating recommendations.
This focus is important because it raises significant issues with processing.
With online processing, user interactions are responded to rapidly, but the
amount of data that can be processed and the computational complexity of the
processing are limited. Offline processing alleviates both of these issues, but
lowers responsiveness, increasing the likelihood of data becoming outdated
during processing. Nearline processing is a middle ground option that allows
for online processing but is not required to occur in real time. With each of
these possibilities come complex consequences and side effects. To control
this, Netflix uses a combination of all three methods of processing across
Amazon’s Web Services in an architecture illustrated below.
As you can see, this is an extremely complex setup. Netflix
uses offline processing for calculating overarching trends or other things that
require no user input, as well as machine learning to develop algorithms that
can be used for result calculations. Nearline processing is used largely to
develop backup plans should online processing fail to produce results as
quickly as required. Nearline is also used in situations where time is of less
importance than accuracy, for instance updating recommendations to show that a
movie has been watched, while the user is watching the movie. Online computing
is used largely in response to user activity, such as searching for a category.
Netflix’s hybrid approach is particularly useful in situations where
intermediate results can be batch processed and then used to calculate more
specific results in real time in response to user activity. Most of Netflix’s
model training and machine learning is done offline and then used online.
Netflix's hybrid approach is particularly important to big data, because it manages to create very strong recommendations, less likely to be accomplished using only online or nearline methods, while still maintaining a fast response time that would not be possible using only offline approaches.
Source: http://techblog.netflix.com/2013/03/system-architectures-for.html
After reading this post I was interested on how youtube conducted their recommendation algorithm. What I found is that they have a different outlook on how their algorithm should run. Because the Youtube servers are growing at a ridiculously quick rate their algorithm takes into account the newness of a video as well. So in order to keep youtubers interested. The analysts at Youtube have created an algorithm that recommends a mixture of old and new videos in order to keep new content from getting buried immediately. This also is good news to youtubers that create content on the site. It gives them a larger chance for their content to be seen, and I think this is what you tube was going for.
ReplyDelete