The New York Times' article recommendation algorithm has been overhauled, using Collaborative Topic Modeling (CTM) as its inspiration. CTM models content, adjusts the model by viewing signals from readers, models reader preference, and bases its recommendations on the similarity between preference and content. The algorithm first models each article as a combination of the subjects it covers, and then models each reader according to the topics they prefer. The algorithm produces recommendations based on how closely their topics align with the readers' favorite topics.
To model an article based on its text, the algorithm examines the body of each article and applies Latent Dirichlet Allocation, which learns the mixture of "topics" in each article. A document with a high-weighted topic has words that are more likely to be weighted highly under this topic.
Updating the model according to audience reading patterns is facilitated by incorporating those patterns on top of content modeling, iteratively adjusting offsets and recalculating reader scores until neither exhibits much change. CTM then describes readers based on reading history to rapidly calculate reader preference in less than 1 millisecond per reader.
From The New York Times
View Full Article
Abstracts Copyright © 2015 Information Inc., Bethesda, Maryland, USA