Massachusetts Institute of Technology and IBM researchers have developed a new text-analysis method for rapidly narrowing down reading material from billions of online options. MIT's Justin Solomon and colleagues designed an algorithm that abstracts a collection of material into topics based on commonly-used words in the collection, then splits each text into its five to 15 key topics, and estimates how much each topic generally contributes to the text.
The algorithm compares books via three text-analysis tools — topic modeling, word embeddings, and optimal transport — to compare topics within a collection of books and within any pair of books.
The method can compare and sort texts faster and with more accuracy than competing techniques and shed light on the model's decision-making process when making recommendations.
From MIT CSAIL
View Full Article
Abstracts Copyright © 2019 SmithBucklin, Washington, DC, USA