Massachusetts Institute of Technology (MIT) professor Philippe Rigollet and colleagues recently completed a study to test how well variants of online ads could drive more traffic as well as reach more viewers.
Rigollet says their approach is mathematically guaranteed to yield optimal results, and also addresses the "exploration-versus-exploitation dilemma." He believes the work could have implications for the distribution of tasks in parallel computers.
In the technique used by the researchers, three or four batches of tests were generally found to yield results as good as those generated by testing subjects individually. The measure of effectiveness they used was "cumulative regret," or the aggregate difference between the rewards the subjects in the trial received and the rewards they would have received had they all been administered what proved to be the best-performing option.
The type of problem Rigollet and his colleagues explored is called a "bandit problem," in which someone is trying to determine which of several slot machines ("one-armed bandits") offers the best rate of return without going broke. The problem requires resolving the exploration-versus-exploitation dilemma.
"Surprisingly...these algorithms still find a treatment that is nearly as good as the best single treatment in hindsight," says Google's Moritz Hardt. "This new development holds the promise of making bandit optimization a more robust choice across several application domains."
From MIT News
View Full Article
Abstracts Copyright © 2015 Information Inc., Bethesda, Maryland, USA