Technical Perspective: The Quest for Optimal Multi-Item Auctions

What is the best way to sell one or multiple indivisible items to maximize revenue? Should they be priced separately, priced in bundles, or sold using a more complex sales mechanism? Despite their practical importance, ancient history, and centuries of scientific study, these questions have remained unanswered. The challenge facing sellers and scientists alike is that there are myriad sales mechanisms. While we are used to price tags on items in stores, many other sales mechanisms exist. Writing a model encompassing all possible mechanisms is already onerous, let alone optimizing over them.

It thus came as a surprise when, in the 1980s, economists developed mathematics that could pinpoint the best way of selling a single item to one or multiple strategic buyers, given distributional information about their values for the item. When these values are independent and identically distributed, the optimum can be cast as the familiar ascending price auction with a reserve price, such as what eBay uses, and becomes a somewhat more exotic auction when values are not identically distributed. Myerson's extremely influential, Nobel-prize winning work prescribes exactly how to set it up.

Alas, pushing this program to the multi-item setting has bedeviled economists and computer scientists. It should be appreciated that organizing the simultaneous sale of multiple items is neither an innocuous nor an esoteric problem. When governments sell radio spectrum, they simultaneously sell multiple licenses in one big auction. When ad exchanges sell banner ads on, say, the New York Times frontpage, they simultaneously sell all banners for each impression. When online retailers sell various items to competing buyers, they should organize their sales cleverly. Despite the importance of multi-item settings, optimal solutions remain elusive.

Indeed, extensive study has revealed that optimal multi-item mechanisms, even for selling two items to one buyer, are tremendously more complex than single-item ones. Bundling and randomized allocations are necessary, and the optimal mechanism may exhibit various counterintuitive properties, as I describe in a recent survey.² At the time of writing, there only exist results characterizing optimal multi-item single-buyer mechanisms.³ Those results, using optimal transport theory to characterize the structure of the optimal mechanism in terms of stochastic dominance conditions satisfied by the buyer's value distribution, suggest that there is no one-size-fits-all optimal multi-item mechanism.

Absent a simple, universal structure in the optimal mechanism there are two natural approaches, which both bode well with the CS aesthetic and have been extensively pursued. One is trading optimality for simplicity, as pursued in a growing body of work that has been identifying progressively simpler, approximately optimal mechanisms for progressively more complex multi-item multi-buyer settings.

Another approach is to develop frameworks for computing optimal mechanisms on an instance-by-instance basis, given the buyers' value distributions. Such frameworks are useful both for their end products, that is, the mechanisms they compute, but also as exploratory tools, for investigating the existence of structure in the optimal mechanism over a range of settings of interest. There is a large volume of work on this topic, computing optimal mechanisms in quite complex settings via convex programming.²

The following paper by Dütting et al. contributes a very interesting and forward-looking new take on this computational challenge, initiating the use of deep learning for mechanism design.

Rather than taking the value distributions as input, their framework receives samples from that distribution and trains a neural network representing the mechanism to attain good empirical revenue. The sample complexity of optimal multi-item mechanisms has been somewhat studied, so we have a reasonable understanding of what settings allow near-optimal mechanisms to be identified from polynomially many samples; in particular, it is known that some conditional independence across item values is needed for statistical efficiency.¹

Rather than pursuing optimality guarantees for their mechanisms, they again deviate from prior work by pursuing best-effort guarantees, that is, the best revenue that gradient descent may deliver, even without endowing their neural networks with much inductive bias about the structure of the optimal mechanism. Moreover, rather than exact truthfulness they seek approximate truthfulness, enforced via min-max training formulations for their neural networks.

With those design choices, the authors provide a flexible computational framework, which readily accommodates continuous distributions, and is more scalable in some dimensions of the problem. Despite the lack of strong inductive bias and gradient-based optimization, empirical testing shows that the mechanisms identified by their framework attain close to optimal revenue in simple settings where optimal mechanisms have been otherwise identified, while in other settings where the optimum had not been identified it computes mechanisms that are near-optimal. How robust are these findings in broader settings? And, if they are, what would be that elusive property of optimal multi-item mechanisms that enables relatively agnostic architectures to represent them and gradient-descent to identify them?