Sorting and transmitting data are two of the most fundamental tasks for which we have employed digital computers. The following paper proves a remarkable connection between how efficiently computers can perform these two tasks, connecting a longstanding question about the optimality of Merge Sort and another, very different, open problem in the study of network coding for data transmission.
Merge Sort was one of the first programs written for digital computers. Though, as a comparison-based in-memory algorithm, it has since been superseded by other sorting algorithms with better memory usage and by algorithms, such as radix sort, that are faster than any comparison-based algorithm, Merge Sort has remained important for sorting large amounts of data that require external storage. It can be modified to merge multiple streams at once and only requires the sequential access common for many storage media. A natural question to ask is: Is (multiway) Merge Sort an optimal choice for an external-memory sorting algorithm, or can we do much better? With the ubiquity of large datasets, this could have many practical applications.