Home → Magazine Archive → June 2021 (Vol. 64, No. 6) → Getting Down to Basics → Full Text

Getting Down to Basics

By Neil Savage

Communications of the ACM, Vol. 64 No. 6, Pages 12-14
10.1145/3460734

[article image]

Save PDF

Writing the code to make a computer perform a particular job could be a Herculean task, back in the 1950s and 60s.

"In the early 1950s, people did numerical computation by writing assembly language programs," says Alfred V. Aho, professor emeritus of computer science at Columbia University. "Assembly language is a language very close to the operations of a computer, and it's a deadly way to program. It's slow, tedious, and expensive."

Of course, people can program at higher levels of abstraction, but that requires translating the higher-level language into a more basic set of instructions the machine can understand. Compilers that efficiently perform that translation exist nowadays in large part due to the work of Aho and Jeffrey D. Ullman, professor emeritus of computer science at Stanford University. Their contribution to both the theory and practice of computer languages has earned them the 2020 ACM A.M. Turing Award.

"Compilers are responsible for generating the software that the world uses today, these trillion lines of software," Aho says. "They've been around for a long time, and the modern world wouldn't exist without programming language translators."

The two men helped develop formal language theory. They invented efficient algorithms for performing lexical analysis, syntax analysis, code generation, and code optimization, all essential tasks for a compiler. And they put that all together into Compilers: Principles, Techniques, and Tools, the definitive textbook on the subject, now known simply as "The Dragon Book" for its colorful cover illustration.


"No matter what you do in life, you need to understand how to use computers, which usually means being able to write at least simple code in some appropriate language."


The two met in 1963 while waiting in line to register for the Ph.D. program at Princeton University. Ullman, now 78, received his B.S. from Columbia that year, and Aho, 79, had just graduated from the University of Toronto with a B.S. in engineering physics. After each received his doctorate from Princeton a few months apart, in 1966 and 1967, they both joined Bell Laboratories, a hotbed for science and innovation.

"Academia has sort of always been my game plan," Ullman says, adding that he didn't feel assistant professors were particularly well treated at the time. "Today, assistant professors are really treated well in academia, especially in the field of computer science, where we recognize it's often the young people who have the greatest ideas," he says. When he was offered a job at Columbia he jumped at the chance, but kept traveling to Bell Labs about once a week to continue collaborating with Aho, until he moved to Stanford University in 1979. Aho, meanwhile, stayed at Bell for nearly 30 years, until he joined the Columbia faculty in 1995.

The researchers laid out a generic framework that provided the roadmap for how a compiler works. First comes lexical analysis, in which the compiler scans the program to identify tokens, which are analogous to the individual words in a sentence. Next comes a parser, which analyzes the syntax of the program, and checks whether the tokens fit together. "Not every sequence of tokens is a well-constructed program," Aho says. Next comes semantic analysis, which checks whether the sequences make sense as a whole.

uf1.jpg
Figure. Alfred Aho and Jeffrey Ullman in the corridor of their former office spaces at Bell Labs.

The two shrunk the size and increased the speed of a left-right parser, which aims to read the program in sequence from left to right without doubling back, but which is allowed to peek ahead to determine meaning. For instance, if the program wants to perform the operation A+BxC, the parser can see the A+ and know it hasn't reached the end of that particular element, so it keeps going. It eventually sees that it has to multiply B and C first, then add A, just because the creators of the compiler defined that order of precedent for arithmetic operations.

The trio of lexical, syntax, and semantic analysis makes up the front end of the compiler, which produces an intermediate representation of the program, which can then be optimized. If it turns out, for instance, that C equals 1, so that BxC=B, the program can be simplified by changing it to merely A+B. "That is a kind of optimization that can be performed to make the ultimate target program run faster," Aho says. The last step is to map the optimized intermediate to an assembly language, which can translate it directly into machine language for the particular type of computer.

Back to Top

Enter the Dragon

Aho and Ullman spelled all this out in the Dragon Book, which has since had two more editions and added Ravi Sethi, and later Monica Lam, as coauthors. It was Ullman's idea to depict a knight battling a dragon on the cover. The dragon represents "complexity of compiler design," Ullman explains, and "many of the weapons of the knight are the tools that we advocated people use," such as data flow analysis and syntax-directed translation. The first version showed a green dragon, the second was red, and the third, drawn by Ullman's son Scott, is purple.

The dragon was red when Krysta Svore, now head of the quantum computing group at Microsoft Research, was a student in Aho's class at Columbia. Even as advanced an application as a quantum program relies on the same ideas the two developed in the 1960s and 1970s. "What they're winning for is part of the foundation of what I and my team are working on for this next type of computer," she says. "What they developed is still critical for writing programs across smartphones, across quantum computers, across your laptop. These technologies are very much forever critical."


"What they developed is still critical for writing programs across smartphones, across quantum computers, across your laptop. These technologies are very much forever critical."


In Aho's class, he asked students to write their own domain-specific language. Svore wrote a quantum computer language, and enjoyed the experience so much that she asked Aho to become one of her thesis advisers.

The men streamlined the process of creating compilers even further by developing tools such as the Lex lexical analyzer generator, and the yacc parser generator, both of which automate the building of compiler components. Aho, along with Peter Weinberger and Brian Kernighan, also developed AWK, a "little language" to simplify some common data processing tasks. "To do these operations in C would require you to write 100 or 200 lines of C," Aho says. "We could specify these routine data processing tasks with one or two lines of AWK."

Aho and Ullman each have written several other books, both together and with other authors, including The Design and Analysis of Computing Algorithms with previous Turing Award recipient John Hopcroft, which laid down a framework for analyzing and teaching algorithms. They've received other awards for their work, including IEEE's John von Neumann Medal in 2010 for outstanding contributions to computer science.

The Turing Award comes with a $1-million cash prize that Aho and Ullman will split. Neither is sure yet what he'll do with the money. After a year of laying low due to the coronavirus pandemic, Ullman says, "I wouldn't mind going out to a restaurant and having a nice meal."

Back to Top

Broader Applications

The men say young people studying computer science, and even those who are not, could benefit from considering how thinking algorithmically fits into the wider world. "The future of computer science lies in applying the ideas more broadly," Aho says. "They should study not just computer science, but another discipline like biology, because there are fascinating opportunities for the application of computer science ideas to biological problems like how does the body work, how does the disease work?" The most intriguing question computer scientists might address, he suggests, is "how does the brain work?"

Ullman says everybody should be able to think about the world computationally, no matter what they do, just as people who are not professional writers ought to know how to write. Similarly, he says, everyone should be able to describe things precisely, even at such a high level of abstraction that it may not look like programming. "You've got to say what you mean and mean what you say," he says. "No matter what you do in life, you need to understand how to use computers, which usually means being able to write at least simple code in some appropriate language."

uf2.jpg
Figure. Watch the recipients discuss their work in the exclusive Communications video. https://cacm.acm.org/videos/2020-acm-turing-award

Back to Top

Author

Neil Savage is a science and technology writer based in Lowell, MA, USA.


©2021 ACM  0001-0782/21/6

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2021 ACM, Inc.

3 Comments

CACM Administrator

[The following letter is published in the Letters to the Editor pages of the August 2021 Communications of the ACM (https://cacm.acm.org/magazines/2021/8/254319). -- CACM Administrator]

I was thrilled when I opened my email weeks ago to discover Aho and Ullman were the ACM A.M. Turing Award 2020 recipients, and I was delighted to see Al Aho and Jeff Ullman on the cover of June Communications when it came in the mail today.

But I was dismayed when reading the article by Neil Savage, which contains two glaring mistakes.

Ullman joined the faculty of Princeton, not Columbia, from Bell Labs. I know because I had the great fortune to learn from Professor Ullman at Princeton -- he broke open the palace gates of computer science for his students, and sparked a lifelong passion for CS.

The article states that Aho and Ullman are being awarded for "their contribution to both the theory and practice of computer languages," but it is for so much more than that -- they were also instrumental in the foundations of the Analysis of Algorithms, as the Turing citation indicates. As students at Princeton, we enjoyed the inherent beauty of a gorgeous and complex subject explicated by some of the finest minds and authors, using the seminal text "The Design and Analysis of Computer Algorithms" by Aho, Hopcroft, and Ullman, known affectionately as AHU. In addition, Ullman was also instrumental in advancing database theory, as well as codifying VLSI theory, as his textbooks and courses at Stanford University will attest.

Also noteworthy, Ullman was able to bring Unix 6th edition to Princeton -- in 1975, we were set up in a small lab in the Engineering Quad on a PDP-11/45. With great support from Bell Labs, Peter Eichenberger, Tom Lyon, Eric Schmidt, and I started a port of Unix to the IBM 370 architecture. None of that would have been possible without JDU.

It is fitting that Aho and Ullman are recognized for their many seminal contributions -- let's appreciate all the great work they did -- it is a lot more than "just compilers."

Joseph P. Skudlarek
ACM Senior Member
Lake Oswego, OR, USA

CACM Administrator

[The following letter is published in the Letters to the Editor pages of the September 2021 Communications of the ACM (https://cacm.acm.org/magazines/2021/9/255043). -- CACM Administrator]

The cover photo of the June 2021 edition of Communications of the ACM depicts Jeffrey Ullman and Alfred Aho, winners of ACM's 2020 Turing Award, and editorial content in the issue celebrates this selection. As ACM and Communications are well aware, for many Iranian members of the computing community, Ullman is the face of discrimination in academia. For more than 15 years, Ullman maintained a Web page that denigrates Iranian students using demeaning and derogatory language and told them they are not welcome in the computing community because of political reasons. The text is so clear and direct that it is hard not to see it as violating ACM's Policy Against Harassment, but of course it was "just" published on Ullman's page for more than a decade rather than being delivered at an ACM event, so in ACM's eyes it does not count.

For context: shortly after the award announcement, a public letter(1) signed by more than 1,200 individuals from academia and industry including over 450 ACM members condemned the decision and shared details of Ullman's discriminatory correspondence over the years. On April 19, ACM officially confirmed receiving the letter and published a response,2 in which they did not even recognize the victims and that any harm was done. In parallel to publishing celebratory content about the winners, ACM's Executive Committee (EC), upon the request of Communications Editor-in-Chief to intervene, decided to reject an Op-Ed submitted by myself and five colleagues in which we criticized ACM leadership's, in our opinion, weak and inadequate response to the "CS for inclusion" letter and proposed concrete steps to be taken to help the cause.

To summarize:

1. Despite the claim that ACM EC and Communications were unaware of Ullman's long history of discriminatory rhetoric, they were officially made aware of this since mid-April;
2. Yet Communications published celebratory content about Ullman two months after the award;
3. In parallel, ACM EC and Communications reject (as far as we know the only) "communication" done by the community through "Communications of ACM" about such an important issue, based on the reason that "enough has been said about it already;"
4. ACM still claims to hold D&I as core value,(3) and that it "cannot accept any conduct that discriminates or denigrates an individual on the basis of citizenship or nationality."

Of course, Communications can come up with logistic excuses, but they cannot be relevant when it comes to any issue of such importance with more than a month left until the publication date, and evidently Communications and ACM's EC could in fact coordinate quickly when they choose to. I hope the ACM EC and Communications either will have the courage to accept the harm they furthered by Ullman's selection and Communications' celebratory coverage of it and correct their stance now that it matters most, or alternatively not claim to truly care about diversity and inclusion as core values or to be really against discrimination based on citizenship or nationality.

Mohammad Mahmoody
Charlottesville, VA, USA

EDITOR-IN-CHIEF'S RESPONSE:
To air community concerns raised that the June issue of Communications did not present a balanced view, we are publishing the above letter.

The ACM's official response to the letter from CSforinclusion can be found at https://www.acm.org/response-to-letter(2) and addresses the current criteria and process for Turing award selection, and the efforts under way to improve them to live up to the values ACM has articulated.

With respect to the actions of Communications, here is a bit of context for how the production logistics of the June issue operate. For each month, final production for ~120 pages of editorial content begins approximately 10 weeks before the issue date. This content production was well under way weeks before the Turing selection was made. Shortly after the award selection is announced, an exceptional process is triggered to collect photographic and interview information to add to the "Turing issue." So this overload process, for the Turing issue and consistent with monthly publication, produced final layouts for the issue by May 1. With our physical publishers and mail delays, this is a hard constraint. The claim that logistics cannot be relevant belies the reality they are a constraint in any practical endeavor.

We regret any upset that inclusion of traditional content for an ACM A.M. Turing Award in the June 2021 issue caused but chose to do so based on the judgment to proceed in usual fashion until greater clarity emerged.

Andrew A. Chien
Chicago, IL, USA

REFERENCES:
(1.) https://csforinclusion.wordpress.com/
(2.) https://www.acm.org/response-to-letter
(3.) https://www.acm.org/about-acm/mission-vision-values-goals

CACM Administrator

[The following letter is published in the Letters to the Editor pages of the September 2021 Communications of the ACM (https://cacm.acm.org/magazines/2021/9/255043). -- CACM Administrator]

In Neil Savage's June 2021 news article "Getting Down to Basics" (p. 12) describing the work of 2020 ACM A.M. Turing Award recipients Alfred Aho and Jeffrey Ullman, I was surprised to read they were credited with the development of LEX and YACC. LEX was the work of Mike Lesk and Eric Schmidt, and YACC was the work of Stephen (Steve) Johnson (Aho provided some insight and came up with the name)these tools were released as Unix utilities, and all three scientists were employees of Bell Labs.

David M. Abrahamson
Dublin, Ireland

EDITOR-IN-CHIEF'S RESPONSE:

Thanks, David.

Andrew A. Chien
Chicago, IL, USA

Displaying all 3 comments