Opinion
Computing Applications Viewpoints

The Business of Software: The Ontology of Paper

The next generation of software engineering will involve designing systems without using paper-based formats, instead using software to develop software.
Posted
  1. Introduction
  2. Electronic Paper
  3. An IDE Whose Time Has Not Quite Come
  4. Paper Is Good For...
  5. Paper Is Not Good For...
  6. Steam Engines
  7. Author
  8. Footnotes
  9. Figures

It is somewhat ironic that in our business we don’t use much software to produce software. Excepting dynamic testing, we don’t generally employ executable knowledge to create executable knowledge. We have compilers that convert a text form of instructions understandable to (some) humans into machine instructions executable by a machine. But a compiler doesn’t know what we are supposed to do and it cannot generate or validate the external real-world functional knowledge. We have CASE tools and modeling tools that allow us to store some of the knowledge we need in a variety of formats such as directed graphs, tables, and dictionaries. But we have to put the knowledge into these tools—the tools cannot self-populate. And often, the only thing we can do with knowledge we have put into these tools is merely retrieve it and look at it.

Knowledge in most software development tools does not usually execute. A tool may verify that the knowledge format conforms to some conventional modeling rules, but few tools allow us to automatically validate that the contained knowledge is correct and applicable to the situation at hand in the real world.

Back to Top

Electronic Paper

As rich and clever (and expensive) as these tools may be, they mostly fail the test of executability. They are more like electronic paper than they are software. The real work of software development is done in word processors, list-format data catalogs, and simple constrained drawing tools. More correctly, most of the work in software development is done in the individual and collective brains of the developers—the word processors and other tools are the nonexecutable book-like places we put the knowledge once we have it.

Back to Top

An IDE Whose Time Has Not Quite Come

Integrated development environments (IDEs) have come a long way since the days of programming using the VI editor, but they are still mostly word processors with some specific look-up and retrieval processes and just a few executing functions such as condition checking (which, of course, we have to define).

We spend most of our systems development effort collecting pieces of knowledge and putting them down on pieces of paper (electronic or otherwise) and it is interesting to consider just what the physical act of putting knowledge onto pieces of paper does to the knowledge (see the figure here).

Words are in sequence. When we put information into words on pieces of paper we apply a sequence. Words are in sequence. We can change the sequence words are in but in sequence words are nonetheless. So even if what we are describing is not sequential, when we put it in words, it will be.

Text is control oriented. The act of reading (and writing) words is a control-oriented task. There is always an associated point of attention—what is being read at that moment in time. The point of control generally moves from one statement to another according to the sequential behavior described previously. Putting ideas onto paper forces this control focus on the ideas we put on the paper.

Reading is single tasking. People do not usually read just one word at a time; they actually absorb patterns or blocks of several words at the same time. But our attention is mostly fixed at one place. If I am reading this word, I am not reading that word; if I am looking at this page, I am not looking at that page. Even if what we describe in words is concurrent, event-driven, and multi-tasking, once it is in words it will look like it is single-tasked.

Text has few connections. The only real relationship that exists between physical words is next-prior. This is the sequence of the words and it is tracked in a single-tasking mode using the place-holding reading control mechanism. We have developed different flavors of next-prior at different levels of abstraction to approximate different types of connections. We group sets of concepts into sentences and then into paragraphs and into chapters. We assign different purposes to different chapters to use the physical proximity to model the logical relationship. We indent paragraphs to show decomposition or ownership and there is usually some implicit relationship of knowledge elements at the same indentation level. If we want to reinforce a sequential relationship we may assign numbers to the paragraphs. If we want to de-emphasize sequence and infer equivalence, we may annotate the paragraphs with bullets.

In the electronic version of the written word we are able to embed hyperlinks to “join” concepts together that are physically separated in the document. But while the hyperlink allows physical connection of ideas and quicker transition between them, it doesn’t clearly show proximity of the knowledge. On paper, the physical proximity of ideas is still the strongest mechanism for associating knowledge.


The systems we are building now are those for which the ontology of paper does not work very well and we have to rethink how we think about systems.


Words have limited scope. There is a relationship between the understand ability of a document and the number of words. We manage this complexity by the mechanisms I’ve described and by adjusting the abstraction level of the words we use—we can use fewer words, but only if the meaning of these words (which is often located elsewhere) is more “knowledgeful.” We see this clearly in certain disciplines such as the medical profession, which has created a domain-specific lexicon to allow more efficient communication. We have the option of using fewer and simpler words but then we inevitably convey less meaning since the sentences must contain less knowledge.

Insular association. Lastly, within one document, it is very difficult to associate ideas that are contained in another document. We do this through indexes and citations, but the ideas are still “further away.” When we want to really bring them together we have to rewrite the ideas and include them in the source document (with attribution). This uses physical proximity of the words to emphasize the logical relationship of the ideas carried by the words.

Back to Top

Paper Is Good For…

If we were to use words and paper (in either the dead-wood or the electronic form), to describe the behavior of a system, we would expect it to work well for those systems where the knowledge maps closely to the ontology of the medium, specifically systems that are: sequential, single tasking, control based, mostly consisting of localized knowledge (the proximate knowledge is related and related knowledge is proximate), all the necessary knowledge is local and the abstraction level of system statements does not require extensive external reference because either the definition is simple or the complexity is abstracted into the language statements. And, of course, the knowledge would not be too “big” or too “complex.”

Back to Top

Paper Is Not Good For…

On the other hand, we should expect that describing the operation of systems using paper formats would not work well for systems that are: multi-tasking, event-driven (especially out of state or time sequence), that need to interface with other knowledge repositories, especially where the abstraction level has not been defined and standardized, and that are large and complex.

If we look at the types of systems we were building a couple of decades ago, they fit well to the strengths of paper. They were silo’d, insular, control-based sequential systems that were, compared to today’s systems, rather simple. But we don’t build these kinds of systems much anymore. The systems we are building now are those for which the ontology of paper does not work very well and we have to rethink how we think about systems. Given that software is a knowledge medium, the future of software engineering clearly lies in constructing software artifacts using media, tools, and representational forms that:

  • Can integrate knowledge both locally and remotely,
  • Use different (and preferably programmable) representational forms than text that better lend themselves to the time- and state-based structure of the problem we are describing,
  • Allow linking and manipulation of ideas remotely (for both machine operation and human understanding) and, most importantly…
  • Are executable.

Back to Top

Steam Engines

It is a common cliché that we are currently going through an “Information Revolution” that may be more profound in its consequences than the development of steam engines was in the Industrial Revolution. This is undoubtedly true, but it is missing one important point: the Industrial Revolution did not occur when we built steam engines, it occurred when we used steam engines to build steam engines.

The true information and computing revolution will not occur until we use software to build software; until we really use executable knowledge to create executable knowledge.

And lose that paper.

Back to Top

Back to Top

Back to Top

Figures

UF1 Figure. The ontology of paper.

Back to top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More