Seeing the Sort: The Aesthetic and Industrial Defense of “The Algorithm”

Christian Sandvig

Associate Professor, Department of Communication Studies and School of Information, University of Michigan

In the 1960s, computer programmers at TRW wrote to a leading journal to complain about the ambiguity of the word “algorithm.” They asked: Was it or was it not the same as a mathematical formula? [1] In reply, computer scientist Donald Knuth argued that an algorithm was not a formula, but rather a word computer science needed to describe a strategy or “an abstract method” for accomplishing a task with a computer. While a mathematical formula or a computer program can be thought of as a finite set of instructions represented in a particular way (e.g., in a computer programming language) an algorithm could be an idea “divorced” from a mechanism that implements it. [2] Even in this early debate the difficulty of representing algorithms troubled Knuth. He wrote that an algorithm must be more than a computer program, but that “I am forced to admit that I don’t know any way to define any particular algorithm except in a programming language.” [3] This anecdote announces the complexity of representing algorithms – both what is intended and what is actually happening when a computer “runs” software (a metaphor in a domain teeming with metaphors). Understanding the “abstract method” for what computers are doing is a daunting task when the software that implements the method is itself “almost intangible.” [4]

In this essay I argue that an important recent development in the struggle to represent algorithms is that computer algorithms now have their own public relations. That is, they have both a public-facing identity and new promotional discourses that depict them as efficient, valuable, powerful, and objective. It is vital that we understand how the algorithms that dominate our experience operate upon us. Yet commercial companies -a recent phenomenon- now systematically manage our image of algorithms and the information we receive about them. Algorithms themselves, rather than just the companies that operate them, have become the subject of mass marketing claims. To make this clear, I analyze a variety of visual and multimedia depictions of algorithms. I begin by reviewing a variety of historical and contemporary attempts to represent algorithms for novices in educational settings, and then I compare these to recent commercial depictions. I will conclude with a critique of current trends and a call for a counter-visuality that can resist them.

Seeing the Sort

The most common application of the word “algorithm” is to a sorting task. While a computer science student might be taught a low-level algorithm that orders a set of disorganized data, e.g., sorting an array of numbers from lowest to highest, today there is a higher-level meaning of the phrase “the Google algorithm” in colloquial use. This “algorithm” refers to the overall process by which Google’s computers sort all Web pages known by Google and display some of them to the user in response to a keyword query. References to “the Google algorithm” (and the Facebook news feed algorithm, and the Netflix recommender algorithm, and so on) are typically shorthand for high-level assemblies of many sub-algorithms, all eventually implemented and running as computer programs. While algorithms are directed to many goals, sorting is an apt umbrella category to highlight because it evokes both the computational process of ordering information but also the significance of sorting in social theory, where technology is often seen as a way to sort and classify people. [5]

The word “algorithm” as used in the phrase “the Google algorithm” was unknown in the popular consciousness two decades ago. In 1997, the Internet was already important, but an influential survey of ideas about the future of the Internet did not contain the word “algorithm,” or even particularly emphasize the role of the Internet in sorting or organizing information. [6] Instead, many experts believed that the future Internet would primarily be an infrastructure of connection, not of sorting or organization. They emphasized that the Internet would allow a user to interact with many distant people, to jointly experience fantastical and/or distant places, or to access many distant library catalogs at libraries around the world.

This is not a surprising oversight. To the typical Internet user of that era, computers did not usually sort content in any way that was meaningful. They did not sort (recommend) music or movies, e-mail was not automatically highlighted as “important” or “spam,” and search engines were not particularly useful. (In fact, the first effective search engine, AltaVista, was not foreseen because at the time many commentators believed it would not be possible for a computer to efficiently sort the entire Web. [7]) At this time it was impossible to use a computer to get driving directions. Internet functionality similar to social media (e.g., Unix’s talk, finger .plan and .project) and online publications existed, but computer interfaces were usually not personalized and the content they displayed was sorted by date. Many personal computers even lacked the ability to sort the user’s own files in a helpful way.

The Internet today is therefore the result of a major transformation, as sorting algorithms are now at the center of everyday user experience online. The design of these algorithms raises important normative questions. [8] Most or even all of the mediated content we experience is generated by an algorithmic sort. The use of search engines is now routine; news, video, and social media content are almost always ordered in a personalized way based on an algorithm’s judgment about relevance to the user, and most online experiences are bordered by long sidebars of advertisements chosen after an algorithm analyzes user demographics and behavior. The operation of ubiquitous cultural products like video games and television shows now have “more in common with the finance software Quicken than” with the games or other media products of just a decade ago: many more aspects of the production, selection, and display of media products are now computed, leading Galloway to name our current moment an “algorithmic culture.” [9]

The increasing reliance on computing, and thus algorithms, means that production processes are now more difficult to see. The visitor to the Ford factory, that archetype of industrialization, is able to look down from a catwalk and follow the assembly line as parts are brought together to eventually form a new whole. In general terms the “algorithm” or strategy of the factory is available to the naked eye of the visitor. Although the “factories” of information that produce our online experience are industrial in scale, they reveal no clues about their inner processes. In a Google, Twitter, or Microsoft data center, racks of identical computers sit motionless. The servers do hum continuously, and a few LED lights decorate them, but these provide no indication as to how information is being manipulated. The algorithm is not available to the visitor. In fact, many data centers are visited in person only for maintenance or repair and as a consequence the interior spaces are usually not lit. [10] In the production processes of Internet platforms, the “black box” is a black box.

Flowcharting the Algorithm

Our everyday experience is now intertwined with algorithms, but understanding what any particular algorithm is doing can be a difficult intellectual problem. The representation of what is happening inside a computer has always been a problem for the computer programmer – the person who is supposed to be the expert on the situation. Diagrams and visualization have been seen as central to the task of programming. [11] Computing pioneers such as John von Neumann diagrammed programs in the 1940s [12] to explain them to other experts, and programmers even created a standardized system of symbols that would allow them to draw algorithms graphically (codified as ISO standard 1028 in 1973). [13] In the era when access to computer time was scarce, programmers drew a diagram of the intended program first with pencil and paper, sometimes using tools like the plastic flowcharting stencil issued to all programmers at IBM. [14] The IBM stencil even included a printed sleeve of drawing instructions (Fig. 1).

IBM Flowcharting Template, c. 1980. User:Wtshymanski. Photograph (detail). Wikipedia. Public Domain. http://en.wikipedia.org/wiki/File:Flowchart-template.jpg

IBM Flowcharting Template, c. 1980. User:Wtshymanski. Photograph (detail). Wikipedia. Public Domain. http://en.wikipedia.org/wiki/File:Flowchart-template.jpg

These flowcharts may have evolved from industrial process engineering, and they minimally improved on the list or recipe format of prose by using arrows and two-dimensional space to handle the branching and parallelism produced by the varying flows of a program that are possible at run time. Over decades, the typical programmer gained easier access to computer time, and thus programming became more interactive. There was no longer a need to flowchart algorithms with pencil and paper in advance, and flowcharting fell into disuse. [15]

The art of flowcharting remains relevant as a way to “see” the invisible actions of computers in some contexts, and it has been broadly influential in computing. For instance, logical diagrams are commonly used to describe computer networks. In fact, the “cloud” symbol from the plastic template for computer networking has become a popular umbrella term in the industry. Today, “cloud computing” is a system architecture that stores and processes information at some point distant from the user’s device. While many commentators seem to think that this is called “the cloud” because this architecture shares common features with the clouds in the sky, in fact the term comes from the symbology of the network diagram, where a cloud symbol indicates a part of the diagram whose internal details are irrelevant. [16] This makes the popular use of “the cloud” a metonym and not a metaphor – cloud computing is not so named because it is fluffy or white – and demonstrates the influence of some of these otherwise obscure plastic stencils from the 1970s.

Another area where diagrams remain relevant is software patent applications. Patent applications are required to “clearly and precisely inform” the technical reader or they may be summarily rejected [17] and many software patents turn to flowcharts to meet this requirement, even though the original programmers of these algorithms may never have used them. For example, in the late 1990s, Amazon changed online shopping by inventing and popularizing a series of algorithms known as item-to-item collaborative filters. These algorithms allowed Amazon to suggest new purchases (“Customers Who Bought This Item Also Bought”) and to mine each user’s purchase history to construct a list of personalized “instant recommendations.” Amazon represented these algorithms in patent applications with text-heavy flowcharts, and some text is even relatively accessible to non-experts (e.g., Fig. 2).

A sequence of steps that are performed by the recommendation process, 2001, Jennifer A. Jacobi, Eric A. Benson, and Gregory D. Linden. From: “Personalized recommendations of items represented within a database.” US Patent #US7113917 B2. Public Domain.

A sequence of steps that are performed by the recommendation process, 2001, Jennifer A. Jacobi, Eric A. Benson, and Gregory D. Linden. From: “Personalized recommendations of items represented within a database.” US Patent #US7113917 B2. Public Domain.

Patent applications – like the laws they serve – are strongholds of tradition. The flowcharts found in patents may describe the present or future but they hew to otherwise-vanished aesthetic traditions of drafting. The Amazon patents, for instance, show the pencil work of a draughtsman although they depict only a linear set of steps in a flowchart that could more easily have been typed as prose.

Sorting Demonstrations as a Learning Tool

In computer history, flowcharts and algorithmic visualizations were meant for expert viewers; computer users were not, or not often, the intended audience. More recently, however, a great deal of effort has been put into attracting the next generation of computer programmers to the profession, which often involved depicting and explaining algorithms.

The most common use of algorithm visualization today is now in computer science education, where simple algorithms are depicted graphically in introductory programming courses in order to inculcate “algorithmic thinking” or the idea that minor process changes in how an program is constructed can dramatically alter the result, usually judged by the efficiency. All educational materials designed to bring newcomers into a professional field also contain a necessary element of public relations. Educational depictions of computers likely intend to depict the technical details of computing as not just scientifically accurate or comprehensible but also as fun, exciting, and interesting.

Visually comparing algorithms for fun has a long history. During the 1980s, the BASIC programming language introduced a generation of future programmers to programming personal computers via the IBM PC and its clones. The popular Microsoft QuickBASIC compiler was distributed with the free program SORTDEMO.BAS, which allowed the user to graphically compare sorting algorithms by using animations to represent the transformations of data in memory. [18]

An example run of SORTDEMO.BAS, 1988. Taken from the QuickBasic 4.5 Program Disks and run on the QB64 compiler by the author.

Unlike the flowchart of a single sorting algorithm, SORTDEMO.BAS helped beginning programmers understand that the computer could perform the same task in a variety of different ways. It did so by graphically representing example data being manipulated by the algorithm, not by representing the algorithm itself as a flowchart would. When run, SORTDEMO.BAS depicts an array of random numbers as a series of colored bars, differentiating each datum and implying their uniqueness by randomly varying their color. The width of the bar represents the size of the value. The visualization is technically a bar chart of memory registers on the Y-axis and values on the X-axis. The user may then select one of six basic comparison sort algorithms (Insertion, Bubble, Heap, Exchange, Shell, and Quick) by pressing the first letter of the algorithm’s name. These choices included algorithms that were widely known at that time to be inefficient. [19]

When a key is pressed, the selected algorithm begins to organize the colored bars. The algorithm itself is unseen, but a simple sonification plays a brief note at the pitch representing the current row in the array where the algorithm is performing a comparison. The algorithm’s activity is thus heard while the results (the organization of the series) are seen. As each of the six possible algorithms performs comparisons in a different order, sonification dramatically distinguishes one algorithm from another, leading users to say, “heapsort is like techno [music]”. [20] A timer provides the excitement of a race and allows the user to compare the performance of each algorithm. Another user comments: “I loved playing with this…when I was five and didn’t understand the principles of what it was demonstrating.”

The Educational Aesthetics of Algorithms

SORTDEMO.BAS employed the visual conventions of a bar graph, and indeed its output was very similar to some of the bar charts that could be produced in Lotus 1-2-3, the bestselling spreadsheet application on the PC at the time. Educational animations in computer science typically borrow the clean aesthetic of scientific visualization and the financial chart. Bostock’s recent visualizations of algorithms exemplify this style. It is interesting to note that in some of Bostock’s work, the data involved in the process of executing the algorithm is also visualized along with the algorithm’s output to help the viewer understand the strategy involved. In Figure 4, for example, an algorithm draws possible dots (samples) in a two-dimensional space and judges the most useful dot (in red) based on its distance from the others – a decision shown here as an animated expanding circle. The visual style evokes both minimalism and a geometry textbook. [21]

Mitchell’s Best Candidate II, 2014, Mike Bostock. Screen capture from a sample run by the author. Javascript. To animate, see: http://bl.ocks.org/mbostock/d7bf3bd67d00ed79695b

Mitchell’s Best Candidate II, 2014, Mike Bostock. Screen capture from a sample run by the author. Javascript. To animate, see: http://bl.ocks.org/mbostock/d7bf3bd67d00ed79695b

Children’s television is another common point of reference. Visualizations of algorithms include children or children’s toys to imply that the complex processes inside the computer are easy to learn. Basic algorithms are illustrated for undergraduates by computer science graduate students who sort children’s blocks on a table [22] or by stop-motion construction of Lego bricks [23]. Children are hailed directly by chipper narrators who provide the voice-over for cartoons depicting cute robots [24]. The cheerful robot is a common character that represents the computer’s agency, as a robot can easily imply action without human intervention.

To be sure, the presentation of a computer as childlike and algorithms as playful is more likely to be a genuine depiction of feeling on the behalf of the visualization designers than a cynical move to drum up more future programmers. Many computer scientists see computers as toys and to them algorithms are indeed whimsical and fun. The Algo-Rythmics Project is one example that exemplifies this affect and also demonstrates the wide range of form and aesthetic that is possible and useful in the representation of algorithms. [25]

The Algo-Rythmics project pairs basic sorting algorithms from an introductory computer science course with folk dances in Romania (Transylvania). Figure 5 depicts the same algorithm shown earlier by SORTDEMO.BAS (Fig. 3), but here human dancers replace the dancing colored bars and the movement of values is horizontal rather than vertical.

Quick-sort with Hungarian (Küküllőmenti legényes) folk dance, 2011, The Algo-Rythmics, YouTube, © 2011 The Algo-Rythmics. Used with permission.

In the video, each dancer embodies a number (which they are wearing) in an array (depicted via a projection on the curtain behind them). The computer’s clock is referenced by the vibrant Hungarian folk music scored over the dancers. The strategy of quicksort is then represented via the choreography of an adapted version of Küküllőmenti legényes, a men’s Hungarian folk dance involving a freestyle virtuoso performance in traditional tall boots and clothing performed in front of a band. For instance, a comparison between two values is indicated in the dance by downstage movement. The operands wear hats, and the first operand (in a quicksort, known as the pivot) wears a hat with a flower. If the comparison operator evaluates as false (the first operand is greater than the second one), one of the dancers executes a retrograde (reverse) phrase. [26] However, a comparison that evaluates as true (the first operand is less than the second) produces an extensive round of boot slapping. Successfully sorted dancers turn to face the upstage curtain and mostly stop moving.

Whenever the algorithm creates a partition (at 1:40 and elsewhere), the dancers shout “Oszd meg és uralkodj!” (“divide and conquer” in Hungarian). Hoare’s quicksort is one of many algorithms that adopt an overall strategy known in computer science as “divide and conquer” – to break down a problem into identical sub-problems. Although programmers and applied mathematicians borrowed this exhortation from politics in the latter half of the 20th century, the video reconnects the phrase to its original context. Derived from the Latin divide et impera (divide and rule), to a present-day ethnic Hungarian the phrase and this particular style of dancing will invoke the “lost territories” ceded to Romania in the Treaty of Trianon to end World War I. [27] The subdivided array and the lateral movement of the dancers on stage thus represents both computer memory and national memory, territory on the storage medium and on a map of the World.

Although this dance is intended as an educational scientific visualization, its ceaseless energy also evokes what art historian Linda Henderson has termed “vibratory modernism,” a movement in art evinced by borrowing ideas from science and by a fascination with the use of implied motion to register as the representations of invisible thoughts. [28] Vibratory modernism is usually dated to the 19th and early 20th century and stems from the problem of representing invisible electromagnetic waves. [29] I do not mean to suggest that these algorithm visualizations are strictly parallels with artwork from the birth of modernism, but rather that the problem of visually representing invisible activity is the same, and this produces similar representational tactics, such as vibration. The continual motion of the dancers – even simple swaying back and forth – is the flashing disk access light on the computer in the server room, an externalization of otherwise invisible energies. One key purpose of this dance is to imbue a boring box or a software process that produces a seemingly instantaneous result with a longer feeling of continual dynamism.

Commercial Depiction of “The Algorithm”

In contrast with the educational and scientific depictions of algorithms I have reviewed so far, in commercial depictions of algorithmic systems the algorithm itself has been almost entirely absent. This may be because computer science education aims to sensitize the viewer to difference and variation in process (e.g., by comparing algorithms), while commercial motivations typically allow for efficiency comparisons only with competitors.

There are also reasons not to advertise algorithms. Although sorting and filtering algorithms prominently feature in the user experience of all online platforms, the algorithms in wide use on the Internet have been sources of controversy and liability for the companies that operate them. For instance, the Google search algorithm is often under fire for censorship and bias: e.g., for sorting some Web search results as more relevant than others. [30] In 2014, Facebook manipulated its News Feed algorithm in an attempt to show that varying the positive and negative emotional content of a user’s feed would produce a variation in the user’s own emotions. [31] This modification of the Facebook algorithm produced a major public backlash against the manipulation, but also against existence of the algorithm itself and its normal operation. [32] The Twitter “trending topics” sorting algorithm came under sustained public attack for allegedly misrepresenting the scale of public protests during the Occupy demonstrations in the USA. [33] Corporate algorithms are also considered to be a trade secret. [34]

When algorithms are mentioned at all, platform providers often encourage the notion that their algorithms operate without any human intervention, and that they are not designed but rather “discovered” or invented as the logical pinnacle of science and engineering research in the area. More than just computer programs, “algorithms are…stabilizers of trust, practical and symbolic assurances that their evaluations are fair and accurate, free from subjectivity, error, or attempted influence” [35]. For example, Edelman collected a set of public statements that Google made about its own search algorithm. Almost all of these conveyed the sentiment that “Our search results are generated completely objectively and are independent of the beliefs and preferences of those who work at Google.” [36] Edelman then used example queries to document that the Google search algorithm likely includes a hard-coded, or in other words an unvarying, instruction to always display a particular site first (e.g., Google Health) when given certain keywords. This is in direct contradiction to public relations statements made about the operation of the algorithm.

Search engines have intentionally highlighted the performance of their search algorithms in advertising, but often in a general way that praises the platform as a whole or attacks a competitor as a whole. It does not market the process (and that means advertisements do not have to explain the process). In a 2012 attempt to gain market share from Google search, the Bing search engine launched “Bing It On” – a Web site and an associated marketing campaign. [37] Bing It On invited visitors to run the same search on both Bing and Google and then compare the results: an algorithmic taste test. Yet the algorithm itself was not mentioned, simply “people prefer Bing over Google.”

Internet platforms periodically give their algorithms a major overhaul, and they name or number them in a way similar to operating systems. Google Hummingbird, Penguin, and Panda were all important algorithms, and as I write this when I search Google on the Web, Google Pigeon returns my results. [38] These names are sometimes only used internally, but some companies use them as opportunities for marketing. When the search engine Ask.com (formerly “Ask Jeeves”) launched their new “Edison” search algorithm they also conceived of an extensive and significant advertising campaign that foregrounded the notion of “the algorithm” for the first time to a mass audience.

Advertising history features Kentucky Fried Chicken’s Colonel Sanders, who, according to KFC advertisements, succeeded because of his secret recipe of 11 herbs and spices (the recipe is allegedly locked in a safe in Louisville, Kentucky). The Ask.com search engine boldly identified that its success was based on its secret algorithm. Like Colonel Sanders, Ask.com promoted its trade secret recipe as an idea without revealing it or explaining it. Ask.com attempted to insert the word “algorithm” into the popular consciousness with a series of mysterious billboards that were initially unbranded (Fig. 6). Billboards read:

THE ALGORITHM CONSTANTLY FINDS JESUS
THE ALGORITHM KILLED JEEVES
THE ALGORITHM IS BANNED IN CHINA
THE ALGORITHM IS FROM JERSEY
THE UNABOMBER HATES THE ALGORITHM

Jesus and the Algorithm, May 8, 2007, Thomas Hawke, Digital photograph, Silver Terrace, San Francisco, CA. Used with permission (CC BY-NC 2.0 licensed).

Jesus and the Algorithm, May 8, 2007, Thomas Hawke, Digital photograph, Silver Terrace, San Francisco, CA. Used with permission (CC BY-NC 2.0 licensed).

In one follow-up television commercial, a boy asked his father “Do you have a lame algorithm?” [39] At the time, the Ask.com CEO explained: “The goal [of the campaign] is to incite a consumer conversation around the importance of a search engine’s algorithm and its integral role in making one engine different from another.” [40] Ask.com also created a companion Web site at thealgorithm.com (now removed). Coincidentally, Ask.com also launched a television advertisement that employed dancers on a stage – perhaps prefiguring the Algo-Rythmics of Fig. 5 – to illustrate an Ask.com search for “chicks with swords.” However, although the advertisement employed the tagline “The Algorithm,” the Ask.com commercial used dancers (women with swords) as the search result and their movement did not illustrate the process of the algorithm, only its success. [41]

The Ask.com campaign received mixed reviews in the marketing trade press, but it signaled a new awareness of “the algorithm” even if it did not cause one. Over the six years after this campaign there would be a five-fold increase in the number of times the word “algorithm” appeared in the major newspapers of the world. [42] Discussion of an “algorithm” appeared in the press in the context of automated financial trading, mathematics, online dating, and music. This was also the era of the Netflix Prize, a public contest to refine the Netflix recommendation algorithm. [43] The idea of the algorithm or “algorithmic literacy” was previously only found in computer science education, but it was now referred to as something every user should know in order to understand – or protect themselves from – the computer systems that they used every day. [44] Still, media campaigns like Ask.com’s “Ask the algorithm” were more mystical than explanatory, and the need for a broader role for the popular understanding of commercial algorithms continued to grow.

Algorithm Seers and Brokers

Given the vacuum of information about how the algorithms of major Internet platforms work, a spate of third-party marketing consultants, trade press publications, and software products rushed to fill the void. Some of these evolved into respected and useful sources of commentary on major Internet companies (e.g., Search Engine Watch). Others advertised their own services by offering advice about and depictions of important algorithms that are operated by others.

Two notable examples of the latter involve the Web sites EdgeRank.net and whatisedgerank.com. “EdgeRank” was the name of the Facebook news feed algorithm used until 2011. [45] These two sites are examples of the many marketing firms and third-party developers that claimed they could provide consulting services and software that claimed to optimize a user’s reach on Facebook – promising to leverage knowledge about the algorithm into a more valuable audience for Facebook posts.

Both Web sites were not affiliated with Facebook and purported to describe how to achieve more attention for Facebook posts on the basis of information that was likely gleaned from a particular Facebook presentation at a 2010 developer conference, but not attributed. [46] As the Facebook algorithm is a secret; it is impossible to know how well these sites represent the algorithm, a universal problem for public depictions of commercial algorithms. (The site whatisedgerank.com [now defunct] launched after EdgeRank was retired at Facebook, providing some cause for skepticism.)

Presenting EdgeRank: A Guide to Facebook’s Newsfeed Algorithm (excerpt), 2011, Jeff Widman, Used with permission. © 2011 Jeff Widman.

Presenting EdgeRank: A Guide to Facebook’s Newsfeed Algorithm (excerpt), 2011, Jeff Widman, Used with permission. © 2011 Jeff Widman.

Whatisedgerank.com employed a muted professional design of grey and white with sans serif headings that likely led many visitors to assume it was associated with Facebook. The Web design is similar to the Facebook help pages and developer documentation. [47] In contrast, EdgeRank.net evokes the visual style of simple personal home pages popular among early Web professionals, which were often littered with clip art and incongruous images. One section of the single dominant graphic at EdgeRank.net (Fig. 7) depicts the Facebook news feed curation algorithm as what is apparently a line drawing of a 19th Century belt-driven grist mill.

This figure is clearly not meant to be a literal technical diagram – no one is suggesting that the Facebook algorithm is a 19th Century mill. The mill’s appearance seems as whimsical as the dancing Hungarians discussed above. Yet the line drawing uses technical illustration as a form, recalling Figures 1 and 2. The fact that the components of the flowcharts discussed earlier are arbitrary symbols and thus easier to draw make them no less an illustration than an exploded diagram of a car’s engine or this drawing of a mill—they all aim to visualize the inside of a technology for the purpose of explanation. This drawing of a mill combines this technical aesthetic with references to industrial production processes and mathematical formulae. Note, e.g., the subscripts meaning “each” in the context of a Greek sigma shown elsewhere on the page. The picture also implies simplicity by relying on a simple antique technology with few parts as its visual vocabulary.

The logic of the diagram is as follows: For each piece of content added to Facebook an “affinity score,” meaning a measure of the relationship between a user and the post; a “weight,” meaning a way to privilege some types of content over others; and a “time,” a penalty for older content, are put into metal hoppers and inserted between a circular moving bed stone and a runner stone, both encased in barrel staves. This produces an EdgeRank (or flour) when mixed, although mixing is not shown. The image eschews the “vibratory” sense of motion discussed above but adds a looming feeling of industrial scale. In an actual gristmill each millstone could easily be 2,000 lbs. The giant data centers are the millstones of Facebook, and they act upon each datum they grind as a one-ton rock acts upon a tiny seed of wheat.

This diagram was probably chosen and annotated rather than drawn, although the source is unattributed. A major consequence of the choice of this mill to illustrate the Facebook algorithm is that the action – the grinding – in this drawing cannot be seen. Other diagrams of mills are available where the machinery of the millstone is revealed via an exploded view. In the chosen drawing, in contrast, the millstones are not shown and although the inputs are visible the algorithm is referenced as a kind of black box. Legal scholar Frank Pasquale has called the prevalence of algorithms in daily life a regime of “secret judgments,” and this picture depicts a secret: machinery hidden behind the barrel staves. [48]

PageRank Public Relations

Pasquale and his co-author Oren Bracha addressed the normative implications of these “secret judgments” and argued that the public interest may require that the government legally compel Internet platforms to reveal their algorithms. [49] This threat of government regulation— coupled with widespread ignorance and misunderstanding about how algorithms work—provides a new commercial rationale for public relations about algorithms. If algorithms are voluntarily explained in broad terms and depicted as comprehensible (and not frightening) this could forestall complaints, shape public debate, and even deter government rules that require their complete disclosure.

The final example I will discuss in this essay concerns a silent interactive Javascript cartoon Google released depicting its search algorithm in 2013, entitled “How Search Works: From Algorithms to Answers” (an excerpt is in Figure 8). The cartoon’s general structure is adapted from a 2010 Google video, also entitled How Search Works, which built upon the previously popularized algorithm “PageRank.” In the 2013 cartoon, algorithms are defined as “programs and formulas to deliver the best results possible.” The cartoon depicts the search algorithm as a vertical assembly line. At the top of the frame, the crawler algorithm’s progress across the Web is represented by a moving yellow line, which bounces from document to document toward the bottom of the page. It eventually stops at a massive blue filing cabinet identified as “the index,” containing drawers filled with cat videos and physics papers. In the next scene, the user’s action of typing a keyword – or perhaps an unspecified action, this is not clear – is embodied by the yellow line, which wraps around six circular algorithms that “get to work” on the query. They smash it with a small press, sparkle near it, and rotate through sequences of shapes separated by the “equal” or “not equal” symbol, suggesting both mathematical formulae and slot machines.

How Search Works: From Algorithms to Answers (excerpt), 2013, Google. © 2013 Google. Full cartoon: http://www.google.com/insidesearch/howsearchworks/thestory/ How Search Works: From Algorithms to Answers (excerpt), 2013, Google. © 2013 Google. Full cartoon: http://www.google.com/insidesearch/howsearchworks/thestory/
How Search Works: From Algorithms to Answers (excerpt), 2013, Google. © 2013 Google. Full cartoon: http://www.google.com/insidesearch/howsearchworks/thestory/ How Search Works: From Algorithms to Answers (excerpt), 2013, Google. © 2013 Google. Full cartoon: http://www.google.com/insidesearch/howsearchworks/thestory/

How Search Works: From Algorithms to Answers (excerpt), 2013, Google. © 2013 Google.
Full cartoon: http://www.google.com/insidesearch/howsearchworks/thestory/

In the next section the yellow line becomes a factory assembly line that conveys Web pages. The arrangement of shapes in this section is similar to a flowchart, but formal flowcharting symbols are not used and there is no branching. The conveyor moves a single bundle of several pages from the filing cabinet. The bundle slides along the line to be sorted by a series of six cute green robots depicted by human body parts. The specifics of the sorting process are not depicted, e.g., a robotic nose smells the pages and a robotic arm makes a tsk-tsk gesture at a page for being “adult.” The text implies that there are 200 possible robots, but only six are shown – this is similar to the earlier public relations about Google Search, which explained the Google algorithm in words as the process of asking more than 200 questions. The sorted Web pages are then dropped vertically down into a computer screen, a smartphone screen, and an iPad, implying the process of gravity and a filter via vertical movement. A set of green numbers indicate that different results appear on each device, with one exception (result #1).

It is surprising that Google should turn to the assembly line as the dominant metaphor for its cartoon. The computer industry has often used its marketing to suggest illogically that it has no industrial component. Computer firms refer to their facilities as “campuses” and not “offices,” and many keep the locations of the data centers that look like factories a secret. Public relations efforts have embraced the idea of a post-industrial economy with phrases like “clean industry.” [50] It is as though the companies involved are suggesting that the computer industry does not involve assembly lines at all, when in fact the industry simply displaces the problems of electronics manufacture – such as factory labor conditions and toxic waste – to poorer areas of the world where they are less visible to Western customers. [51]

Unlike the earlier images I have discussed, How Search Works is an extensive depiction of the overall search process containing ten pages of material. Two omissions from this big-picture, holistic view of search are then particularly significant. First, the screens of search results in the cartoon do not contain advertisements. Advertisements are not mentioned at all although they comprise the bedrock of the Google business model and account for 90-95% of Google revenue. [52] Advertising selection was included in earlier descriptions of Google Search, but such mentions have now been removed. [53] Second, people do not appear. The cartoon uses the construction “we” to refer to Google, and implies human agency by saying that “we write” algorithms. Yet there is no depiction of “we” other than the Google logo, and the algorithms appear to be mechanical. The user is also not embodied, though perhaps the abstract yellow line and animated words appearing in the search box (a text box contains a search for “string theory”) are meant to represent this otherwise invisible user.

The automatic use of user data in Google search was a major innovation to the algorithm adopted in 2009, and user privacy is now a common public controversy facing Google. Therefore, the lack of a depiction of an embodied person is important because the visual treatment of the user (or the lack of one) carefully avoids or minimizes the central controversy that now faces Google and this algorithm. To find any mention of data about the user in the cartoon, the viewer must puzzle out three oblique references. Some Web pages enter a small green house (the user’s house?) but others remain outside. Clicking produces the text: “User context provides more relevant results based on geographic region, Web history, and other factors.” (The user’s Web history?) The translation robot, if clicked, informs users that results depend on “your language and country.” Clicking on the words “Universal Search” produces a note that “personal content” is “blend[ed]” into the results.

In sum, the cartoon shows Google to be a complicated yet well-intentioned site of industrial production, where attractive robots work in a jokey way to satisfy search queries. They employ an enormous store of Web pages, but few pieces of user data and no advertising. Algorithms themselves are also said to be produced in the cartoon, but this is not shown. This visualization therefore admits the potential of a variety of algorithms for the same job, while still persistently asserting scientism and objectivity.

Conclusion: Toward a Countervisuality of Algorithms

I have reviewed a broad array of examples of how algorithms, and particularly sorting algorithms, have been depicted for the novice. I have argued that “the algorithm” is now an object of marketing, that the processes of algorithms are now the object of public relations and that this produces distinctive representations of algorithms that have evolved from technical and educational imagery. As software is “almost intangible,” these depictions produce our knowledge of the computer, our interactions with it, and ultimately our understandings about important parts of our lives. We have seen a dazzling diversity in the representations of algorithms that are possible. This brings us to the concluding question of how algorithms should be depicted in a given situation, a normative question involving both practicality and politics. Google’s assembly line seems to me to be a reasonable portrayal of what people should know about algorithms from Google’s point of view. What about other points of view?

In his book, The Right to Look: A Counterhistory of Visuality, Nicholas Mirzoeff explains that a “right to look” is “the claim to a subjectivity that has the autonomy to arrange the relations of the visible and the sayable.” Applying his concept in this instance I assert that the current state of algorithms requires a right to look back – a right to “mutual recognition” between the user and the algorithm. [54] The algorithms employed by Internet platforms are cloaked by the legal protections of intellectual property and corporate personhood, while the data about the individuals that these systems operate with is extracted, commodified and sold via a system of transactions that are largely invisible. To look back at the algorithm, an appropriate visualization of the production line in Figure 8 might therefore be a production line where the user is produced, as a commonplace saying reminds us: “when something online is free, you’re not the customer, you’re the product.” [55]

To clearly answer whether an algorithm should best be depicted as a flow chart, a series of questions, a bar graph, a folk dance, a gristmill, or a factory, we need to understand the operation of the algorithm but also the subjectivity that the depiction enacts. Understanding how to use an algorithmic system on its own terms is surely important. Researchers recently found that about one third of young adult searchers could not successfully complete a search task; a major impediment was a fundamental lack of understanding about what keywords are and how they are related to search results. [56]

Yet we require more than a depiction of how algorithms work: we need to know how corporations work algorithms. For instance, another study showed that the majority of online search engine users did not know there was a difference between paid and unpaid search results. [57] A key task of depicting algorithms may still be revealing their existence in the first place and explaining what they can do in simple terms: a third study found that the majority of a group of Facebook users did not know that their personalized news feed was filtered at all. [58] We cannot expect a corporate apparatus to honestly represent itself to users. Almost every major Internet platform operator has now entered into a consent decree agreement with the US government because the government intended to bring suit against him or her for unfair or deceptive business practices. This includes Google, Twitter, Facebook, and Apple.

This opens a space for a countervisuality of algorithms. Artists and programmers have already constructed the beginnings of such a visuality, but we need to go further. In “Cat or Human?,” Shinseungback Kimyonghun display a series of human faces identified as cats by the KITTYDAR feline face detection algorithm alongside a series of cat faces identified as human by the OpenCV face detector. [59] While this is an excellent interrogative about both algorithms and humanity itself, projects like this one tend to focus on the result of an algorithm rather than its process. And prominent process-oriented work has largely eschewed the strategy of tactical or counter visualization. In the influential CODeDOC exhibition commissioned by the Whitney Museum of American Art in 2002, the curatorial statement emphasized an interest in software as prose. Many algorithms were involved, but they were represented most prominently by their instantiation in prose computer programs. With a few exceptions, many of the projects focused on the expressive potential of programming code itself rather than a visual representation of software algorithms as a critical intervention into existing systems. [60]

I have shown that algorithms are depicted using a combination of imagery from scientific visuality and children’s television, and those pictures of algorithms feature references to math, objectivity, flow, and inhuman agency. Depictions of algorithms slow down the action of a computer so that it can be seen, but are visually ‘vibratory,’ involving oscillation or other continuous movement to convey hidden energy. Newer commercial depictions of algorithms reinvigorate the image of the factory while recalling its history. They are much more vague in their claims than their educational counterparts, and they represent ‘hidden judgments’ while strategically omitting controversial details. A countervisuality of algorithms could continue the visual and sonic features highlighted in this essay or break away from them, but it is clear that algorithms need to be seen.

 

References

  1. T. Wangsness and J. Franklin, “‘Algorithm’ and ‘Formula’,” Communications of the ACM 9, no. 4 (1966): 243
  2. Donald E. Knuth, “Algorithm and Program; Information and Data,” Communications of the ACM 9, no. 9 (1966): 654.
  3. Ibid., 654.
  4. Manfred Broy, “Software Engineering — From Auxiliary to Key Technology, ” in Software Pioneers: Contributions to Software Engineering, ed. Manfred Broy and Ernst Denert (Berlin: Springer, 2002), 11; on this notion, see also: Wendy Hui Kyong Chun, Programmed Visions: Software and Memory. (Cambridge, MA: MIT Press, 2011); Ian Bogost, Alien Phenomenology, or What It’s Like to Be a Thing (Minneapolis: University of Minnesota Press, 2012).
  5. Oscar H. Gandy, Jr. The Panoptic Sort: A Political Economy of Personal Information (Boulder, CO: Westview Press, 1993); Geoffrey C. Bowker and Susan Leigh Star, Sorting Things Out: Classification and Its Consequences (Cambridge, MA: MIT Press, 2002).
  6. Mark Stefik, ed., Internet Dreams: Archetypes, Myths, and Metaphors, (Cambridge, MA: MIT Press, 1997).
  7. Eric J. Ray, Deborah S. Ray, Richard Seltzer, The AltaVista Search Revolution, (New York: Osborne/McGraw-Hill, 1988).
  8. Solon Barocas, Sophie Hood, and Malte Ziewitz, “Governing Algorithms: A Provocation Piece,” Paper presented to the symposium “Governing Algorithms,” (New York: New York University; March 29, 2013). http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2245322; Tarleton Gillespie, “The Relevance of Algorithms,” in Media Technologies, ed. Tarleton Gillespie, Pablo Boczkowski, and Kirsten Foot (Cambridge, MA: MIT Press, 2014), 167-194; Kevin Hamilton, Karrie Karahalios, Christian Sandvig, and Motahhare Eslami, “A Path to Understanding the Effects of Algorithm Awareness.” In CHI Extended Abstracts on Human Factors in Computing Systems (alt.CHI; 2014): 631-642.
  9. Alexander R. Galloway, Gaming: Essays on Algorithmic Culture. (Minneapolis, University of Minnesota Press, 2006), 6.
  10. Data Center Knowledge, “The Illustrated Data Center,” n.d. http://www.datacenterknowledge.com/the-illustrated-data-center/ (accessed September 25, 2014).
  11. For example, in the words of IBM, “The use of data processing equipment has focused attention on the necessity for an orderly representation of information flow.” From: IBM, Flowcharting Techniques. (White Plains, NY, International Business Machines, 1969), 1. Available online: http://www.fh-jena.de/~kleine/history/software/IBM-FlowchartingTechniques-GC20-8152-1.pdf
  12. e.g., see the figures in: Herman H. Goldstine and John von Neumann, Planning and Coding of Problems for an Electronic Computing Instrument, Part II, Volume II, (Princeton, Institute for Advanced Study, 1948). http://bitsavers.informatik.uni-stuttgart.de/pdf/ias/Planning_and_Coding_of_Problems_for_an_Electronic_Computing_Instrument_Part_II_Volume_II_Apr48.pdf
  13. S. J. Morris and O. J. C. Gotel, “The Diagram of Flow: Its Departure from Software Engineering and Its Return,” in Diagrammatic Representation and Interface eds. Philip T. Cox, Beryl Plimmer, and Peter Rodgers (Berlin, Springer-Verlag, 2012), 256–269.
  14. Computer History Museum, “IBM Programmer Drawing a Flowchart,” ca. 1965 (from the exhibition “Early Computer Companies”), accessed September 20, 2014, http://www.computerhistory.org/revolution/early-computer-companies/5/117/496
  15. Although diagrams were later revived in the practice of software engineering. “The Diagram of Flow.”
  16. “For decades, network diagrams have used a cloud-like symbol to reduce the entire infrastructure of a network into simple entry and exit points when the specific network architecture is not material to the illustration.” from: PC Magazine, “Encyclopedia of Computing,” n.d., accessed September 22, 2014, http://www.pcmag.com/encyclopedia/term/39847/cloud
  17. US Patent and Trademark Office, “Manual of Patent Examining Procedure,” n.d., Chapter 2100, Section 2173, accessed September 20, 2014, http://www.uspto.gov/web/offices/pac/mpep/s2173.html
  18. Michael Halvorson and David Rygmyr, Learn BASIC Now. (Redmond, WA, Microsoft Press, 1989), 286.
  19. Owen Astrachan, “Bubble Sort: An Archaeological Algorithmic Analysis,” Proceedings of the 34th SIGCSE technical symposium on Computer science education 35, no. 1 (2003): 1-5. doi: 10.1145/792548.611918
  20. Quotations are from the comments on a SORTDEMO.BAS video capture on YouTube, although there is a variety of SORTDEMO.BAS nostalgia around the Web. See: https://www.youtube.com/all_comments?v=leNaS9eJWqo
  21. For an example of Bostock’s visualization of sorting (rather than sampling), see: http://bl.ocks.org/mbostock/e1e1e7e2c360bec054ba
  22. Mark Grozen-Smith, “Bubble Sort,” YouTube, 2014, https://www.youtube.com/watch?v=aQiWF4E8flQ (accessed September 20, 2014).
  23. ollloolo, “Lego Bubble Sort,” YouTube, 2009, https://www.youtube.com/watch?v=MtcrEhrt_K0 (accessed September 18, 2014).
  24. For a video intended for middle-school students, see: Udi Aharoni, “Visualization of Quicksort,” YouTube, 2012, accessed September 20, 2014, https://www.youtube.com/watch?v=aXXWXz5rF64
  25. Zoltan Katai, “Intercultural computer science education,” Proceedings of the 2014 conference on Innovation & technology in computer science education (ITiCSE) 19, no. 1 (2014): 183-188. doi: 10.1145/2591708.2591744
  26. The original algorithm being illustrated is here: C. A. R. Hoare, “Algorithm 64: Quicksort,” Communications of the ACM 4, no. 7 (1961): 321; for additional explanation see also: C. A. R. Hoare, “Quicksort,” The Computer Journal 5, no. 1 (1962): 10-16.
  27. I am indebted to Julia Sonnevend for invaluable assistance with this interpretation. See also: The Oxford Dictionary of Proverbs (5th ed.), eds. John Simpson and Jennifer Speake, (Oxford: Oxford University Press, 2008): “divide and rule.”
  28. Linda Dalrymple Henderson, “Vibratory Modernism: Boccioni, Kupka, and the Ether of Space,” in From energy to information: representation in science and technology, art, and literature, ed. Bruce Clarke and Linda Dalrymple Henderson (Stanford, CA: Stanford University Press, 2002), 126-149.
  29. Anthony Enns and Shelley Trower, eds., Vibratory Modernism, (London: Palgrave-MacMillan, 2013).
  30. Lucas Introna and Helen Nissenbaum, “Shaping the Web: Why the Politics of Search Engines Matters,” The Information Society, 16 no. 3 (2000): 1-17.
  31. Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock, “Experimental evidence of massive-scale emotional contagion through social networks,” Proceedings of the National Academy of Sciences 111 no. 24 (2014): 8788–8790.
  32. James Grimmelman, “The Facebook Emotional Manipulation Study: Sources,” 2014, accessed September 29, 2014,
    http://laboratorium.net/archive/2014/06/30/the_facebook_emotional_manipulation_study_ source
  33. Tarleton Gillespie, 2013, “Can an Algorithm be Wrong?” Limn 2, http://limn.it/can-an-algorithm-be-wrong/
  34. Frank Pasquale, “The Troubling Consequences of Trade Secret Protection of Search Engine Rankings,” in The Law and Theory of Trade Secrecy, eds. Rochelle Cooper Dreyfuss and Katherine Jo Strandburg (Cheltenham: Edward Elgar, 2011), 381.
  35. “The Relevance of Algorithms,” 179.
  36. Benjamin Edelman, “Hard-Coding Bias in Google ‘Algorithmic’ Search,” November 15, 2010, accessed September 5, 2014, http://www.benedelman.org/hardcoding/.
  37. The “Bing It On” survey is available at: http://www.bingiton.com/
  38. Moz, “Google Algorithm Change History,” n.d., accessed August 2, 2014, http://moz.com/google-algorithm-change
  39. Jessica E. Vascellaro and Lauren Tara Lacapra, “Ask.com hopes ads compute to buzz,” The Wall Street Journal (May 3, 2007). http://online.wsj.com/news/articles/SB117815708671990422?mg=reno64-wsj&url=http%3A%2F%2Fonline.wsj.com%2Farticle%2FSB117815708671990422.html
  40. Frank Watson, “Ask CEO Explains ‘The Algorithm,’” Search Engine Watch, (April 17, 2007), accessed September 20, 2014, http://searchenginewatch.com/article/2056433/Ask-CEO-Explains-The-Algorithm
  41. TechCrunch, “Ask.com Commercial,” YouTube (June 5, 2007). https://www.youtube.com/watch?v=yasBpCHHm2E
  42. I measured this by performing a LexisNexis search in the newspapers database on each year from 2006 (16 mentions of algorithms) to 2013 (84 mentions) while holding the indexed news sources constant.
  43. Blake Hallinan and Ted Striphas, “Recommended for you: The Netflix Prize and the production of algorithmic culture,” New Media & Society (OnlineFirst, June 23, 2014): 1-21.
  44. For example, Douglas Rushkoff, Program or Be Programmed. (Berkeley, CA: Soft Skull Press, 2011).
  45. Jason Kincaid, “EdgeRank: The Secret Sauce That Makes Facebook’s News Feed Tick,” Tech Crunch, (April 22, 2010). http://techcrunch.com/2010/04/22/facebook-edgerank/
  46. Ibid.
  47. This site was also launched at a moment when many online platforms were providing official brochure ware sites that explained how their platform works, such as Google’s “Inside Search,” discussed next. See an archived version at: http://web.archive.org/web/20120626101938/http://www.whatisedgerank.com/
  48. Frank Pasquale, The Black Box Society: The Secret Algorithms that Control Money and Information, (Cambridge, MA: Harvard University Press, 2015).
  49. If not revealed publicly, at least revealed to trusted third parties. See: Oran Bracha and Frank Pasquale, “Federal Search Commission? Access, Fairness, and Accountability in the Law of Search,” Cornell Law Review (93): 1149-1210.
  50. Anne Alvergue and Roger Sorkin, The Company Lawn. Video recording (8:30), (Stanford, CA: M.A. Program in Documentary Film and Video, Stanford University, 1999).
  51. Giles Slade, Made to Break: Technology and Obsolescence in America, (Cambridge, MA: Harvard University Press, 2007).
  52. Google, Inc., 2013 Annual Report. (Mountain View, CA: Google, Inc., 2014). http://investor.google.com/pdf/2013_google_annual_report.pdf
  53. Google, “How Search Works,” YouTube, accessed September 10, 2014, http://www.youtube.com/watch?v=BNHR6IQJGZs.
  54. Nicholas Mirzoeff, The Right to Look: A Counterhistory of Visuality, (Durham: Duke University Press, 2011).
    55. An effective paraphrase of: Dallas W. Smythe, “Communications: Blindspot of Western Marxism,” Canadian Journal of Social and Political Theory 1, no. 3 (1977): 1-27.
  55. Eszter Hargittai and Heather Young, “Searching for a ‘Plan B’: Young Adults’ Strategies for Finding Information about Emergency Contraception Online,” Policy & Internet 4, no. 2 (2012): article 4.
  56. Deborah Fallows, “Search Engine Users,” Pew Internet & American Life Project Research Report (Washington, DC: Pew Internet, Feburary 3, 2005). http://www.pewinternet.org/2005/01/23/search-engine-users/
  57. Motahhare Eslami, Kevin Hamilton, Karrie Karahalios, Cedric Langbort, Aimee Rickman, and Christian Sandvig, “Uncovering Algorithms: Looking Inside the Facebook News Feed.” A lecture presented to the Berkman Center for Internet & Society, Harvard University (July 22, 2014) http://cyber.law.harvard.edu/events/luncheon/2014/07/sandvigkarahalios
  58. Shinseungback Kimyonghun, “Cat or Human,” (2013), accessed September 22, 2014, http://ssbkyh.com/works/cat_human/
  59. Whitney Museum of American Art, “CODeDOC,” (2002), accessed September 10, 2014, http://whitney.org/Exhibitions/Artport/Commissions/Codedoc

 

Bio

Christian Sandvig is Steelcase Research Professor and Associate Professor in both the Department of Communication Studies and the School of Information at the University of Michigan. He is also a faculty associate of the Berkman Center for Internet & Society at Harvard University. Sandvig is a researcher and teacher studying the implications of new Internet infrastructures. He is also a computer programmer with industry experience consulting for a Fortune 500 company, a regional government, and a San Francisco Bay Area software start-up. He holds the Ph.D. in Communication Research from Stanford University (2002) and received the US National Science Foundation’s Faculty Early-Career Development Award in the area of “human-centered computing.” Sandvig was previously named a “next-generation leader” in technology policy by the American Association for the Advancement of Science. His research has appeared in The New York Times, The Economist, New Scientist, National Public Radio, and CBS News.
www.niftyc.org