{"id":851,"date":"2017-08-06T17:44:26","date_gmt":"2017-08-06T21:44:26","guid":{"rendered":"https:\/\/itp.nyu.edu\/~jvc301\/wordpress\/?p=124"},"modified":"2017-08-06T17:44:26","modified_gmt":"2017-08-06T21:44:26","slug":"a-case-for-clarity-in-the-age-of-algorithmic-injustice","status":"publish","type":"post","link":"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/a-case-for-clarity-in-the-age-of-algorithmic-injustice\/","title":{"rendered":"A Case for Clarity In the Age of Algorithmic Injustice"},"content":{"rendered":"<p><span class=\"first-letter\">O<\/span> n October 16th, 2017 the New York City Council heard testimony on Bill 1696, that would &#8220;require agencies that use algorithms or other automated processing methods that target services, impose penalties, or police persons to publish the source code used for such processing.&#8221;<a class=\"fn\" href=\"#fn-1\" name=\"fnn-1\">1<\/a><\/p>\n<p><span style=\"font-weight: 400\">Both advocates and detractors of the bill knew the stakes were high: As the first city council in the country to confront the existence of algorithms and their impact on public life, New York City was setting a precedent that could have far reaching implications. Public defenders, civil liberty and open data advocates, and privacy researchers filled the docket to push for the bill\u2019s passage. Meanwhile, the city\u2019s Department of Information Technology and Telecommunications (DoITT) urged caution, and the trade group Tech:NYC (paying members include Google and Facebook) suggested that \u201cthere could be better ways to address concerns underlying the proposed bill.<a class=\"fn\" href=\"#fn-2\" name=\"fnn-2\">2<\/a>\u201d<\/span><span style=\"font-weight: 400\"> So many wished to cast their stake in the hearing that it was forced to be relocated to a larger space, and even then, it was standing room only. \u201cThis is the largest attendance a Technology Committee meeting has ever had,<a class=\"fn\" href=\"#fn-3\" name=\"fnn-3\">3<\/a>\u201d<\/span><span style=\"font-weight: 400\"> council member James Vacca, who introduced the bill, gleefully noted. <\/span><\/p>\n<p><span style=\"font-weight: 400\">So what were the implications of Bill 1696 exactly? If passed, 71 words would be appended to New York City\u2019s existing open data law that would, in addition to requiring agencies to publish the source code of their automated processing methods, also require agencies to \u201cpermit a user to (i) submit data into such system for self-testing and (ii) receive the results of having such data processed by such system.\u201d<\/span><span style=\"font-weight: 400\"> Agencies would have four months from the bill\u2019s signing to comply.<\/span><\/p>\n<h1><span style=\"font-weight: 400\">Before They Called It \u2018Algorithmic\u2019<\/span><\/h1>\n<p><span style=\"font-weight: 400\">The road to algorithmic decision-making was paved by actuarial risk assessment and big data. In his 2007 book <\/span><i><span style=\"font-weight: 400\">Against Prediction<\/span><\/i><span style=\"font-weight: 400\">, attorney and Columbia University professor Bernard Harcourt examines the use of statistical risk assessments in the field of criminology. He notes that by the 1920s in the United States there was already \u201ca thirst for prediction\u2014a strong desire to place the study of social and legal behavior on scientific footing.\u201d In order to satisfy this desire, the \u201cprognostic score\u201d emerged as a means to predict whether criminals would reoffend. By assigning risk values based on offenders\u2019 mental and physical characteristics, criminologists developed a score they felt to be meaningful and true.<a class=\"fn\" href=\"#fn-5\" name=\"fnn-5\">5<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400\">\u201cA certain euphoria surrounded the prediction project, reflecting a shared sense of progress, of science, of modernity,\u201d Harcourt writes of criminal profiling in Massachusetts a hundred years ago.<\/span><\/p>\n<p><span style=\"font-weight: 400\">More recently, the promise of Big Data seemed like it might serve the same purpose. Having more data, as Big Data advocates purport, improves predictive accuracy. And having that data-backed \u2018accuracy\u2019 grants us license to apply data-based scoring to new domains.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Today, algorithms dominate the predictive technology scene. Councilman Vacca wants to know how those algorithms -many of which are used to aid city agencies in their decision making processes- work. His motivation comes in part from frustration with the shortcomings of the current system. \u201cHow does somebody get an apartment in public housing?\u201d Vacca asked at the City Council hearing. \u201cI\u2019m told that it\u2019s strictly done by computer\u2026On what basis?<a class=\"fn\" href=\"#fn-6\" name=\"fnn-6\">6<\/a>\u201d<\/span><span style=\"font-weight: 400\"> If the algorithms were working correctly, he implies, people who applied for public housing would be assigned to apartments near their families and doctors, high school students would not be assigned to their sixth choice school, and fire department resources would be allocated more fairly. At the very least, Vacca hopes that people might have recourse when \u201csome inhuman computer is spitting them out and telling them where to go.<a class=\"fn\" href=\"#fn-7\" name=\"fnn-7\">7<\/a>\u201d<\/span><span style=\"font-weight: 400\"> The councilman has a name for this: Transparency.<\/span><\/p>\n<h1><span style=\"font-weight: 400\">The Problem With \u2018Transparency\u2019 <\/span><\/h1>\n<p><span style=\"font-weight: 400\">What kind of transparency might we get by asking government agencies to publish the source code of their automated processing methods? <\/span><\/p>\n<p><span style=\"font-weight: 400\">In 2008, the Free Software Foundation, overseers of the GNU<\/span><span style=\"font-weight: 400\"><a class=\"fn\" href=\"#fn-8\" name=\"fnn-8\">8<\/a> General Public License open source software license (GPL), sued the maker of Linksys routers. Linksys used GPL-licensed software in its routers, but hadn\u2019t published improvements it made to that software -thereby violating the terms of the license. Rather than go to court, Linksys\u2019 parent company agreed to publish its source code and hire a free software<a class=\"fn\" href=\"#fn-9\" name=\"fnn-9\">9<\/a> compliance officer.<a class=\"fn\" href=\"#fn-10\" name=\"fnn-10\">10<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400\">I went to the Linksys website to see if I could find the source code they\u2019d agreed to publish. Buried within Linksys\u2019 support documentation is the GPL Code Center, a table of hardware model numbers with corresponding links to software files.<a class=\"fn\" href=\"#fn-11\" name=\"fnn-11\">11<\/a> For no reason in particular, I chose to download the 277 MB file bundle for model CM3024. The bundle contained a README (\u201cHitron GPL Compiling Guide\u201d) with instructions like \u201cHow to build both firmware and toolchain,\u201d \u201cHow to build &amp; install toolchain alone,\u201d \u201cHow to make image alone.\u201d Aside from the README, the other words in the file bundle were meant for computers and didn\u2019t give me, a human, a better understanding of how the router worked.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Publishing source code the way the GPL mandates is itself a type of transparency, but it\u2019s not the kind that\u2019s meaningful to the general public. The GPL primarily exists for enforcing its own, very specific philosophy of freedom<a class=\"fn\" href=\"#fn-12\" name=\"fnn-12\">12<\/a>; which says that users have the freedom to run, copy, distribute, study, change, and improve the software. Thus, \u201cfree software\u201d is a matter of liberty, not price. It certainly doesn\u2019t exist to convey the meaning of the software it licenses.<\/span><\/p>\n<p><span style=\"font-weight: 400\">In September of 2017, ProPublica filed a motion to unseal the source code of a DNA analysis algorithm used in thousands of court cases across the United States.<a class=\"fn\" href=\"#fn-13\" name=\"fnn-13\">13<\/a> One could presume the publication was similarly motivated to reveal the significance of the code to humans. ProPublica argued that the design of the algorithm might have resulted in sending innocent people to prison. A month later, a federal judge granted ProPublica\u2019s motion and the source code was released. As with the Linksys code, the release of the source code to the general public, though technically transparent, is still meaningless in its indecipherability. In practice, the implications of the DNA algorithm were conveyed in the form of a 60-page code review by a student pursuing a master\u2019s degree in computer science.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Addressing the issue of algorithmic transparency recently on a panel, data scientist Cathy O\u2019Neil noted, \u201cI don\u2019t think transparency per se is a very meaningful concept in this realm because, speaking <\/span><span style=\"font-weight: 400\">as a data scientist, I can make something that\u2019s technically transparent but totally meaningless.\u201d<a class=\"fn\" href=\"#fn-14\" name=\"fnn-14\">14<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400\">The transparency that open source code provides is only meaningful when there are translators who can explain what the code does. Vacca\u2019s bill, while a step in the right direction, remains incomplete in that it fails to propose a solution for demystifying the meaning of the code itself. In effect, the burden of deciphering the algorithm would still fall on the individuals that it affects. Similarly, while ProPublica was able to get source code published, it became incumbent upon the publication\u2019s legal team to find experts who could decipher what they had gotten the court to unseal. <\/span><\/p>\n<p><span style=\"font-weight: 400\">So how do we get corporations and government agencies to foot the bill for this work, as opposed to \u2018outsourcing\u2019 it to private individuals and investigative journalists?<\/span><\/p>\n<div id=\"attachment_1572\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1572\" class=\"aligncenter wp-image-1419\" src=\"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/wp-content\/uploads\/sites\/7\/2018\/02\/Adjacent_TransparentAlg_Segment-1.png\" alt=\"\" width=\"604\" height=\"430\" \/><p id=\"caption-attachment-1572\" class=\"wp-caption-text\">Overlapping Triangles (Image By Hye Ryeong Shin)<\/p><\/div>\n<h1><span style=\"font-weight: 400\">Data Controllers<\/span><\/h1>\n<p><span style=\"font-weight: 400\">The EU\u2019s General Data Protection Regulation (GDPR), which replaces a 1995 data protection initiative, was adopted in 2016 and will go into effect May of this year. Written over the course of four years, its 11 chapters contain 99 articles that map out in great detail the digital rights of European Union citizens.<a class=\"fn\" href=\"#fn-15\" name=\"fnn-15\">15<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400\">In its final version, the GDPR states that large companies whose core activities include processing and monitoring personal data are required to hire a \u201cdata protection officer.\u201d<a class=\"fn\" href=\"#fn-16\" name=\"fnn-16\">16<\/a> Data subjects (or those whose data is run through an algorithm) \u201cshould have the right not to be subject to a decision\u201d arrived at by the automated processing of their data, and in the event that they are, they have \u201cthe right to obtain human intervention, to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge the decision.\u201d<a class=\"fn\" href=\"#fn-17\" name=\"fnn-17\">17<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400\">The GDPR has also expanded its jurisdiction and increased fines for noncompliance. The GDPR now \u201capplies to all companies processing the personal data of data subjects residing in the Union, regardless of the company\u2019s location.\u201d<a class=\"fn\" href=\"#fn-18\" name=\"fnn-18\">18<\/a> Those that don\u2019t comply risk being fined \u20ac20,000,000 or 4% of their annual revenue, whichever is higher.<\/span><\/p>\n<p><span style=\"font-weight: 400\">That language used by the GDPR reflects a depth of understanding and proactive engagement with data and power dynamics that our City Council has not reached. To be fair, New York is <\/span><span style=\"font-weight: 400\">\u201cthe first city and the first legislative body in [the United States] to take on this issue,\u201d <\/span><span style=\"font-weight: 400\">as Vacca points out.<\/span><span style=\"font-weight: 400\"> But, in scaffolding legal frameworks here, the city council might benefit from borrowing some of the language the authors of the GDPR developed over many years.<\/span><\/p>\n<h1><span style=\"font-weight: 400\">A New Bill <\/span><\/h1>\n<p><span style=\"font-weight: 400\">In the end, Councilmember Vacca\u2019s tiny but mighty 71 word bill was not passed. Instead, the City Council passed a revised version of Bill 1696 that is more detailed &#8211;and also more measured. The revision includes a task force appointed by the mayor. <\/span><span style=\"font-weight: 400\">The task force will spend<\/span><span style=\"font-weight: 400\"> 18 months producing a report. The report will recommend procedures for determining whether algorithms disproportionately impact people based on their identities, and, similar to the GDPR, will come up with ways for people affected by an automated decision system to access an explanation of the system\u2019s inner workings.<a class=\"fn\" href=\"#fn-19\" name=\"fnn-19\">19<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400\">The requirement that agencies publish source code is notably missing from the amended bill, replaced with more meaningful measures of transparency that shifts the burden of technical understanding and explanation from the general public back to the writers and wielders of algorithms themselves\u2014or as the GDPR calls them, \u201cdata controllers.\u201d<\/span><\/p>\n<p><span style=\"font-weight: 400\">The proposed scope of the report is daunting. \u201cI mean, i<\/span><span style=\"font-weight: 400\">f you use a computer program, you\u2019re using an algorithm,<\/span><span style=\"font-weight: 400\">\u201d Craig Campbell of the city\u2019s Department of Information Technology and Telecommunications sighed under questioning by Vacca.<a class=\"fn\" href=\"#fn-20\" name=\"fnn-20\">20<\/a> It remains to be seen how members of the task force will differentiate between the decision-making part of a computer program and the rest of its functions.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Perhaps most daunting is the prospect of eliciting participation from the city agencies most responsible for disproportionately harming New Yorkers, with and without the help of algorithms. The revised bill notes that compliance with the task force\u2019s recommendations is not required, specifically when that compliance might interfere with a law enforcement investigation or \u201cresult in the disclosure of proprietary information.\u201d<a class=\"fn\" href=\"#fn-21\" name=\"fnn-21\">21<\/a><\/span><\/p>\n<h1><span style=\"font-weight: 400\">Outcomes. And Then Algorithms<\/span><\/h1>\n<p><span style=\"font-weight: 400\">Maybe the way the term \u2018algorithm\u2019 is used in this conversation is a contemporary manifestation of the same fantasy about scientific progress that Harcourt describes in his book. When data controllers use it, it creates opacity and allows them to wriggle out of questions about responsibility by pointing to data and science and the objectivity of statistics, without ever having to acknowledge that they might not actually know what they\u2019ve built. When data subjects use the term, it\u2019s to speak in the terms by which they\u2019ve been harmed.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Underlying this revised New York City Council bill is the belief that knowing what goes into the decision-making machine will make redress for harmful decisions more possible. That\u2019s hopefully true at least in part, and it\u2019s why data subjects should insist on multiple transparencies, and push back when data controllers argue that algorithms are too complex for laypeople to understand or untangle, or that it\u2019s a security risk to have more eyes on them or that, conveniently, algorithms are private property and therefore not available for public scrutiny.<\/span><\/p>\n<p><span style=\"font-weight: 400\">It\u2019s also important for programmers to understand and reckon with the fact that far from the well-lit offices where they write code, their same code might be used in a program that determines whether \u201cNew Yorkers go home to their families and communities or, instead, sit for days, weeks, or months on Rikers Island.\u201d<a class=\"fn\" href=\"#fn-22\" name=\"fnn-22\">22<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400\">But it\u2019s also true that developing language to regulate algorithms, algorithmic decision-making, data controllers, and data subjects is just one of many strategies for addressing what lies at the heart of the matter: inequality. Those facing injustice, algorithmic or otherwise, will need, as they have always needed, infrastructure for finding each other, sharing their stories, and developing their own demands on their own terms. We need strategies for fighting injustice that have nothing to do with the technical details of algorithms. Maybe that\u2019s why organizer and filmmaker Astra Taylor said about organizing around algorithmic injustice: \u201cI don\u2019t think you should lead with the algorithms,\u201d contesting the premise of this entire article. \u201cOutcomes. And then algorithms.\u201d<a class=\"fn\" href=\"#fn-23\" name=\"fnn-23\">23<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The country\u2019s first attempt to develop a policy that addresses the true implications of algorithmic decision making<\/p>\n","protected":false},"author":1,"featured_media":1765,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-851","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-issue-2"],"_links":{"self":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/posts\/851"}],"collection":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/comments?post=851"}],"version-history":[{"count":0,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/posts\/851\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/posts\/1765"}],"wp:attachment":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/media?parent=851"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/categories?post=851"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/tags?post=851"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}