{"id":853,"date":"2017-08-06T15:45:29","date_gmt":"2017-08-06T19:45:29","guid":{"rendered":"https:\/\/itp.nyu.edu\/~jvc301\/wordpress\/?p=129"},"modified":"2017-08-06T15:45:29","modified_gmt":"2017-08-06T19:45:29","slug":"what-you-see-may-not-be-there","status":"publish","type":"post","link":"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/what-you-see-may-not-be-there\/","title":{"rendered":"What You See May Not Be There: The State of the Subject Amid Pervasive Computer Vision"},"content":{"rendered":"<p><span class=\"first-letter\">P<\/span> hotography has since its dawn maintained a strong degree of truth-authority over other representational mediums. As philosopher Vilem Flusser<a class=\"fn\" href=\"#fn-1\" name=\"fnn-1\">1<\/a> puts it, the contents of an image appear \u201cto be on the same level of reality as their significance,\u201d and photographs are more readily accepted as an objective window. By their very existence, pictures claim to offer a trustworthy record of a subject being somewhere, doing something. The tenacity of this \u2018truthfulness\u2019 illusion\u2014a feature of the medium since its dawn\u2014is still on display, throughout our transition into the digital age. But the ways in which it manifests are changing.<\/p>\n<p><span style=\"font-weight: 400\">Today\u2019s computer vision algorithms are complicating the nature of fidelity present in the still and moving image. This is well demonstrated through a line of computer vision research referred to as predictive vision, interested in part in \u201c<\/span><a href=\"http:\/\/carlvondrick.com\/tinyvideo\/\"><span style=\"font-weight: 400\">predicting plausible futures of static images<\/span><\/a><span style=\"font-weight: 400\">.\u201d Carl Vondrick and his colleagues, leading the front in this discipline, created an AI software program called <\/span><a href=\"http:\/\/carlvondrick.com\/tinyvideo\/\"><i><span style=\"font-weight: 400\">Tiny Video<\/span><\/i><\/a><span style=\"font-weight: 400\">, which analyzes a still-frame of video to predict what might happen next in the scene and turns its predictions into algorithmically-generated (e.g., not \u2018real\u2019) video content. Trained on over 5,000 hours of footage downloaded from Flickr of newborn babies, people playing golf, and other moving images of relatively banal subject matter, the program generates short GIFs of pixel blobs that move like wiggling babies or walking figures amid a grassy backdrop. The machine-hallucinated animations don\u2019t yet look exactly realistic\u2014in fact they are slightly disturbing to look at\u2014but as the researchers state, \u201cthe motions are fairly reasonable\u201d for these types of human gestures. <\/span><i><span style=\"font-weight: 400\">Tiny Video<\/span><\/i><span style=\"font-weight: 400\"> points to the trajectory of computer vision advancements, programs increasingly able to not only process an image or video but generate a realistic copy of its subject\u2019s looks and movements\u2014essentially a 2D or 3D avatar\u2014based on speculation of their actions deduced from patchy data.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The discordance between Flusser\u2019s observation and <\/span><i><span style=\"font-weight: 400\">Tiny Video\u2019s<\/span><\/i><span style=\"font-weight: 400\"> outputs captures the current state of the digital image. Our visual culture assumes the veracity of the photos and videos circulated within it. Meanwhile, the image today undergoes increasing layers of abstraction via computer vision algorithmic operations, their yields based more on conjecture than fact. While <\/span><i><span style=\"font-weight: 400\">Tiny Video<\/span><\/i><span style=\"font-weight: 400\"> is an academic research experiment, when technologies of this sort reach a certain threshold of believability they immediately get adopted by the market and their use cases proliferate. The implications of this evolution of both technical prowess and market adoption rest predominantly with the subject\u2014all of us featured in photos shared online or captured via any number of pervasive body or security cameras \u2018in the wild\u2019\u2014whose physical and behavioral identities get co-opted and interpreted by the algorithmic methods. <\/span><\/p>\n<p><span style=\"font-weight: 400\">These increased layers of abstraction create a distorted portrait of the subject that, because of its computational genesis, gets treated as ontological fact. Algorithms have a good chance of getting the subject wrong outright by mismatching a face to an identity. And they will invariably be introducing their own guesses and biases in drawing more subjective inferences about one\u2019s behavior or character or future actions. The impact on the subject is inevitable\u2014the job they get or don\u2019t get, their insurance rates, or whether they are implicated in a crime will depend more and more on data collected via computer vision software and hardware. Not only are most of these practices exempt from privacy laws but we don\u2019t currently have a standard mode of accountability for the conclusions they draw.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Photography has long had the authority to appropriate a subject\u2019s identity and affect their lives for years to come. In the case of a notorious 1957 photograph of desegregation at Little Rock Central High School, Elizabeth Eckford, the African American student, and Hazel Bryan, the white antagonist with the hateful expression behind her, <\/span><a href=\"https:\/\/www.vanityfair.com\/news\/2007\/09\/littlerock200709\"><span style=\"font-weight: 400\">had their lives changed by this image, taken by Will Counts, for the half century since<\/span><\/a><span style=\"font-weight: 400\">. Theorist Roland Barthes described photography as a <\/span><i><span style=\"font-weight: 400\">lamination<\/span><\/i><span style=\"font-weight: 400\"> procedure, in which the subject and their 2D likeness get permanently attached to one another.<a class=\"fn\" href=\"#fn-2\" name=\"fnn-2\">2<\/a> Today, this lamination continues but with more persistent and chaotic conditions. Anyone who has ever Google Image-searched themselves knows how surreptitiously digital images can travel across the internet to unexpected websites and servers without their knowledge or consent.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-1466 aligncenter\" src=\"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/wp-content\/uploads\/sites\/7\/2018\/02\/AdjacentIllustration7V2_FINAL-1-1024x576.jpg\" alt=\"\" width=\"928\" height=\"558\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">A picture\u2019s susceptibility to being tampered with is also not a condition unique to AI-assisted automation. <\/span><a href=\"https:\/\/thesocietypages.org\/socimages\/2011\/03\/30\/representation-of-the-primitive\/\"><span style=\"font-weight: 400\">The analog photograph has a long history of being edited to leave things out<\/span><\/a><span style=\"font-weight: 400\">, and of course <\/span><a href=\"http:\/\/time.com\/3967329\/sandra-bland-video-continuity\/\"><span style=\"font-weight: 400\">frames of a video can easily be dropped<\/span><\/a><span style=\"font-weight: 400\"> to tell a different story. From the notorious 1922 film <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Nanook_of_the_North#Controversies\"><i><span style=\"font-weight: 400\">Nanook of the North<\/span><\/i><\/a><span style=\"font-weight: 400\">\u2014an early pioneer of the documentary form\u2014to<\/span><a href=\"https:\/\/www.theroot.com\/baltimore-drops-43-police-cases-after-cops-fake-another-1798338576\"><span style=\"font-weight: 400\"> recent police body cam videos<\/span><\/a><span style=\"font-weight: 400\">, that which is accepted as documentation may very well have been staged in the first place. However, computer vision goes a step, or multiple steps, further in the scope of its augmentations. <\/span><\/p>\n<p><span style=\"font-weight: 400\">From facial recognition, to emotion detection, to predictive vision, the original capture is just a jumping-off point from which the subject\u2019s identity gets appropriated, operated upon, and generated anew. Very often these algorithms are proprietary, in the service of industries that have a special interest in creating a colored-in, fully fleshed-out picture of the average citizen for the purposes of targeted advertising or law enforcement.<\/span><\/p>\n<p><span style=\"font-weight: 400\">While in many cases, computer vision applications operate in isolation from one another, it\u2019s conceivable that they will be, if they aren\u2019t already, combined in any number of ways that increase their guesswork and exacerbate misrepresentations of a subject. It\u2019s as if your likeness had a clandestine Second Life, getting rendered and puppetted around via dubious corporate transactions.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h1><span style=\"font-weight: 400\">Facial Recognition and Identity Verification<\/span><\/h1>\n<p><span style=\"font-weight: 400\">The first layer of abstraction involves matching a subject in an image to an identity. Facial recognition software is becoming almost as ubiquitous as cameras, which means a subject will be recognizable to systems\u2019 proprietors nearly any time they leave their house, or even <\/span><a href=\"https:\/\/www.engadget.com\/2017\/04\/27\/what-amazon-gets-out-of-putting-a-camera-in-your-closet\/\"><span style=\"font-weight: 400\">in their own bedrooms with their own devices<\/span><\/a><span style=\"font-weight: 400\">. The process is also fraught with complex, overconfident mathematical models and biased or incomplete datasets.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Originating in the the 1960\u2019s, facial recognition programs have evolved from passive scanners of front-facing headshots to involved algorithms that perform computational gymnastics. From attempting to match two photos taken from wildly different angles to literally filling in the missing pixels of occluded parts of a subject\u2019s face, today\u2019s operations call for significant guesswork on the part of the computer. <\/span><\/p>\n<p><span style=\"font-weight: 400\">A potential method for comprehensive facial recognition is the creation of a 3D model of a person\u2019s bust to serve as a reference for identity verification. In 2014, <\/span><a href=\"https:\/\/research.fb.com\/wp-content\/uploads\/2016\/11\/deepface-closing-the-gap-to-human-level-performance-in-face-verification.pdf?\"><span style=\"font-weight: 400\">Facebook revealed<\/span><\/a><span style=\"font-weight: 400\"> it employed such a procedure with its DeepFace algorithm. Though the company hasn\u2019t disclosed use of this specific approach since, it <\/span><a href=\"https:\/\/www.recode.net\/2017\/12\/19\/16793538\/facebook-facial-recognition-pictures-update-identity\"><span style=\"font-weight: 400\">recently acknowledged<\/span><\/a><span style=\"font-weight: 400\"> that it knows when you are present in a photo whether you\u2019re tagged or not. But what about software that has access to far less photographic data, or even none at all? A recent <\/span><a href=\"https:\/\/arxiv.org\/pdf\/1703.07834.pdf\"><span style=\"font-weight: 400\">study<\/span><\/a><span style=\"font-weight: 400\"> and <\/span><a href=\"http:\/\/cvl-demos.cs.nott.ac.uk\/vrn\/\"><span style=\"font-weight: 400\">online demo<\/span><\/a><span style=\"font-weight: 400\"> released by the University of Nottingham introduces an algorithm that extrapolates a 3D model from just a single 2D photograph. And as we see with <\/span><a href=\"http:\/\/deweyhagborg.com\/projects\/stranger-visions\"><span style=\"font-weight: 400\">Heather Dewey-Hagborg\u2019s speculative project Stranger Visions<\/span><\/a><span style=\"font-weight: 400\">, such a rendering can be visioned from just a hair or other biomatter from which DNA can be extracted, requiring no original photo whatsoever. Regardless of the mode by which it was conceived, a 3D model of one\u2019s head is often the result of hypothetical geometric information filled in to piece together a photorealistic avatar. While researchers are at work devising ways to identify subjects with limited information, industry is rife with efforts to capture more facial data.<\/span><\/p>\n<p>&nbsp;<\/p>\n<div id=\"attachment_1460\" style=\"width: 404px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1460\" class=\"wp-image-1460\" src=\"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/wp-content\/uploads\/sites\/7\/2018\/02\/julia_3d.gif\" alt=\"\" width=\"394\" height=\"421\" \/><p id=\"caption-attachment-1460\" class=\"wp-caption-text\"><a href=\"http:\/\/cvl-demos.cs.nott.ac.uk\/vrn\/\" target=\"_blank\" rel=\"noopener\">3D face reconstruction<\/a> from the author&#8217;s photo<\/p><\/div>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">As our daily lives increasingly take place in the presence of cameras, image tagging opportunities are plentiful. More and more, facial recognition capabilities are seamlessly enmeshed with security and body camera systems. Movidius, an artificial intelligence company specializing in machine vision, <\/span><a href=\"https:\/\/www.movidius.com\/news\/movidius-strikes-deal-with-hikvision-to-bring-artificial-intelligence-to-in\"><span style=\"font-weight: 400\">recently had its technology integrated into Hikvision security cameras<\/span><\/a><span style=\"font-weight: 400\"> to make video analytics possible in real time locally on the device. Suppliers of police body cameras promise live facial recognition \u201c<\/span><a href=\"https:\/\/www.documentcloud.org\/documents\/3679537-Taser-2017-Law-Enforcement-Technology-Report.html\"><span style=\"font-weight: 400\">to tell almost immediately if someone has an outstanding warrant against them<\/span><\/a><span style=\"font-weight: 400\">.\u201d And, it\u2019s only a matter of time before facial recognition systems are part and parcel to every retail experience. <\/span><a href=\"http:\/\/pdfpiw.uspto.gov\/.piw?PageNum=0&amp;docid=09299084&amp;IDKey=C8DE7EC2D24B&amp;HomeUrl=http:\/\/patft1.uspto.gov\/netacgi\/\"><span style=\"font-weight: 400\">Walmart was recently granted a patent<\/span><\/a><span style=\"font-weight: 400\"> to integrate a custom facial tracking system at its registers. And Amazon <\/span><a href=\"https:\/\/www.recode.net\/2018\/1\/21\/16914188\/amazon-go-grocery-convenience-store-opening-seattle-dilip-kumar\"><span style=\"font-weight: 400\">opened its first Amazon Go<\/span><\/a><span style=\"font-weight: 400\"> store this January, where a tapestry of in-store cameras and real-time facial recognition technology takes the place of the checkout experience altogether. The opportunities for these systems to fail us escalate in sync with their deployment.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-1465 aligncenter\" src=\"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/wp-content\/uploads\/sites\/7\/2018\/02\/AdjcentIllustration_FINAL-1024x576.jpg\" alt=\"\" width=\"912\" height=\"529\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">Far from a binary, objective practice, identity verification via facial recognition involves significant room for error. <\/span><a href=\"https:\/\/theintercept.com\/2016\/10\/13\/how-a-facial-recognition-mismatch-can-ruin-your-life\/\"><span style=\"font-weight: 400\">This story<\/span><\/a><span style=\"font-weight: 400\"> from 2016 about a man from Denver twice wrongly implicated because of mismatched security footage reveals the extent to which systems that involve computer modeling can elicit false positives. Identifying suspects in photos is a practice known to be extremely difficult for humans, and not necessarily easier for computers. <\/span><\/p>\n<p><span style=\"font-weight: 400\">The data sets used to both train an algorithm and serve as a reference for an identity match are often incomplete and include disproportionate representations of a given population. As a reflection of the uneven makeup of their training sets, <\/span><a href=\"https:\/\/www.theatlantic.com\/technology\/archive\/2016\/04\/the-underlying-bias-of-facial-recognition-systems\/476991\/\"><span style=\"font-weight: 400\">algorithms are known to misidentify African Americans at a higher rate than Caucasians<\/span><\/a><span style=\"font-weight: 400\">. The FBI\u2019s facial recognition technology database, which was compiled via questionable if not illegal means to include nearly half of unwitting American adults, <\/span><a href=\"https:\/\/oversight.house.gov\/hearing\/law-enforcements-use-facial-recognition-technology\/\"><span style=\"font-weight: 400\">was found to follow suit with this trend of misidentification<\/span><\/a><span style=\"font-weight: 400\">. But even when they\u2019re not outright amiss in naming the correct subject, facial recognition has nuanced and insidious overtones.<\/span><\/p>\n<p><span style=\"font-weight: 400\">When we go online, we expect that our browsing behavior will be recorded, archived, and shared. When we travel with our phones we generally expect our location data to be tracked. But we don\u2019t expect our face to be identified by countless inconspicuous imaging devices scattered throughout public spaces. And even if sophisticated facial recognition apparatuses were to correctly identify us at every capture point, they still amass an incomplete, erroneous portrait of the individual similar to how analysis of one\u2019s browsing history inevitably does.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h1><span style=\"font-weight: 400\">Emotion Detection and Analytics<\/span><\/h1>\n<p><span style=\"font-weight: 400\">Not only will these cameras recognize where and when a person is present but will log their emotions and supposed intentions at each capture as well. Emotion detection involves dramatic inferences on a photo with an aim to unveil the inner life of a subject. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Like many other computer vision technologies, emotion detection relies on a model that is reductive, yet celebrated and normalized. The accepted standard for this software is the <\/span><a href=\"https:\/\/www.paulekman.com\/product-category\/facs\/\"><span style=\"font-weight: 400\">Facial Action Coding<\/span><\/a><span style=\"font-weight: 400\"> system, devised by psychologist Paul Ekman in 1965, used to read human emotion via facial anatomy. It claims the entire human emotional range can be schematized into seven discrete expressions\u2014happiness, sadness, surprise, fear, anger, disgust, contempt\u2014that can be unveiled via examination of frontal snapshots. Ekman was commissioned by the US Department of Defense to develop this research and went on to apply it as a means for detecting deception, an application widely embraced by the law enforcement community.<a class=\"fn\" href=\"#fn-3\" name=\"fnn-3\">3<\/a> Despite its dark history and extreme simplification of complicated human qualities, the Facial Action Coding system <\/span><a href=\"https:\/\/nordicapis.com\/20-emotion-recognition-apis-that-will-leave-you-impressed-and-concerned\/\"><span style=\"font-weight: 400\">forms the basis for most services out there<\/span><\/a><span style=\"font-weight: 400\">.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Unlike facial identification, there is no definitive right or wrong when it comes to emotion detection. As Alessandro Vinciarelli, a researcher in the related field of <\/span><a href=\"http:\/\/www.dcs.gla.ac.uk\/~vincia\/papers\/sspsurvey.pdf\"><span style=\"font-weight: 400\">social signal processing<\/span><\/a><span style=\"font-weight: 400\">, explains, \u201cnonverbal behavior is the physical, machine-detectable evidence of social and psychological phenomena that cannot be observed directly\u201d but instead only inferred. Vinciarelli makes his inferences from physical measurements of people\u2019s faces on camera and via other sensors. Transposing anatomical features and subjective emotional or personality traits harkens back to the centuries-old practice of physiognomy, in which measurement instruments would be used to assess one\u2019s character. Physiognomy was long ago debunked as pseudoscience and criticised for its propagation of institutional racism. Much before the aid of computational devices, the practice involved bridging a significant logic gap between the physical and psychological. The fact that a computer can scan photographic details at the pixel level and perform statistical operations does not mean it can shrink the logic gap inherent in this model. However, the algorithms\u2019 assumptions are upheld as credible data and the scope of their applications are now distending.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-1467 aligncenter\" src=\"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/wp-content\/uploads\/sites\/7\/2018\/02\/AdjacentIllustration2_FINAL-1.jpg\" alt=\"\" width=\"837\" height=\"499\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">With ubiquitous image capture comes innumerable opportunities for emotion analytics to be put to work on snapshots of citizens. The <\/span><a href=\"https:\/\/www.forbes.com\/sites\/retailwire\/2017\/07\/27\/walmarts-facial-recognition-tech-would-overstep-boundaries\/#224530bf45f8\"><span style=\"font-weight: 400\">facial recognition system Walmart intends to implement<\/span><\/a><span style=\"font-weight: 400\"> will scan shoppers\u2019 faces for signs of unhappiness. Startup <\/span><a href=\"https:\/\/www.faception.com\/\"><span style=\"font-weight: 400\">Faception<\/span><\/a><span style=\"font-weight: 400\"> is a company targeting law enforcement agencies, among other industries, claiming its personality-reading computer vision product can be used to \u201capprehend potential terrorists or criminals before they have the opportunity to do harm.\u201d Unlike a fingerprint used to match evidence to an identity, this software looks to one\u2019s <\/span><i><span style=\"font-weight: 400\">imagined<\/span><\/i> <i><span style=\"font-weight: 400\">personality<\/span><\/i><span style=\"font-weight: 400\"> as revealed by anatomical measurements in a photo to predict someone\u2019s likelihood of being terrorist. A similar claim was made by researchers at Shanghai Jiao Tong University upon release of their study titled <\/span><a href=\"https:\/\/arxiv.org\/pdf\/1611.04135v1.pdf\"><i><span style=\"font-weight: 400\">Automated Inference on Criminality using Face Images<\/span><\/i><\/a><span style=\"font-weight: 400\">. The conjecture embraced by Faception and these researchers, and the inevitable discrimination that is to result, is cause for great concern, especially because the very arbitrary nature of the data provides no basis for recourse.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Over time, it\u2019s likely that any one of the companies performing these analyses could begin to formulate an individualized psychometric signature based solely on a subject\u2019s expressions in a set of photos. The burgeoning trend toward emotion and personality analytics can be seen in adjacent fields such as speech recognition and natural language processing with tools like <\/span><a href=\"https:\/\/www.ibm.com\/watson\/services\/personality-insights\/\"><span style=\"font-weight: 400\">IBM Watson\u2019s Personality Insights API<\/span><\/a><span style=\"font-weight: 400\">. The various ways in which a subject\u2019s supposed emotional data could be traded or combined are many. And because emotion analysis is based on \u201csoft\u201d data, such profiles will become increasingly speculative with each added data point, all the while being exempt from judgment as to their accuracy or equity.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h1>Predictive Vision<\/h1>\n<p><span style=\"font-weight: 400\">Finally, as we see with the nascent stages of <\/span><i><span style=\"font-weight: 400\">Tiny Video<\/span><\/i><span style=\"font-weight: 400\">, any snap of a subject can eventually be used to speculate about their future actions. An algorithm capable of generating video scenes of things that never happened invites hackers and people interested in spreading misinformation at the subject\u2019s expense. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Carl Vondrick of <\/span><i><span style=\"font-weight: 400\">Tiny Video<\/span><\/i><span style=\"font-weight: 400\"> also worked on training algorithms to anticipate human gestures. <\/span><a href=\"http:\/\/news.mit.edu\/2016\/teaching-machines-to-predict-the-future-0621\"><span style=\"font-weight: 400\">His system<\/span><\/a><span style=\"font-weight: 400\"> watched hundreds of hours of television like <\/span><i><span style=\"font-weight: 400\">The Office<\/span><\/i><span style=\"font-weight: 400\"> and looked for innocuous, easy-to-spot gestures like a handshake, hug, or high five. It eventually was given a single frame from which to predict which gesture was to occur within the next five seconds, doing so with an accuracy rate at about half that of a human. In isolation these computer-generated forecasts indicate increasing abilities for AI systems to read and understand human behavior. They also come to bear at a time when predictive vision could very well become the mantra of police forces and data scientists already captivated by the <\/span><a href=\"https:\/\/www.nij.gov\/topics\/law-enforcement\/strategies\/predictive-policing\/Pages\/welcome.aspx\"><span style=\"font-weight: 400\">promise of prediction<\/span><\/a><span style=\"font-weight: 400\">. Similar algorithms to Vondrick\u2019s could be integrated into police body cameras to analyze gestures in real time and claim to prevent crimes before they happen. The room for error in these predictions is vast due to the speculative nature on which they are founded\u2014the future is difficult to prove.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h1>A New Model<\/h1>\n<p><span style=\"font-weight: 400\">Despite where they stand on the spectrum of abstraction, computer vision algorithms, because of the way they are wielded, often work against the subject in the frame. Facial recognition and identity verification programs, while operating with a binary relationship to accuracy\u2014a subject is either a certain person or someone else\u2014often get the wrong answer, and the stakes can result in prison time for a misidentified subject. And even for subjects who won\u2019t encounter such extreme circumstances, the notion that one\u2019s face is captured and identified perpetually while moving through space is a new, uncomfortable normal.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Meanwhile, algorithms that make more subtle inferences like extracting emotion from a photo or anticipating what a subject will do next are exempt from judgment over accuracy because their models and inputs are arbitrary. As Carl Vondrick describes <\/span><i><span style=\"font-weight: 400\">Tiny Video\u2019s<\/span><\/i><span style=\"font-weight: 400\"> results\u2014they are <\/span><i><span style=\"font-weight: 400\">plausible<\/span><\/i><span style=\"font-weight: 400\">.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-1464 aligncenter\" src=\"https:\/\/itp.nyu.edu\/adjacent\/issue-2\/wp-content\/uploads\/sites\/7\/2018\/02\/AdjacentIllustration6_FINAL-1024x576.jpg\" alt=\"\" width=\"841\" height=\"502\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">Computer vision products and services are part of a market-driven system, what the <\/span><i><span style=\"font-weight: 400\">Economist<\/span><\/i><span style=\"font-weight: 400\"> calls the <\/span><a href=\"https:\/\/www.economist.com\/news\/business\/21728654-chinas-megvii-has-used-government-collected-data-lead-sector-ever-better-and-cheaper\"><span style=\"font-weight: 400\">Facial-Industrial Complex<\/span><\/a><span style=\"font-weight: 400\">. The computer vision market is expected to grow almost <\/span><a href=\"https:\/\/www.tractica.com\/newsroom\/press-releases\/computer-vision-hardware-and-software-market-to-reach-48-6-billion-by-2022\/\"><span style=\"font-weight: 400\">650% between 2015 and 2022<\/span><\/a><span style=\"font-weight: 400\">. The outputs of these omnipresent systems\u2014their presumptions about a subject\u2019s most intimate proclivities\u2014are the means of their transactions. An industry hungry to grow has no incentive to pause and consider when data stops being synonymous with truth.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Since computer vision\u2019s dawn over a half century ago, we\u2019ve seen a significant shift in the function of a computer\u2014from a machine once employed as a superfast calculator to now an interpreter of nuanced, subjective information about the way humans behave in different situations. Once on the hook for numeric precision and accuracy, computer programs are entering the realm of the humanities, the metric moving from a binary certainty to a soft plausibility. And without the criteria to assess them for accuracy or fairness, and in absence of any agency or even awareness on the part of the subject, the algorithms\u2019 deductions can be wielded to tell any story its proprietors want, with no consequence. Such a scenario calls for more thoughtful discourse and regulation to protect the subject\u2019s rights to their own representation.<\/span><\/p>\n<h1><\/h1>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A critical look at how computer algorithms are increasingly altering our perception of truth<\/p>\n","protected":false},"author":1,"featured_media":1770,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-853","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-issue-2"],"_links":{"self":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/posts\/853","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/comments?post=853"}],"version-history":[{"count":0,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/posts\/853\/revisions"}],"wp:attachment":[{"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/media?parent=853"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/categories?post=853"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/itp.nyu.edu\/adjacent\/wp-json\/wp\/v2\/tags?post=853"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}