I created "Face un-recognition" in SFPC's Machine Language class, responding to the class prompt of how we might vary a particular example of machine language.
Face un-recognition explores the idea of facial recognition as a negation of the camera, rather than an addition to it or the purpose of it. Facial recognition technology is used to track and surveil people in higher and higher granularity, but could it alternatively be used to set boundaries to the camera and to surveillance? What is the point of such a camera? Who would use it?
Face un-recognition also plays on the idea that in order for a face to be tracked and surveilled, it must first be recognized as a face. In Machine Language, one of my classmates Blair Johnson explored the question of "what makes a face?" This tool invites the user to explore that same question through distortion and re-making of the face. At what point does a face become "un-recognized", to you and to the camera? When does the camera misidentify a face? What kind of face doesn't look like a face?
To me, the point of such a camera is all about control. The feeling that the camera is subordinate to you, the person being “watched”, rather than the other way around. In class, Xizi Hua showed us a home security camera which learns to recognize different faces, and only detects new faces as “intruders.” I wonder if a camera like the one here could be used in home security, one that automatically turns off when you enter your home and is switched back on when you leave, in an attempt to gain security without the privacy invasion constantly being watched. Would this give me solace enough to install a security camera inside my home?
While pondering this camera, I thought about a possible feature where if there was a face in the frame but it wasn’t detected, you could click a button that says “There’s a face in this frame!” and draw a square around the face. This new data could be incorporated into the algorithm to make it “better” at detecting faces. But again I think about how that might be used against the original intentions - why would someone try to make this camera better? So that it could turn off more of the time? This leads me to think that the less a technology infers about who its user is and what they are trying to do, the more control is given back to the user. Manual intervention inherently shifts the power away from the machine back to the human. For example, I could imagine this camera switching on every 5 seconds to check if a face is in the frame again (indeed, this is what I started with), but it turns out that the camera that gives the most control is also the one that has the smallest feature set: a simple on / off button for the user to decide.
This project was created using opencv.js, using a Haar feature based classifier. Given a window of (black and white) pixels, a Haar feature refers to two adjacent rectangles with a particular difference in intensity (i.e. saturation). In images of faces, these features tend to appear in specific regions, for example around the eye, eyebrows, cheek, nose, and lips. The algorithm used on this page is a Haar-based Cascade Classifier. Cascade Classifiers are a class of machine learning algorithm typically trained on ~hundreds of positive examples (algorithm returns "true") and an arbitrary number of negative examples (algorithm returns "false") to derive the specific parameters at which a new input would return true or false. Data used for various Haar-based classifiers, including the one used on this page can be found here.
I love this example because it almost perfectly demonstrates how machines (computers) work in ways that humans don't, and a way in which they often go awry. To this little program, despite any amount of theorizing I may do in the description, "what makes a face" is very simple: a certain number of pixels in a certain range of patterns within a 2D image. The "number" and "patterns" emerge not naturally, but from manually labelling a narrow set of 2D (frontal) human faces, yet they can be applied to an image of any individual. You'll notice that in this experiment, a side of a face is not a face, because the labelled images from which this algorithm is developed did not include those examples. And yet, when running this code on a 30 fps live-stream, it almost feels as if the machine could, maybe, make a face.
All of the code can be viewed by inspecting source in the browser and also here.