Data, Bias, and ... Giraffes: AI Weirdness interview with Janelle Shane
Data, Bias, and ... Giraffes: AI Weirdness
interview with Janelle Shane
s data-driven (driving? consuming?) AI becomes ubiquitous across the computing landscape, we're beginning to realize that it's not just doing work quietly and efficiently in some corner cubicle, a mild-mannered office colleague who does what we want even if it's not what we asked. Instead, it's the one yelling out odd and interesting "facts" every five minutes, the one with pictures of everybody else's family on its desk, the one wearing a black cape to work and pulling giraffes out of its hat. AI is downright weird.
One of the researchers at the forefront of describing and discussing this "AI weirdness" is Janelle Shane, author of the book You Look Like a Thing and I Love You
and fascinating blog and Twitter posts, and proponent of a general adoption of the Plush Giraffe Perturbation test ... because AI and giraffes go together much more closely than you'd think ...
Happily, we were able to interview her for this AI-themed Penumbric
* * *
How did you become interested in what's now "AI weirdness"?
Janelle Shane's book You Look Like a Thing and I Love You.
This was about the same time as your magazine was starting up [the original Penumbric
, early 2000s], in 2002, and I was graduating from high school and starting an undergrad at Michigan State University and happened to attend a lecture by Professor Eric Goodman, who was talking about his work on machine learning algorithms and how they would solve problems in the weirdest, most unexpected, and sometimes really mysterious ways. He'd describe how they would use a genetic algorithm to try and evolve new shapes for a car bumper, so that it would crumple nicely during an accident, and the thing it would come up with would be this weird looking organic-ish structure that no human would have designed, and yet it worked and they weren't sure exactly how it worked or how to reproduce that in another design, but it was still really cool. Or there was a much less useful case where there were some people doing chemistry experiments trying to figure out the arrangements of atoms in a molecule that would be the lowest, most stable energy configuration. And then the algorithm found a really low energy, really stable configuration, but when they looked at it more closely they found that they hadn't told it that it couldn't put all of the atoms in the same singular point in space, so it basically stuffed them into a singularity or something ...
A black hole!
Yeah, exactly. I guess that's stable, but ... So it really did capture me, this sort of combination of really useful and unexpected and also hilarious and unexpected, and looking at the world very differently.
Now, I didn't do my research in that area. I'm actually working as a laser scientist right now, ... but I've always had this kind of interest.
Yeah, I think it says in your bio that you're in optics.
Yeah, I actually get to work on a bunch of cool projects. I get to kinda choose what I work on. There was one project where we were looking at building a virtual reality arena for Mantis shrimp, because they have really interesting eyes that are way different from human eyes, and see polarization and see all these colors, but it's kind of tough for us to figure out what they can see and what they can't see because we don't have screens that really tax their vision, so this was in part a project trying to come up with ways of showing polarization images to them. I've done some projects having to do with putting experiments up on the space station, so it's been fun to read about what astronauts have to keep in mind when they're using stuff in the space station, what different stuff you have to do when you're building something to operate in low gravity. For example, heat dissipation. Normally, if something's hot, hot air rises off of it and dissipates away because there are all these convective forces; it has to do with density and gravity and less dense air rising. In space you don't have the less dense air rising, so instead each hot object sort of stays in this hot envelope of stagnant air around it, so you have to have active fans to take the place of this convective cooling that no longer operates up there.
Stuff in space always acts so differently, like anything to do with sound waves. There's a lot of "noisy space" in science fiction movies that just wouldn't happen ...
Yeah, I think there's a lot of richness still to be mined out of the ways in which living in space is really weird. You get some fiction that just wants to get on with the story, and we have, OK, artificial gravity, and we have a pill we can take to counteract the effects of radiation or whatever; but if you think through, well, would things be weird if we're still dealing with this stuff? Or ideas of getting anywhere taking a really long amount of time if you can't beat the speed of light, and what does that mean if that is your limit and you can have all the technology you want, but you're still limited by the speed of light? How does that make your technological workarounds really weird, and your experience of the universe really weird?
Have you read The End of Everything
by Katie Mack?
That one's good, because it focuses on various scenarios for how the universe will end, and how do we know the details of these various scenarios, and what would it be like, and how long would it take, and would it hurt? So the Big Crunch--not a great time, as it turns out [laughs].
Squeezing and squeezing. Yeah ...
Actually, before you get squeezed, according to this book, all the light that has ever been emitted, all the radiation that has ever been emitted is now compressed back down into a smaller area, and it gets hot. They can calculate the point at which it would ignite the surfaces of stars. So there's some weird times.
How did you end up with the Giraffe Perturbation?
Plush Giraffe Perturbation as it applies to this book. Click for video.
Giraffes as it turns out are a bit of a running joke in machine learning now. People would notice that image recognition algorithms would tend to label images as giraffes that clearly were not giraffes, but it seemed to be one of the possibilities that would pop out when the algorithm really didn't know. It'd be like, well, I've seen lots of giraffes, so maybe this is a giraffe. And then the plush giraffes in particular came about because there was a paper by Open AI where they were having a robot hand manipulate a Rubik's Cube, and the point they wanted to make was that the robot hand could deal with cubes that were slightly bigger or smaller in size, or where there were changes in texture and stuff, because normally robot hands work only under one specific condition that is only found in simulation, and as soon as it goes to the real world it's like "Agh!" So they were showing that this can handle real world cubes and slight changes in the real world cubes, and then one of the tests they did, they labeled it the "Plush Giraffe Perturbation Test," and they had a giant giraffe just gently nuzzling the cube in the robot's hand, and the robot did not drop the cube. (I don't think it made much progress on the actual puzzle that time.) And so I have now been a proponent of having every paper that comes out include a "plush giraffe perturbation test" in some regard. You've got an image recognition algorithm? Let's see if it can recognize a plush giraffe. Or you've got something that will take a video of one person dancing and use it to make a video of someone else dancing; well, can it make a giraffe dance? I'd like to see more weird pushing of the edges of what these models can do when they're a little outside their comfort zone, because I think it's also a good illustration of, yes, this thing is really amazingly good at this specific task and this particular kind of data, but once you get it a little outside of that, you can start to see, oh yeah, this is a narrow intelligence, it can only do so much.
It's like you talk about in your book, once things get slightly more complicated than something simple, the AI gets even weirder.
And that's kind of the area I like to operate in as well. Especially when you get some of these algorithms that can generate text now that, yeah, is a grammatical sentence, and furthermore, these sentences follow from one another now--like, three years ago that was not the case, now it can do that, but the sentences can tend to be boring, as what it is trying to do is be as predictable and unremarkable as it possibly can and try and blend in to the pile of internet text it's read. So getting something that's actually interesting to read or has something new to say or puts it in some interesting way, that actually requires poking at the algorithm in such a way that it begins to act non-human again, and you can start to see this, oh, yeah, it does not understand this part of the universe.
Figure courtesy of Voracious.
Or you can ask it lots of questions. [For example, you can ask it] who the first president of the United States was. You can ask it all of these fact-based questions, and it does pretty well on these trivia-type questions--it sees a lot of those on the internet--and then you ask it, "How many eyes does a horse have?" and it will answer, "Three. Two in front and one in back." [laughs] Or it will say, "Four. Two on the outside and two on the inside." This thing is making up facts. When you ask for the right sort of facts in the right sort of way, you can see that it's not really concerned with being correct; ... it's been trained to sound correct, and that is what it's been rewarded for. So you get very well-phrased, elegant sentences and complex vocabulary all put together very well and it can be imparting utter nonsense. Like you could ask it about the evolution of whales and it would sound very good--if you knew nothing about the evolution of whales. Like, wait a second, dolphins do not live in the desert.
You can catch it out with these obvious mistakes sometimes, but most of the time it's a less obvious one, like, oh yeah, [whales evolved] 50 million years ago. But whales have not been around that long, it's more like 10 million, but you'd have to know about whales. It's made-up facts. [To the AI,] 10 million years ago is almost the same as 50 million years ago; there's only one little character different, so it is 95% correct even though it is actually quite incorrect.
So in effect, when it's doing something like that, it's just looking at characters within the sentences and how it thinks they ought to go together, and not really understanding per se what it's saying.
Yeah. It's being rewarded for getting certain things right. It's not being rewarded for getting facts correct about the universe. It's being rewarded for predicting which characters come next in a sentence, so if it's guessing most of the sentence, and it really closely matches its training data, one little character slightly different, oh yeah, great job, really well-rewarded! It's wrong, but it's rewarded for sounding right.
So recently I feel like the definition of "AI" has sort of changed, or perhaps the predominant definition of AI. It used to be AI was the sci-fi version where it was Roy Batty, it was some kind of "more human than human" kind of creature, or it was a Terminator if you wanted to go that way. And now, in the last 10 or 15 years, it seems to have caught on as a sort of really advanced data analytics. Do you think that that's the kind of AI we have now and that's what will predominate, or do you see any future to trying to create a human-like AI?
Figure courtesy of Voracious.
I would say it's weird because we have both definitions going on at the same time, and ... that can be a bit confusing when you say, "Ah yes, we have used this AI to figure out who deserves a loan from this company" or something, and if you have one picture of AI as the superhuman or human intelligence, you may say ah, it has probably made a good decision, perhaps a superior decision to that of humans, whereas the reality is more like this is a spreadsheet, and it's weighing some variables and coming up with an answer, and it may be dead wrong, and it doesn't know if it's copying biased human decisions, for example. So I think we as a society have already run into some trouble. When people see "AI" and think of something that's very smart, and not realize that this AI is the same sort of technology that's sorting spam emails or doing their voicemail transcriptions, where you can see that it kinda works but it can be buggy, and it can have systematic problems too.
So that's where we are with AI is having these two different definitions at once, but having the really simple algorithms in practice, and from what I've seen in talking to other people in the field I'm of the opinion that we're going to have the real simple AI for the foreseeable future. I think that we're not giving biological brains and human brains enough credit for just how complex they are, and we're in the habit of counting neurons and comparing those to our number of virtual neurons, while ignoring a lot of the complexity that's in human neurons. Like each individual human neuron is actually like several of our artificial neurons put together, and there's this chemical signaling that's going on at different levels, and all this other stuff that we don't even understand yet in biological brains. We've had this sort of repeated optimism, and reevaluation, and more optimism when we think about what computers are going to be able to do, and we're not great at figuring out what's going to be an easy task and what's a complex task--in part, I think, because in our society, we've got this weird valuation of, say, ah, well, chess, that is a very complex task, only done by the very well-educated complex people, so we are astounded a computer can do this, and then we think, ah, but, answering the phone, why, that is not a complex task, we don't pay people very much for that at all, so of course a computer can do that. And then, no, computers cannot answer the phone very well at all, despite there being a lot of financial incentive to design them to do that. Same thing with cleaning your house. We just sort of, kind of got vacuuming down, but you still have to rescue the Roomba from the closet or whatever. ...
We've got people looking toward that future, whether it's a few years away as some people are thinking, or many decades or many hundreds of years away, like science fiction level future, as I'm more inclined to think. Yeah, we do have people kind of looking at trying to build something more general. I think trying to, if we're talking science fiction, if we're talking morality, if we're talking about a machine intelligence that's self-aware like a human, or as intelligent as a human, I think that would be a bad thing to do, because what would be the motivation for doing that? Somebody wants to profit off of it, or study and research, which doesn't sound very fair to this person that's just been created. So I'm sort of relieved that I don't see any of this kind of intelligence on the horizon, because I do think it would be a bad thing to make.
As it is, we've got lots of biases that we see in this kind of AI that we're doing now, and that's just based on what humans are doing, so if we were to create one that was more human-like, it would probably have those biases built in, or it would have the morals or understanding of whoever built it, which might or might not be a good thing.
Figure courtesy of Voracious.
Yeah. If we're talking fully human level, then it would also have some capacity to learn and correct its own biases, whereas today's algorithms have no hope of doing that; they can be told to do the most ludicrous things, and won't say, hey, wait a minute ...
Yeah, that's true. So, a lot of companies kind of blur that line in their advertising, like IBM, when you see advertisements for Watson, it is always talking just like a person, or even Alexa or Siri or something like that, they always have them answering people in a completely human-like manner and they never show any mistakes happening. Do you think AI is becoming a marketing gimmick, like "our thing uses AI" even if it kinda doesn't?
Yeah, I think there are definitely some places that are taking advantage of the sort of blurry idea of what AI is, because AI sounds cool and futuristic, and it might be better than a person--[but] you know the AI might be a really simple algorithm that isn't what a computer programmer would call AI or machine learning. ... So we're definitely seeing ... we're seeing people who may be genuinely using AI but are kind of overhyping the capabilities of this AI. You'll see things in advertising copy like, "This thing is programmed to be unbiased" or "We don't tell it anything about race, so how could it possibly be biased?" Well there's a lot of ways for it to figure out how to emulate human biases without knowing race. It could know ZIP codes, for example. And so you get a lot of marketing copy that doesn't really dig into that, or makes claims about being infallible without doing the research to find out if there's bias or if there's some kind of systematic mistakes in there. You definitely see that.
There's another class of algorithms, usually these sorts of things are not released by tech companies, but you'll get some robotics company that is like, "Here is our humanoid robot that is now a citizen of Saudi Arabia, and they answer these deep questions about humanity." And if you talk to the people who work there, you can talk to writers, and they're like, "Oh yeah, this is scripted, it's a puppet, you know. There is no AI, this is just a performance puppet. Everyone knows that, right?" And I've definitely sat through lectures where people did not know this, and it is sort of surreal to see how worried they are that general AI is here already and that this puppet deserves rights, and kind of not realize that this is a game someone is playing with them.
Right. It's funny, I can think of lots of science fiction stories that do similar things. There's an anime I've watched [
Macross Plus] where something's presented as a fully-fledged human-like AI, but really there's a person behind it.
Yeah, you see Wizard of Oz sort of AIs sometimes. And companies have gotten in trouble with that; it's easier than ever. From the outside, the function to have an actual AI versus connect to the cloud and hire a remote human worker to do the thing, they can look pretty similar, and you get down to one level, and this gets routed to a random human, and you get companies who are in trouble because their customers don't realize that humans are seeing the data they are feeding to the supposed AI. You get crowd-sourced workers who are saying, "I just saw so-and-so's SSN," or "I just saw someone's bank account information. I don't think we're supposed to be seeing this." So that is a problem, too.
And I think it's dangerous to blur the lines in customer service as well. That's another [area] where I see this ... you're chatting with somebody; are you chatting with a human? Are you chatting with a bot? It's not going to tell, but that kinda opens the door to people accidentally cussing out a human employee who is trying to help, and the customer thinks it's a bot and gets mad and treats it badly, and the human's going, hey, I'm a human here. So yeah, I think that blurring the lines can be a problem. It can be fun to, if you know you're interacting with something that's a bot, to sort of play around, to see what you can get it to do, so I don't think that having people be 100% serious and never mouth off to a bot is the right approach, but it should be crystal clear what is what, what's human and what's not.
So going back to the kinds of biases we put in, sometimes without even realizing it, how easily do you think that can be "fixed"?
There are a few different approaches you can use to sort of look for bias and to correct it once you find it. A lot of places don't go with Step One, which is look to see if it's there. They're like, oh, well, we didn't tell it about race or gender, so how could it be biased? And these algorithms are so sneaky about trying to figure out any information, any correlations that are going to help them imitate the humans, because that's their goal, is to imitate human decisions or to predict human decisions. So they [might] uncover some signal, like, "I don't know why, but we should always score this group of humans less than this other group of humans, and we'll really be predicting how humans behave," and they are therefore rewarded for implementing racism.
Since these algorithms are so sneaky about finding those subtle signals, you can see the unfair distributions of decisions happening in a lot of different algorithms, so one way [to deal with it] is to just kind of expect it. You're training this on human data, it's going to pick up human bad stuff too, and we need to look for it. And assume that [bias] is probably there and run a bunch of test decisions, for example, to see what it does. There are ways [to correct it], once you discover a bias, depending on what kind of a problem you are trying to solve. There may be ways, for example, to then go in and say, "OK, we've trained it on loan decisions from the human world, but we know that in a fair world these two groups would have gotten equal numbers of approvals, so we are going to edit the data so that these two groups have equal numbers of approvals, and train our algorithm on that."
Figure courtesy of Voracious.
There are other cases where it's a lot more difficult to tease out the subtle sources in the data. If you're thinking of hiring algorithms, for example, Amazon recently released kind of a case study of an algorithm they were working on that they ended up not rolling out because it found just so many ways to discriminate against the resumes of women. Even if they weren't telling it that this was from a female candidate, it would go in and look at extracurriculars, it would look at subtleties of word choice, and it was just really hard to stop it from copying that particular signal that it was seeing so strongly from human behavior it was trying to imitate.
Is there a way to very easily look into that black box of what goes on when it's figuring out these things?
Sometimes. It depends on the algorithm. There may be a way to get some kind of intermediate decisions output, but in a lot of cases, your main source of feedback is just test it on a whole bunch of real-world scenarios and see if there are any trends. So for a resume sorting algorithm, for example, give it resumes of candidates from different backgrounds and see if there are trends that shouldn't be there and how it's rating those resumes.
What do you think the future of the AI we're working on now is? Will there just be deeper and deeper neural networks as computers become more powerful, or do you think there are different kinds of machine learning that are going to become more important?
I think what we'll be seeing as the most useful implementations, and what are becoming more ubiquitous, are the kinds of AI applications that are really specialized and really narrow. In fact we're seeing them already in our cell phones. You look at your cell phone one day and now it can do voice transcription. Well, in the past you used to have a human do that to do a decent job and now, the computer's not perfect, but you can kind of get the gist, and maybe you don't have to listen to the voice mail yourself, and that, I think, is going to get better and better. Or our phone cameras are going to get better and better; a lot of the cameras in our phones right now, they are a lot better than they have any right to be based on the optics themselves, but we have a lot of software that knows what a good photo should look like, and will alter your photo so that it is focused in the right spot, or so that it has the right sort of balanced lighting, and this is all machine learning doing the heavy lifting. So I think you'll be able to pretty soon hold up your phone to a scene and just kind of wave it around and it will say, "Oh, I think I know what you were trying to take a picture of, and I have taken the liberty of composing your shot, all in the rule of thirds, with a nice foreground interest. Or you didn't have enough foreground interest, so I've taken the liberty of Photoshopping in a few artfully placed rocks so as to draw the eye ..."
We've added a giraffe to this image ...
Figure courtesy of Voracious.
[laughs] Yeah, we've added giraffes. Oh, I would totally buy a spontaneous giraffe filter for my phone.
Google's already part of the way to this. If you do a Google search for a random animal, and it's like, oh you're on your phone. Would you like to have that animal projected into your room with you, and yes, I actually would like to see a tiger standing next to me, that would be cool. It would be cool to see that, or like, ancient animals, or ancient plants and stuff. ... They could do more and more of this. And each one of these is going to be a really specialized sort of algorithm. So you have one algorithm figuring out where your room is, and how everything's laid out in 3D, so it can figure out where to put the tiger, or will a brontosaurus fit, and if not, what are we going to do here? And we have other algorithms figuring out lighting ... and so there's a lot of individual algorithms that are going to be individually getting better and better. AI is still, at the end of the day, a narrow intelligence, narrow artificial intelligence. It's best at narrow problems. So I don't think we're going to have AI solving any big problems, like content moderation, or how do we use AI to fix society's problems. No, I can get AI to add sparkles to my face or something, or maybe compute me a better route, or make certain routine tasks a lot easier. But yeah, we're going to have a better toolbox, but we're still going to need human supervision.
So there won't really be a sort of master algorithm that can look at all the other sub-algorithms and coordinate them or something?
Yeah, I think that'll be hard to build.
So I was going to ask you if AI could write passable short stories and things like that, but I think we've kind of already answered that ...
It depends on your definition of passable. Like if you want something to generate a really boring conversation or like an extended fight scene, as long as you don't mind that, like, Captain Kirk might show up and make a cameo, or that it's not going to sound anything like your voice, or. ... It's just tough to control right now. You can get a short story structure out right now, but it's not very interesting or fun to read. So I really do think the sort of area that shows the most promise is the cracks where the text does not appear very human-like, but is weird in interesting and unexpected ways.
Janelle Shane writes about AI weirdness at aiweirdness.com and on Twitter at @JanelleCShane. Some of her most popular experiments include computer programs that try to invent recipes, paint colors, cat names, and candy heart messages. Her book,
You Look Like a Thing and I Love You, is widely available (and excellent reading!).