Video: What Does A.I. Have To Do With This Selfie?
[MUSIC PLAYING] NAT: So if you're using the internet these days, you've probably seen people posting these photos and videos that look like works of art by famous artists. Seeing this stuff made me want to know how is this happening. So I read these research papers, talked to some people at Google also doing research in this field, and this is what I found out. The way that photo filters typically work is you're just adding a new layer on top of your image that might have some new colors or new textures in it. But these artsy filters are so much more amazing and so much more complicated than that.
In fact, they're not even filters. What's happening is something called style transfer. When you use one of these apps to make your photo look like Van Gogh's "Starry Night," what's happening is a machine-learning algorithm is taking your photo, separating out its content, and at the same time taking "Starry Night" and separating out its style, and then making a brand-new digital image that's a combination of these two things. How does it do this exactly? How does it know what's style in one image and content in another? The short answer is deep neural networks. Deep neural networks are based very roughly off of how our own brains work. Our brains have individual neurons arranged in layers, and deep neural networks have little mathematical functions arranged in hot dogs.
I was just seeing if you were paying attention there. They are also arranged in layers. And when these layers of math neurons work together, they can do some pretty cool stuff– like recognize speech, and know what this handwriting says, and find all the dog photos I've ever taken. And about a year ago, people started experimenting with using the kind of neural nets that can recognize dogs, and sunglasses, and pizzas, and tons of other objects to make style transfer images like these. This inspired lots of other coders and artists to try out their own experiments– like this one from our friends at work that uses 3D imagery from Google Maps and this one from a few research scientists at Google that works with a webcam and combines styles from multiple paintings. So depending on what app you're using or what experiment results you're looking at, the details of how they do their thing are going to be different. But in general, here's how style transfer works– which means more talk about deep neural nets.
First, a deep neural net is called "deep" because it has many different layers of neurons, and each layer is good at detecting different features and patterns. So for example, when a neural net looks at this photo of my dog and tries to figure out what it is, a layer at the bottom might just be able to recognize edges. The next layer up might be good at figuring out where those edges meet to form corners or curves– maybe the next layer, basic shapes. Because as you move up through the layers, they become better at recognizing more sophisticated things. So the very top layers might be able to things like is there a dog nose or a dog face anywhere in this image. What features the different layers detect isn't something that's hard-coded in by an engineer.
It's just that the neural nets have learned to detect them based on lots of examples. So if I wanted to make that same photo have the style of "White Zig Zag," I pass it through the neural net, and the features the higher layers detect are going to be best at representing its content. I can do the same thing with that piece of art. And the lower and middle layers are going to be better at representing its style because the features they pick up on look more similar to things like brush strokes, colors, and textures. Then to take that style and combine it with this content. There's different approaches, but one way is to take a random pixely white noise image and keep adjusting the pixels until you can pass it through the same neural net and get lower-level features similar to the style image and higher level features similar to the content image. But of course, when you're using one of these apps or someone's experiment online, you upload a photo. Pick a style, give the math anywhere from a few nanoseconds to a few hours to do its thing. And then– voila– style has been transferred.
I am very curious to know what you think about style transfer and what you think it could be used for in the future. So please leave a comment below with your thoughts, and thanks for watching. [MUSIC PLAYING].