I remember being in my CMPUT 466 Machine Learning class in Fall of 2011, when the prof started explaining deep learning. For a brief shining moment it felt like I had understood how deep learning worked… and then the math and the understanding largely abandoned me. Despite getting an A- in that course, I never felt confident in the area.
My interests of course were drawn elsewhere, but I had many opportunities to explore machine learning in various forms. Fairly consistently though, where the opportunity was machine learning I turned it down, repeatedly. For whatever reason, deep learning and its applications never spoke to me, and never really attracted me. Much of it felt like smoke and mirrors -- part of this was watching so many projects consume large amounts of resources, only to fail to find deployment. The places where it seemed to work never stood out to me. I am absolutely certain that to practitioners deep learning models felt revolutionary, I didn't see it myself and so didn't feel compelled to pay attention.
ChatGPT, DALL-E and Midjourney have forced me to acknowledge: We have crossed some sort of Rubicon with these large model technologies. I no longer have the option of ignoring them.
Yet, despite knowing that I have to pay attention… I have struggled mightily to form coherent thoughts here: since I haven't paid attention, I feel a bit like Rip van Winkle, awaking after a twenty years into a future I barely understand. There are so many dimensions here that it's hard to figure out what to think on any of them -- certainly there's many ways in which the dimensions cross.
I want to write down some thoughts about all of this (like every other nerd on the internet), so expect a few blog posts on this subject over a little while. Expect me to alternate hugely between wonder and loathing. Talking this over with a friend, one thing that stood out hugely in our conversation: You can draw wildly different conclusions about this technology depending on whether or not you start from the presumption of capitalism or not. This probably true of all automation technology, but it's pretty clear already that the image generation technology is going to put some artists out of work, and in world where these artists need to make art to eat, that's an upsetting outcome.
I have much to wrestle with, and it's challenging to sort through my thoughts on this. I think the best way for me to organize myself on this is to divide this initial thinking into two pieces: First, I will cover image generation using tools like Stable Diffusion, DALL-E and Midjourney. Next time I will write about Chat GPT.