BTS on GPT-3: How does it actually work? Why all the AI hype?
Have you heard about the hype but don't really know what's going on behind the scenes? Maybe you've played around with the Playground, or the API? Or just seen some funny snippets of "AI generated Seinfeld" on Twitter. No matter your background, I hope I can help shed some light on what's actually going on, how it's different from what came before, and whether any of it matters for you.
This is a talk I gave summer of 2020 to a small group of enthusiasts, investors and founders when GPT-3 first came out. It goes into the nuts and bolts of what GPT-3 is derived from, how it's different, and finally what it means for AI. It's designed to be palatable for engineers, scientists, artists, savvy investors, and anyone in between. All of the more recent work with Dall-E still relies on the same building blocks too, so this will remain relevant and informative for some time.
For background, I worked at OpenAI back in 2016, and contributed some to the line of work that eventually turned into GPT-3, 4 years later. Since then, I've remained active in the research community, but have shifted to computer vision work for the most part.