Explore types of generative AI models like GANs, variational autoencoders, and transformers, and learn about the different types of content they're capable of generating.
![[Featured Image] A person sits at a desk and works on their laptop using generative AI to create content for their website.](https://d3njjcbhbojbot.cloudfront.net/api/utilities/v1/imageproxy/https://images.ctfassets.net/wp1lcwdav1p1/7yiPgoWB1FzwsBl34YJLN9/d90eff87ebb9f53905a1c8f095be9a8c/GettyImages-1513343950.jpg?w=1500&h=680&q=60&fit=fill&f=faces&fm=jpg&fl=progressive&auto=format%2Ccompress&dpr=1&w=1000)
There are two ways to think about the different types of generative AI: by the type of model and by the type of content each one can generate.
Generative AI models each employ a distinct architectural approach to achieve the same goal: generating original content.
The type of generative AI model you use will depend on which type of content you want to generate.
Explore the diverse types of genAI models and the types of content they can produce. Afterward, strengthen your genAI knowledge with Google Cloud's Generative AI Leader Professional Certificate, where you'll learn how to identify the core layers of the genAI landscape: infrastructure, models, platforms, agents, and applications.
GenAI models refer to any AI system that can create or generate new content. The term itself is a broad category with each specific model representing a unique technical approach. Four common types of generative AI models include:
A generative adversarial network (GAN) consists of two dueling neural networks: the generator and the discriminator. The generator aims to create fake content indistinguishable from training data, and the discriminator seeks to detect the fake. The two neural networks go back and forth until the generator wins, or the discriminator can’t tell the difference between the fake item and the training data.
A variational autoencoder (VAE) encodes or compresses data into a simplified version that contains the most critical elements and omits details. After encoding, the decoder rebuilds the original data set, generating new details around the most essential elements. This introduces a bit of randomness that helps create unique items or variations on the original input.
A transformer model is a deep learning model that understands text by breaking it down into tokens, which are small components of text that can contain a character, a part of a word, or a short phrase. The model then converts the tokens into numerical vectors and analyzes how they relate to each other. Transformers also have a self-attention mechanism that enables them to understand the relative importance of certain words in a sentence compared to others. Large language models (LLMs), like Chat-GPT, use transformer-based models (GPT stands for “generative pre-trained transformer”).
Diffusion models are generative AI that add noise (random sets of data points) to the input to distort the data, study how that process alters the data, and then rebuild a reverse-diffused original version of the input data. This process helps the AI model understand how the data elements relate to one another. After training, the model can utilize what it has learned about the patterns in its training materials to generate content that meets the request of your prompt.
Beyond the common types we outlined above, there are two more technical GenAI models that require a deeper understanding of machine learning principles:
A flow-based model learns to transform simple, random data into complex, realistic content through a series of reversible steps. Think of it like following a recipe that you can also run in reverse—the model learns each step of transforming random noise into a final image or piece of content. Because each step is reversible, it can also work backwards from the final result to understand exactly how it was created. This reversibility enables flow-based models to generate new content while also calculating the exact likelihood of that content occurring, making them useful for applications where understanding probability is crucial.
An autoregressive model generates content one piece at a time, using what it has already created to decide what comes next. Similar to how you might write a story by looking at the previous sentence to determine the next word, these models build content sequentially, with each new element depending on all the elements that came before it. Many popular AI systems employ autoregressive generation—for example, ChatGPT generates responses one word at a time —and some image generators create pictures one pixel or section at a time, always considering what they've already generated.
Another way to think about types of generative AI is to consider the main task a model can accomplish. Some generative AI models can generate multiple types of content, but others have limitations in what they can provide or excel in specific use cases. Industries use these across different applications.
Some of the types of content GenAI can generate include:
Text generation: Generative AI can help you produce text explaining concepts or answering questions. Generative AI can also create drafts of documents or assemble pieces of documents, like outlines or citations. Examples of models you might use to create text include ChatGPT, Google Gemini, or Perplexity AI.
Image generation: You can use generative AI to create, alter, and produce images in various styles. You can even prompt AI to employ a specific medium, such as sketching, photorealistic images, or an oil painting. You may use an AI model like DALL-E 2 from OpenAI, Midjourney, or Stable Diffusion from Stability AI to create images.
Music and audio generation: You can use generative AI to create audio data and music. Just like with text or images, generative AI can detect and understand the patterns in music and create similar, original works. You can also use generative AI to make an audio speech file based on a text prompt. Two examples of models you may use for this task include Aiva AI and Soundful.
Video generation: Generative AI can create videos, animations, and special effects. The functionality of the AI you use depends on what material was used during training. So, for example, an AI trained in video editing could help you add special effects, while other algorithms can create original videos. Examples of video-generation AI apps include Canva, DeepAI, and Invideo.
Code generation: You can use generative AI to create code from scratch or to autocomplete code as you write it. The documentation for many programming languages is available online, making it possible for developers to use this data to train fluent AI models in these languages (which is similar to how AI models can learn multiple human languages). Examples of AI apps you may use to generate code include Google’s Gemini, Vertex AI, and Sonar.
Learn more: 7 Generative AI Use Cases
Whether you want to develop a new skill, get comfortable with an in-demand technology, or advance your abilities, keep growing with a Coursera Plus subscription. You’ll get access to over 10,000 flexible courses from over 350 top universities and companies.
Editorial Team
Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...
This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.