What is Generative AI?

What is Generative AI?

In previous courses, you learned what AI is and how to use it in everyday life. Now we're diving into one of the most exciting subfields: generative AI, artificial intelligence that creates new content. Text, images, videos, music, code, 3D models, generative AI can produce all of these, and the quality has reached levels that were unthinkable just a few years ago.

But what exactly does "generative" mean? And how does generative AI differ from other forms of artificial intelligence? In this lesson, you'll get the overview you need to fully understand the following lessons in this course.

What does "generative" mean? The word "generative" comes from the Latin "generare": to produce, to create. Generative AI refers to AI systems that can create new, original content that didn't exist in the training data. Unlike analytical AI, which classifies existing data or recognizes patterns, generative AI creates something new. When you ask ChatGPT to write a story or have Midjourney generate an image, you're working with generative AI.

Generative vs. Discriminative Models

To truly understand generative AI, it helps to draw a distinction. In AI research, we differentiate between two fundamental model types:

Discriminative Models

Analyze and classify existing data. They answer the question: "What is this?"

Examples:

  • Spam filter: Is this email spam or not?
  • Image recognition: Is there a cat or a dog in the photo?
  • Credit assessment: Is this application high-risk?
  • Medical diagnostics: Does the scan show an anomaly?

Discriminative models make decisions about existing data but do not generate new content.

Generative Models

Create new data that resembles the training data. They answer the question: "What could exist?"

Examples:

  • ChatGPT/Claude: Generates new text that sounds human
  • DALL-E/Midjourney: Generates new images from text descriptions
  • Sora: Generates new videos from prompts
  • Suno/Udio: Generates new music pieces

Generative models learn the structure and patterns of training data and can create something new from them.

What Content Can Generative AI Create?

The range of generative AI today is enormous. Here are the most important content types that AI systems can produce in 2026:

Text Generation

Large Language Models (LLMs) like GPT-4o, Claude, Gemini, or Llama generate all kinds of text: articles, emails, summaries, translations, creative writing, analyses. Quality in 2026 is so high that AI-generated text is often indistinguishable from human-written content.

Status 2026: Texts up to several thousand words in consistent quality, support for over 100 languages, contextual memory across long conversations.

Image Generation

Models like DALL-E 3, Midjourney v7, Stable Diffusion 3, and Flux generate photorealistic images, illustrations, art, and designs from text descriptions. From product photos to artistic paintings, the quality now surpasses many professional stock photos.

Status 2026: Photorealism at professional level, consistent characters, reliable text rendering in images, resolutions up to 4K+.

Video Generation

Text-to-video models like Sora, Runway Gen-3, Kling, and Pika can generate short video clips from text descriptions. Development has been rapid: In 2023, results were barely usable; in 2026, they're production-ready for many use cases.

Status 2026: Clips up to 60 seconds in HD, improved physical consistency, basic camera control, early progress on longer narratives.

Audio and Music Generation

AI can generate speech, music, and sound effects. Text-to-speech (TTS) models like ElevenLabs produce natural-sounding voices. Music generators like Suno and Udio create complete songs with vocals in various genres.

Status 2026: TTS nearly indistinguishable from real voices, personalized voice clones, complete songs with arrangements and vocals in studio quality.

Code Generation

AI assistants like GitHub Copilot, Claude Code, and Cursor write functional code in dozens of programming languages. They can generate entire functions, tests, documentation, and even complete small applications.

Status 2026: Reliable generation of functions and classes, context-aware suggestions, automated bug fixing, agent-based development of entire features.

3D Generation

Newer models can generate 3D objects and scenes from text or images. A game-changer for game development, architecture, and product design. Examples: Meshy, Tripo, Luma AI.

Status 2026: Usable 3D models from single images or text descriptions, automatic texturing, early progress on animated 3D characters.

How Does Generation Work? (Simplified)

Without diving too deep into the mathematics: generative models learn the statistical structure of their training data during training. They build an internal model – a so-called latent space – that maps the essential features and relationships of the data in compressed form.

When you then enter a prompt, the model navigates through this latent space and samples (selects) new data points that match your request. For language models, this means: token by token, the next most likely word is chosen. For image models, an image is gradually constructed from noise. The result is slightly different each time, that's why you get different results with identical prompts.

Milestones of Generative AI

The development has been exponentially fast. Here are the most important milestones:

  • 2020 – GPT-3: OpenAI releases GPT-3 with 175 billion parameters. For the first time, machines can generate coherent, convincing text. The world is amazed, but access is limited.
  • 2021 – DALL-E & Codex: AI generates usable images from text for the first time. Codex (basis for GitHub Copilot) shows that AI can also write code.
  • 2022 – The Breakthrough Year: Stable Diffusion makes image generation open-source. Midjourney delivers artistic images. DALL-E 2 goes public. And in November 2022, ChatGPT arrives and changes everything. 100 million users in 2 months.
  • 2023 – GPT-4 & Multimodality: GPT-4 shows massively improved capabilities. Claude 2 and Gemini enter the arena. Models become multimodal (text + image).
  • 2024 – Video & Audio: Sora is announced, music generators like Suno gain popularity. Open-source models (Llama, Mixtral) catch up to the leaders.
  • 2025–2026 – Agents & Integration: AI agents can independently complete multi-step tasks. Generative AI is integrated into operating systems, office suites, and industry-specific software. The generative AI market exceeds $100 billion.

Economic Significance

Generative AI is no toy. It's transforming entire industries. McKinsey estimates the economic value of generative AI at $2.6 to $4.4 trillion per year. Most affected: marketing and creative industries, software development, customer service, education, healthcare, and legal consulting. Companies that adopt generative AI early and wisely gain a measurable competitive advantage.

What is the core difference between generative and discriminative AI models?
Correct! Generative models create new data (text, images, videos), while discriminative models analyze and classify existing data (e.g., detecting spam, categorizing images).
Not quite. The core difference lies in function: generative models create new content (text, images, videos), while discriminative models analyze existing data and sort it into categories. Both can use deep learning and exist in various sizes.
Key Takeaways:
  • Generative AI creates new, original content: text, images, videos, audio, code, and 3D models.
  • Unlike discriminative models (which classify data), generative models create something new from learned patterns.
  • Generation works through a latent space where the model samples new data points that match your prompt.
  • Development has been explosive: from GPT-3 (2020) to ChatGPT (2022) to AI agents and integration into everyday software (2026).
  • The economic impact is enormous: generative AI is fundamentally transforming industries like marketing, software, education, and healthcare.