AI Basics · Part 3

Generative AI, Mapped Out — Text, Images, Video, and Audio: What Exists and What to Use

May 18, 2026 · AI Note Lab

The five branches of generative AI and their flagship services
The five branches of generative AI and their flagship services

"I hear AI can draw pictures and make videos now — but what do I actually use, and where?" It's the question I get asked most often. In this post, I'll draw a full map of generative AI, organized by the kind of content it creates, along with the leading services in each field.

① Text Generation — The Most Widely Used Field

This is the field that writes, summarizes, translates, and even codes. The tools people casually call "AI chatbots" all live here.

ServiceMade byWhat stands out
ChatGPTOpenAIThe most famous, with the broadest user base and a wide range of features
ClaudeAnthropicStrong at handling long documents and writing naturally
GeminiGoogleStrong integration with Google Search, Gmail, and Docs

Use cases: drafting emails, summarizing reports, translating, outlining blog posts, writing code. If you work an office job, this is where you'll feel the impact most.

② Image Generation — Painting with Sentences

Type a sentence (a prompt) like "a cat walking along a beach at sunset, watercolor style," and it produces the picture for you.

Use cases: blog illustrations, presentation visuals, concept sketches, logo ideas. For commercial use, always check each service's licensing policy first.

③ Video Generation — The Fastest-Moving Field

Feed it a sentence or an image and it produces a short video. This field has advanced faster than any other over the past year or two, led by OpenAI's Sora, Google's Veo, and Runway. For now it's better suited to clips of a few seconds to a few dozen seconds than to long-form video, but real-world use has already begun in ads, music videos, and product prototypes.

④ Voice and Music Generation

Use cases: YouTube narration, podcasts, background music. That said, voice cloning raises real abuse concerns (voice-phishing scams and the like), and regulators around the world are actively debating how to handle it.

⑤ Code Generation — The Developer's New Colleague

This field writes, fixes, and explains programming code. GitHub Copilot, Claude Code, and Cursor are the flagship tools, and things have progressed to the point where "describe what you want in plain words and it builds the app, no coding knowledge required." These days that approach even has a name: vibe coding.

Not sure where to start? I recommend this order: ① text (the free tier of ChatGPT or Claude) → ② images (DALL·E inside ChatGPT). One account lets you try both fields.

Today's Takeaways

In the next post, we'll cover the skill that determines output quality no matter which generative AI you use: writing prompts.

← Previous · How LLMs Work