Coursera
Analyze & Deploy Scalable LLM Architectures

Gain next-level skills with Coursera Plus for $199 (regularly $399). Save now.

Coursera

Analyze & Deploy Scalable LLM Architectures

LearningMate

Instructor: LearningMate

Included with Coursera Plus

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

2 hours to complete
Flexible schedule
Learn at your own pace
Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

2 hours to complete
Flexible schedule
Learn at your own pace

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 3 modules in this course

This module establishes the foundational mindset that "performance lives in the pipeline." Learners will discover that a large language model (LLM) application is a multi-stage system where overall speed is dictated by the slowest component. They will learn to deconstruct a complex Retrieval-Augmented Generation (RAG) architecture, trace a user request through it, and use system diagrams to form an evidence-based hypothesis about the primary performance bottleneck.

What's included

2 videos1 reading2 assignments

In this module, learners move from hypothesis to evidence. They will learn to use system logging and profiling data to quantify the precise latency contribution of each stage in an LLM pipeline. The focus is on designing small, reversible, and hypothesis-driven experiments to prove or disprove their initial findings and distinguish a performance bottleneck's root cause from its symptoms.

What's included

1 video2 readings2 assignments

This module bridges the gap between a working prototype and a resilient, production-ready service. Learners will design and manage declarative deployments using Helm and Kubernetes, package a multi-component RAG stack, and implement Horizontal Pod Autoscaling (HPA) for dynamic, cost-efficient scaling. They will also master the critical operational skills of performing controlled, zero-downtime rollouts and rapid rollbacks.

What's included

2 videos2 readings2 assignments

Instructor

LearningMate
Coursera
78 Courses1,109 learners

Offered by

Coursera

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.