In this course, you will investigate the internal workings of transformer-based language models by exploring how embeddings, attention, and model architecture shape textual outputs. You'll begin by building a neural search engine that retrieves documents through vector similarity then move on to extracting token-level representations and visualizing attention patterns across different layers and heads.

As you progress, you will analyze how tokens interact with each other in a large language model (LLM), compare encoder-based architecture with decoder-based architectures, and trace how a single word's meaning can shift from input to output. By mastering techniques like plotting similarity matrices and identifying key influencers in the attention process, you'll gain insights enabling you to decode model behaviors and apply advanced strategies for more accurate, context-aware text generation.

You are required to have completed the following courses or have equivalent experience before taking this course:

  • LLM Tools, Platforms, and Prompts
  • Language Models and Next-Word Prediction
  • Fine-Tuning LLMs
  • Language Models and Language Data
 

How It Works

Course Length
2 weeks

Effort
6 to 8 hours of study per week

Format
100% online, instructor-led
  • Engineers
  • Developers
  • Analysts
  • Data scientists
  • AI engineers
  • Entrepreneurs
  • Data journalists
  • Product managers
  • Researchers
  • Policymakers
  • Legal professionals
Get It Done 100% Online
Our programs are expressly designed to fit the lives of busy professionals like you.

Learn From cornell's Top Minds
Courses are personally developed by faculty experts to help you gain today's most in-demand skills.

Power Your career
Cornell's internationally recognized standard of excellence can set you apart.

Request Information Now by completing the form below.

Act today—courses are filling fast.