Overview and Courses

From data to decision, R is quickly becoming one of the most popular and effective programming languages of data science.

In this program, you’ll apply data science tools to the collection of data and the translation of data into information, constructing models that can be used to address the questions that you're investigating. You’ll have the opportunity to apply data analytics as a four-part process: gathering data, looking for patterns in that data, finding insights in any patterns you discover, and using those insights to make decisions. This process does not make decisions for you, but it will help you to better understand the effects of the decisions you might make. Through an examination of real-world data sets and different modeling techniques, as well as an in-depth look at how the programming language R can be used to help you find patterns and derive insights, you will gain valuable experience working in each stage of the data analytics process, helping you and your organization to make better decisions – and gain a sound scientific understanding of why you're making the choices you're making.

In order to be successful in this program, you will need to have experience in any programming language, prerequisite knowledge in basic probability and statistics concepts, and college-level calculus.

The courses in this certificate program are required to be completed in the order that they appear.

Understanding Data Analytics

By some estimates, 90% of the data that has ever existed has been created in the last two years. This is a staggering figure and has given rise to new challenges and opportunities in almost every industry: what kind of data do you need to collect to compete, and how can you make sense of it once you have collected it? As technology evolves and the volume of data increases, how can you make the best use of all this information? How can you use the data to help drive your decision-making? How can you make data work for you? How can you ensure your data accurately reflects the population in which you're interested?

In this course, you will determine the types of engineering and business questions you can answer, the kinds of problems you can solve, and the decisions you can make, all through using data analytics. You will explore best practices for collecting information so that you can make informed predictions, develop insights, and better inform organizational decision-making. You will see real-world examples that demonstrate how those tools work. Additionally, you will have a chance to apply some of the concepts to your own work. You will explore best practices for sampling and examine how different types of sampling are each suited for different situations. Finally, you will see real-world examples that demonstrate how those tools work and have a chance to practice sampling techniques in some case study scenarios.

View Course Start Dates Hide Course Start Dates

Apr 22, 2026
Jul 1, 2026
Sep 9, 2026
Nov 18, 2026

View Course Details

Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis

Visualization is one of the most simple and effective ways to find patterns in data. These patterns include: What is the general range and shape of the data set? Are there any clusters of observations? Which variables correlate with each other? Are there any obvious outliers?

As your data set grows in terms of the number of data points and variables, however, it becomes increasingly difficult to visualize all this information at once. At most, you can plot data points on a three-dimensional axis and add further distinctions of size, color, shape, and so on. Yet this can easily become too busy and difficult to read. How, then, do we find patterns in really big data sets?

In this course, you will explore several powerful and commonly utilized techniques for distilling patterns from data. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following course or have equivalent experience before taking this course:

Understanding Data Analytics

View Course Start Dates Hide Course Start Dates

Feb 25, 2026
May 6, 2026
Jul 15, 2026
Sep 23, 2026
Dec 2, 2026

View Course Details

Finding Patterns in Data Using Cluster and Hotspot Analysis

When you have large groups of objects, it is often helpful to split them into meaningful groups or clusters. One example of this would be to identify different types of customers so that a company can more efficiently route their calls to a helpline. As a second example, suppose an automobile manufacturer wanted to segment their market to target the ads more carefully. One approach might be to take a database of recent car sales, including the social demographics associated with each customer, and segment the population purchasing each type of automobile into meaningful groups.

Specialized approaches exist if your data contains information that relates to time and geography. You can use this additional information to identify geographical and temporal hotspots. Hotspots are regions of high activity or a high value of a particular variable. These results can help you focus your attention on a particular region where a problem is occurring more than usual, such as the incidence of asthma in a large city. In both cluster and hotspot analysis, the results can help you discover new and interesting features, problems, and red flags regarding the data being analyzed.

In this course, you will explore several powerful and commonly utilized techniques for performing both cluster and hotspot analysis. You will implement these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible and applicable to your work.

You are required to have completed the following courses or have equivalent experience before taking this course:

Understanding Data Analytics
Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis

View Course Start Dates Hide Course Start Dates

Mar 11, 2026
May 20, 2026
Jul 29, 2026
Oct 7, 2026
Dec 16, 2026

View Course Details

Regression Analysis and Discrete Choice Models

A story can play an important role in understanding data. It can help distill complex information into something manageable- something we can think about easily, relate to, and use to make decisions. For many problems that we encounter globally, however, a story that describes what already happened is not enough precision for the job we want to perform. Often, we would like to use available data to make numerically accurate predictions about what might happen in the future. This task requires the construction of mathematical models that are well suited to our real-world problems.

In this course, you will explore several types of statistical models used with data to make predictions. These models bring with them a whole batch of important concerns, such as estimation and validation, that make the entire process into both an art and a science. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following courses or have equivalent experience before taking this course:

Understanding Data Analytics
Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis
Finding Patterns in Data Using Cluster and Hotspot Analysis

View Course Start Dates Hide Course Start Dates

Mar 25, 2026
Jun 3, 2026
Aug 12, 2026
Oct 21, 2026
Dec 30, 2026

View Course Details

Supervised Learning Techniques

Supervised learning is a general term for any machine learning technique that attempts to discover the relationship between a data set and some associated labels for prediction. In regression, the labels are continuous numbers. This course will focus on classification, where the labels are taken from a finite set of numbers or characters. The prototypical and perhaps most well-known example of classification is image recognition. The goal is to take an image (represented by its pixel values) and determine what objects are in the image. Is it a dog? A grapefruit? A stop sign?

There are many practical classification tasks, such as determining whether an individual's financial history makes them high risk for a loan, whether there is a defect in a material based on some sensor readings, or whether a new email is spam or not. These problems share the same basic form and can be solved with many different types of mathematical, statistical, and probabilistic models developed by the machine learning community.

In this course, you will explore several powerful and commonly utilized techniques for supervised learning. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following courses or have equivalent experience before taking this course:

Understanding Data Analytics
Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis
Finding Patterns in Data Using Cluster and Hotspot Analysis
Regression Analysis and Discrete Choice Models

View Course Start Dates Hide Course Start Dates

Apr 8, 2026
Jun 17, 2026
Aug 26, 2026
Nov 4, 2026

View Course Details

Neural Networks and Machine Learning

Neural networks, a nonlinear supervised learning modeling tool, have become hugely popular within the last two decades because they have been successfully applied to a wide range of problems, including automatic language processing, image classification, object detection, speech recognition, and pattern recognition. They are mathematical models that are loosely built up based on an analogy to the interconnected neuron in the brain. They take in a vector or matrix of input data and output either a classification value or an approximation to a functional value. The beauty is that the relationships between the inputs and outputs can be highly non-linear and complex.

In this course, you will explore the mechanics of neural networks and the intricacies involved in fitting them to data for prediction. Using packages in the free and open-source statistical programming language R with real-world data sets, you will implement these techniques. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following courses or have equivalent experience before taking this course:

Understanding Data Analytics
Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis
Finding Patterns in Data Using Cluster and Hotspot Analysis
Regression Analysis and Discrete Choice Models
Supervised Learning Techniques

View Course Start Dates Hide Course Start Dates

Apr 22, 2026
Jul 1, 2026
Sep 9, 2026
Nov 18, 2026

View Course Details

How It Works

Format

All Online

Time Commitment

4 months with 8 to 10 hours of study per week

Learn From Top Minds

Courses are developed by Cornell faculty.

Power Your Career

Gain today’s most in-demand skills to stand apart.

Flexibility Fits Your Life

Learn on your schedule without stepping out of your job.

Small-class Experience

Participate in facilitated discussions and live sessions with industry peers.

Real-world Projects

Apply learnings and insights to your work to make an impact right away.

Personalized Feedback

Enjoy meaningful feedback on assignments from expert facilitators.

Format

All Online

Time Commitment

4 months with 8 to 10 hours of study per week

Learn From Top Minds

Courses are developed by Cornell faculty.

Power Your Career

Gain today’s most in-demand skills to stand apart.

Flexibility Fits Your Life

Learn on your schedule without stepping out of your job.

Small-class Experience

Participate in facilitated discussions and live sessions with industry peers.

Real-world Projects

Apply learnings and insights to your work to make an impact right away.

Personalized Feedback

Enjoy meaningful feedback on assignments from expert facilitators.

View slide #1
View slide #2
View slide #3
View slide #4
View slide #5
View slide #6
View slide #7
View slide #8

Key Course Takeaways

Explore the data analytics process and examine the tools available to improve decision making
Use unsupervised learning techniques to help identify patterns in data and create visualizations to better spot those patterns
Categorize data using supervised learning algorithms
Predict the value of continuous variables with linear regression
Use neural networks to make predictions about new data
Make forecasts from data collected over time and measure their accuracy

Enroll Now

Download a Brochure

Not ready to enroll but want to learn more? Download the certificate brochure to review program details.

Download Now

“

I like to think outside of the box, and this program from eCornell helped me conceptualize how I want to approach data problems going forward. I was able to actually apply new course concepts to my work, rather than simply repeat steps with different values.

‐ Mark T.

What You'll Earn

Data Science Certificate from Cornell Duffield College of Engineering
120 Professional Development Hours (12 CEUs)
Offset 2 Credit Hours when you apply to the Master of Engineering in Engineering Management (M. Eng.) degree in Cornell’s College of Engineering

Start Now

Who Should Enroll

Current and aspiring data scientists
Analysts
Engineers
Researchers
Technical managers

Request Information Now by completing the form below.

Act today—courses are filling fast.

Do you wish to communicate with our team by text message?

I'm most interested in programs about: *

Select Payment Method	Cost
Determine Your Own Course Schedule	$3,750
Learn and Pay as You Go

Data
ScienceCornell Certificate Program

Overview and Courses

Understanding Data Analytics

Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis

Finding Patterns in Data Using Cluster and Hotspot Analysis

Regression Analysis and Discrete Choice Models

Supervised Learning Techniques

Neural Networks and Machine Learning

How It Works

Key Course Takeaways

Download a Brochure

What You'll Earn

Who Should Enroll

Explore Related Programs

Machine Learning

AI Law and Policy

Generative AI for Productivity

Marketing AI

AI Strategy

AI in Finance

Agentic AI Architecture

Cybersecurity and AI Strategy

Designing and Building AI Solutions

AI in Hospitality

Precision Nutrition and AI

Applied Machine Learning and AI

AI 360

AI in Healthcare

Natural Language Processing With Python

NLP for Finance

AI Skills for Workplace Communications

Text Analysis

Large Language Model Fundamentals

Request Information Now by completing the form below.

Address:	950 Danby Rd.
	Suite 150
	Ithaca, NY 14850

DataScienceCornell Certificate Program

Overview and Courses

Course list

Understanding Data Analytics

Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis

Finding Patterns in Data Using Cluster and Hotspot Analysis

Regression Analysis and Discrete Choice Models

Supervised Learning Techniques

Neural Networks and Machine Learning

How It Works

Key Course Takeaways

Download a Brochure

What You'll Earn

Who Should Enroll

Explore Related Programs

Machine Learning

AI Law and Policy

Generative AI for Productivity

Marketing AI

AI Strategy

AI in Finance

Agentic AI Architecture

Cybersecurity and AI Strategy

Designing and Building AI Solutions

AI in Hospitality

Precision Nutrition and AI

Applied Machine Learning and AI

AI 360

AI in Healthcare

Natural Language Processing With Python

NLP for Finance

AI Skills for Workplace Communications

Text Analysis

Large Language Model Fundamentals

Request Information Now by completing the form below.

Data
ScienceCornell Certificate Program