CS

Undergraduate // Minors // Computational Sciences Concentrations

Data Science and Statistics

Acquire the skills necessary to analyze, interpret, and exploit massive amounts of data. Through the lenses of statistics, machine learning, and stochastic modeling learn how to draw strong inferences about the world around us.

Core courses

Data Science and Statistics (Computational Sciences Minor)

CS110 / Problem Solving with Data Structures and Algorithms

Apply core concepts in design and analysis of algorithms, data structures, and computational problem-solving techniques to address complex problems. Hashing, searching, sorting, tree algorithms, dynamic programming, greedy algorithms, divide and conquer, backtracking, random number generation, and randomized algorithms are examples of algorithms you will learn to exploit to solve problems ranging from logistics to route optimization to DNA sequencing.

Prerequisite: CS51 / Formal AnalysesCS51B / Programming

Corequisite:

CS111 / Single and Multivariable Calculus

Learn to utilize principles of single and multivariable calculus to solve relevant problems from across STEM. Traditional calculus courses focus on the techniques needed to perform complex computations by hand, and evaluate students primarily on their ability to do so quickly. This course takes a different approach by shifting the focus to applying foundational calculus concepts to analyze and solve problems in practical contexts while building the facility to take full advantage of technologies such as Sage to perform complex computations. In addition to honing skills from critical and creative thinking, an emphasis is placed on effective collaborative problem-solving and communication of technical processes and results to appropriate audiences. Note: This course was previously CS111A.

Prerequisite: CS51 / Formal Analyses

Corequisite:

CS113 / Theory and Applications of Linear Algebra

This course develops the tools necessary for the analysis of linear systems. The emphases are both on abstract notions such as vectors spaces, linear maps between them and their matrix representations, and concrete applications such as Markov chains and graphical network analysis. Students apply their knowledge to explore a wide variety of problems such as Page Rank, least squares fitting, and traffic modeling. Note: This course was previously CS111B. In addition to the listed prerequisites, the following courses are recommended prior to taking this course: CS111

Prerequisite: CS51 / Formal Analyses

Corequisite:

Concentrations Courses

Data Science and Statistics (Computational Sciences Minor)

CS130 / Statistical Modeling: Prediction and Causal Inference

The course focuses on the application of predictive and causal statistical inference for decision making across a wide range of scenarios and contexts. The first part of the course focuses on parametric and non-parametric predictive modeling (regression, cross-validation, bootstrapping, random forests, etc.). The second part of the course focuses on causal inference in randomized control trials and observational studies (statistical matching, synthetic control methods, encouragement design/instrument variables, regression discontinuity design, etc.). Technical aspects of the course focus on computational approaches and real-world challenges, drawing cases from the life sciences, public policy and political science, education, and business. This course also emphasizes the importance of being able to articulate one’s findings effectively and tailor methodology and policy/decision-relevant recommendations for different audiences. Note: CS130 may be substituted for a tutorial in CS/NS/SS and can count like a cross-listed tutorial for double majors in SS, NS, or CS (any pairwise combination). This course was previously CS112.

Prerequisite: CS51 / Formal Analyses

Co-rerequisite:

CS146 / Computational Methods for Bayesian Statistics

Learn to apply Bayesian inference which is the mathematical framework for using observed data to update the information we have about a system. The course proceeds from the fundamentals of probability theory and Bayesian inference to the data modeling process, covering various real-world scenarios from sports, medicine, vehicle tracking, social sciences, and more. The second half of the course covers approximate methods for automating inference in the form of variational inference (approximations using functions) and Monte Carlo methods (approximations using random samples). These methods allow us to work with large models containing many unknown variables and large data sets.

Prerequisite:

Co-rerequisite:

CS156 / Finding Patterns in Data with Machine Learning

Students learn to apply core machine learning techniques — such as classification, perceptron, neural networks, support vector machines, hidden Markov models, and nonparametric models of clustering — as well as fundamental concepts such as feature selection, cross-validation and over-fitting. Students program machine learning algorithms to make sense of a wide range of data, such as genetic data, data used to perform customer segmentation or data used to predict the outcome of elections. NOTE: In addition to the listed prerequisites, the following courses are recommended prior to taking this course: CS110

Prerequisite: CS113 / Theory and Applications of Linear AlgebraCS114 / Probability and Statistics and the Structure of Randomness

Co-rerequisite:

CS166 / Modeling and Analysis of Complex Systems

Learn how to apply advanced modeling techniques to analyze and predict the behavior of social, physical and economic systems. You will learn from specific examples applied to portfolio management, traffic flow management, and analyzing social networks. The course covers three modeling frameworks — cellular automata for modeling interactions on grids of cells, networks for more general interactions between nodes in a graph, and Monte Carlo simulations showing how we can use simulation to generate random numbers and how we can use random numbers to drive simulations of complex phenomena. The course covers the theoretical (mathematical) and practical (implementation) aspects of each of the three frameworks.

Prerequisite: CS110 / Problem Solving with Data Structures and AlgorithmsCS114 / Probability and Statistics and the Structure of RandomnessCS130 / Statistical Modeling: Prediction and Causal Inference

Co-rerequisite: