Undergraduate // Majors // Computational Sciences Concentrations

Data Science and Statistics

Acquire the skills necessary to analyze, interpret, and exploit massive amounts of data. Through the lenses of statistics, machine learning, and stochastic modeling learn how to draw strong inferences about the world around us.

Core courses

Data Science and Statistics (Computational Sciences Major)

In their second year, Computational Sciences majors enroll in core courses that provide the foundation for the Computational Sciences concentrations. They also take electives from core courses offered in other majors.

CS110 / Problem Solving with Data Structures and Algorithms

Apply core concepts in design and analysis of algorithms, data structures, and computational problem-solving techniques to address complex problems. Hashing, searching, sorting, tree algorithms, dynamic programming, greedy algorithms, divide and conquer, backtracking, random number generation, and randomized algorithms are examples of algorithms you will learn to exploit to solve problems ranging from logistics to route optimization to DNA sequencing.

Prerequisite: CS51 / Formal AnalysesCS51B / Programming


CS111 / Single and Multivariable Calculus

Learn to utilize principles of single and multivariable calculus to solve relevant problems from across STEM. Traditional calculus courses focus on the techniques needed to perform complex computations by hand, and evaluate students primarily on their ability to do so quickly. This course takes a different approach by shifting the focus to applying foundational calculus concepts to analyze and solve problems in practical contexts while building the facility to take full advantage of technologies such as Sage to perform complex computations. In addition to honing skills from critical and creative thinking, an emphasis is placed on effective collaborative problem-solving and communication of technical processes and results to appropriate audiences. Note: This course was previously CS111A.

Prerequisite: CS51 / Formal Analyses


CS113 / Theory and Applications of Linear Algebra

This course develops the tools necessary for the analysis of linear systems. The emphases are both on abstract notions such as vectors spaces, linear maps between them and their matrix representations, and concrete applications such as Markov chains and graphical network analysis. Students apply their knowledge to explore a wide variety of problems such as Page Rank, least squares fitting, and traffic modeling. Note: This course was previously CS111B. In addition to the listed prerequisites, the following courses are recommended prior to taking this course: CS111

Prerequisite: CS51 / Formal Analyses


CS114 / Probability and Statistics and the Structure of Randomness

When can you find patterns in seemingly random noise? Or determine when an observed pattern is likely due to chance? This course focuses on the concepts from probability and statistics used to extract meaning from data. In addition to building a strong, theoretical foundation, students learn how to apply these tools to understand real-world scenarios. Formal topics include Sample spaces, conditional probability and independence, Bayes’ theorem, discrete and continuous random variables, joint distributions, the law of large numbers as well as the central limit theorem among others. These techniques are then used in applications such as statistical learning, linear regression, simulation, maximum likelihood and least squares.

Prerequisite: CS111 / Single and Multivariable Calculus


Concentrations Courses

Data Science and Statistics (Computational Sciences Major)

In their third year, Computational Sciences majors select a concentration, begin taking courses within it and begin work on their capstone courses. They also take electives chosen from other Minerva courses (other concentration courses in Computational Sciences, core and concentration courses in other colleges). Computational Sciences offers concentrations shown in the table below.

In the fourth year, Computational Sciences majors enroll in additional electives chosen from Minerva’s course offerings within or outside the major. Additionally, they take senior tutorials in the major, and finish their capstone courses.

CS146 / Computational Methods for Bayesian Statistics

Learn to apply Bayesian inference which is the mathematical framework for using observed data to update the information we have about a system. The course proceeds from the fundamentals of probability theory and Bayesian inference to the data modeling process, covering various real-world scenarios from sports, medicine, vehicle tracking, social sciences, and more. The second half of the course covers approximate methods for automating inference in the form of variational inference (approximations using functions) and Monte Carlo methods (approximations using random samples). These methods allow us to work with large models containing many unknown variables and large data sets.



CS156 / Finding Patterns in Data with Machine Learning

Students learn to apply core machine learning techniques — such as classification, perceptron, neural networks, support vector machines, hidden Markov models, and nonparametric models of clustering — as well as fundamental concepts such as feature selection, cross-validation and over-fitting. Students program machine learning algorithms to make sense of a wide range of data, such as genetic data, data used to perform customer segmentation or data used to predict the outcome of elections. NOTE: In addition to the listed prerequisites, the following courses are recommended prior to taking this course: CS110

Prerequisite: CS113 / Theory and Applications of Linear AlgebraCS114 / Probability and Statistics and the Structure of Randomness


CS166 / Modeling and Analysis of Complex Systems

Learn how to apply advanced modeling techniques to analyze and predict the behavior of social, physical and economic systems. You will learn from specific examples applied to portfolio management, traffic flow management, and analyzing social networks. The course covers three modeling frameworks — cellular automata for modeling interactions on grids of cells, networks for more general interactions between nodes in a graph, and Monte Carlo simulations showing how we can use simulation to generate random numbers and how we can use random numbers to drive simulations of complex phenomena. The course covers the theoretical (mathematical) and practical (implementation) aspects of each of the three frameworks.

Prerequisite: CS110 / Problem Solving with Data Structures and AlgorithmsCS114 / Probability and Statistics and the Structure of RandomnessCS130 / Statistical Modeling: Prediction and Causal Inference