Welcome to DATA 110!

Originally developed with Dr. Rick Marks (F24), then modified with Dr. Can Chen and Dr. Chudi Zhong (S25).

The Introduction to Data Science course is a broad, high-level survey of the major aspects of data science including ethics, best practices in communication (e.g. data visualization), mathematical/statistical concepts, and computational thinking. Students will gain an understanding of the fundamentals of data science to support more in-depth, advanced coursework that are requirements for the BA and BS in Data Science. The curriculum and format are designed specifically for students who are considering a major in data science and may not have taken statistics or computer science courses. The course is a requirement of the BA and BS in Data Science.


Course Schedule (Fall 25)

Schedule may change during the semester. See the syllabus on Canvas for the most updated version.

Introduction to Data Science

Week Date Lec Topic(s)
1 Aug. 18 (M) 1 Syllabus, What is data science?
Aug. 20 (W) 2 Data modality, Tables
Aug. 22 (F) Lab 1

Python Programming

Week Date Lec Topic(s)
2 Aug. 25 (M) 3 What is programming? Variables, Data types
Aug. 27 (W) 4 Control, If/else, Comparisons
Aug. 29 (F) Lab 2
3 Sept. 1 (M) No Class
Sept. 3 (W) 5 List, Indexing, Slicing, For
Sept. 5 (F) Lab 3
4 Sept. 8 (M) 6 Table column operations
Sept. 10 (W) 7 Table row operations
Sept. 12 (F) Lab 4

Data Communication

Week Date Lec Topic(s)
5 Sept. 15 (M) No Class
Sept. 17 (W) 8 Line plots, Scatter plots
Sept. 19 (F) Quiz 1
6 Sept. 22 (M) 9 Bar charts, Histograms
Sept. 24 (W) 10 Guest Lecture: Billy Fryer (SMT)
Sept. 26 (F) Lab 5
7 Sept. 29 (M) 11 Summary Statistics, Boxplot

Statistics

Week Date Lec Topic(s)
Oct. 1 (W) 12 Probability, Uniform distribution
Oct. 3 (F) Lab 6
8 Oct. 6 (M) 13 Sampling
Oct. 8 (W) 14 Association, Correlation, Causality
Oct. 10 (F) Lab 7
9 Oct. 13 (M) 15 Gaussian distribution, Inference
Oct. 15 (W) 16 Hypothesis testing
Oct. 17 (F) No Class
10 Oct. 20 (M) 17 Hypothesis testing
Oct. 22 (W) Quiz 2
Oct. 24 (F) Lab 8

Prediction

Week Date Lec Topic(s)
11 Oct. 27 (M) 18 Data Science life cycle, Modeling
Oct. 29 (W) 19 Intro to ML (supervised learning)
Oct. 31 (F) Lab 9
12 Nov. 3 (M) 20 Decision trees
Nov. 5 (W) 21 Linear regression
Nov. 7 (F) Lab 10
13 Nov. 10 (M) 22 Underfitting and overfitting
Nov. 12 (W) 23 Class imbalance
Nov. 14 (F) Lab 11
14 Nov. 17 (M) 24 KNN
Nov. 19 (W) Quiz 3 Review
Nov. 21 (F) Quiz 3

Next Steps in Data Science

Week Date Lec Topic(s)
15 Nov. 24 (M) 25 Unsupervised and self-supervised learning
Nov. 26 (W) No Class
Nov. 28 (F) No Class
16 Dec. 1 (M) 26 Detecting AI, Prompt engineering
Dec. 3 (W) Project Work Time

Assignments

Weekly Lab Exercises [10%]

Lab assignments are small group exercises that are intended to be completed during Friday recitations, which are led by the TAs. You should submit individually at the end of 50 minutes. At the latest, it must be submitted by 11:59pm of the same day. One lowest score out of the 11 labs will be dropped automatically. Excused absences policy and request form: link. Do not request extension or excused absences via email.

Homework [30%]

You may discuss problems with other students/course staff, but complete and submit independently. Due dates for every Homework assignment are provided on the course syllabus and course schedule. Unless otherwise stated, assignments are due on those days at 11:59pm. Submit them to Gradescope, which can be accessed via Canvas. Homework grades will not be dropped.

  • HW1 (5%): Introduction to Data Science
  • HW2 (5%): Introduction to Programming and Pandas
  • HW3 (5%): Data Visualization and Communication
  • HW4 (5%): Probability, Statistics, and Sampling
  • HW5 (5%): Inference & Hypothesis Testing
  • HW6 (5%): Machine Learning

You have 3 late days that you can freely use throughout the semester. You can use up to 2 days for a single HW. For example, you can use:

  • 1 late day each on HW 2, HW4, HW5, or
  • 1 late day on HW3 and 2 late days on HW6.

You do not need to request permission to use the late days. It will be counted at the end of the semester through Gradescope submission timestamps. You cannot use fractions of late days. For example, submissions that are 10 hours late and 24 hours late both count as having spent 1 late day.

Quizzes [35%]

  • Quiz 1 (10%) will cover week 1 through week 4.
  • Quiz 2 (10%) will cover week 5 through week 9 (Gaussian distribution and inference).
  • Quiz 3 (15%) will cover week 9 through week 13.

All quizzes will be during the usual lecture/lab time in the usual lecture/lab room. They are pen-and-paper exams. Question types include multiple choice, True/False, matching, fill-in-the-blank, short answers (no more than a paragraph). Quiz grades will not be dropped.

Project [25%]

Presentation + peer review during final exam block. Project will be assessed based on:

  1. Group project proposal
  2. Group poster presentation
  3. Group write-up and code
  4. Individual reflection and teamwork assessment
  5. Individual feedback to other teams

If a member does not contribute to the project due to lack of effort, they will receive an F in the group assignments.

Extra Credit Opportunities

At the end of the semester, you can earn up to 0.75% extra credit added to your final numerical score, which will then be used to calculate your final letter grade.

  1. If you attend at least 21 out of 26 lectures, you will get a 0.5% extra credit. Since this is extra credit and there is a generous buffer, I will not accept any non-UAA excuses or respond to such emails. If you attend 20 lectures or less, you will not receive any extra credit.
  2. Students who regularly and actively participate in class and lab discussions may receive up to 0.25% extra credit. This includes asking thoughtful questions during guest lectures.
  3. I do not offer individual extra credit assignments under any circumstances.

Grading Policy

Numeric Grade (%) Letter Grade
94 and above A
90 – 93.9 A-
87 – 89.9 B+
83 – 86.9 B
80 – 82.9 B-
77 – 79.9 C+
73 – 76.9 C
70 – 72.9 C-
67 – 69.9 D+
60 – 66.9 D
Below 60 F

Course Goals & Student Learning Outcomes (SLOs)

The goal of this course is to lay the foundation for subsequent courses required for the BS and BA in Data Science, as well as to introduce core concepts and ideas in the field of data science to any student, regardless of major. The course provides a high-level survey of current and emerging concepts in key data science domains, including computational thinking, mathematics/statistical skills, data management, communication best practices, and ethics. The course will offer hands-on analysis of real-world datasets, exposing students to the type of insights and problem-solving that the field of data science can deliver.

Accessibility and Equity

Students from all backgrounds should be able to take Intro to Data Science. As such, no prerequisites in statistics or programming are required for the course; only basic high-school algebra are necessary.

Diversity

Intro to Data Science can be taken by students from any major across campus and should be acceptable as a potential pre-requisite for statistics, math, or computing many majors.

Pedagogical Clarity

Intro to Data Science is designed to first teach introductory programming, then statistics through a computation lens, and ultimately concludes with basic methods in inference.

Core Concepts and Learning Outcomes

  • Data Management: Describe differences in types of data and the ways in which individuals and organizations store, manage, and interact with data. Identify and appropriately acknowledge sources of data. Apply basic data cleaning techniques to prepare data for analysis.

  • Mathematical and Statistical Foundations: Select and use appropriate data analytics and statistical techniques to discover new relationships, deliver insights into research problems or organizational processes, and support decision-making. Draw accurate and useful conclusions from data analysis.

  • Computational Thinking: Build and understand algorithms for analyzing large data sets and accurate numerical modeling for problems.

  • Communication: Convey data analyses through written and oral communication skills as well as select the appropriate tools to visually display data.

  • Responsible Data Science: Identify security, privacy protection, governance, and ethical considerations in data management. Differentiate between ethical and unethical uses of data science.


This course is designed to meet the general education requirement of Quantitative Reasoning. Below are the corresponding learning outcomes and student questions from UNC Chapel Hill’s IDEAs in Action General Education Curriculum.

IDEAs in Action General Education Curriculum

FC-QUANT

Student Learning Outcomes:

  1. Summarize, interpret, and present quantitative data in mathematical forms, such as graphs, diagrams, tables, or mathematical text.
  2. Develop or compute representations of data using mathematical forms or equations as models and use statistical methods to assess their validity.
  3. Make and evaluate important assumptions in the estimation, modeling, and analysis of data, and recognize the limitations of the results.
  4. Apply mathematical concepts, data, procedures, and solutions to make judgments and draw conclusions.
  5. Synthesize and present quantitative data to others to explain findings or to provide quantitative evidence in support of a position.

Questions for Students:

  1. What is the role of mathematics in organizing and interpreting measurements of the world?
  2. How can mathematical models and quantitative analysis be used to summarize or synthesize data into knowledge and predictions?
  3. What methodology can we apply to validate or reject mathematical models or to express our degree of confidence in them?