📖 Syllabus

Table of contents

  1. Overview
    1. Instructor
    2. Content
    3. Credit and Workload
    4. Prerequisites
  2. Getting Started
    1. Computer and Network Recommendations
    2. Websites
    3. Development Environment
    4. Forms
  3. Communication
  4. Course Components
    1. Lectures
    2. Homeworks
    3. Final Project
    4. Study Sessions
    5. Office Hours
  5. Exams and Interviews
    1. Exams
    2. Technical Interview
  6. Grades
    1. Assignment Weights
    2. Letter Grades
    3. Late Policy, Slip Days, and Drops
    4. Regrade Requests
  7. Collaboration and Academic Integrity
    1. Use of Generative Artificial Intelligence
    2. Honor Code
  8. Student Support and Well-Being
    1. Accommodations
    2. Diversity and Inclusion
    3. Campus Resources
  9. Acknowledgements
  10. Disclaimer

Overview

Instructor

See the 👩‍🏫 Staff page for contact information.

Content

Watch this video to learn more about the course.

Skills and tools for building practical data science projects, along with their theoretical underpinnings. pandas, numpy, scikit-learn, BeautifulSoup, and Jupyter Notebooks, and also the math behind loss functions, gradient descent, linear and logistic regression, and other key ideas in machine learning.

This course will train students to use industry-standard tools to solve real-world problems, while giving them an understanding of how these tools work under the hood. After taking this course, students will be prepared to build data science portfolios, participate in research across campus, and succeed in data science internships. Refer to the course homepage for a more specific list of topics.

Credit and Workload

This course is worth 4 credits, and counts for the following major requirements:

  • Computer Science: Upper-Level CS (ULCS) Elective
  • Data Science: Advanced Technical Elective or Application Elective
  • Electrical Engineering: Flexible Technical Elective

Refer to the Fall 2024 and Winter 2025 (Section 1, Section 2) course evaluations to see what students from last year said about the course. Student perceptions of workload from Fall 2024 are shown below; we expect this offering’s workload to be similar, with the caveat that this Spring offering is being taught at twice the speed of Fall/Winter terms.

Prerequisites

The course is open to students from all majors.

The enforced prerequisites are discrete math (EECS 203), programming (EECS 280), calculus I, calculus II, and linear algebra. A probability and statistics course is an advisory prerequisite. Options include DATASCI 101, STATS 206, STATS 250, STATS 280, STATS 412, IOE 265, or ECON 451.

That said, we have allowed interested students to override into the course without meeting the prerequisites. In particular, many students are missing the linear algebra prerequisite. We have prepared linear algebra review materials specifically for this course; these resources are available here, and will be referred to as the semester progresses. Students missing the linear algebra prerequisite are expected to self-study this material (and come to us for help!).


Getting Started

The course website, practicaldsc.org, will contain links to all course content. There are also a few things you’ll need to do to get set up.

Computer and Network Recommendations

Make sure you have a laptop consistent with CAEN recommendations.

Test your internet connection with the UM Custom Speedtest website and make sure it meets the minimum requirements for any UM service. You’ll need more bandwidth if there will be multiple simultaneous users in your household.

Resources for help with computing equipment:

You may also use computer workstations in CAEN labs on campus or connect remotely.

Websites

You’ll need to make accounts on the following sites:

  • Ed: We’ll be using Ed as our course message and discussion board. (Think of Ed as a replacement to Piazza.) More details are in the Communication section below. If you didn’t already get an invitation to our Ed course, sign up here.

  • Gradescope: You’ll submit all assignments to Gradescope. This is where all of your grades will live as well. Most of the assignments will be coding assignments. Parts of these assignments will be manually graded, but most of them will be autograded. You should have received an email invitation for Gradescope, but if not please let us know as soon as possible (preferably via Ed). Note that we will not be using the EECS-specific “Autograder” platform.

  • GitHub: You will access all course content (lecture slides and assignments) by pulling our course GitHub repository. That repo is here: github.com/practicaldsc/sp25. Don’t worry if you’ve never used Git before – the ⚙️ Environment Setup page will walk you through all of the necessary steps.

Note that we will not be using Canvas for anything this semester (so please don’t try and send us messages on Canvas!).

Development Environment

As soon as you are able to, go follow the steps in the ⚙️ Environment Setup page of the course website to set up your programming environment for the course.

Forms

Please fill out the required Welcome Survey to tell us a bit more about your background and whether you need an alternate exam no later than Friday, May 9th (the same day that Homework 1 is due).


Communication

This semester, we’ll be using Ed as our course message board. You will be added to Ed automatically; use this invite link if you weren’t added.

If you have a question about anything to do with the course — if you’re stuck on a problem, didn’t understand something from lecture, want clarification on course logistics, or just have a general question about data science — you can make a post on Ed. We only ask that if your question includes some or all of an answer (even if you’re not sure it’s right), please make your post private so that others cannot see it. You can also post anonymously to other students if you prefer.

Course staff will regularly check Ed and try to answer any questions that you have. You’re also encouraged to answer questions asked by other students. Explaining something is a great way to solidify your understanding of it!

Please don’t email individual staff members, just make a private or public Ed post instead.


Course Components

Lectures

Lectures will be held in-person on Tuesdays and Thursdays from 2-5PM in 1690 BBB. On each lecture day, we will cover two standard 80 minute lectures from Fall/Winter terms, with a ~10 minute break in between.

Attendance is not required, though you are encouraged to attend in-person if you are able to. Lectures will be recorded. On at least one class day, Suraj will be out of town; on that day, we will post recordings from previous terms.

Recordings will be made publicly available so that students who are not enrolled – including students not at Michigan – can benefit from the recordings. As part of your participation in this course, you may be recorded (e.g. if you answer a question). If you do not wish to be recorded, please email Suraj to discuss alternate arrangements.

Lecture slides/notebooks will be your main resource in this class. You can access them, along with all course materials, by pulling from the course GitHub repository, github.com/practicaldsc/sp25; instructions on how to do this are in the ⚙️ Environment Setup page. We will also link HTML previews of each lecture notebook from the course homepage; you can use these to annotate the lecture notebooks with a tablet, if you’d like.

For some topics, we have prepared Guides that you’re also expected to read and reference while working on assignments.

Supplementary readings will be made available on the course homepage, drawn from a variety of online resources. Additional supplementary resources can be found in the Readings section of the Resources tab of the course website.

Homeworks

There will be 11 homework assignments due throughout the semester. Expect each homework assignment to take ~8-10 hours to complete. Even though this is a programming-heavy class, expect it to have the workload patterns of a more theoretical class – that is, expect a constant, moderate level of work each week, rather than having some weeks with very little work and some extremely heavy weeks.

Homeworks will be distributed in Jupyter Notebooks, in which you will write Python code. For programming problems, public tests will be provided to make sure you’re on the right track, however, your submission will be graded using an autograder with hidden tests. Many homeworks will also include written questions which must be answered on pen-and-paper; these allow us to assess your understanding of more theoretical ideas underpinning core data science techniques taught in lecture. Autograded questions and written questions must be submitted to two separate Gradescope assignments but count as one big assignment for the purposes of grading.

Each homework is worth the same amount in your overall grade, no matter how many points it is worth. For instance, Homework 1 could be out of 50 points and Homework 2 could be out of 20 points, but they will still both be worth the same amount when calculating your overall grade.

Your lowest two homeworks will be dropped when calculating your final score. Homeworks will be released at least a week before the due date. You will access homeworks by pulling the course GitHub repository, and will submit your completed homework as Jupyter Notebooks to Gradescope (we will not be using the EECS autograder). You may turn them in as many times as you like before the deadline, and only the most recent submission will be graded, so it’s a good idea to submit early and often. For homeworks with both an autograded and written submission, your submission time will be the latter of the two submission times.

Note that all homeworks are to be completed individually. See the Collaboration and Academic Integrity section for more details. That said, you’re encouraged to discuss all assignments with others at a conceptual level in office hours and study groups.

Final Project

In the Final Project, you’ll work on an open-ended investigation of a dataset of your choosing, using the tools from throughout the semester. Your work will culminate in a public-facing website that you can share with friends, family, and on your resume. Visit the Showcase to see Final Project submissions from prior semesters. (In Fall 2024, the assignment was called the “Portfolio Homework”; it has not changed in spirit, despite the new name.)

We will release the spec for the Final Project by the end of the second week of the semester, so you’ll have over a month to complete it. For now, note that the Final Project is different than homeworks in the following ways:

  • You can work with one partner, but aren’t required to.
  • You cannot use slip days on the Final Project, since it’s due at the end of the semester and we need time to grade it.
  • The Final Project will not be autograded at all; it will be fully manually graded.

Study Sessions

In typical Fall/Winter terms, discussion sections are used to prepare students for exams, by having students work through worksheets with theoretical practice problems from related courses. Given the faster pace of the Spring term and the pace of homework deadlines (most weeks have 2 homeworks due), we have made the decision to not hold discussion sections this term.

Instead, we’re planning on replacing the discussion section slot with a new type of meeting, called a “study session.” Study sessions are like group office hours, where you can come to work on homeworks, meet your classmates, and get help from course staff. There will almost always be a homework due on the day of a study session, which make them a great place to wrap up the currently-due homework.

Based on availability shared in the Welcome Survey, we’ve decided not to hold study sessions from 2-4PM on Wednesdays (the originally-scheduled discussion section slot), but rather, hold them later in the evenings on Wednesdays. See the Calendar tab of the course website for the most up-to-date schedule.

Study session attendance is optional. In Week 1, the study session is a great place to come to get started on Homework 1 and iron out any issues you ran into while following the steps in the ⚙️ Environment Setup page.

Office Hours

To get help on assignments and concepts, course staff will be hosting several office hours per week. To provide added flexibility, given the nature of the Spring term, we’ll provide a mix of in-person office hours in BBB and remote office hours on Zoom. See the Calendar tab of the course website for the most up-to-date schedule and directions.

We use the term “office hours” but really, office hours are held in a common room where you can come to work on assignments, meet your classmates, and get help from course staff. We don’t bite, and we would love to see you in office hours!

Office hours are your chance to ask for general help, clarification on homeworks, and to review previous homeworks. Course staff will not tell you if your answer is correct, and it is inappropriate to ask. Here are some really good questions to ask instead:

  • I got confused about a concept in class. Can you explain it?
  • When the assignment says X, does it mean A or B?
  • My code is giving a weird error - can you help me understand why?
  • I can’t get this test to pass, so I must be doing something wrong. Can you help me figure it out?
  • My code is doing something different than what I expected. Can you explain what is happening?

Questions that you should never ask us:

  • Is this the right answer?
  • Can you check my code and make sure it is right?
  • What is the answer?
  • What’s going to be on the exam?

Your primary motivation when interacting with course staff should be learning.


Exams and Interviews

Exams

This class has one Midterm Exam and one Final Exam. The Final Exam is not cumulative, in that the focus will be on content from the second half of the semester, though the content in the second half of the course builds on the first half. The exams will feature a mix of multiple choice, select all, short answer, and long answer questions, including questions that require you to write code and do math. See study.practicaldsc.org for links to the exams from previous offerings.

  • Midterm Exam: Wednesday, May 28th, 2-4PM (held during the discussion section slot)

  • Final Exam: Tuesday, June 24th, 1:30-3:30PM

Both exams will be administered in-person and on paper, and will be closed-note, with the exception of one, double-sided hand-written notes sheet that you prepare yourself. If you have conflicts with either of the exams, please let us know on the Welcome Survey. We may provide alternate exam times for students with a valid, documented conflict with a required activity in another course or official university-affiliated activity, or to help students avoid negative academic consequences when their religious obligations conflict with academic requirements.

Given that we’ve replaced discussion sections with study sessions, we have instead decided to post all Fall/Winter discussion worksheets at the start of the semester, and encourage you to dedicate several hours per week to working through these worksheets.

You can find all practice worksheets at study.practicaldsc.org.

Specifically, after each day of lecture, you should spend an hour working through problems from the relevant practice worksheet. Work through the first few problems in a simulated exam-like environment: with a time limit, without access to the internet, without any music, in an uncomfortable chair in a quiet-ish area, etc., and only look at the solutions once you’ve faithfully attempted the problems. Then, work through the remaining problems in the worksheet in a similar fashion.

Technical Interview

Given the smaller scale of the course, we’re offering a new type of evaluative assessment, in addition to the standard exams. The Technical Interview is an optional, pass/fail, 30 minute, in-person interview with Suraj, covering content from Lectures 1-8 (i.e. the content on data wrangling and visualization).

In the Technical Interview, you will be presented with a dataset in a Jupyter Notebook, and will be asked to write several lines of code to answer 5 verbal questions that are asked of you (e.g. “Find the average height of the most common dog species in the dataset.”). Like in a job interview, you’ll also be expected to explain your thought process and reasoning. The interview will be graded pass/fail; to earn a “pass”, you must successfully answer at least 4 of the 5 questions that are asked. If you fail the Technical Interview, you cannot retake it.

The Technical Interview is designed to supplement the Midterm Exam. The Technical Interview is completely optional, and cannot hurt your grade, even if you fail it. Here’s how:

  • If you don’t take the Technical Interview, or fail it, the Midterm Exam will count for 25% of your overall course grade.
  • If you pass the Technical Interview, the Midterm Exam will count for 15% of your overall course grade, and the Technical Interview will count for 10% of your overall course grade (but since the Technical Interview is pass/fail, passing it will give you all 10% of this credit).

This means that you can still earn 100% on the Midterm Exam and 100% in the course overall, even if you don’t take the Technical Interview or fail it. So, it doesn’t hurt you at all to take the Technical Interview!

To help prepare for the Technical Interview, Suraj will offer several 20 minute “practice” interview slots – some virtual, some in-person – which you can schedule with Suraj on one of the dates listed below. It is highly recommended to schedule a practice interview if you plan on doing the Technical Interview, to make sure you’re familiar with the style of questions that will be asked.

Signups for both Technical Interviews and practice interviews will open on May 16th via Google Calendar, for the following dates:

  • Technical Interviews (30 minutes): May 22nd, May 23rd, May 27th (all in-person)
  • Practice Interviews (20 minutes): May 19th (Zoom), May 20th (in-person), May 21st (Zoom)

If there are changes to these dates, we will post an update on Ed, but note that all slots for the Technical Interview are before the Midterm Exam. If you would like to schedule either a Technical Interview or a practice interview, but aren’t available at the times listed, please let Suraj know.

Note that each student will be given a different dataset to work with in the Technical Interview, and the questions themselves will be slightly different for each student, so there is no disadvantage to scheduling an earlier interview or advantage to scheduling a later interview.


Grades

Assignment Weights

ComponentWeightNotes
Homeworks42%• 11 total; 2 lowest scores dropped
• 8 slip days available to use with a max of 2 per homework
Final Project8%no slip days allowed
Midterm Exam25% (or 15%, if passed Technical Interview) 
Technical Interview10% (or 0%, if opted out or failed)See Technical Interview for details
Final Exam25% 

Letter Grades

Grading for this class is not curved in the sense that the average is set at (say) a B+ and half of the class must receive a grade lower than that. If everyone does well and shows mastery of the material, everyone can receive an A (this would be awesome!). If no one does well (this is unlikely), then everyone can receive a C.

Grading for this class is curved in the sense that we do not have a pre-defined mapping from project and exam scores to a final GPA. There is no pre-determined score (e.g., 90% of all possible points) that earns an A or a B or a C or any other grade. To determine the final grade, we will ask questions like “Did this student master the material?”. With that said, grades will not be any stricter than the standard grading scale (where an A+ is a 97+, A is 93+, A- is 90+, etc). For instance, the threshold for an “A” will never be higher than 93%.

The distribution of letter grades in Fall 2024 is shown below; the grade distribution in Winter 2025 was very similar. Our distribution will likely be similar, though not identical. (The majority of “E” grades came from students who did not participate in the course.)

Try your best not to worry about grades, and we’ll reciprocate by being fair. We’re in this together ❤️.

Late Policy, Slip Days, and Drops

All homeworks must be submitted by 11:59PM Ann Arbor time on the due date to be considered on time. You may turn them in as many times as you like before the deadline, and only the most recent submission will be graded, so it’s a good habit to submit early and often. If you make a submission after the deadline, your assignment will be counted as late.

You have 8 “slip days” to use throughout the semester. A slip day extends the deadline of a homework by 24 hours. You may use up to 2 slip days on any one homework assignment, meaning it is impossible to submit a homework more than 48 hours late. You cannot use slip days on the Final Project. Note that you do not use up any slip days on homeworks you don’t submit, e.g. if you choose not to submit Homework 2, it costs you 0 slip days.

Slip days are designed to be a transparent and predictable source of leniency in deadlines. You can use a slip day if you are too busy to complete a homework on its original due date (or if you forgot about it). But slip days are also meant for things like the internet going down at 11:58PM just as you go to submit your homework, or if you are sick and have fallen behind with the material.

Slip days are meant to be used in exceptional circumstances, so you probably should not need to use all 8, but if you have something going on in your life that is impeding your ability to do your classwork on time, please reach out to us as soon as possible so we can work something out. The earlier you let us know that something’s going on, the more we can do to help, so please reach out. If – and only if! – you have already used your 8 slip days, you may petition for an additional slip day in cases of illness or emergency. Requests will be considered on a case-by-case basis.

Slip days are applied automatically at the end of the semester, and you don’t need to ask in order to use one. It’s your responsibility to keep track of how many you have left. If you’ve run out of slip days and submit an assignment late, that homework may still be graded, but you will receive a 0 on it when we calculate overall grades at the end of the semester.

Regrade Requests

For most homeworks, we will accept regraded requests. If you feel that there in an error in the autograder or that the manual grader has made a mistake, you may submit a regrade request within one week of the grades being released. If you do not submit a regrade request within one week, your original grade will be final.

Regrade Requests for Manually Graded Problems

To submit a regrade request for a manually graded problem, make the regrade request directly on Gradescope. Note that part of your grade is clarity, so if your answer was mostly right but unclear you may still not be eligible for full credit.

Regrade Requests for Autograded Problems

To submit an autograder regrade request, please fill out the Autograder Regrade Request Form.

Note that it’s rare that something is wrong with the autograder, and if that’s the case, we’ll typically fix the necessary test cases and re-run the autograder for the entire class.


Collaboration and Academic Integrity

This will be a tough, but rewarding course. While you will be challenged this semester, we will be offering you plenty of support through office hours and Ed. Make good use of these resources, and you will be able to succeed in this course.

In this course, you can read books, surf the web, talk to your friends and course staff to get help understanding the concepts you need to know to complete your assignments. However, all work you submit must be your own, original work; collaboration must not result in solutions that are identifiably similar to other solutions, past or present.

Encouraged CollaborationUnacceptable Collaboration
Discussing the general approach to homeworks

Talking about problem-solving strategies or issues you ran into and how you solved them

Discussing the answers to exams with other students who have already taken the exam after the exam is complete

Using code provided in class, by the textbook or any other assigned reading or video, with attribution

Google searching for documentation on Python or pandas

Working together with other students on homeworks without copying or sharing answers

Posting a question about your approach to a problem on Ed, without sharing your code
Using or submitting code acquired from other students, the web, or any other resource not officially sanctioned by this course

Posting your code online, including on Ed, unless privately to course staff only

Having any other person complete any part of your assignment on your behalf

Completing an assignment on behalf of someone else

Providing code, exam questions, or solutions to any other student in the course

Collaborating with others on exams

If you are unsure about what constitutes an honor code violation, please contact the course staff with questions. The best way to avoid problems is by using your best judgement. Here are some suggestions for completing your work:

  • Don’t look at or discuss the details of another student’s code for a homework you are working on, and don’t let another student look at your code.
  • Don’t start with someone else’s code and make changes to it, or in any way share code with other students.
  • If you are talking to another student about a homework, don’t take notes, and wait an hour afterward before you write any code.

Use of Generative Artificial Intelligence

Our course policy on the use of GenAI tools for homeworks is simple: you can use these tools to build an understanding of course material and to assist you on assignments, keeping in mind that no tool is a substitute for a strong understanding of course concepts.

Some examples of responsible use of generative AI include autocompleting repetitive/boilerplate code and suggesting edge cases. Creating large sections of code you do not understand yourself is using generative AI in an irresponsible way, and is likely to be detrimental when it comes to showing what you know on exams, which are worth 50% of your course grade and do not allow generative AI tools.

Honor Code

We report suspected violations to the Engineering Honor Council. To identify violations, we use both manual inspection and automated software to compare present solutions with each other, with past solutions, and with code found online. The Honor Council determines whether a violation of academic standards has occurred, as well as any sanctions. Read the Honor Code for detailed definitions of cheating, plagiarism, and other forms of academic misconduct.


Student Support and Well-Being

Accommodations

If you need, or think you might need, an accommodation for a disability, please let us know during the first three weeks of the semester. Some aspects of this course may be modified to facilitate your participation and progress. As soon as you make us aware of your needs, we can work with the Services for Students with Disabilities (SSD) office to help us determine appropriate academic accommodations. SSD (ssd.umich.edu; 734-763-3000) recommends accommodations through a Verified Individualized Services and Accommodations (VISA) form. Any information you provide is private and confidential and will be treated as such.

Diversity and Inclusion

It is our intention that students from all backgrounds and perspectives will be well served by this course, and that the diversity that students bring to this class will be viewed as an asset. We welcome individuals of all ages, backgrounds, beliefs, ethnicities, genders, gender identities, gender expressions, national origins, religious affiliations, sexual orientations, socioeconomic background, family education level, ability - and other visible and nonvisible differences. All members of this class are expected to contribute to a respectful, welcoming, and inclusive environment for every other member of the class. Your suggestions are encouraged and appreciated.

Campus Resources

As a student, you may experience a range of issues that can negatively impact your learning, such as anxiety, depression, interpersonal or sexual violence, difficulty eating or sleeping, loss/grief, and/or alcohol/drug problems. These mental health concerns or stressful events may lead to diminished academic performance and affect your ability to participate in day-to-day activities.

In order to support you during such challenging times, the University of Michigan provides a number of confidential resources to all enrolled students, many of which are listed here. Some particularly useful resources include:


Acknowledgements

This course is being offered for the third time at the University of Michigan (and the first time in the accelerated Spring term). With that said, many of the materials we will use are adopted from content created by countless other instructors for courses at other institutions, in particular:

  • DSC 10, DSC 40A, and DSC 80 at the University of California, San Diego
  • Data 6 and Data 100 at the University of California, Berkeley

Language in this syllabus has been adopted from other courses as well, including EECS 203, EECS 280, EECS 376, and EECS 485 here at the University of Michigan, and CSE 160 at the University of Washington.


Disclaimer

While we try to do our best to plan ahead, unfortunately, sometimes circumstances do arise that necessitate a policy change. When this happens, the change will be announced, and this document will be updated with the new policy.

We appreciate any and all feedback, given that this course is new and evolving (and this Spring offering is especially new). If you’d like to provide us with anonymous feedback at any point, you can do so at this form. Thank you!