đź“– Syllabus

Table of contents

  1. Overview
    1. Instructor
    2. Content
    3. Credit and Workload
    4. Prerequisites
  2. Getting Started
    1. Computer and Network Recommendations
    2. Websites
    3. Development Environment
    4. Forms
  3. Communication
  4. Course Components
    1. Lectures
    2. Discussions
    3. Homeworks
    4. Final Project
    5. Office Hours
    6. Weekly Schedule
  5. Exams
  6. Grades
    1. Assignment Weights
    2. Letter Grades
    3. Late Policy, Slip Days, and Drops
    4. Regrade Requests
  7. Collaboration and Academic Integrity
    1. Use of Generative Artificial Intelligence
    2. Honor Code
  8. Student Support and Well-Being
    1. Accommodations
    2. Diversity and Inclusion
    3. Campus Resources
  9. Acknowledgements
  10. Disclaimer

Overview

Instructor

See the 👩‍🏫 Staff page for contact information.

Content

Watch this video to learn more about the course.

Skills and tools for building practical data science projects, along with their theoretical underpinnings. pandas, numpy, scikit-learn, BeautifulSoup, and Jupyter Notebooks, and also the math behind loss functions, gradient descent, linear and logistic regression, and other key ideas in machine learning.

This course will train students to use industry-standard tools to solve real-world problems, while giving them an understanding of how these tools work under the hood. After taking this course, students will be prepared to build data science portfolios, participate in research across campus, and succeed in data science internships. Refer to the course homepage for a more specific list of topics.

Credit and Workload

This course is worth 4 credits, and counts for the following major requirements:

  • Computer Science: Upper-Level CS (ULCS) Elective
  • Data Science: Advanced Technical Elective or Application Elective
  • Electrical Engineering: Flexible Technical Elective

Refer to the Fall 2024 course evaluations to see what students from Fall 2024 said about the course. Student perceptions of workload from Fall 2024 are shown below; we expect this offering’s workload to be similar.

Prerequisites

The course is open to students from all majors.

The enforced prerequisites are discrete math (EECS 203), programming (EECS 280), calculus I, calculus II, and linear algebra. A probability and statistics course is an advisory prerequisite. Options include DATASCI 101, STATS 206, STATS 250, STATS 280, STATS 412, IOE 265, or ECON 451.

That said, we have allowed interested students to override into the course without meeting the prerequisites. In particular, many students are missing the linear algebra prerequisite. We will be providing linear algebra review materials as the semester progresses; students missing the prerequisite background are expected to self-study this material (and come to us for help!).


Getting Started

The course website, practicaldsc.org, will contain links to all course content. There are also a few things you’ll need to do to get set up.

Computer and Network Recommendations

Make sure you have a laptop consistent with CAEN recommendations.

Test your internet connection with the UM Custom Speedtest website and make sure it meets the minimum requirements for any UM service. You’ll need more bandwidth if there will be multiple simultaneous users in your household.

Resources for help with computing equipment:

You may also use computer workstations in CAEN labs on campus or connect remotely.

Websites

You’ll need to make accounts on the following sites:

  • Ed: We’ll be using Ed as our course message and discussion board. (Think of Ed as a replacement to Piazza.) More details are in the Communication section below. If you didn’t already get an invitation to our Ed course, sign up here.

  • Gradescope: You’ll submit all assignments to Gradescope. This is where all of your grades will live as well. Most of the assignments will be coding assignments. Parts of these assignments will be manually graded, but most of them will be autograded. You should have received an email invitation for Gradescope, but if not please let us know as soon as possible (preferably via Ed). Note that we will not be using the EECS-specific “Autograder” platform.

  • GitHub: You will access all course content (lecture slides and assignments) by pulling our course GitHub repository. That repo is here: github.com/practicaldsc/wn25. Don’t worry if you’ve never used Git before – the ⚙️ Environment Setup page will walk you through all of the necessary steps.

Note that we will not be using Canvas for anything this semester (so please don’t try and send us messages on Canvas!).

Development Environment

As soon as you are able to, go follow the steps in the ⚙️ Environment Setup page of the course website to set up your programming environment for the course. Discussion 1, held on Thursday, January 9th, will be dedicated to making sure you’ve followed these steps.

Forms

Please fill out the required Welcome Survey to tell us a bit more about your background and whether you need an alternate exam no later than Wednesday, January 15th.


Communication

This semester, we’ll be using Ed as our course message board. You will be added to Ed automatically; use the invite link in the section above if you weren’t added.

If you have a question about anything to do with the course — if you’re stuck on a problem, didn’t understand something from lecture, want clarification on course logistics, or just have a general question about data science — you can make a post on Ed. We only ask that if your question includes some or all of an answer (even if you’re not sure it’s right), please make your post private so that others cannot see it. You can also post anonymously to other students if you prefer.

Course staff will regularly check Ed and try to answer any questions that you have. You’re also encouraged to answer questions asked by other students. Explaining something is a great way to solidify your understanding of it!

Please don’t email individual staff members, just make a private or public Ed post instead.


Course Components

Lectures

Lectures will be held in-person on Mondays and Wednesdays from 3-4:30PM in 1670 BBB. Attendance is not required, though you are encouraged to attend in-person if you are able to, no matter which section you’re enrolled in (i.e. students in LEC 002 and LEC 003 can both attend lecture). Lectures will be recorded.

Recordings will be made publicly available so that students who are not enrolled – including students not at Michigan – can benefit from the recordings. As part of your participation in this course, you may be recorded (e.g. if you answer a question). If you do not wish to be recorded, please email Suraj to discuss alternate arrangements.

Lecture slides/notebooks will be your main resource in this class. You can access them, along with all course materials, by pulling from the course GitHub repository, github.com/practicaldsc/wn25; instructions on how to do this are in the ⚙️ Environment Setup page. We will also link HTML previews of each lecture notebook from the course homepage; you can use these to annotate the lecture notebooks with a tablet, if you’d like.

For some topics, we will prepare official “Guides” that you’re also expected to read and reference while working on assignments.

Supplementary readings will be made available on the course homepage, drawn from a variety of online resources. Additional supplementary resources can be found in the Readings section of the Resources tab of the course website.

Discussions

There are four discussion sections on Thursdays. Their times and locations can be found on the Calendar tab of the course website. You can attend any discussion session, but if space fills up, priority will be given to students officially enrolled in that section.

The first discussion, held in the first week of class, includes some useful instruction and tips for using Jupyter Notebooks, the programming environment we’ll be using in this course. It should be helpful to get you set up and comfortable with the technology you’ll be using all semester.

Subsequent discussion sections will be focused on exam preparation. Students will work through problems from past exams in related courses and be able to get help from course staff. Attending discussion and working through practice problems gives you direct experience with the types of theoretical questions you will see on exams. Discussion sections will not be recorded. The purpose of discussion is to give you hands-on problem-solving experience, so you really need to attend and participate to reap the benefits.

In order to incentivize you to attend discussion, discussion attendance can optionally count towards your course grade. There will be 13 discussion sections total, and we will take attendance in each one. Each week you attend discussion will earn you 1 “discussion point”, up to a maximum of 10 discussion points. Your discussion score will be the number of discussion points you earn out of 10. This means that you can miss up to 3 discussions for any reason (late add, extenuating circumstances, etc.) and still earn a full discussion score. Details can be found in the Grades section below.

Homeworks

There will be 11 homework assignments due throughout the semester, usually on Tuesdays (except for Homework 1). Expect each homework assignment to take ~8-10 hours to complete. Even though this is a programming-heavy class, expect it to have the workload patterns of a more theoretical class – that is, expect a constant, moderate level of work each week, rather than having some weeks with very little work and some extremely heavy weeks.

Homeworks will be distributed in Jupyter Notebooks, in which you will write Python code. For programming problems, public tests will be provided to make sure you’re on the right track, however, your submission will be graded using an autograder with hidden tests. Many homeworks will also include written questions which must be answered on pen-and-paper; these allow us to assess your understanding of more theoretical ideas underpinning core data science techniques taught in lecture. Autograded questions and written questions must be submitted to two separate Gradescope assignments but count as one big assignment for the purposes of grading. The Example Homework shows the typical format of a homework assignment and explains the submission workflow.

Each homework is worth the same amount, but the lowest two homeworks will be dropped when calculating your final score. Homeworks will be released at least a week before the due date. You will access homeworks by pulling the course GitHub repository, and will submit your completed homework as Jupyter Notebooks to Gradescope (we will not be using the EECS autograder). You may turn them in as many times as you like before the deadline, and only the most recent submission will be graded, so it’s a good idea to submit early and often. For homeworks with both an autograded and written submission, your submission time will be the latter of the two submission times.

Note that all homeworks are to be completed individually. See the Collaboration and Academic Integrity section for more details. That said, you’re encouraged to discuss all assignments with others at a conceptual level in office hours and study groups.

Final Project

In the Final Project, you’ll work on an open-ended investigation of a dataset of your choosing from a fixed set of options, using the tools from throughout the semester. Your work will culminate in a public-facing website that you can share with friends, family, and on your resume. Visit the Showcase to see Final Project submissions from prior semesters. (Previously, the assignment was called the “Portfolio Homework”; it has not changed in spirit, despite the new name.)

We will release the spec for the Final Project before the Midterm Exam, and it will not be due until the last week of classes, so you’ll have over a month to complete it. For now, note that the Final Project is different than homeworks in the following ways:

  • You can work with one partner, but aren’t required to.
  • You cannot use slip days on the Final Project, since it’s due at the end of the semester and we need time to grade it.
  • The Final Project is not be autograded at all; it will be fully manually graded.

Office Hours

To get help on assignments and concepts, course staff will be hosting several office hours per week. The vast majority of office hours will be held in-person on North Campus, though we will hold a few Central Campus and remote office hours, depending on staff availability. See the Calendar tab of the course website for the most up-to-date schedule and directions.

We use the term “office hours” but really, office hours are held in a common room where you can come to work on assignments, meet your classmates, and get help from course staff. We don’t bite and we would love to see you in office hours!

Office hours are your chance to ask for general help, clarification on homeworks, and to review previous homeworks. Course staff will not tell you if your answer is correct, and it is inappropriate to ask. Here are some really good questions to ask instead:

  • I got confused about a concept in class. Can you explain it?
  • When the assignment says X, does it mean A or B?
  • My code is giving a weird error - can you help me understand why?
  • I can’t get this test to pass, so I must be doing something wrong. Can you help me figure it out?
  • My code is doing something different than what I expected. Can you explain what is happening?

Questions that you should never ask us:

  • Is this the right answer?
  • Can you check my code and make sure it is right?
  • What is the answer?
  • What’s going to be on the exam?

Your primary motivation when interacting with course staff should be learning.

Weekly Schedule

To summarize all of the events and deadlines, refer to this general weekly schedule (which is subject to change in any given week):

MondayTuesdayWednesdayThursdayFriday
Lecture LectureDiscussion 
 Homework N - 1
due 11:59PM
Homework N
released
  

Exams

This class has one Midterm Exam and one Final Exam. The Final Exam is not cumulative, in that the focus will be on content from the second half of the semester, though the content in the second half of the course builds on the first half. The exams will feature a mix of multiple choice, select all, short answer, and long answer questions, including questions that require you to write code and do math. See study.practicaldsc.org for links to the exams from previous offerings.

  • Midterm Exam: Tuesday, February 25th, 7-9PM

  • Final Exam: Monday, April 28th, 10:30AM-12:30PM

Both exams will be administered in-person and on paper. If you have conflicts with either of the exams, please let us know on the Welcome Survey. We may provide alternate exam times for students with a valid, documented conflict with a required activity in another course or official university-affiliated activity, or to help students avoid negative academic consequences when their religious obligations conflict with academic requirements.


Grades

Assignment Weights

The default grading scheme is shown below:

ComponentWeightNotes
Homeworks42%• 11 total; 2 lowest scores dropped
• 8 slip days available to use with a max of 2 per homework
Final Project8%no slip days allowed
Midterm Exam25% 
Final Exam25% 

We will compute your grade as follows:

\[\boxed{\max(\text{default}, 97\% \cdot \text{default} + 3\% \cdot \text{discussion points out of 10})}\]

This effectively makes discussion attendance optional, since you can still earn 100% of the available points without ever attending discussion. See the Discussions section above for details on how discussion scoring works.

Letter Grades

Grading for this class is not curved in the sense that the average is set at (say) a B+ and half of the class must receive a grade lower than that. If everyone does well and shows mastery of the material, everyone can receive an A (this would be awesome!). If no one does well (this is unlikely), then everyone can receive a C.

Grading for this class is curved in the sense that we do not have a pre-defined mapping from project and exam scores to a final GPA. There is no pre-determined score (e.g., 90% of all possible points) that earns an A or a B or a C or any other grade. To determine the final grade, we will ask questions like “Did this student master the material?”. With that said, grades will not be any stricter than the standard grading scale (where an A+ is a 97+, A is 93+, A- is 90+, etc). For instance, the threshold for an “A” will never be higher than 93%.

The distribution of letter grades in Fall 2024 is shown below. Our distribution will likely be similar, though not identical. (The majority of “E” grades came from students who did not participate in the course.)

Try your best not to worry about grades, and we’ll reciprocate by being fair. We’re in this together ❤️.

Late Policy, Slip Days, and Drops

All homeworks must be submitted by 11:59PM Ann Arbor time on the due date to be considered on time. You may turn them in as many times as you like before the deadline, and only the most recent submission will be graded, so it’s a good habit to submit early and often. If you make a submission after the deadline, your assignment will be counted as late.

You have 8 “slip days” to use throughout the semester. A slip day extends the deadline of a homework by 24 hours. You may use up to 2 slip days on any one homework assignment. You cannot use slip days on the Final Project.

Slip days are designed to be a transparent and predictable source of leniency in deadlines. You can use a slip day if you are too busy to complete a homework on its original due date (or if you forgot about it). But slip days are also meant for things like the internet going down at 11:58PM just as you go to submit your homework. Slip days are meant to be used in exceptional circumstances, so you probably should not need to use all 8, but if you have something going on in your life that is impeding your ability to do your classwork on time, please reach out to us as soon as possible so we can work something out. The earlier you let us know that something’s going on, the more we can do to help, so please reach out.

Slip days are applied automatically at the end of the semester, and you don’t need to ask in order to use one. It’s your responsibility to keep track of how many you have left. If you’ve run out of slip days and submit an assignment late, that homework may still be graded, but you will receive a 0 on it when we calculate grades at the end of the semester.

Regrade Requests

For most homeworks, we will accept regraded requests. If you feel that there in an error in the autograder or that the manual grader has made a mistake, you may submit a regrade request within one week of the grades being released. If you do not submit a regrade request within one week, your original grade will be final.

Regrade Requests for Manually Graded Problems

To submit a regrade request for a manually graded problem, make the regrade request directly on Gradescope. Note that part of your grade is clarity, so if your answer was mostly right but unclear you may still not be eligible for full credit.

Regrade Requests for Autograded Problems

To submit an autograder regrade request, please fill out the Autograder Regrade Request Form.

Note that it’s rare that something is wrong with the autograder, and if that’s the case, we’ll typically fix the necessary test cases and re-run the autograder for the entire class.


Collaboration and Academic Integrity

This will be a tough, but rewarding course. While you will be challenged this semester, we will be offering you plenty of support through office hours and Ed. Make good use of these resources, and you will be able to succeed in this course.

In this course, you can read books, surf the web, talk to your friends and course staff to get help understanding the concepts you need to know to complete your assignments. However, all work you submit must be your own, original work; collaboration must not result in solutions that are identifiably similar to other solutions, past or present.

Encouraged CollaborationUnacceptable Collaboration
Discussing the general approach to homeworks

Talking about problem-solving strategies or issues you ran into and how you solved them

Discussing the answers to exams with other students who have already taken the exam after the exam is complete

Using code provided in class, by the textbook or any other assigned reading or video, with attribution

Google searching for documentation on Python or pandas

Working together with other students on homeworks without copying or sharing answers

Posting a question about your approach to a problem on Ed, without sharing your code
Using or submitting code acquired from other students, the web, or any other resource not officially sanctioned by this course

Posting your code online, including on Ed, unless privately to course staff only

Having any other person complete any part of your assignment on your behalf

Completing an assignment on behalf of someone else

Providing code, exam questions, or solutions to any other student in the course

Collaborating with others on exams

If you are unsure about what constitutes an honor code violation, please contact the course staff with questions. The best way to avoid problems is by using your best judgement. Here are some suggestions for completing your work:

  • Don’t look at or discuss the details of another student’s code for a homework you are working on, and don’t let another student look at your code.
  • Don’t start with someone else’s code and make changes to it, or in any way share code with other students.
  • If you are talking to another student about a homework, don’t take notes, and wait an hour afterward before you write any code.

Use of Generative Artificial Intelligence

Our course policy on the use of GenAI tools for homeworks is simple: you can use these tools to build an understanding of course material and to assist you on assignments, keeping in mind that no tool is a substitute for a strong understanding of course concepts.

Some examples of responsible use of generative AI include autocompleting repetitive/boilerplate code and suggesting edge cases. Creating large sections of code you do not understand yourself is using generative AI in an irresponsible way, and is likely to be detrimental when it comes to showing what you know on exams, which are worth 50% of your course grade and do not allow generative AI tools.

Honor Code

We report suspected violations to the Engineering Honor Council. To identify violations, we use both manual inspection and automated software to compare present solutions with each other, with past solutions, and with code found online. The Honor Council determines whether a violation of academic standards has occurred, as well as any sanctions. Read the Honor Code for detailed definitions of cheating, plagiarism, and other forms of academic misconduct.


Student Support and Well-Being

Accommodations

If you need, or think you might need, an accommodation for a disability, please let us know during the first three weeks of the semester. Some aspects of this course may be modified to facilitate your participation and progress. As soon as you make us aware of your needs, we can work with the Services for Students with Disabilities (SSD) office to help us determine appropriate academic accommodations. SSD (ssd.umich.edu; 734-763-3000) recommends accommodations through a Verified Individualized Services and Accommodations (VISA) form. Any information you provide is private and confidential and will be treated as such.

Diversity and Inclusion

It is our intention that students from all backgrounds and perspectives will be well served by this course, and that the diversity that students bring to this class will be viewed as an asset. We welcome individuals of all ages, backgrounds, beliefs, ethnicities, genders, gender identities, gender expressions, national origins, religious affiliations, sexual orientations, socioeconomic background, family education level, ability - and other visible and nonvisible differences. All members of this class are expected to contribute to a respectful, welcoming, and inclusive environment for every other member of the class. Your suggestions are encouraged and appreciated.

Campus Resources

As a student, you may experience a range of issues that can negatively impact your learning, such as anxiety, depression, interpersonal or sexual violence, difficulty eating or sleeping, loss/grief, and/or alcohol/drug problems. These mental health concerns or stressful events may lead to diminished academic performance and affect your ability to participate in day-to-day activities.

In order to support you during such challenging times, the University of Michigan provides a number of confidential resources to all enrolled students, many of which are listed here. Some particularly useful resources include:


Acknowledgements

This course is being offered for the second time at the University of Michigan. With that said, many of the materials we will use are adopted from content created by countless other instructors for courses at other institutions, in particular:

  • DSC 10, DSC 40A, and DSC 80 at the University of California, San Diego
  • Data 6 and Data 100 at the University of California, Berkeley

Language in this syllabus has been adopted from other courses as well, including EECS 203, EECS 280, EECS 376, and EECS 485 here at the University of Michigan, and CSE 160 at the University of Washington.


Disclaimer

While we try to do our best to plan ahead, unfortunately, sometimes circumstances do arise that necessitate a policy change. When this happens, the change will be announced, and this document will be updated with the new policy.

We appreciate any and all feedback, given that this course is new and evolving. If you’d like to provide us with anonymous feedback at any point, you can do so at this form. Thank you!