MOOC: Data Science and Predictive Analytics (University of Michigan)
The Data Science and Predictive Analytics (DSPA) MOOC aims to build computational abilities, inferential thinking, and practical skills for tackling core data scientific challenges. It explores foundational concepts in data management, processing, statistical computing, and dynamic visualization using modern programming tools and agile web-services. Concepts, ideas, and protocols are illustrated through examples of real observational, simulated and research-derived datasets. Some prior quantitative experience in programming, calculus, statistics, mathematical models, or linear algebra will be necessary.
This open graduate course will provide a general overview of the principles, concepts, techniques, tools and services for managing, harmonizing, aggregating, preprocessing, modeling, analyzing and interpreting large, multi-source, incomplete, incongruent, and heterogeneous data (Big Data). The focus will be to expose students to common challenges related to handling Big Data and present the enormous opportunities and power associated with our ability to interrogate such complex datasets, extract useful information, derive knowledge, and provide actionable forecasting. Biomedical, healthcare, and social datasets will provide context for addressing specific driving challenges. Students will learn about modern data analytic techniques and develop skills for importing and exporting, cleaning and fusing, modeling and visualizing, analyzing and synthesizing complex datasets. The collaborative design, implementation, sharing and community validation of high-throughput analytic workflows will be emphasized throughout the course.
To summarize, students should have prior experience with college level (undergrad) mathematical modeling, statistical analysis, or programming courses or permission of the instructor. Some MOOCs may be taken as prerequisites, e.g., Corsera, EdX1, EdX2. Additional examples of remediation courses are provided in the self-assessment (pretest).
Trainees successfully completing the course will:
- Gain understanding of the computational foundations in Big Data Science.
- Develop critical inferential thinking.
- Gather a tool chest of R libraries for managing and interrogating raw and derived, observed, experimental, and simulated big healthcare datasets.
- Possess practical skills for handling complex datasets.
This course will be appropriate for trainees who have significant interest in learning data scientific and predictive analytic methods that can commit substantial amount of time to focus their undivided attention to study, practice and interact with other trainees for the duration of this MOOC course.
- The MOOC runs continuously.
- Registration: University of Michigan affiliates can directly register for the course using their UMich credentials and the Enrollment link below. Non-affiliated learners and students outside the University of Michigan need to first obtain a UMich friend account (using an outside email), then complete a registration form to be added to the DSPA course. Learning modules, assignments, datasets, case-studies, audio and video materials are available under each chapter of the DSPA course.
DSPA MOOC Course Certification
Course mastery certificates for completion of the entire DSPA MOOC course will be issued to all students that complete successfully and timely all course sections, modules and assignments. This dynamic flowchart shows pathways to obtaining partial DSPA MOOC completion certificates.
UMich Graduate CreditTo obtain UMich grad credit and get a course grade for completing HS650, students must enroll in HS650, through the registrar's office, and complete all requirements in due time. This option is only available to currently enrolled University of Michigan graduate students.
The DSPA MOOC is made possible with substantial support from Michigan Institute for Data Science (MIDAS), Statistics Online Computational Resources (SOCR), the Department of Computational Medicine and Bioinformatics (DCMB), and the Department of Health Behavior and Biological Sciences (HBBS/UMSN).
Ideas, scripts, software, code, protocols and documentation from the broad and diverse R statistical computing community have been utilized throughout the DSPA materials.
Many colleagues, students, researchers, and fellows have shared their constructive expertise, valuable time, and critical assessment for generating, validating, and enhancing these open-science resources. Among these are Christopher Aakre, Simeone Marino, Jiachen Xu, Ming Tang, Nina Zhou, Chao Gao, Alex Kalinin, Syed Husain, Brady Zhu, Farshid Sepehrband, Lu Zhao, Sam Hobel, Hanbo Sun, Tuo Wang, Brian Athey, and many others.
Fair Use Licensing
Dinov, ID. (2018) Data Science and Predictive Analytics: Biomedical and Health Applications using R, Springer (ISBN 978-3-319-72346-4).
The syllabus page shows a table-oriented view of the course schedule, and the basics of course grading. You can add any other comments, notes, or thoughts you have about the course structure, course policies or anything else.
To add some comments, click the "Edit" link at the top.