HS 650 001 WN 2018

Overview

HS650_WI2017_icon.pngThe Data Science and Predictive Analytics (DSPA) course aims to build computational abilities, inferential thinking, and practical skills for tackling core data scientific challenges. It explores foundational concepts in data management, processing, statistical computing, and dynamic visualization using modern programming tools and agile web-services. Concepts, ideas, and protocols are illustrated through examples of real observational, simulated and research-derived datasets. Some prior quantitative experience in programming, calculus, statistics, mathematical models, or linear algebra will be necessary.


This open graduate course will provide a general overview of the principles, concepts, techniques, tools and services for managing, harmonizing, aggregating, preprocessing, modeling, analyzing and interpreting large, multi-source, incomplete, incongruent, and heterogeneous data (Big Data). The focus will be to expose students to common challenges related to handling Big Data and present the enormous opportunities and power associated with our ability to interrogate such complex datasets, extract useful information, derive knowledge, and provide actionable forecasting. Biomedical, healthcare, and social datasets will provide context for addressing specific driving challenges. Students will learn about modern data analytic techniques and develop skills for importing and exporting, cleaning and fusing, modeling and visualizing, analyzing and synthesizing complex datasets. The collaborative design, implementation, sharing and community validation of high-throughput analytic workflows will be emphasized throughout the course.

Prerequisites

You can view the General DSPA Prerequisites. To ensure students are comfortable in this DSPA course, consider taking the self-assessment (pretest) prior to enrolling in the course.

To summarize, students should have prior experience with college level (undergrad) mathematical modeling, statistical analysis, or programming courses or permission of the instructor. Some MOOCs may be taken as prerequisites, e.g., Corsera, EdX1, EdX2. Additional examples of remediation courses are provided in the self-assessment (pretest).

Course Objectives

Trainees successfully completing the course will:

  1. Gain understanding of the computational foundations in Big Data Science
  2. Develop critical inferential thinking
  3. Gather a tool chest of R libraries for managing and interrogating raw and derived, observed, experimental, and simulated big healthcare datasets
  4. Possess practical skills for handling complex datasets.

 

Target Audience

This course will be appropriate for trainees who have significant interest in learning data scientific and predictive analytic methods that can commit substantial amount of time to focus an undivided attention to study, practice and interact with other trainees in the course.

Notes

Class notes, software code, learning materials and assignments are provided here.

Instructor

Ivo D. Dinov

Competencies

This course is designed to build specific data science skills and predictive analytic competencies.

Logistics

University of Michigan affiliates can directly register for the course using their UMich credentials and the Enrollment link below. Non-affiliated learners and students outside the University of Michigan need to first obtain a UMich friend account (using an outside email) that can then be used to register for the DSPA course. (January-April 2018, Monday & Wednesday, 8:30-10:30 AM, SNB 1240)

Acknowledgments

The DSPA course is made possible with substantial support from Michigan Institute for Data Science (MIDAS), Statistics Online Computational Resources (SOCR), Health Behavior and Biological Sciences (HBBS/UMSN), and Computational Medicine and Bioinformatics (DCM&B).

Fair Use Licensing

Like all SOCR resources, and to support open-science, these resources (learning materials and source-code) are CC-BY and LGPL licensed.

Course Summary:

Date Details Due
CC Attribution Share Alike This course content is offered under a CC Attribution Share Alike license. Content in this course can be considered under this license unless otherwise noted.