---
title: "SOCR Case-Study: Deaths in Guatemala (2009-2016)"
subtitle: "
Fall 2018 SOCR Health Analytics Training Workshop
"
author: "SOCR/MIDAS (Ivo Dinov)
"
date: "`r format(Sys.time(), '%B %Y')`"
tags: [DSPA, SOCR, MIDAS, Big Dta, Predictive Analytics]
output:
html_document:
theme: spacelab
highlight: tango
toc: true
number_sections: true
toc_depth: 2
toc_float:
collapsed: false
smooth_scroll: true
---
# Import, plot, sumarize and save data
Load the SPSS (*.sav) 2 datasets, generate summary statistics for all variables, plot some of the features (e.g., histograms, box plots, density plots, etc.) of several variables.
* [Case-Study: Deaths in Guatemala (2009_2016)](https://umich.instructure.com/courses/38100/files/folder/Case_Studies/15_ALS_CaseStudy),
* [Other SOCR Case-Studies](https://umich.instructure.com/courses/38100/files/folder/Case_Studies).
```{r message=F, warning=F}
# install.packages("foreign")
library("foreign")
pathToZip <- tempfile()
download.file("https://umich.instructure.com/files/8882923/download?download_frd=1", pathToZip, mode = "wb")
#dataset <- read.spss(unzip(pathToZip, files = "namcs2015-spss.sav", list = F, overwrite = TRUE), to.data.frame=TRUE)
# Check ZIP file content
unzip(pathToZip, list = T, overwrite = TRUE)
# 2009
dataset_2009 <- read.spss(unzip(pathToZip, files = "2009vitales.sav", list = F, overwrite = TRUE), to.data.frame=TRUE)
dim(dataset_2009)
## 71707 25
# str(dataset_2009)
# View(dataset_2009)
summary(dataset_2009)
str(dataset_2009)
# 2016
dataset_2016 <- read.spss(unzip(pathToZip, files = "2016vitales.sav", list = F, overwrite = TRUE), to.data.frame=TRUE)
dim(dataset_2016) # 82565 28
summary(dataset_2016)
str(dataset_2016)
# Data Dictionary and Challenges (DDC)
dataset_DDC <- readxl::read_xlsx(unzip(pathToZip, files = "DataDictionary_Challenges.xlsx"))
# dim(dataset_DDC) # 82565 28
View(dataset_DDC)
unlink(pathToZip)
library("DT")
datatable(dataset_2016)
```
# Descriptive statistics and graphs of the data
Try some `exploratory` and `quantitative` data analytics for these data using these materials:
* [DSPA Chapter 2: Data Management](http://www.socr.umich.edu/people/dinov/courses/DSPA_notes/02_ManagingData.html)
* [DSPA Chapter 3: Visualizaiton](http://www.socr.umich.edu/people/dinov/courses/DSPA_notes/03_DataVisualization.html)
...