Capstone Project
- Due No Due Date
- Points 100
- Submitting a file upload
In this an optional capstone project you may want to complete (not required). It aims to demonstrate your thinking about conducting a real scientific study using some of the techniques we presented in class. You should review your own data and other publicly available data archives, as well as the SOCR Data and Case-Studies and come up with an interesting project that you complete and submit as a final term paper. Your report should include a relevant healthcare application along with the data analytics and scientific methods that you chose to interrogate the data. Below are the basic requirements your project must satisfy:
- Please submit your project as a self-contained PDF, including text/tables/graphs. No hand-written reports will be accepted.
- You can use and dataset/case-study that has over 100 cases and over 20 features, including data from the case-studies on the Canvas site.
- Examples of online resources containing interesting data:
- SOCR Wiki Data Resource SOCR Data (Links to an external site.)
- Nature Scientific Data (Links to an external site.)
- US EPA http://www.epa.gov/safewater/databases/index.html (Links to an external site.)
- Human Development Reports http://hdr.undp.org/en/ (Links to an external site.)
- National Center for Health Statistics http://www.cdc.gov/nchs/express.htm (Links to an external site.)
- Washington State Department of Health http://www.doh.wa.gov/Data/data.htm (Links to an external site.)
- Database for Phenotypes and Genotypes (dbGaP) (Links to an external site.)
- UCLA Statistics Department Data Archive: http://www.stat.ucla.edu/data (Links to an external site.)
- US Census Bureau (econ, population, geographic, health data) http://www.census.gov/ (Links to an external site.)
- Bureau of Justice Statistics http://bjs.ojp.usdoj.gov/ (Links to an external site.)
- Statistics Resources Online http://www.lesley.edu/library/guides/research/statistics.html (Links to an external site.)
- USA Federal Government Data http://www.data.gov (Links to an external site.)
- Google Public Datasets http://www.google.com/publicdata/directory (Links to an external site.)
- Format:
- The front page should include your student ID, name, course ID and instructor. Start with a one paragraph abstract, followed by an intro/background of the problem, methods, results, discussion/conclusion and acknowledgments, references, in that order. Clearly state the problem you have chosen to investigate. List the resources you used to come up with the project and reference all sources you used to complete the project.
- Use data science techniques from the list of techniques we have discussed in the course (Links to an external site.) to convey your approach.
- Interpret your results in the context of the problem using a lay back language. Write conclusions and discussions at the end of your report and acknowledge outside help. Describe how this project can be extended in the future.
- One or two people can work on a project as a team. If two people work together both must have equally contributed for the completion and submit separate copies of the project, with their names on top (the names of both students should be on both papers). Expectations of team projects are higher.
Rubric
Criteria | Ratings | Pts | ||
---|---|---|---|---|
Correctness and scientific validity
threshold:
pts
|
|
pts
--
|
||
Result reproducibility
threshold:
pts
|
|
pts
--
|
||
Content focus, presentaiton style, and clarity
threshold:
pts
|
|
pts
--
|
||
Total Points:
100
out of 100
|