Exploratory data analysis Assignment


This week, you will perform the first three steps in the process of exploratory data analysis (EDA) using curated data. It is important to document your workflow as you go along. This assignment is in two parts, with the first part being about the exploratory data analysis performed on the selected data and the second part discussing exploratory data analysis in general.

Follow these steps to complete the assignment.

Part 1:

Curate data from one of the scenarios provided in the Week 1 Resources Data Repository.
Using either Python or R, complete the entire process of exploratory data analysis (EDA) on the selected dataset. Use descriptive and meaningful comments in the code or process file as documentation.
Describe the first three steps in the process of exploratory data analysis (EDA) that you used on the dataset.
Explain your general observations, including an understanding of the types of variables present, the extent to which the dataset is complete, and additional insight gained about the dataset. Describe patterns discovered and anomalies detected. Embed at least two annotated visualizations to support your findings.
Part 2:

Describe the typical process of exploratory data analysis (EDA) and the software tools commonly involved.
Explain the purpose of exploratory data analysis and its importance to predictive model construction.
Discuss why EDA is critical to performing investigations into the data to find patterns, identify anomalies, engage in hypothesis testing, and check assumptions.
Describe the specific steps (and related goals) involved in the typical EDA before confirmatory or final models are developed.
Length: 5 to 7-page paper, not including title and reference pages

References: Include a minimum of 2 scholarly references.

The completed assignment should address all of the assignment requirements, exhibit evidence of concept knowledge, and demonstrate thoughtful consideration of the content presented in the course.

