Explore-and-Summarize-Data

View project on GitHub

Welcome to my “Explore and Summarize Data” Project Page!

I completed this project for Udacity’s Data Analyst Nanodegree “Data Analysis with R” course.

In this course, I learned how to do the following things:

  • Understand the distribution of a variable and to check for anomalies and outliers
  • Learn how to quantify and visualize individual variables within a data set by using appropriate plots such as scatter plots, histograms, bar charts, and box plots
  • Explore variables to identify the most important variables and relationships within a data set before building predictive models; calculate correlations, and investigate conditional means
  • Learn powerful methods and visualizations for examining relationships among multiple variables, such as reshaping data frames and using aesthetics like color and shape to uncover more information

For this project, I used R in Rstudio to explore a data set containing financial contributions made by California residents in the 2016 Presidential election.

Both the original .Rmd file and the generated .html file can be found by clicking the “View on GitHub” link above.

Enjoy!