Chapter 2 Data sources
Data source: https://www.gunviolencearchive.org/
All members in our group are responsible for collecting data since we hope to collect data on topics that both of us are interested in. Since the legal use of gun is a long existing and controversial problems in the United States, we decide to search for data related to this topic to analyze some negative influence of gun-rights and finally reach out our data source website. This website has comprehensive data including total number of incidents of gun violence categorized by deaths and injuries, ages, reasons of shootings and so on. We will use these data to complete exploratory analysis and data visualization tasks such as comparing the number of gun violence in different categories in a time manner or analyzing how the number of gun violence is alternated by geographical region, which is strongly related to state law and crime rates.
In the data set, there are numerical variables “# Killed” and “# Injured” representing number of people killed or injured in the gun violence incident. There are also categorical variables “Incident Date”, “State”, and “City or County” showing the data and location that the gun violence incidents happened. There are different numbers of records for data in different years. To be more specific, the dataset has 40594 total records in 2014, 49546 total records in 2015, 56280 total records in 2016, 59038 total records in 2017, and 56280 total records in 2018.
One problem about the datasets on this website is that records of gun violence incidents with some specific characteristics such as officer involved incidents or whether the incident was a mass shooting or not are separately recorded in different files, which means additional data transformation is needed before data analysis and visualization. Moreover, the data only contain specific records for children(age 0-11) and teens(age 12-17) killed or injured in the gun violence incidents but have no detailed records on other age group such as the elderly. The lack of information on each age group may make our analysis on victims based on age less convenient for us.