Understanding the American Community Survey (ACS) 2013-2017 Summary File: A Step-by-Step Guide to Downloading ACS Data for Kansas Block Groups.

Understanding the American Community Survey (ACS) 2013-2017 Summary File

The American Community Survey (ACS) 2013-2017 summary file provides a wealth of demographic and socioeconomic data for various geographic areas in the United States. The data is collected by the US Census Bureau and is used to inform policy decisions, plan programs, and make informed business decisions.

In this article, we will focus on downloading all variables from all tables in the ACS 2013-2017 summary file for all census block groups in a state, specifically Kansas.

Working with the totalcensus Package

The totalcensus package is a powerful tool for working with ACS data. It provides an easy-to-use interface for importing and manipulating ACS data, as well as functions for summarizing and analyzing the data.

To start, we need to install and load the totalcensus package in R:

install.packages("totalcensus")
library(totalcensus)

Understanding the read_acs5year Function

The read_acs5year function is used to import ACS data from a summary file. The function takes several arguments, including:

  • year: The year of the data.
  • states: A vector of states for which to retrieve data.
  • table_contents: A character string specifying the tables to include in the data.
  • summary_level: The level of summary for the data (e.g., block group, tract).
  • with_margin: A logical indicating whether to include margins in the data.

Understanding the table_contents Argument

The table_contents argument specifies which tables to include in the data. By default, this argument is set to “*”, which includes all tables. However, we can specify a subset of tables by providing a list of characters.

For example, to retrieve only block group data for Kansas, we would use:

table_contents = "B02001 02 03"

This code retrieves the following variables:

  • B02001: Population in the 5-year age groups.
  • B02002: Population in the 5-year age groups, by sex.
  • B02003: Population in the 5-year age groups, by sex and race.

Finding Available Table Contents

To find all available table contents for a given year and state, we can use the search_tablecontents function:

search_tablecontents("acs5")

This code returns a list of all available tables for the ACS 2013-2017 data.

Extracting Variables from Tables

Once we have identified the tables we want to include in our data, we can extract the variables using the read_acs5year function. We can create a vector of table contents and pass it to the function:

table_contents <- c("B02001", "B02002", "B02003")  # Extract population variables

trying_acs <- read_acs5year(
    year = 2017,
    states = "KS",
    table_contents = table_contents,
    summary_level = "block group"
)

This code extracts the population variables for Kansas block groups in the ACS 2013-2017 data.

Handling a Large Number of Variables

With over 26,000 available variables, it is not feasible to extract all of them into one data.table. Instead, we can use the dplyr package to subset the data based on our desired variables:

library(dplyr)

trying_acs %>%
    select(B02001, B02002, B02003)  # Select specific population variables

This code selects only the specified population variables from the trying_acs dataframe.

Conclusion

Downloading all variables from all tables in the ACS 2013-2017 summary file for all census block groups in a state requires careful consideration of the available data and how to extract it. By using the totalcensus package, understanding the role of the table_contents argument, and utilizing functions like search_tablecontents and dplyr, we can efficiently retrieve the desired data.

Remember, working with large datasets requires patience, persistence, and a willingness to experiment with different approaches. With practice and experience, you’ll become proficient in handling even the most daunting datasets.


Last modified on 2023-10-08