Skip to contents

The dataset comes from CouncilStat, which is used by many NYC Council district offices to enter and track constituent cases that can range from issues around affordable housing, to potholes and pedestrian safety. This dataset aggregates the information that individual staff have entered. However, district staffs handle a wide range of complex issues. Each offices uses the program differently, and thus records cases, differently and so comparisons between accounts may be difficult. Not all offices use the program. For more info see http://labs.council.nyc/districts/data/.

This is a teaching package; the data are not clean—in particular there are spurious zip codes in the complaints data, as well as other issues.

Data sourced from NYC Open Data.

The package also includes nyc_zips and census_vars. The former is a table of NYC zip codes. The latter is a table of the names of some Census Bureau ACS variables.

Installation

You can install nycomplaints from GitHub with:

remotes::install_github("kjhealy/nycomplaints@main")

Alternatively, install this package from my r-universe:

install.packages(
  "nycomplaints",
  repos = c("https://kjhealy.r-universe.dev", "https://cloud.r-project.org")
)

Including https://cloud.r-project.org ensures dependencies on CRAN are resolved automatically.

Loading the Data

library(tidyverse) # Optional but strongly recommended
library(nycomplaints)

nycomplaints
#> # A tibble: 341,299 × 11
#>    unique_key   account opendate   closedate  complaint_type    descriptor zip  
#>    <chr>        <chr>   <date>     <date>     <chr>             <chr>      <chr>
#>  1 NYCC34519748 NYCC34  2025-01-09 NA         Governmental Ope… Voting In… 11385
#>  2 NYCC34519746 NYCC34  2025-01-09 2025-01-09 Utilities         Con Edison 11237
#>  3 NYCC34519742 NYCC34  2025-01-08 2025-01-08 Housing and Buil… Heat/Hot … 11249
#>  4 NYCC34519741 NYCC34  2025-01-06 2025-01-08 Housing and Buil… Heat/Hot … 11211
#>  5 NYCC34519743 NYCC34  2024-12-23 2025-01-09 Housing and Buil… Maintenan… 11211
#>  6 NYCC34519744 NYCC34  2024-12-11 2025-01-09 General Welfare   SSI and S… 11211
#>  7 NYCC34519745 NYCC34  2024-09-05 2025-01-09 Finance           Tax Prepa… 11385
#>  8 NYCC31505454 NYCC31  2024-06-04 NA         Sanitation        OVERGROWN… <NA> 
#>  9 NYCC31505447 NYCC31  2024-05-30 NA         <NA>              <NA>       11413
#> 10 NYCC31505450 NYCC31  2024-05-30 NA         <NA>              <NA>       11413
#> # ℹ 341,289 more rows
#> # ℹ 4 more variables: borough <chr>, city <chr>, council_dist <chr>,
#> #   community_board <chr>

nyc_zips
#> # A tibble: 211 × 6
#>      zip borough       city          county          long_county    short_county
#>    <dbl> <chr>         <chr>         <chr>           <chr>          <chr>       
#>  1 11368 Queens        Corona        Queens County   Queens County… Queens      
#>  2 11208 Brooklyn      Brooklyn      Kings County    Kings County,… Kings       
#>  3 11385 Queens        Ridgewood     Queens County   Queens County… Queens      
#>  4 11373 Queens        Elmhurst      Queens County   Queens County… Queens      
#>  5 11226 Brooklyn      Brooklyn      Kings County    Kings County,… Kings       
#>  6 11236 Brooklyn      Brooklyn      Kings County    Kings County,… Kings       
#>  7 10467 Bronx         Bronx         Bronx County    Bronx County,… Bronx       
#>  8 10025 Manhattan     New York      New York County New York Coun… New York    
#>  9 11207 Brooklyn      Brooklyn      Kings County    Kings County,… Kings       
#> 10 10314 Staten Island Staten Island Richmond County Richmond Coun… Richmond    
#> # ℹ 201 more rows

census_vars
#> # A tibble: 9 × 2
#>   variable   varname          
#>   <chr>      <chr>            
#> 1 B01001_001 population       
#> 2 B02001_002 white_alone      
#> 3 B02001_003 black_alone      
#> 4 B02001_005 asian_alone      
#> 5 B02001_008 two_or_more_races
#> 6 B03003_001 hispanic         
#> 7 B03002_003 nonhispanic_white
#> 8 B03002_004 nonhispanic_black
#> 9 B19013_001 med_hhinc