A2-RaymondLuong

From cs448b-wiki
Jump to: navigation, search

Final Visualization

Rayluong a2 final.png

San Francisco city employees can barely afford to live in the city in which they work. Using an average rental price of $3000 per month (averaged for 2011-2014) and the rule of thumb that a third of one’s income should go toward rent, this means that one’s annual salary must be $108,000. Across most departments and categories, the median salary falls below that figure. As rent prices in San Francisco continue to increase, the income inequality issues surrounding the growing tech scene will continue to be exacerbated.

Design Choices

  • I chose to make my refined question the title of my graph, because it immediately encourages my user to consider the message and social context that I am trying to convey.
  • The benchmark line is crucial to the message of my visualization, so I used a dotted red line. The red immediately catches the reader’s eye but the dotted nature prevents the line from conflicting with the rest of the chart.
  • I used a series of box-and-whisker plots to easily visualize the distributions of data and compare statistical data such as medians and quartiles.
  • I sorted the job categories on the chart by decreasing median annual salary so it is easy to quickly see which departments pay the highest/least as well as look for trends in job categories.
  • I left the box-and-whisker plots unlabelled because numbers would make the chart too cluttered and draw attention away from the the overall message.
  • Even though box-and-whisker plots usually only display medians and quintiles, I decided to include the distribution of data as well. Including the distribution of data allows the viewer to quickly see outliers as well as understand the statistical significance of the data. Because the individual data points are not as important, I used a cooler blue so they would blend into the background.
  • I removed the default gridlines Tableau creates as well as the top/right borders of the chart because they are chartjunk.

Process

Iteration 1: Initial Domain of Marriage in the United States

When I first started this project, I wanted to look at data pertaining to same-sex marriage in the United States. However, given that same-sex marriage is a fairly new legalized institution to the United States as a whole, comprehensive data on this specific topic very difficult to find. From here, I decided to look at the institution of marriage in general in the United States. I’ve heard marriage rates have been on the decline and that millennials are less interested in marriage, and one relationship I’ve always wanted to investigate was the connection between age of marriage and education level. My initial question was: how are marriage rates and education level related?

I found a dataset from 538 (coincidentally the same repository that housed the data from Assignment 1), and created this chart that depicts the decline in overall marriage rates in 25 to 34 year olds, categorized by education level. It’s interesting to note that those with lower levels of educational attainment used to be the most likely to be married, but now they are the least likely.

However, from here, there wasn’t much else to explore because the data set had been preprocessed by 538 and I wanted to work with raw data. As a result, I decided to pivot my exploration.

Rayluong a2 1.png

Iteration 2: Pivot to SF City Employee Salaries

Another big topic that came to mind was salaries of workers in San Francisco. Much of the current dialogue on this topic highlights the extravagant salaries that can be earned by working at one of the numerous tech companies in San Francisco, luring in hordes of young professionals who all want a slice of the pie. However, I wanted to approach the issue from another angle. What are the salaries like for San Francisco city employees, the ones who keep the city running so that the tech scene can exist? What departments or groups of city employees make the most? Least?

The results of my first iteration were intriguing. The highest average salaries are Fire, Attorney, and Police categories. Given how tech-heavy San Francisco is, it surprised me that engineering was only the 5th highest.

Rayluong a2 2.png

Iteration 3: Exploring SF City Employee Salary Data

My next question was: have these salaries changed in the past few years?

In my second iteration, I plotted average total salary against year for each category of jobs. Overall, the salaries appear to be generally increasing, which makes sense given economic conditions and inflation rates. Noticeable outliers were Court average salaries that saw a spike from 2013 to 2014, as well as Fire salaries that actually experienced a decline.

This data was interesting, but I didn’t find it meaningful enough. What do these salaries mean in the context of the tech boom and rising housing costs in San Francisco? From my first iteration, I liked the idea of depicting which categories/departments made the most/least, but I also wanted to see the distributions for each category to gauge spread and outliers. I decided to go for a visualization using a series of box-and-whisker plots that are sorted by median income. I also decided to include a benchmark for how much one’s income should be in order to afford the average rental rate in San Francisco. This benchmark serves to contextualize the data in the rising housing cost dialogue. This brings me to my final iteration

Rayluong a2 3.png

Dataset and Tools

I obtained this data from Kaggle. The data contains salary data of city employees, including job title, benefits, salary, and year.

The raw data only contained Job Titles, so I had to bucket the job titles into categories. I used the categories proposed by one subscriber of the dataset. I manipulated the data and created these category buckets using Microsoft Excel. The visualizations were created with Tableau Desktop 9.3 and vector reformatting was done with Sketch3.