{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Discussion 2: Data visualization and vibecoding\n",
    "\n",
    "STATS 60 / STATS 160 / PSYCH 10\n",
    "\n",
    "<div class=\"layout\" style=\"display: flex; align-items: center; justify-content: space-around;\">\n",
    "\n",
    "<div style=\"flex: 1;\">\n",
    "\n",
    "**Today's section**\n",
    "\n",
    "- Recap of lecture material.\n",
    "- Week 2 practice quiz 1.\n",
    "- Making visualizations with vibecoding.\n",
    "    - <a href = \"https://colab.research.google.com/drive/1nLfjax5gc1s5xnf5g_QJiQGXTNEho3gz?authuser=1\">Notebook link</a>.\n",
    "\n",
    "</div>\n",
    "<div style=\"flex: 1;\">\n",
    "\n",
    "\n",
    "</div>\n",
    "</div>\n",
    "\n",
    "# Recap\n",
    "\n",
    "## Data terminology\n",
    "\n",
    "\n",
    "- **Observational units** are the individual entities on which data are recorded.\n",
    "\n",
    "\n",
    "- **Variables** are the different characteristics/measurements that are recorded for each observational unit.\n",
    "    - Variables can be **quantitative** (like height) or **categorical** (like eye color).\n",
    "\n",
    "\n",
    "- The **distribution** of a variable describes the pattern of the variable across the different observational units.\n",
    "    - The distribution can be represented with a visualization or with summary statistics (like the mean).\n",
    "\n",
    "\n",
    "\n",
    "## Visualizations\n",
    "\n",
    "- When making a visualization, think about the **number of variables** and the **type of variable** (quantitative or categorical).\n",
    "\n",
    "\n",
    "- For a **single** variable:\n",
    "    - **Categorical**: bar chart or pie chart.\n",
    "    - **Quantitative**: histogram.\n",
    "\n",
    "\n",
    "- For **multiple** variables:\n",
    "    - **Two categorical**: stacked bar chart.\n",
    "    - **Two quantitative**: scatter plot.\n",
    "    - **One quantitative, one categorical**: side-by-side histograms.\n",
    "\n",
    "\n",
    "- For a variable that changes **over time**: line chart.\n",
    "\n",
    "\n",
    "- For a variable that changes **over locations**: dot map or chloropelth (maps)\n",
    "\n",
    "\n",
    "# Practice quiz 1\n",
    "\n",
    "## Bird nests and cigarette butts\n",
    "\n",
    "Practice quiz \\#1 is about a <a href = \"https://doi.org/10.1098/rsbl.2012.0931\">2013 study</a> that found that bird nests that contained cigarette butts typically contained fewer parasites. This was done by measuring the number of parasites and the weight of cigarette butts in different bird nests.\n",
    "\n",
    "<div class=\"layout\" style=\"display: flex; align-items: center; justify-content: space-around;\">\n",
    "\n",
    "<div style=\"flex: 1;\">\n",
    "<figure><img src=\"../figures/house-finch.jpg\" alt=\"The study featured house finches...\"style=\"width:100%;\"><figcaption>The study featured <a href = \"https://www.allaboutbirds.org/guide/House_Finch/\">House Finches</a>...</figcaption></figure>\n",
    "\n",
    "</div>\n",
    "<div style=\"flex: 1;\">\n",
    "\n",
    "<figure><img src=\"../figures/house-sparrow.jpg\" alt=\"and House Sparrows\" style=\"width:100%;\"><figcaption>and <a href = \"https://www.allaboutbirds.org/guide/House_Sparrow/\">House Sparrows</a>.</figcaption></figure>\n",
    "</div>\n",
    "</div>\n",
    "\n",
    "\n",
    "\n",
    "## Bird nests and cigarette butts\n",
    "\n",
    "- What are the observational units?\n",
    "- What variables will be relevant for the study?\n",
    "\n",
    "\n",
    "- The observational units are bird nests.\n",
    "- Relevant variables are:\n",
    "    - Number of cigarette butts.\n",
    "    - Number of nest parasites.\n",
    "    - Weight of cigarette butts.\n",
    "    - Species of bird.\n",
    "    - and more!\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "## Bird nests and cigarette butts\n",
    "\n",
    "What type of visualization would be best to see the relationship between the weight of cigarette butts and the number of nest parasites?\n",
    "\n",
    "\n",
    "A scatter plot would be best. This is because the weight of cigarette butts and the number of pests are both quantitative variables, and we want to see the relationship between these two variables.\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/birds_cigarettes.png\" alt=\"A scatter plot showing the relationship between the weight of cigarette butts and the number of parasites in different nests.\" style=\"width:45%;\"><figcaption></figcaption></figure>\n",
    "\n",
    "## Weight of cigarette butts\n",
    "\n",
    "Below is a histogram of the weight of cigarettes found in different birds nests. Based on the histogram, is the mean or median weight larger? Explain why.\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/butts_weight_hist.png\" alt=\"\" style=\"width:45%;\"><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "\n",
    "The mean weight is larger than the median weight. This is because the mean is more sensitive to outliers and so the small number of nests with a large weight will increase the mean more than the median.\n",
    "\n",
    "# Vibe-coding\n",
    "\n",
    "## NYC waste data\n",
    "\n",
    "<div class=\"layout\" style=\"display: flex; align-items: center; justify-content: space-around;\">\n",
    "\n",
    "<div style=\"flex: 1;\">\n",
    "\n",
    "- We will vibe-code (ask AI to write code for us) in order to visualize data on waste collection in NYC.\n",
    "\n",
    "- Details are in the Colab Notebook. \n",
    "\n",
    "- You can access the notebook <a href = \"https://colab.research.google.com/drive/1nLfjax5gc1s5xnf5g_QJiQGXTNEho3gz?usp=sharing\">here</a> or under \"notebook link\" on the course web page for <a href = \"https://web.stanford.edu/class/stats60/discussion/02-discussion.html\">discussion 2</a>.\n",
    "\n",
    "</div>\n",
    "<div style=\"flex: 1;\">\n",
    "\n",
    "<figure><img src=\"../figures/street_cleaning_nyc.png\" alt=\"A history of NYC street cleaning by Julia Wertz.\" style=\"width:100%;\"><figcaption><a href = \"https://www.newyorker.com/culture/culture-desk/the-n-y-c-mystery-history-hour-street-cleaning\">A history of NYC street cleaning by Julia Wertz.</a></figcaption></figure>\n",
    "\n",
    "</div>\n",
    "</div>\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Phython (JB)",
   "language": "python",
   "name": "jb-python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
