{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ClimateMatchAcademy/course-content/blob/main/tutorials/W2D4_ClimateResponse-Extremes&Variability/W2D4_Tutorial5.ipynb)   \"Open" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "# **Tutorial 5: Non-stationarity in Historical Records**\n", "\n", "**Week 2, Day 4, Extremes & Vulnerability**\n", "\n", "**Content creators:** Matthias Aengenheyster, Joeri Reinders\n", "\n", "**Content reviewers:** Yosemley Bermúdez, Younkap Nina Duplex, Sloane Garelick, Zahra Khodakaramimaghsoud, Peter Ohue, Laura Paccini, Jenna Pearson, Derick Temfack, Peizhen Yang, Cheng Zhang, Chi Zhang, Ohad Zivan\n", "\n", "**Content editors:** Jenna Pearson, Chi Zhang, Ohad Zivan\n", "\n", "**Production editors:** Wesley Banfield, Jenna Pearson, Chi Zhang, Ohad Zivan\n", "\n", "**Our 2023 Sponsors:** NASA TOPS and Google DeepMind" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "# **Tutorial Objectives**" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "In this tutorial, we will analyze the annual maximum sea level heights in Washington DC. Coastal storms, particularly when combined with high tides, can result in exceptionally high sea levels, posing significant challenges for coastal cities. Understanding the magnitude of extreme events, such as the X-year storm, is crucial for implementing effective safety measures. We will examine the annual maximum sea level data from a measurement station near Washington DC.\n", "\n", "By the end of this tutorial, you will gain the following abilities:\n", "\n", "- Analyze timeseries data during different climate normal periods.\n", "- Evaluate changes in statistical moments and parameter values over time to identify non-stationarity." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "# **Setup**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "executionInfo": { "elapsed": 1430, "status": "ok", "timestamp": 1681921157378, "user": { "displayName": "Matthias Aengenheyster", "userId": "16322208118439170907" }, "user_tz": -60 }, "tags": [] }, "outputs": [], "source": [ "# imports\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import pandas as pd\n", "import cartopy.crs as ccrs\n", "from scipy import stats\n", "from scipy.stats import genextreme as gev\n", "import os\n", "import pooch\n", "import tempfile" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Figure Settings\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Figure Settings\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Figure Settings\n", "import ipywidgets as widgets # interactive display\n", "\n", "%config InlineBackend.figure_format = 'retina'\n", "plt.style.use(\n", " \"https://raw.githubusercontent.com/ClimateMatchAcademy/course-content/main/cma.mplstyle\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Video 1: Speaker Introduction\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Video 1: Speaker Introduction\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Video 1: Speaker Introduction\n", "# Tech team will add code to format and display the video" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "tags": [] }, "outputs": [], "source": [ "# helper functions\n", "\n", "\n", "def pooch_load(filelocation=None, filename=None, processor=None):\n", " shared_location = \"/home/jovyan/shared/Data/tutorials/W2D4_ClimateResponse-Extremes&Variability\" # this is different for each day\n", " user_temp_cache = tempfile.gettempdir()\n", "\n", " if os.path.exists(os.path.join(shared_location, filename)):\n", " file = os.path.join(shared_location, filename)\n", " else:\n", " file = pooch.retrieve(\n", " filelocation,\n", " known_hash=None,\n", " fname=os.path.join(user_temp_cache, filename),\n", " processor=processor,\n", " )\n", "\n", " return file" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "# **Section 1**\n", "Let's inspect the annual maximum sea surface height data and create a plot over time. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "executionInfo": { "elapsed": 472, "status": "ok", "timestamp": 1681921184150, "user": { "displayName": "Matthias Aengenheyster", "userId": "16322208118439170907" }, "user_tz": -60 }, "tags": [] }, "outputs": [], "source": [ "# download file: 'WashingtonDCSSH1930-2022.csv'\n", "\n", "filename_WashingtonDCSSH1 = \"WashingtonDCSSH1930-2022.csv\"\n", "url_WashingtonDCSSH1 = \"https://osf.io/4zynp/download\"\n", "\n", "data = pd.read_csv(\n", " pooch_load(url_WashingtonDCSSH1, filename_WashingtonDCSSH1), index_col=0\n", ").set_index(\"years\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "tags": [] }, "outputs": [], "source": [ "data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "executionInfo": { "elapsed": 1097, "status": "ok", "timestamp": 1680721204016, "user": { "displayName": "Yosmely Tamira Bermudez Gutierrez", "userId": "07776907551108334395" }, "user_tz": 180 }, "tags": [] }, "outputs": [], "source": [ "data.ssh.plot(\n", " linestyle=\"-\", marker=\".\", ylabel=\"Annual Maximum Sea Surface Height (mm)\"\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "\n", "Take a close look at the plot of your recorded data. There appears to be an increasing trend, which can be attributed to the rising sea surface temperatures caused by climate change. In this case, the trend seems to follow a linear pattern. It is important to consider this non-stationarity when analyzing the data.\n", "\n", "In previous tutorials, we assumed that the probability density function (PDF) shape remains constant over time. In other words, the precipitation values are derived from the same distribution regardless of the timeframe. However, in today's world heavily influenced by climate change, we cannot assume that the PDF remains stable. For instance, global temperatures are increasing, causing a shift in the distribution's location. Additionally, local precipitation patterns are becoming more variable, leading to a widening of the distribution. Moreover, extreme events are becoming more severe, resulting in thicker tails of the distribution. We refer to this phenomenon as **non-stationarity**.\n", "\n", "To further investigate this, we can group our data into three 30-year periods known as \"**climate normals**\". We can create one record for the period from 1931 to 1960, another for 1961 to 1990, and a third for 1991 to 2020. By plotting the histograms of each dataset within the same frame, we can gain a more comprehensive understanding of the changes over time.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "tags": [] }, "outputs": [], "source": [ "# 1931-1960\n", "data_period1 = data.iloc[0:30]\n", "# 1961-1990\n", "data_period2 = data.iloc[30:60]\n", "# 1990-2020\n", "data_period3 = data.iloc[60:90]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "executionInfo": { "elapsed": 798, "status": "ok", "timestamp": 1680721212023, "user": { "displayName": "Yosmely Tamira Bermudez Gutierrez", "userId": "07776907551108334395" }, "user_tz": 180 }, "tags": [] }, "outputs": [], "source": [ "# plot the histograms for each climate normal identified above\n", "fig, ax = plt.subplots()\n", "sns.histplot(\n", " data_period1.ssh,\n", " bins=np.arange(-200, 650, 50),\n", " color=\"C0\",\n", " element=\"step\",\n", " alpha=0.5,\n", " kde=True,\n", " label=\"1931-1960\",\n", " ax=ax,\n", ")\n", "sns.histplot(\n", " data_period2.ssh,\n", " bins=np.arange(-200, 650, 50),\n", " color=\"C1\",\n", " element=\"step\",\n", " alpha=0.5,\n", " kde=True,\n", " label=\"1961-1990\",\n", " ax=ax,\n", ")\n", "sns.histplot(\n", " data_period3.ssh,\n", " bins=np.arange(-200, 650, 50),\n", " color=\"C2\",\n", " element=\"step\",\n", " alpha=0.5,\n", " kde=True,\n", " label=\"1991-2020\",\n", " ax=ax,\n", ")\n", "ax.legend()\n", "ax.set_xlabel(\"Annual Maximum Sea Surface Height (mm)\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "Let's also calculate the moments each climate normal period:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "executionInfo": { "elapsed": 561, "status": "ok", "timestamp": 1680721216790, "user": { "displayName": "Yosmely Tamira Bermudez Gutierrez", "userId": "07776907551108334395" }, "user_tz": 180 }, "tags": [] }, "outputs": [], "source": [ "# setup pandas dataframe\n", "periods_stats = pd.DataFrame(index=[\"Mean\", \"Standard Deviation\", \"Skew\"])\n", "\n", "# add info for each climate normal period\n", "periods_stats[\"1931-1960\"] = [\n", " data_period1.ssh.mean(),\n", " data_period1.ssh.std(),\n", " data_period1.ssh.skew(),\n", "]\n", "periods_stats[\"1961-1990\"] = [\n", " data_period2.ssh.mean(),\n", " data_period2.ssh.std(),\n", " data_period2.ssh.skew(),\n", "]\n", "periods_stats[\"1991-2020\"] = [\n", " data_period3.ssh.mean(),\n", " data_period3.ssh.std(),\n", " data_period3.ssh.skew(),\n", "]\n", "\n", "periods_stats = periods_stats.T\n", "periods_stats" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "The mean increases as well as the standard deviation. Conversely, the skewness remains relatively stable over time, just decreasing slightly. This observation indicates that the dataset is non-stationary. To visualize the overall shape of the distribution changes, we can fit a Generalized Extreme Value (GEV) distribution to the data for each time period and plot the corresponding probability density function (PDF)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "tags": [] }, "outputs": [], "source": [ "# 1931-1960\n", "params_period1 = gev.fit(data_period1.ssh.values, 0)\n", "shape_period1, loc_period1, scale_period1 = params_period1\n", "\n", "# 1961-1990\n", "params_period2 = gev.fit(data_period2.ssh.values, 0)\n", "shape_period2, loc_period2, scale_period2 = params_period2\n", "\n", "# 1991-2020\n", "params_period3 = gev.fit(data_period3.ssh.values, 0)\n", "shape_period3, loc_period3, scale_period3 = params_period3" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {}, "tags": [] }, "outputs": [], "source": [ "# plot PDFs for each time period\n", "\n", "fig, ax = plt.subplots()\n", "x = np.linspace(-200, 600, 1000)\n", "ax.plot(\n", " x,\n", " gev.pdf(x, shape_period1, loc=loc_period1, scale=scale_period1),\n", " c=\"C0\",\n", " lw=3,\n", " label=\"1931-1960\",\n", ")\n", "ax.plot(\n", " x,\n", " gev.pdf(x, shape_period2, loc=loc_period2, scale=scale_period2),\n", " c=\"C1\",\n", " lw=3,\n", " label=\"1961-1990\",\n", ")\n", "ax.plot(\n", " x,\n", " gev.pdf(x, shape_period3, loc=loc_period3, scale=scale_period3),\n", " c=\"C2\",\n", " lw=3,\n", " label=\"1991-2020\",\n", ")\n", "ax.legend()\n", "ax.set_xlabel(\"Annual Maximum Sea Surface Height (mm)\")\n", "ax.set_ylabel(\"Density\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "Now, let's examine the changes in the GEV parameters. This analysis will provide valuable insights into how we can incorporate non-stationarity into our model in one of the upcoming tutorials." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "## **Question 1**\n", "\n", "1. Look at the plot above. Just by visual inspection, describe how the distribution changes between time periods. Which parameters of the GEV dsitribution do you think are responsible? How and why? " ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "execution": {} }, "source": [ "[*Click for solution*](https://github.com/ClimateMatchAcademy/course-content/tree/main/tutorials/W2D4_ClimateResponse-Extremes&Variability/solutions/W2D4_Tutorial5_Solution_a4064872.py)\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "## **Coding Exercise 1** \n", "\n", "1. Compare the location, scale and shape parameters of the fitted distribution for the three time periods. How do they change? Compare with your answers to the question above." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": {} }, "outputs": [], "source": [ "# setup dataframe with titles for each parameter\n", "parameters = pd.DataFrame(index=[\"Location\", \"Scale\", \"Shape\"])\n", "\n", "# add in 1931-1960 parameters\n", "parameters[\"1931-1960\"] = ...\n", "\n", "# add in 1961-1990 parameters\n", "parameters[\"1961-1990\"] = ...\n", "\n", "# add in 1991-202 parameters\n", "parameters[\"1991-2020\"] = ...\n", "\n", "# transpose the dataset so the time periods are rows\n", "parameters = ...\n", "\n", "# round the values for viewing\n", "_ = ..." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "execution": {}, "executionInfo": { "elapsed": 9, "status": "ok", "timestamp": 1680721280152, "user": { "displayName": "Yosmely Tamira Bermudez Gutierrez", "userId": "07776907551108334395" }, "user_tz": 180 }, "tags": [] }, "source": [ "[*Click for solution*](https://github.com/ClimateMatchAcademy/course-content/tree/main/tutorials/W2D4_ClimateResponse-Extremes&Variability/solutions/W2D4_Tutorial5_Solution_15b0fea6.py)\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "# **Summary**\n", "In this tutorial, you focused on the analysis of annual maximum sea surface heights in Washington DC, specifically considering the impact of **non-stationarity** due to climate change. You've learned how to analyze time series data across different climate normal periods, and how to evaluate changes in statistical moments and parameter values over time to identify this non-stationarity. By segmenting our data into 30-year \"**climate normals**\" periods, we were able to compare changes in sea level trends over time and understand the increasing severity of extreme events. " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "execution": {} }, "source": [ "# **Resources**\n", "\n", "Original data from this tutorial can be found [here](https://climexp.knmi.nl/getsealev.cgi?id=someone@somewhere&WMO=360&STATION=WASHINGTON_DC&extraargs=). Note the data used in this tutorial were preprocessed to extract the annual maximum values." ] } ], "metadata": { "colab": { "collapsed_sections": [], "include_colab_link": true, "name": "W2D4_Tutorial5", "provenance": [ { "file_id": "1q9gfBO_7bG3xrQfyFisZAKq8lK_RN2BX", "timestamp": 1680619878447 } ], "toc_visible": true }, "kernel": { "display_name": "Python 3", "language": "python", "name": "python3" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4 }