Regional Precipitation Variability and Extreme Events
Contents
Regional Precipitation Variability and Extreme Events#
Content creators: Laura Paccini, Raphael Rocha
Content reviewers: Marguerite Brown, Ohad Zivan, Jenna Pearson, Chi Zhang
Content editors: Zane Mitrevica, Natalie Steinemann, Douglas Rao, Jenna Pearson, Chi Zhang, Ohad Zivan
Production editors: Wesley Banfield, Jenna Pearson, Chi Zhang, Ohad Zivan
Our 2023 Sponsors: NASA TOPS and Google DeepMind
# @title Project Background
from ipywidgets import widgets
from IPython.display import YouTubeVideo
from IPython.display import IFrame
from IPython.display import display
class PlayVideo(IFrame):
def __init__(self, id, source, page=1, width=400, height=300, **kwargs):
self.id = id
if source == "Bilibili":
src = f"https://player.bilibili.com/player.html?bvid={id}&page={page}"
elif source == "Osf":
src = f"https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render"
super(PlayVideo, self).__init__(src, width, height, **kwargs)
def display_videos(video_ids, W=400, H=300, fs=1):
tab_contents = []
for i, video_id in enumerate(video_ids):
out = widgets.Output()
with out:
if video_ids[i][0] == "Youtube":
video = YouTubeVideo(
id=video_ids[i][1], width=W, height=H, fs=fs, rel=0
)
print(f"Video available at https://youtube.com/watch?v={video.id}")
else:
video = PlayVideo(
id=video_ids[i][1],
source=video_ids[i][0],
width=W,
height=H,
fs=fs,
autoplay=False,
)
if video_ids[i][0] == "Bilibili":
print(
f"Video available at https://www.bilibili.com/video/{video.id}"
)
elif video_ids[i][0] == "Osf":
print(f"Video available at https://osf.io/{video.id}")
display(video)
tab_contents.append(out)
return tab_contents
video_ids = [('Youtube', '49XHRe61LI8'), ('Bilibili', 'BV1Au411L7fo')]
tab_contents = display_videos(video_ids, W=730, H=410)
tabs = widgets.Tab()
tabs.children = tab_contents
for i in range(len(tab_contents)):
tabs.set_title(i, video_ids[i][0])
display(tabs)
# @title Tutorial slides
# @markdown These are the slides for the videos in all tutorials today
from IPython.display import IFrame
link_id = "a53b4"
In this project, you will explore rain gauge and satellite data from CHIRPS and NOAA Terrestrial Climate Data Records NDVI datasets to extract rain estimates and land surface reflectance, respectively. This data will enable identification of extreme events in your region of interest. Besides investigating the relationships between these variables, you are encouraged to study the impact of extreme events on changes in vegetation.
Project Template#
Note: The dashed boxes are socio-economic questions.
Data Exploration Notebook#
Project Setup#
# imports
import numpy as np
import matplotlib.pyplot as plt
import xarray as xr
import pooch
import os
import tempfile
import pandas as pd
import s3fs
import boto3
import botocore
import datetime
# helper functions
def pooch_load(filelocation=None,filename=None,processor=None):
shared_location='/home/jovyan/shared/Data/Projects/Precipitation' # this is different for each day
user_temp_cache=tempfile.gettempdir()
if os.path.exists(os.path.join(shared_location,filename)):
file = os.path.join(shared_location,filename)
else:
file = pooch.retrieve(filelocation,known_hash=None,fname=os.path.join(user_temp_cache,filename),processor=processor)
return file
CHIRPS Version 2.0 Global Daily 0.25°#
The Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) is a high-resolution precipitation dataset developed by the Climate Hazards Group at the University of California, Santa Barbara. It combines satellite-derived precipitation estimates with ground-based station data to provide gridded precipitation data at a quasi-global scale between 50°S-50°N.
Read more about CHIRPS here:
Indices for Extreme Events#
The Expert Team on Climate Change Detection and Indices (ETCCDI) has defined various indices that focus on different aspects such as duration or intensity of extreme events. The following functions provide examples of how to compute indices for each category. You can modify these functions to suit your specific needs or create your own custom functions. Here are some tips you can use:
Most of the indices require daily data, so in order to select a specific season you can just use xarray to subset the data. Example:
daily_precip_DJF = data_chirps.sel(time=data_chirps['time.season']=='DJF')
A common threshold for a wet event is precipitation greater than or equal to 1mm/day, while a dry (or non-precipitating) event is defined as precipitation less than 1mm/day.
Some of the indices are based on percentiles. You can define a base period climatology to calculate percentile thresholds, such as the 5th, 10th, 90th, and 95th percentiles, to determine extreme events.
def calculate_sdii_index(data):
"""
This function calculates the Simple Daily Intensity Index (SDII), which
represents the average amount of precipitation on wet days (days with
precipitation greater than or equal to 1mm) for each year in the input data.
The input data should be a Dataset with time coordinates, and the function
returns a Dataset with the SDII index values for each year in the data.
----------
- data (xarray.Dataset): Input dataset containing daily precipitation data.
- period (str, optional): Period for which to calculate the SDII index.
Returns:
-------
- xarray.Dataset: Dataset containing the SDII index for the given period.
"""
# calculate daily precipitation amount on wet days (PR >= 1mm)
wet_days = data.where(data >= 1)
# group by year and calculate the sum precipitation on wet days
sum_wet_days_grouped = wet_days.groupby("time.year").sum(dim="time")
# count number of wet days for each time step
w = wet_days.groupby("time.year").count(dim="time")
# divide by the number of wet days to get SDII index
sdii = sum_wet_days_grouped / w
return sdii
def calculate_cdd_index(data):
"""
This function takes a daily precipitation dataset as input and calculates
the Consecutive Dry Days (CDD) index, which represents the longest sequence
of consecutive days with precipitation less than 1mm. The input data should
be a DataArray with time coordinates, and the function returns a DataArray
with the CDD values for each unique year in the input data.
Parameters:
----------
- data (xarray.DataArray): The input daily precipitation data should be
a dataset (eg. for chirps_data the SataArray would be chirps_data.precip)
Returns:
-------
- cdd (xarray.DataArray): The calculated CDD index
"""
# create a boolean array for dry days (PR < 1mm)
dry_days = data < 1
# initialize CDD array
cdd = np.zeros(len(data.groupby("time.year")))
# get unique years as a list
unique_years = list(data.groupby("time.year").groups.keys())
# iterate for each day
for i, year in enumerate(unique_years):
consecutive_trues = []
current_count = 0
for day in dry_days.sel(time=data["time.year"] == year).values:
if day:
current_count += 1
else:
if current_count > 0:
consecutive_trues.append(current_count)
current_count = 0
if current_count > 0:
consecutive_trues.append(current_count)
# print(consecutive_trues)
# CDD is the largest number of consecutive days
cdd[i] = np.max(consecutive_trues)
# transform to dataset
cdd_da = xr.DataArray(cdd, coords={"year": unique_years}, dims="year")
return cdd_da
# code to retrieve and load the data
years=range(1981,2024) # the years you want. we want 1981 till 2023
file_paths=['https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_daily/netcdf/p25/chirps-v2.0.'+str(year)+'.days_p25.nc' for year in years] # the format of the files
filenames=['chirps-v2.0.'+str(year)+'.days_p25.nc' for year in years] # the format of the files
downloaded_files=[ pooch_load(fpath,fname) for (fpath,fname) in zip(file_paths,filenames)] # download all of the files
#### open data as xarray
chirps_data = xr.open_mfdataset(
downloaded_files, combine="by_coords"
) # open the files as one dataset
Downloading data from 'https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_daily/netcdf/p25/chirps-v2.0.1985.days_p25.nc' to file '/tmp/chirps-v2.0.1985.days_p25.nc'.
SHA256 hash of downloaded file: 795f6ef239d5a88b0afc682875ad5567f61d694980346c656db5445af428b3ab
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
Downloading data from 'https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_daily/netcdf/p25/chirps-v2.0.1986.days_p25.nc' to file '/tmp/chirps-v2.0.1986.days_p25.nc'.
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[7], line 7
4 file_paths=['https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_daily/netcdf/p25/chirps-v2.0.'+str(year)+'.days_p25.nc' for year in years] # the format of the files
5 filenames=['chirps-v2.0.'+str(year)+'.days_p25.nc' for year in years] # the format of the files
----> 7 downloaded_files=[ pooch_load(fpath,fname) for (fpath,fname) in zip(file_paths,filenames)] # download all of the files
9 #### open data as xarray
10 chirps_data = xr.open_mfdataset(
11 downloaded_files, combine="by_coords"
12 ) # open the files as one dataset
Cell In[7], line 7, in <listcomp>(.0)
4 file_paths=['https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_daily/netcdf/p25/chirps-v2.0.'+str(year)+'.days_p25.nc' for year in years] # the format of the files
5 filenames=['chirps-v2.0.'+str(year)+'.days_p25.nc' for year in years] # the format of the files
----> 7 downloaded_files=[ pooch_load(fpath,fname) for (fpath,fname) in zip(file_paths,filenames)] # download all of the files
9 #### open data as xarray
10 chirps_data = xr.open_mfdataset(
11 downloaded_files, combine="by_coords"
12 ) # open the files as one dataset
Cell In[4], line 10, in pooch_load(filelocation, filename, processor)
8 file = os.path.join(shared_location,filename)
9 else:
---> 10 file = pooch.retrieve(filelocation,known_hash=None,fname=os.path.join(user_temp_cache,filename),processor=processor)
12 return file
File ~/miniconda3/envs/climatematch/lib/python3.10/site-packages/pooch/core.py:239, in retrieve(url, known_hash, fname, path, processor, downloader, progressbar)
236 if downloader is None:
237 downloader = choose_downloader(url, progressbar=progressbar)
--> 239 stream_download(url, full_path, known_hash, downloader, pooch=None)
241 if known_hash is None:
242 get_logger().info(
243 "SHA256 hash of downloaded file: %s\n"
244 "Use this value as the 'known_hash' argument of 'pooch.retrieve'"
(...)
247 file_hash(str(full_path)),
248 )
File ~/miniconda3/envs/climatematch/lib/python3.10/site-packages/pooch/core.py:803, in stream_download(url, fname, known_hash, downloader, pooch, retry_if_failed)
799 try:
800 # Stream the file to a temporary so that we can safely check its
801 # hash before overwriting the original.
802 with temporary_file(path=str(fname.parent)) as tmp:
--> 803 downloader(url, tmp, pooch)
804 hash_matches(tmp, known_hash, strict=True, source=str(fname.name))
805 shutil.move(tmp, str(fname))
File ~/miniconda3/envs/climatematch/lib/python3.10/site-packages/pooch/downloaders.py:226, in HTTPDownloader.__call__(self, url, output_file, pooch, check_only)
224 progress = self.progressbar
225 progress.total = total
--> 226 for chunk in content:
227 if chunk:
228 output_file.write(chunk)
File ~/miniconda3/envs/climatematch/lib/python3.10/site-packages/requests/models.py:816, in Response.iter_content.<locals>.generate()
814 if hasattr(self.raw, "stream"):
815 try:
--> 816 yield from self.raw.stream(chunk_size, decode_content=True)
817 except ProtocolError as e:
818 raise ChunkedEncodingError(e)
File ~/miniconda3/envs/climatematch/lib/python3.10/site-packages/urllib3/response.py:628, in HTTPResponse.stream(self, amt, decode_content)
626 else:
627 while not is_fp_closed(self._fp):
--> 628 data = self.read(amt=amt, decode_content=decode_content)
630 if data:
631 yield data
File ~/miniconda3/envs/climatematch/lib/python3.10/site-packages/urllib3/response.py:567, in HTTPResponse.read(self, amt, decode_content, cache_content)
564 fp_closed = getattr(self._fp, "closed", False)
566 with self._error_catcher():
--> 567 data = self._fp_read(amt) if not fp_closed else b""
568 if amt is None:
569 flush_decoder = True
File ~/miniconda3/envs/climatematch/lib/python3.10/site-packages/urllib3/response.py:533, in HTTPResponse._fp_read(self, amt)
530 return buffer.getvalue()
531 else:
532 # StringIO doesn't like amt=None
--> 533 return self._fp.read(amt) if amt is not None else self._fp.read()
File ~/miniconda3/envs/climatematch/lib/python3.10/http/client.py:466, in HTTPResponse.read(self, amt)
463 if self.length is not None and amt > self.length:
464 # clip the read to the "end of response"
465 amt = self.length
--> 466 s = self.fp.read(amt)
467 if not s and amt:
468 # Ideally, we would raise IncompleteRead if the content-length
469 # wasn't satisfied, but it might break compatibility.
470 self._close_conn()
File ~/miniconda3/envs/climatematch/lib/python3.10/socket.py:705, in SocketIO.readinto(self, b)
703 while True:
704 try:
--> 705 return self._sock.recv_into(b)
706 except timeout:
707 self._timeout_occurred = True
File ~/miniconda3/envs/climatematch/lib/python3.10/ssl.py:1274, in SSLSocket.recv_into(self, buffer, nbytes, flags)
1270 if flags != 0:
1271 raise ValueError(
1272 "non-zero flags not allowed in calls to recv_into() on %s" %
1273 self.__class__)
-> 1274 return self.read(nbytes, buffer)
1275 else:
1276 return super().recv_into(buffer, nbytes, flags)
File ~/miniconda3/envs/climatematch/lib/python3.10/ssl.py:1130, in SSLSocket.read(self, len, buffer)
1128 try:
1129 if buffer is not None:
-> 1130 return self._sslobj.read(len, buffer)
1131 else:
1132 return self._sslobj.read(len)
KeyboardInterrupt:
We can now visualize the content of the dataset.
# code to print the shape, array names, etc of the dataset
chirps_data
NOAA Fundamental Climate Data Records (FCDR) AVHRR Land Bundle - Surface Reflectance and Normalized Difference Vegetation Index#
As we learned in the W1D3 tutorials, all the National Atmospheric and Oceanic Administration Climate Data Record (NOAA-CDR) datasets are available both at NOAA National Centers for Environmental Information (NCEI) and commercial cloud platforms. See the link here. We are accessing the data directly via the Amazon Web Service (AWS) cloud storage space.
For this project we recommend using the Normalized Difference Vegetation Index (NDVI). It is one of the most commonly used remotely sensed indices. It measures the “greeness” of vegetation, and is useful in understanding vegetation density and assessing changes in plant health. For example, NDVI can be used to study the impacts of droughts, heatwaves, and insect infestations on plants covering Earth’s surface. A good overview of this index from this particular sensor can be accessed here.
Recall what we learned in W1D3 Tutorial 3, the data files on AWS are named systematically:
Sensor name:
AVHRR
Product category:Land
Product version:v005
Product code:AVH13C1
Satellite platform:NOAA-07
Date of the data:19810624
Processing time:c20170610041337
(This will change for each file based on when the file was processed)
File format:.nc
(netCDR-4 format)
In other words, if we are looking for the data of a specific day, we can easily locate where the file might be.
For example, if we want to find the AVHRR data for the day of 2002-03-12 (or March 12, 2002), you can use:
s3://noaa-cdr-ndvi-pds/data/2002/AVHRR-Land_v005_AVH13C1_*_20020312_c*.nc
The reasaon that we put *
in the above directory is because we are not sure about what satellite platform this data is from and when the data was processed. The *
is called a wildcard, and is used because we want all the files that contain our specific criteria, but do not want to have to specify all the other pieces of the filename we are not sure about yet. It should return all the data satisfying that initial criteria and you can refine further once you see what is available. Essentially, this first step helps to narrow down the data search. You can then use this to create datasets from the timeframe of your choosing.
# we can use the data from W1D3 tutorial 3
# to access the NDVI data from AWS S3 bucket, we first need to connect to s3 bucket
fs = s3fs.S3FileSystem(anon=True)
client = boto3.client('s3', config=botocore.client.Config(signature_version=botocore.UNSIGNED)) # initialize aws s3 bucket client
date_sel = datetime.datetime(2001,1,1,0)
file_location = fs.glob('s3://noaa-cdr-ndvi-pds/data/'+
date_sel.strftime('%Y')+'/AVHRR-Land_v005_AVH13C1_*_c*.nc') # the files we want for a whole year
files_for_pooch=[pooch_load('http://s3.amazonaws.com/'+file,file) for file in file_location]
ds = xr.open_mfdataset(files_for_pooch, combine="by_coords") # open the file
ds
Alternative NDVI source: MODIS/Terra Vegetation Indices (MOD13C2) Version 6.1 L3 Global 0.05° CMG#
Global MODIS (Moderate Resolution Imaging Spectroradiometer) vegetation indices are designed to provide consistent spatial and temporal comparisons of vegetation conditions. Blue, red, and near-infrared reflectances, centered at 469-nanometers, 645-nanometers, and 858-nanometers, respectively, are used to determine the MODIS vegetation indices.
The MODIS Normalized Difference Vegetation Index (NDVI) complements NOAA’s Advanced Very High Resolution Radiometer (AVHRR) NDVI products providing continuity for time series applications over this rich historical archive.
Global MOD13C2 data are cloud-free spatial composites of the gridded 16-day 1-kilometer MOD13C2A2, and are provided as a level-3 product projected on a 0.05 degree (5600-meter) geographic Climate Modeling Grid (CMG). These files have also been pre-processed to match the grid of the precipitation data.
years=range(2000,2024) # the NDVI data we have available
file_paths=['MODIS/NDVI_'+str(year)+'.nc' for year in years] # the format of the files
filenames=['MODIS/NDVI_'+str(year)+'.nc' for year in years] # the format of the files
downloaded_files=[ pooch_load(fpath,fname) for (fpath,fname) in zip(file_paths,filenames)] # download all of the files
# load Data
modis_data = xr.open_mfdataset(downloaded_files,combine='by_coords')
modis_data
Worldbank Data: Cereal Production#
Cereal production is a crucial component of global agriculture and food security. The World Bank collects and provides data on cereal production, which includes crops such as wheat, rice, maize, barley, oats, rye, sorghum, millet, and mixed grains. The data covers various indicators such as production quantity, area harvested, yield, and production value.
The World Bank also collects data on land under cereals production, which refers to the area of land that is being used to grow cereal crops. This information can be valuable for assessing the productivity and efficiency of cereal production systems in different regions, as well as identifying potential areas for improvement. Overall, the World Bank’s data on cereal production and land under cereals production is an important resource for policymakers, researchers, and other stakeholders who are interested in understanding global trends in agriculture and food security.
# code to retrieve and load the data
filename_cereal = 'data_cereal_land.csv'
url_cereal = 'https://raw.githubusercontent.com/Sshamekh/Heatwave/f85f43997e3d6ae61e5d729bf77cfcc188fbf2fd/data_cereal_land.csv'
ds_cereal_land = pd.read_csv(pooch_load(url_cereal,filename_cereal))
ds_cereal_land.head()
# example
ds_cereal_land[(ds_cereal_land["Country Name"] == "Brazil")].reset_index(
drop=True
).iloc[0].transpose()
Now you are all set to address the questions you are interested in! Just be mindful of the specific coordinate names to avoid any issues.
You can use the provided functions as examples to compute various indices for extreme events based on duration or intensity. Don’t hesitate to modify them according to your specific needs or create your own custom functions.
Happy exploring and analyzing precipitation variability and extreme events in your project!
Further Reading#
Anyamba, A. and Tucker, C.J., 2012. Historical perspective of AVHRR NDVI and vegetation drought monitoring. Remote sensing of drought: innovative monitoring approaches, 23, pp.20.https://digitalcommons.unl.edu/nasapub/217/
Zhang, X., Alexander, L., Hegerl, G.C., Jones, P., Tank, A.K., Peterson, T.C., Trewin, B. and Zwiers, F.W., 2011. Indices for monitoring changes in extremes based on daily temperature and precipitation data. Wiley Interdisciplinary Reviews: Climate Change, 2(6), pp.851-870.
Schultz, P. A., and M. S. Halpert. “Global correlation of temperature, NDVI and precipitation.” Advances in Space Research 13.5 (1993): 277-280.
Seneviratne, S.I. et al., 2021: Weather and Climate Extreme Events in a Changing Climate. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V. et al. (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 1513–1766, https://www.ipcc.ch/report/ar6/wg1/chapter/chapter-11/
IPCC, 2021: Annex VI: Climatic Impact-driver and Extreme Indices [Gutiérrez J.M. et al.(eds.)]. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V. et al. (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 2205–2214, https://www.ipcc.ch/report/ar6/wg1/downloads/report/IPCC_AR6_WGI_AnnexVI.pdf