{ "cells": [ { "cell_type": "markdown", "source": [ "# Matched-filters\n", "\n", "This notebook will provide a look at using EQcorrscan's Tribe objects for matched-filter detection of earthquakes.\n", "\n", "This notebook extends on the ideas covered in the [Quick Start](quick_start.ipynb) notebook. In particular this\n", "notebook also covers:\n", "1. Concurrent execution of detection workflows for more efficient compute utilisation with large datasets;\n", "2. Use of local waveform databases using [obsplus](https://github.com/niosh-mining/obsplus);\n", "3. Cross-correlation pick-correction using the `lag_calc` method." ], "metadata": { "collapsed": false }, "id": "96a39ad3defaeedc" }, { "cell_type": "markdown", "source": [ "## Set up\n", "\n", "We are going to focus in this notebook on using local data. For examples of how to directly use data from online providers\n", "see the [Quick Start](quick_start.ipynb) notebook. \n", "\n", "To start off we will download some data - in your case this is likely data that you have either downloaded from one or more\n", "online providers, or data that you have collected yourself. At the moment we don't care how those data are organised, as long\n", "as you have continuous data on disk somewhere. We will use [obsplus](https://github.com/niosh-mining/obsplus) to work out\n", "what data are were and provide us with the data that we need when we need it.\n", "\n", "Obsplus is great and has more functionality than we expose here - if you make use of obsplus, please cite the \n", "paper by [Chambers et al., (2021)](https://joss.theoj.org/papers/10.21105/joss.02696).\n", "\n", "As in the [Quick Start](quick_start.ipynb) example, we will control the output level from EQcorrscan using logging." ], "metadata": { "collapsed": false }, "id": "7c244daa3adde5ea" }, { "cell_type": "code", "execution_count": 1, "outputs": [], "source": [ "import logging\n", "\n", "logging.basicConfig(\n", " level=logging.WARNING,\n", " format=\"%(asctime)s\\t%(name)s\\t%(levelname)s\\t%(message)s\")\n", "\n", "Logger = logging.getLogger(\"TutorialLogger\")" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:18:30.192025542Z", "start_time": "2023-12-01T00:18:30.187031966Z" } }, "id": "afb90fba397b3674" }, { "cell_type": "markdown", "source": [ "We will use the March 2023 Kawarau swarm as our case-study for this. This was an energetic swarm that\n", "was reported by New Zealand's GeoNet monitoring agency and discussed in a news article [here](https://www.geonet.org.nz/response/VJW80CGEPtq0JPCBHlNaR).\n", "\n", "We will use data from ten stations over a duration of two days. The swarm lasted longer than this, but\n", "we need to limit compute resources for this tutorial! Feel free to change the end-date below to run\n", "for longer. To be kind to GeoNet and not repeatedly get data from their FDSN service we are going to get data from the AWS open data bucket. If you don't already have boto3 installed you will need to install that for this sections (`conda install boto3` or `pip install boto3`).\n", "\n", "NB: If you actually want to access the GeoNet data bucket using Python, a drop-in replacement from FDSN clients exists [here](https://github.com/calum-chamberlain/cjc-utilities/blob/main/src/cjc_utilities/get_data/geonet_aws_client.py)" ], "metadata": { "collapsed": false }, "id": "b4baffba897550b8" }, { "cell_type": "code", "execution_count": 2, "outputs": [], "source": [ "def get_geonet_data(starttime, endtime, stations, outdir):\n", " import os\n", " import boto3\n", " from botocore import UNSIGNED\n", " from botocore.config import Config\n", " \n", " GEONET_AWS = \"geonet-open-data\"\n", " \n", " DAY_STRUCT = \"waveforms/miniseed/{date.year}/{date.year}.{date.julday:03d}\"\n", " CHAN_STRUCT = (\"{station}.{network}/{date.year}.{date.julday:03d}.\"\n", " \"{station}.{location}-{channel}.{network}.D\")\n", " if not os.path.isdir(outdir):\n", " os.makedirs(outdir)\n", " \n", " bob = boto3.resource('s3', config=Config(signature_version=UNSIGNED))\n", " s3 = bob.Bucket(GEONET_AWS)\n", " \n", " date = starttime\n", " while date < endtime:\n", " day_path = DAY_STRUCT.format(date=date)\n", " for station in stations:\n", " for instrument in \"HE\":\n", " for component in \"ZNE12\":\n", " channel = f\"{instrument}H{component}\"\n", " chan_path = CHAN_STRUCT.format(\n", " station=station, network=\"NZ\",\n", " date=date, location=\"10\", channel=channel)\n", " local_path = os.path.join(outdir, chan_path)\n", " if os.path.isfile(local_path):\n", " Logger.info(f\"Skipping {local_path}: exists\")\n", " continue\n", " os.makedirs(os.path.dirname(local_path), exist_ok=True)\n", " remote = \"/\".join([day_path, chan_path])\n", " Logger.debug(f\"Downloading from {remote}\")\n", " try:\n", " s3.download_file(remote, local_path)\n", " except Exception as e:\n", " Logger.debug(f\"Could not download {remote} due to {e}\")\n", " continue\n", " Logger.info(f\"Downloaded {remote}\")\n", " date += 86400" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:18:33.554168095Z", "start_time": "2023-12-01T00:18:33.546942071Z" } }, "id": "a5e81f234705ab54" }, { "cell_type": "code", "execution_count": 3, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "from obspy import UTCDateTime\n", "\n", "starttime, endtime = UTCDateTime(2023, 3, 17), UTCDateTime(2023, 3, 19)\n", "stations = ['EDRZ', 'LIRZ', 'MARZ', 'MKRZ', 'OMRZ', 'OPRZ', 'TARZ', 'WKHS', 'HNCZ', 'KARZ']\n", "\n", "outdir = \"tutorial_waveforms\"\n", "\n", "get_geonet_data(starttime=starttime, endtime=endtime, stations=stations, outdir=outdir)" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:19:54.148732360Z", "start_time": "2023-12-01T00:18:34.938266088Z" } }, "id": "a4182117cbf6692c" }, { "cell_type": "markdown", "source": [ "Great, now we have some data. EQcorrscan is well set up to use clients for data access,\n", "using clients allows EQcorrscan to request the data that it needs and take care of \n", "overlapping chunks of data to ensure that no data are missed: network-based\n", "matched-filters apply a delay-and-stack step to the correlations from individual\n", "channels. This increases the signal-to-noise ratio of the correlation sum. However,\n", "because of the delay part, the stacks made at start and end of chunks of waveform\n", "data do not use the full network. To get around this *you should overlap your data*.\n", "\n", "If you use client-based access to data, EQcorrscan will take care of this for you.\n", "\n", "So how do you use clients for local data? Make a local database using obsplus.\n", "\n", "If you don't have obsplus installed you should install it now (`conda install obsplus`\n", "or `pip install obsplus`)." ], "metadata": { "collapsed": false }, "id": "864c0532837b9fc" }, { "cell_type": "code", "execution_count": 4, "outputs": [ { "data": { "text/plain": " network station location channel starttime \\\n0 NZ EDRZ 10 EHE 2023-03-17 00:00:03.528394 \n1 NZ EDRZ 10 EHN 2023-03-17 00:00:05.458394 \n2 NZ EDRZ 10 EHZ 2023-03-17 00:00:03.528394 \n3 NZ KARZ 10 EHE 2023-03-17 00:00:02.963130 \n4 NZ KARZ 10 EHN 2023-03-17 00:00:00.093130 \n5 NZ KARZ 10 EHZ 2023-03-17 00:00:05.823130 \n6 NZ LIRZ 10 EHE 2023-03-17 00:00:01.753132 \n7 NZ LIRZ 10 EHN 2023-03-17 00:00:02.913132 \n8 NZ LIRZ 10 EHZ 2023-03-17 00:00:01.463132 \n9 NZ MARZ 10 EHE 2023-03-17 00:00:01.553130 \n10 NZ MARZ 10 EHN 2023-03-17 00:00:01.683130 \n11 NZ MARZ 10 EHZ 2023-03-17 00:00:00.963130 \n12 NZ MKRZ 10 EHE 2023-03-17 00:00:01.673129 \n13 NZ MKRZ 10 EHN 2023-03-17 00:00:00.143129 \n14 NZ MKRZ 10 EHZ 2023-03-17 00:00:00.053129 \n15 NZ OMRZ 10 EHE 2023-03-17 00:00:02.740000 \n16 NZ OMRZ 10 EHN 2023-03-17 00:00:00.580000 \n17 NZ OMRZ 10 EHZ 2023-03-17 00:00:04.110000 \n18 NZ OPRZ 10 HHE 2023-03-17 00:00:02.993132 \n19 NZ OPRZ 10 HHN 2023-03-17 00:00:03.473132 \n20 NZ OPRZ 10 HHZ 2023-03-17 00:00:01.963132 \n21 NZ TARZ 10 EHE 2023-03-17 00:00:01.850000 \n22 NZ TARZ 10 EHN 2023-03-17 00:00:00.760000 \n23 NZ TARZ 10 EHZ 2023-03-17 00:00:00.630000 \n\n endtime \n0 2023-03-19 00:00:00.098393 \n1 2023-03-19 00:00:04.518393 \n2 2023-03-19 00:00:03.588393 \n3 2023-03-19 00:00:01.273126 \n4 2023-03-19 00:00:00.303126 \n5 2023-03-19 00:00:03.653126 \n6 2023-03-19 00:00:03.523130 \n7 2023-03-19 00:00:04.253130 \n8 2023-03-19 00:00:00.313130 \n9 2023-03-19 00:00:01.593131 \n10 2023-03-19 00:00:04.163131 \n11 2023-03-19 00:00:05.063131 \n12 2023-03-19 00:00:01.763133 \n13 2023-03-19 00:00:02.463133 \n14 2023-03-19 00:00:02.363133 \n15 2023-03-19 00:00:02.470000 \n16 2023-03-19 00:00:00.550000 \n17 2023-03-19 00:00:03.820000 \n18 2023-03-19 00:00:01.243131 \n19 2023-03-19 00:00:04.443131 \n20 2023-03-19 00:00:00.143131 \n21 2023-03-19 00:00:01.580000 \n22 2023-03-19 00:00:00.820000 \n23 2023-03-19 00:00:03.830000 ", "text/html": "
\n | network | \nstation | \nlocation | \nchannel | \nstarttime | \nendtime | \n
---|---|---|---|---|---|---|
0 | \nNZ | \nEDRZ | \n10 | \nEHE | \n2023-03-17 00:00:03.528394 | \n2023-03-19 00:00:00.098393 | \n
1 | \nNZ | \nEDRZ | \n10 | \nEHN | \n2023-03-17 00:00:05.458394 | \n2023-03-19 00:00:04.518393 | \n
2 | \nNZ | \nEDRZ | \n10 | \nEHZ | \n2023-03-17 00:00:03.528394 | \n2023-03-19 00:00:03.588393 | \n
3 | \nNZ | \nKARZ | \n10 | \nEHE | \n2023-03-17 00:00:02.963130 | \n2023-03-19 00:00:01.273126 | \n
4 | \nNZ | \nKARZ | \n10 | \nEHN | \n2023-03-17 00:00:00.093130 | \n2023-03-19 00:00:00.303126 | \n
5 | \nNZ | \nKARZ | \n10 | \nEHZ | \n2023-03-17 00:00:05.823130 | \n2023-03-19 00:00:03.653126 | \n
6 | \nNZ | \nLIRZ | \n10 | \nEHE | \n2023-03-17 00:00:01.753132 | \n2023-03-19 00:00:03.523130 | \n
7 | \nNZ | \nLIRZ | \n10 | \nEHN | \n2023-03-17 00:00:02.913132 | \n2023-03-19 00:00:04.253130 | \n
8 | \nNZ | \nLIRZ | \n10 | \nEHZ | \n2023-03-17 00:00:01.463132 | \n2023-03-19 00:00:00.313130 | \n
9 | \nNZ | \nMARZ | \n10 | \nEHE | \n2023-03-17 00:00:01.553130 | \n2023-03-19 00:00:01.593131 | \n
10 | \nNZ | \nMARZ | \n10 | \nEHN | \n2023-03-17 00:00:01.683130 | \n2023-03-19 00:00:04.163131 | \n
11 | \nNZ | \nMARZ | \n10 | \nEHZ | \n2023-03-17 00:00:00.963130 | \n2023-03-19 00:00:05.063131 | \n
12 | \nNZ | \nMKRZ | \n10 | \nEHE | \n2023-03-17 00:00:01.673129 | \n2023-03-19 00:00:01.763133 | \n
13 | \nNZ | \nMKRZ | \n10 | \nEHN | \n2023-03-17 00:00:00.143129 | \n2023-03-19 00:00:02.463133 | \n
14 | \nNZ | \nMKRZ | \n10 | \nEHZ | \n2023-03-17 00:00:00.053129 | \n2023-03-19 00:00:02.363133 | \n
15 | \nNZ | \nOMRZ | \n10 | \nEHE | \n2023-03-17 00:00:02.740000 | \n2023-03-19 00:00:02.470000 | \n
16 | \nNZ | \nOMRZ | \n10 | \nEHN | \n2023-03-17 00:00:00.580000 | \n2023-03-19 00:00:00.550000 | \n
17 | \nNZ | \nOMRZ | \n10 | \nEHZ | \n2023-03-17 00:00:04.110000 | \n2023-03-19 00:00:03.820000 | \n
18 | \nNZ | \nOPRZ | \n10 | \nHHE | \n2023-03-17 00:00:02.993132 | \n2023-03-19 00:00:01.243131 | \n
19 | \nNZ | \nOPRZ | \n10 | \nHHN | \n2023-03-17 00:00:03.473132 | \n2023-03-19 00:00:04.443131 | \n
20 | \nNZ | \nOPRZ | \n10 | \nHHZ | \n2023-03-17 00:00:01.963132 | \n2023-03-19 00:00:00.143131 | \n
21 | \nNZ | \nTARZ | \n10 | \nEHE | \n2023-03-17 00:00:01.850000 | \n2023-03-19 00:00:01.580000 | \n
22 | \nNZ | \nTARZ | \n10 | \nEHN | \n2023-03-17 00:00:00.760000 | \n2023-03-19 00:00:00.820000 | \n
23 | \nNZ | \nTARZ | \n10 | \nEHZ | \n2023-03-17 00:00:00.630000 | \n2023-03-19 00:00:03.830000 | \n