{ "cells": [ { "cell_type": "markdown", "source": [ "# Matched-filters\n", "\n", "This notebook will provide a look at using EQcorrscan's Tribe objects for matched-filter detection of earthquakes.\n", "\n", "This notebook extends on the ideas covered in the [Quick Start](quick_start.ipynb) notebook. In particular this\n", "notebook also covers:\n", "1. Concurrent execution of detection workflows for more efficient compute utilisation with large datasets;\n", "2. Use of local waveform databases using [obsplus](https://github.com/niosh-mining/obsplus);\n", "3. Cross-correlation pick-correction using the `lag_calc` method." ], "metadata": { "collapsed": false }, "id": "96a39ad3defaeedc" }, { "cell_type": "markdown", "source": [ "## Set up\n", "\n", "We are going to focus in this notebook on using local data. For examples of how to directly use data from online providers\n", "see the [Quick Start](quick_start.ipynb) notebook. \n", "\n", "To start off we will download some data - in your case this is likely data that you have either downloaded from one or more\n", "online providers, or data that you have collected yourself. At the moment we don't care how those data are organised, as long\n", "as you have continuous data on disk somewhere. We will use [obsplus](https://github.com/niosh-mining/obsplus) to work out\n", "what data are were and provide us with the data that we need when we need it.\n", "\n", "Obsplus is great and has more functionality than we expose here - if you make use of obsplus, please cite the \n", "paper by [Chambers et al., (2021)](https://joss.theoj.org/papers/10.21105/joss.02696).\n", "\n", "As in the [Quick Start](quick_start.ipynb) example, we will control the output level from EQcorrscan using logging." ], "metadata": { "collapsed": false }, "id": "7c244daa3adde5ea" }, { "cell_type": "code", "execution_count": 1, "outputs": [], "source": [ "import logging\n", "\n", "logging.basicConfig(\n", " level=logging.WARNING,\n", " format=\"%(asctime)s\\t%(name)s\\t%(levelname)s\\t%(message)s\")\n", "\n", "Logger = logging.getLogger(\"TutorialLogger\")" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:18:30.192025542Z", "start_time": "2023-12-01T00:18:30.187031966Z" } }, "id": "afb90fba397b3674" }, { "cell_type": "markdown", "source": [ "We will use the March 2023 Kawarau swarm as our case-study for this. This was an energetic swarm that\n", "was reported by New Zealand's GeoNet monitoring agency and discussed in a news article [here](https://www.geonet.org.nz/response/VJW80CGEPtq0JPCBHlNaR).\n", "\n", "We will use data from ten stations over a duration of two days. The swarm lasted longer than this, but\n", "we need to limit compute resources for this tutorial! Feel free to change the end-date below to run\n", "for longer. To be kind to GeoNet and not repeatedly get data from their FDSN service we are going to get data from the AWS open data bucket. If you don't already have boto3 installed you will need to install that for this sections (`conda install boto3` or `pip install boto3`).\n", "\n", "NB: If you actually want to access the GeoNet data bucket using Python, a drop-in replacement from FDSN clients exists [here](https://github.com/calum-chamberlain/cjc-utilities/blob/main/src/cjc_utilities/get_data/geonet_aws_client.py)" ], "metadata": { "collapsed": false }, "id": "b4baffba897550b8" }, { "cell_type": "code", "execution_count": 2, "outputs": [], "source": [ "def get_geonet_data(starttime, endtime, stations, outdir):\n", " import os\n", " import boto3\n", " from botocore import UNSIGNED\n", " from botocore.config import Config\n", " \n", " GEONET_AWS = \"geonet-open-data\"\n", " \n", " DAY_STRUCT = \"waveforms/miniseed/{date.year}/{date.year}.{date.julday:03d}\"\n", " CHAN_STRUCT = (\"{station}.{network}/{date.year}.{date.julday:03d}.\"\n", " \"{station}.{location}-{channel}.{network}.D\")\n", " if not os.path.isdir(outdir):\n", " os.makedirs(outdir)\n", " \n", " bob = boto3.resource('s3', config=Config(signature_version=UNSIGNED))\n", " s3 = bob.Bucket(GEONET_AWS)\n", " \n", " date = starttime\n", " while date < endtime:\n", " day_path = DAY_STRUCT.format(date=date)\n", " for station in stations:\n", " for instrument in \"HE\":\n", " for component in \"ZNE12\":\n", " channel = f\"{instrument}H{component}\"\n", " chan_path = CHAN_STRUCT.format(\n", " station=station, network=\"NZ\",\n", " date=date, location=\"10\", channel=channel)\n", " local_path = os.path.join(outdir, chan_path)\n", " if os.path.isfile(local_path):\n", " Logger.info(f\"Skipping {local_path}: exists\")\n", " continue\n", " os.makedirs(os.path.dirname(local_path), exist_ok=True)\n", " remote = \"/\".join([day_path, chan_path])\n", " Logger.debug(f\"Downloading from {remote}\")\n", " try:\n", " s3.download_file(remote, local_path)\n", " except Exception as e:\n", " Logger.debug(f\"Could not download {remote} due to {e}\")\n", " continue\n", " Logger.info(f\"Downloaded {remote}\")\n", " date += 86400" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:18:33.554168095Z", "start_time": "2023-12-01T00:18:33.546942071Z" } }, "id": "a5e81f234705ab54" }, { "cell_type": "code", "execution_count": 3, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "from obspy import UTCDateTime\n", "\n", "starttime, endtime = UTCDateTime(2023, 3, 17), UTCDateTime(2023, 3, 19)\n", "stations = ['EDRZ', 'LIRZ', 'MARZ', 'MKRZ', 'OMRZ', 'OPRZ', 'TARZ', 'WKHS', 'HNCZ', 'KARZ']\n", "\n", "outdir = \"tutorial_waveforms\"\n", "\n", "get_geonet_data(starttime=starttime, endtime=endtime, stations=stations, outdir=outdir)" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:19:54.148732360Z", "start_time": "2023-12-01T00:18:34.938266088Z" } }, "id": "a4182117cbf6692c" }, { "cell_type": "markdown", "source": [ "Great, now we have some data. EQcorrscan is well set up to use clients for data access,\n", "using clients allows EQcorrscan to request the data that it needs and take care of \n", "overlapping chunks of data to ensure that no data are missed: network-based\n", "matched-filters apply a delay-and-stack step to the correlations from individual\n", "channels. This increases the signal-to-noise ratio of the correlation sum. However,\n", "because of the delay part, the stacks made at start and end of chunks of waveform\n", "data do not use the full network. To get around this *you should overlap your data*.\n", "\n", "If you use client-based access to data, EQcorrscan will take care of this for you.\n", "\n", "So how do you use clients for local data? Make a local database using obsplus.\n", "\n", "If you don't have obsplus installed you should install it now (`conda install obsplus`\n", "or `pip install obsplus`)." ], "metadata": { "collapsed": false }, "id": "864c0532837b9fc" }, { "cell_type": "code", "execution_count": 4, "outputs": [ { "data": { "text/plain": " network station location channel starttime \\\n0 NZ EDRZ 10 EHE 2023-03-17 00:00:03.528394 \n1 NZ EDRZ 10 EHN 2023-03-17 00:00:05.458394 \n2 NZ EDRZ 10 EHZ 2023-03-17 00:00:03.528394 \n3 NZ KARZ 10 EHE 2023-03-17 00:00:02.963130 \n4 NZ KARZ 10 EHN 2023-03-17 00:00:00.093130 \n5 NZ KARZ 10 EHZ 2023-03-17 00:00:05.823130 \n6 NZ LIRZ 10 EHE 2023-03-17 00:00:01.753132 \n7 NZ LIRZ 10 EHN 2023-03-17 00:00:02.913132 \n8 NZ LIRZ 10 EHZ 2023-03-17 00:00:01.463132 \n9 NZ MARZ 10 EHE 2023-03-17 00:00:01.553130 \n10 NZ MARZ 10 EHN 2023-03-17 00:00:01.683130 \n11 NZ MARZ 10 EHZ 2023-03-17 00:00:00.963130 \n12 NZ MKRZ 10 EHE 2023-03-17 00:00:01.673129 \n13 NZ MKRZ 10 EHN 2023-03-17 00:00:00.143129 \n14 NZ MKRZ 10 EHZ 2023-03-17 00:00:00.053129 \n15 NZ OMRZ 10 EHE 2023-03-17 00:00:02.740000 \n16 NZ OMRZ 10 EHN 2023-03-17 00:00:00.580000 \n17 NZ OMRZ 10 EHZ 2023-03-17 00:00:04.110000 \n18 NZ OPRZ 10 HHE 2023-03-17 00:00:02.993132 \n19 NZ OPRZ 10 HHN 2023-03-17 00:00:03.473132 \n20 NZ OPRZ 10 HHZ 2023-03-17 00:00:01.963132 \n21 NZ TARZ 10 EHE 2023-03-17 00:00:01.850000 \n22 NZ TARZ 10 EHN 2023-03-17 00:00:00.760000 \n23 NZ TARZ 10 EHZ 2023-03-17 00:00:00.630000 \n\n endtime \n0 2023-03-19 00:00:00.098393 \n1 2023-03-19 00:00:04.518393 \n2 2023-03-19 00:00:03.588393 \n3 2023-03-19 00:00:01.273126 \n4 2023-03-19 00:00:00.303126 \n5 2023-03-19 00:00:03.653126 \n6 2023-03-19 00:00:03.523130 \n7 2023-03-19 00:00:04.253130 \n8 2023-03-19 00:00:00.313130 \n9 2023-03-19 00:00:01.593131 \n10 2023-03-19 00:00:04.163131 \n11 2023-03-19 00:00:05.063131 \n12 2023-03-19 00:00:01.763133 \n13 2023-03-19 00:00:02.463133 \n14 2023-03-19 00:00:02.363133 \n15 2023-03-19 00:00:02.470000 \n16 2023-03-19 00:00:00.550000 \n17 2023-03-19 00:00:03.820000 \n18 2023-03-19 00:00:01.243131 \n19 2023-03-19 00:00:04.443131 \n20 2023-03-19 00:00:00.143131 \n21 2023-03-19 00:00:01.580000 \n22 2023-03-19 00:00:00.820000 \n23 2023-03-19 00:00:03.830000 ", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
networkstationlocationchannelstarttimeendtime
0NZEDRZ10EHE2023-03-17 00:00:03.5283942023-03-19 00:00:00.098393
1NZEDRZ10EHN2023-03-17 00:00:05.4583942023-03-19 00:00:04.518393
2NZEDRZ10EHZ2023-03-17 00:00:03.5283942023-03-19 00:00:03.588393
3NZKARZ10EHE2023-03-17 00:00:02.9631302023-03-19 00:00:01.273126
4NZKARZ10EHN2023-03-17 00:00:00.0931302023-03-19 00:00:00.303126
5NZKARZ10EHZ2023-03-17 00:00:05.8231302023-03-19 00:00:03.653126
6NZLIRZ10EHE2023-03-17 00:00:01.7531322023-03-19 00:00:03.523130
7NZLIRZ10EHN2023-03-17 00:00:02.9131322023-03-19 00:00:04.253130
8NZLIRZ10EHZ2023-03-17 00:00:01.4631322023-03-19 00:00:00.313130
9NZMARZ10EHE2023-03-17 00:00:01.5531302023-03-19 00:00:01.593131
10NZMARZ10EHN2023-03-17 00:00:01.6831302023-03-19 00:00:04.163131
11NZMARZ10EHZ2023-03-17 00:00:00.9631302023-03-19 00:00:05.063131
12NZMKRZ10EHE2023-03-17 00:00:01.6731292023-03-19 00:00:01.763133
13NZMKRZ10EHN2023-03-17 00:00:00.1431292023-03-19 00:00:02.463133
14NZMKRZ10EHZ2023-03-17 00:00:00.0531292023-03-19 00:00:02.363133
15NZOMRZ10EHE2023-03-17 00:00:02.7400002023-03-19 00:00:02.470000
16NZOMRZ10EHN2023-03-17 00:00:00.5800002023-03-19 00:00:00.550000
17NZOMRZ10EHZ2023-03-17 00:00:04.1100002023-03-19 00:00:03.820000
18NZOPRZ10HHE2023-03-17 00:00:02.9931322023-03-19 00:00:01.243131
19NZOPRZ10HHN2023-03-17 00:00:03.4731322023-03-19 00:00:04.443131
20NZOPRZ10HHZ2023-03-17 00:00:01.9631322023-03-19 00:00:00.143131
21NZTARZ10EHE2023-03-17 00:00:01.8500002023-03-19 00:00:01.580000
22NZTARZ10EHN2023-03-17 00:00:00.7600002023-03-19 00:00:00.820000
23NZTARZ10EHZ2023-03-17 00:00:00.6300002023-03-19 00:00:03.830000
\n
" }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from obsplus import WaveBank\n", "\n", "bank = WaveBank(outdir)\n", "\n", "bank.get_availability_df()" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:19:56.010304625Z", "start_time": "2023-12-01T00:19:54.151398148Z" } }, "id": "422d39dd855950a6" }, { "cell_type": "markdown", "source": [ "Obsplus has now scanned the waveforms that we just downloaded and made a table\n", "of what is there. Great. These `WaveBank` objects have a similar api to obspy\n", "`Client` objects, so we can use them as a drop-in replacement.\n", "\n", "Now we are nearly ready to make some templates.\n", "\n", "## Template creation\n", "\n", "To make templates you need two things:\n", "1. Continuous waveform data;\n", "2. A catalogue of events with picks.\n", "\n", "We already have (1). For (2), the catalogue of events, we could use GeoNet picked\n", "events, however if you have events with picks locally and want to use those\n", "events as templates you should save those events in a format readable by obspy.\n", "You can then skip ahead to read those picks back in.\n", "\n", "In the worst case scenario you have times that you know that you want your\n", "template to start at, but they are not in any standard format readable by obspy,\n", "you can construct events from scratch as below. Note in this example I am just\n", "populating the picks as this is all we need. You do need to be careful about\n", "the `waveform_id`: this should match the seed id of the continuous data\n", "exactly, otherwise the picks will not be used." ], "metadata": { "collapsed": false }, "id": "9f4dd2504fb18202" }, { "cell_type": "code", "execution_count": 5, "outputs": [], "source": [ "from obspy.core.event import (\n", " Catalog, Event, Pick, WaveformStreamID)\n", "from obspy import UTCDateTime\n", " \n", "# Make the picks for the event:\n", "picks = [\n", " Pick(\n", " time=UTCDateTime(2023, 3, 18, 7, 46, 15, 593125),\n", "\t waveform_id=WaveformStreamID(\n", " network_code='NZ', station_code='MARZ', \n", " channel_code='EHZ', location_code='10'),\n", " phase_hint='P'),\n", " Pick(\n", " time=UTCDateTime(2023, 3, 18, 7, 46, 17, 633115),\n", "\t waveform_id=WaveformStreamID(\n", " network_code='NZ', station_code='MKRZ', \n", " channel_code='EHZ', location_code='10'),\n", " phase_hint='P'),\n", " Pick(\n", " time=UTCDateTime(2023, 3, 18, 7, 46, 18, 110000),\n", "\t waveform_id=WaveformStreamID(\n", " network_code='NZ', station_code='OMRZ', \n", " channel_code='EHZ', location_code='10'),\n", " phase_hint='P'),\n", "] \n", "# Add as many picks as you have - you might want to loop \n", "# and/or make a function to pasre your picks to obspy Picks.\n", "\n", "# Make the event\n", "event = Event(picks=picks)\n", "# Make the catalog\n", "catalog = Catalog([event])" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:19:56.010572545Z", "start_time": "2023-12-01T00:19:56.004559606Z" } }, "id": "cf480f12b889f227" }, { "cell_type": "markdown", "source": [ "For this example we are going to use a catalogue of events picked by GeoNet - we will download those data and write them to disk to mimic you using local files:" ], "metadata": { "collapsed": false }, "id": "d482d389c5260ca6" }, { "cell_type": "code", "execution_count": 6, "outputs": [], "source": [ "from obspy.clients.fdsn import Client\n", "\n", "client = Client(\"GEONET\")\n", "\n", "cat = client.get_events(\n", " starttime=UTCDateTime(2023, 3, 17),\n", " endtime=UTCDateTime(2023, 3, 19),\n", " latitude=-38.05, longitude=176.73, \n", " maxradius=0.5, minmagnitude=3.0) # Limited set of relevent events\n", "\n", "cat.write(\"tutorial_catalog.xml\", format=\"QUAKEML\")" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:20:11.091149707Z", "start_time": "2023-12-01T00:19:56.007304525Z" } }, "id": "70ec5a6c46c54884" }, { "cell_type": "markdown", "source": [ "## Template creation with local files\n", "\n", "Now that we have the events and waveforms we need, we can make our Tribe of templates.\n", "\n", "First we have to read in the events that we want to use as templates:" ], "metadata": { "collapsed": false }, "id": "2d8590e8fd52ed0c" }, { "cell_type": "code", "execution_count": 7, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "49 Event(s) in Catalog:\n", "2023-03-17T14:29:34.921582Z | -38.067, +176.689 | 3.40 MLv | manual\n", "2023-03-17T14:56:17.215087Z | -38.061, +176.679 | 3.07 MLv | manual\n", "...\n", "2023-03-18T20:20:52.842474Z | -38.045, +176.734 | 3.17 MLv | manual\n", "2023-03-18T21:42:39.943071Z | -38.051, +176.735 | 4.25 MLv | manual\n", "To see all events call 'print(CatalogObject.__str__(print_all=True))'\n" ] } ], "source": [ "from obspy import read_events\n", "\n", "cat = read_events(\"tutorial_catalog.xml\")\n", "print(cat)" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:20:40.878853704Z", "start_time": "2023-12-01T00:20:28.064446183Z" } }, "id": "bfce77794a5d15f8" }, { "cell_type": "markdown", "source": [ "### Pick curation\n", "\n", "You may want to limit what picks you actually use for your templates. Any picks that you provide will\n", "be used for cutting waveforms - this may include amplitude picks! You should not need to restrict\n", "what stations you have picks for, but it doesn't do any harm to.\n", "\n", "Below we select picks from the stations that we set earlier, and only P and S picks. We also limit\n", "to only one P and one S pick per station - you may not want to do that, but it can get messy if you\n", "have multiple picks of the same phase." ], "metadata": { "collapsed": false }, "id": "d119b8c9c37581f5" }, { "cell_type": "code", "execution_count": 8, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Matched-filter GPU is not compiled! Should be here: /home/chambeca/my_programs/Building/fast_matched_filter/fast_matched_filter/lib/matched_filter_GPU.so\n" ] } ], "source": [ "from eqcorrscan.utils.catalog_utils import filter_picks\n", "\n", "cat = filter_picks(\n", " cat, \n", " stations=stations, \n", " phase_hints=[\"P\", \"S\"], \n", " enforce_single_pick=\"earliest\") " ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:20:45.250392572Z", "start_time": "2023-12-01T00:20:42.068040579Z" } }, "id": "7d9ac720ccf4aa2f" }, { "cell_type": "markdown", "source": [ "### Template creation decisions\n", "\n", "We now have everything needed to create a tribe of templates. At this point you\n", "have to make some decisions about parameters:\n", "1. What filters to use;\n", "2. What sampling-rate to use;\n", "3. How long your template waveforms should be;\n", "4. When to start your template waveforms relative to your pick times;\n", "5. Whether to use separate P and S windows or not.\n", "\n", "Your choices for 2 and 3 should be based somewhat on your choice of what filters \n", "to use (1). There is little point in using a sampling-rate significantly above\n", "2.5x your high-cut frequency (2.5x because off the roll-offs used in the\n", "Butterworth filters used by EQcorrscan). Lower sampling-rates will result in\n", "fewer correlations and hence faster compute time, but most of the time in EQcorrscan's\n", "matched-filter runs is spent in the pre-processing of the data rather than the\n", "actual correlation computation.\n", "\n", "When deciding on filtering parameters you may find it helpful to look at what\n", "frequencies have the best signal-to-noise ratio. There are functions in the\n", "eqcorrscan.utils.plotting module to help with this.\n", "\n", "We will use some relatively standard, but un-tuned parameters for this example.\n", "You should spend some time deciding: these decisions strongly affect the quality\n", "of your results. You can also set the minimum signal-to-noise ratio for traces\n", "to be included in your templates. Again, this should be tested.\n", "\n", "The `process_len` parameter controls how much data will be processed at once.\n", "Because EQocrrscan computes resampling in the frequency domain, and can compute\n", "correlations in the frequency domain, changing this length between construction \n", "and detection affects the accuracy of the Fourier transforms which affects the\n", "final correlations. For this reason the `process_len` is maintained throughout\n", "the workflow by EQcorrscan. Here we use one hour (3600 seconds), but it is common\n", "to use one day (86400 seconds).\n", "\n", "You will note that we use the `from_client` method for construction: this is\n", "because we have a `WaveBank` that emulates a client making this really simple." ], "metadata": { "collapsed": false }, "id": "df11e06b7dce0cce" }, { "cell_type": "code", "execution_count": 9, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2023-12-01 13:20:51,205\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 2.876741406237065 below threshold for KARZ.EHZ, not using\n", "2023-12-01 13:20:51,206\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.KARZ.10.EHZ\n", "2023-12-01 13:20:51,208\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 1.9540223699044974 below threshold for OMRZ.EHZ, not using\n", "2023-12-01 13:20:51,209\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OMRZ.10.EHZ\n", "2023-12-01 13:20:51,274\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 1.5737322242199878 below threshold for LIRZ.EHZ, not using\n", "2023-12-01 13:20:51,275\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.LIRZ.10.EHZ\n", "2023-12-01 13:20:51,277\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 0.8545790599562736 below threshold for MKRZ.EHN, not using\n", "2023-12-01 13:20:51,278\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.MKRZ.10.EHN\n", "2023-12-01 13:20:51,280\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 1.5224417059935123 below threshold for OMRZ.EHZ, not using\n", "2023-12-01 13:20:51,280\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OMRZ.10.EHZ\n", "2023-12-01 13:20:51,283\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 0.9704676413162124 below threshold for TARZ.EHZ, not using\n", "2023-12-01 13:20:51,283\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.TARZ.10.EHZ\n", "2023-12-01 13:20:51,315\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 3.6693293207326403 below threshold for LIRZ.EHZ, not using\n", "2023-12-01 13:20:51,316\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.LIRZ.10.EHZ\n", "2023-12-01 13:20:51,318\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 3.241518566731996 below threshold for OMRZ.EHZ, not using\n", "2023-12-01 13:20:51,319\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OMRZ.10.EHZ\n", "2023-12-01 13:20:51,322\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 2.657470431167782 below threshold for TARZ.EHZ, not using\n", "2023-12-01 13:20:51,322\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.TARZ.10.EHZ\n", "2023-12-01 13:20:52,446\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 0.5290305215816998 below threshold for LIRZ.EHZ, not using\n", "2023-12-01 13:20:52,446\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.LIRZ.10.EHZ\n", "2023-12-01 13:20:52,448\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 1.0144665058782114 below threshold for TARZ.EHZ, not using\n", "2023-12-01 13:20:52,448\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.TARZ.10.EHZ\n", "2023-12-01 13:20:53,107\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 2.400855581315436 below threshold for LIRZ.EHZ, not using\n", "2023-12-01 13:20:53,108\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.LIRZ.10.EHZ\n", "2023-12-01 13:20:53,109\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 2.3176851693587057 below threshold for MKRZ.EHZ, not using\n", "2023-12-01 13:20:53,110\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.MKRZ.10.EHZ\n", "2023-12-01 13:20:53,112\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 1.3894705992621912 below threshold for OMRZ.EHE, not using\n", "2023-12-01 13:20:53,112\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OMRZ.10.EHE\n", "2023-12-01 13:20:53,114\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 3.884291218472121 below threshold for OPRZ.HHZ, not using\n", "2023-12-01 13:20:53,115\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OPRZ.10.HHZ\n", "2023-12-01 13:20:53,173\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 3.1825458574844174 below threshold for MARZ.EHN, not using\n", "2023-12-01 13:20:53,174\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.MARZ.10.EHN\n", "2023-12-01 13:20:53,177\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 3.851222353660108 below threshold for OPRZ.HHN, not using\n", "2023-12-01 13:20:53,178\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OPRZ.10.HHN\n", "2023-12-01 13:20:53,697\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 0.4135717198623312 below threshold for KARZ.EHZ, not using\n", "2023-12-01 13:20:53,697\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.KARZ.10.EHZ\n", "2023-12-01 13:20:53,699\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 2.3877145548728342 below threshold for LIRZ.EHE, not using\n", "2023-12-01 13:20:53,699\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.LIRZ.10.EHE\n", "2023-12-01 13:20:53,701\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 1.1370695659576338 below threshold for MARZ.EHZ, not using\n", "2023-12-01 13:20:53,701\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.MARZ.10.EHZ\n", "2023-12-01 13:20:53,703\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 0.22436151966111806 below threshold for OMRZ.EHZ, not using\n", "2023-12-01 13:20:53,704\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OMRZ.10.EHZ\n", "2023-12-01 13:20:53,705\teqcorrscan.core.template_gen\tWARNING\tSignal-to-noise ratio 1.487347592301636 below threshold for OPRZ.HHZ, not using\n", "2023-12-01 13:20:53,706\teqcorrscan.core.template_gen\tWARNING\tNo pick for NZ.OPRZ.10.HHZ\n", "2023-12-01 13:20:57,350\teqcorrscan.core.match_filter.tribe\tERROR\tEmpty Template\n", "2023-12-01 13:20:57,352\teqcorrscan.core.match_filter.tribe\tERROR\tEmpty Template\n" ] } ], "source": [ "from eqcorrscan import Tribe\n", "\n", "tribe = Tribe().construct(\n", " method=\"from_client\",\n", " client_id=bank,\n", " catalog=cat,\n", " lowcut=2.0,\n", " highcut=15.0,\n", " samp_rate=50.0,\n", " filt_order=4,\n", " length=3.0,\n", " prepick=0.5,\n", " swin=\"all\",\n", " process_len=3600,\n", " all_horix=True,\n", " min_snr=4.0,\n", " parallel=True\n", ")" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:20:57.362410907Z", "start_time": "2023-12-01T00:20:49.826135441Z" } }, "id": "fb993ce01cf4377a" }, { "cell_type": "markdown", "source": [ "You should see an ERROR message about empty templates: some of the events in our catalog\n", "do not have useful data in our wavebank. We might want to set a minimum number of stations\n", "used for our templates to ensure that our templates are all of reasonable quality.\n", "In this case we will only retain templates with at least five stations:" ], "metadata": { "collapsed": false }, "id": "474f94eb5efa59d5" }, { "cell_type": "code", "execution_count": 10, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tribe of 33 templates\n" ] } ], "source": [ "tribe.templates = [t for t in tribe if len({tr.stats.station for tr in t.st}) >= 5]\n", "print(tribe)" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:20:59.634894692Z", "start_time": "2023-12-01T00:20:59.629687355Z" } }, "id": "aa7ce144ced8823a" }, { "cell_type": "markdown", "source": [ "### Matched-filter detection\n", "\n", "Now that we have our tribe we can use it to detect new earthquakes. Again we\n", "will make use of our local `WaveBank`. This is preferred to feeding one stream\n", "of data to the code at a time for two reasons:\n", "1. EQcorrscan will overlap your chunks of data (in this can every hour of data)\n", " to ensure that all of the data have correlations from all channels after the\n", " delay-and-stack correlation sums.\n", "2. EQcorrscan can pre-emptively process the next chunks data in parallel while\n", " computing detections in the current chunk. This can significantly speed up\n", " processing, and makes better use of compute resources.\n", "\n", "\n", "The main decisions that you have to make at this stage are around thresholds.\n", "Generally it is better to start with a relatively low threshold: you can increase\n", "the threshold later using the `Party.rethreshold` method, but you can't lower\n", "it without re-running the whole detection workflow.\n", "\n", "It is common to use `MAD` thresholding, but you should experiment with your\n", "dataset to see what works best." ], "metadata": { "collapsed": false }, "id": "231442eea4647717" }, { "cell_type": "code", "execution_count": 11, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2023-12-01 13:21:02,031\teqcorrscan.core.match_filter.tribe\tWARNING\tUsing concurrent_processing=True can be faster ifdownloading your data takes a long time. See https://github.com/eqcorrscan/EQcorrscan/pull/544for benchmarks.\n", "2023-12-01 13:22:27,562\teqcorrscan.core.match_filter.helpers.tribe\tWARNING\tRemoved data for NZ.EDRZ.10.EHE NZ.EDRZ.10.EHN NZ.EDRZ.10.EHZ NZ.KARZ.10.EHE NZ.KARZ.10.EHN NZ.KARZ.10.EHZ NZ.LIRZ.10.EHE NZ.LIRZ.10.EHN NZ.LIRZ.10.EHZ NZ.MARZ.10.EHE NZ.MARZ.10.EHN NZ.MARZ.10.EHZ NZ.MKRZ.10.EHE NZ.MKRZ.10.EHN NZ.MKRZ.10.EHZ NZ.OMRZ.10.EHE NZ.OMRZ.10.EHN NZ.OMRZ.10.EHZ NZ.OPRZ.10.HHE NZ.OPRZ.10.HHN NZ.OPRZ.10.HHZ NZ.TARZ.10.EHN NZ.TARZ.10.EHZ due to less than 80% of the required length.\n", "2023-12-01 13:22:27,563\teqcorrscan.core.match_filter.tribe\tWARNING\tNo suitable data between 2023-03-18T23:50:43.199856Z and 2023-03-19T00:51:03.199856Z, skipping\n" ] } ], "source": [ "party = tribe.client_detect(\n", " client=bank,\n", " starttime=starttime,\n", " endtime=endtime,\n", " threshold=10.0,\n", " threshold_type=\"MAD\",\n", " trig_int=1.0,\n", ")" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:22:27.605964306Z", "start_time": "2023-12-01T00:21:02.016248197Z" } }, "id": "b7f7332ffa3ffde3" }, { "cell_type": "code", "execution_count": 12, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "125\n" ] } ], "source": [ "print(len(party))" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:42:38.678673425Z", "start_time": "2023-12-01T00:42:38.668416131Z" } }, "id": "45dd26bad62d8e17" }, { "cell_type": "code", "execution_count": 13, "outputs": [ { "data": { "text/plain": "
", "image/png": "\n" }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "party.plot(plot_grouped=True)" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2023-12-01T00:42:41.270249073Z", "start_time": "2023-12-01T00:42:41.038983113Z" } }, "id": "ca178d93cec54650" }, { "cell_type": "markdown", "source": [ "## Note on concurrent processing\n", "\n", "As of EQcorrscan versions > 0.4.4, detect methods support concurrent processing\n", "of intermediate steps in the matched-filter process when running multiple\n", "chunks of data (e.g. when `endtime - starttime > process_len`). By default this\n", "is disabled as it does increase memory use. However, for cases when downloading\n", "data from a client is a major bottleneck, or processing data is slow (e.g. when you\n", "have limited CPU threads available, but do have a GPU for the FMF correlation\n", "backend), and can cope with the extra memory requirements, this can be much faster.\n", "\n", "To see examples of the speed-ups and memory consumption, look at the benchmarks\n", "in the pull-request [here](https://github.com/eqcorrscan/EQcorrscan/pull/544).\n", "\n", "To enable concurrent processing, use the `concurrent_processing` argument\n", "for `.client_detect` or `.detect` methods on `Tribe` objects." ], "metadata": { "collapsed": false }, "id": "ff1e851b71fc452d" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [], "metadata": { "collapsed": false }, "id": "69047ddc15f2f2c9" } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 5 }