4.1. Template creation

Note: These tutorials are for the functional workflow - for details on creating Template and Tribe objects see the matched filter docs page.

4.1.1. Simple example

Within template_gen there are lots of methods for generating templates depending on the type of pick data and waveform data you are using. Have a look at those to check their simple examples.

Example of how to generate a useful template from data available via FDSN (see obspy.clients.fdsn for a list of possible clients):

>>> from obspy.clients.fdsn import Client
>>> from obspy.core.event import Catalog
>>> from eqcorrscan.core.template_gen import from_client
>>> client = Client('NCEDC')
>>> catalog = client.get_events(eventid='72572665', includearrivals=True)
>>> templates = from_client(catalog=catalog, client_id='NCEDC',
...                         lowcut=2.0, highcut=9.0, samp_rate=20.0,
...                         filt_order=4, length=3.0, prepick=0.15,
...                         swin='all', process_len=200)
Pre-processing data

This will download data for a single event (given by eventid) from the NCEDC database, then use that information to download relevant waveform data. These data will then be filtered and cut according the parameters set.

It is often wise to set these parameters once as variables at the start of your scripts to ensure that the same parameters are used for both template creation and matched-filtering.

The method from_client can cope with multiple events (hence the use of a Catalog object), so you can download a lot of events from your chosen client and generate templates for them all.

4.1.2. Useful considerations

With these data-mining techniques, memory consumption is often an issue, as well as speed. To reduce memory consumption and increase efficiency it is often worth using a subset of picks. The advanced example below shows one way to do this. In practice, five picks (and therefore traces in a template) is often sufficient for matched-filter detections. However, you should test this on your own data.

Some other things that you might want to consider when generating templates include:

  • Template-length, you probably only want to include the real earthquake signal in your template, so really long templates are probably not the best idea.
  • On the same note, don’t include much (if any) data before the P-phase, unless you have good reason to - assuming your noise is random, including noise will reduce the correlations.
  • Consider your frequency band - look for peak power in the chosen waveform relative to the noise.
  • Coda waves often describe scatterers - scattered waves are very interesting, but may reduce the generality of your templates. If this is what you want, include coda, if you want a more general template, I would suggest not including coda. For examples of this you could try generating a lot of templates from a sequence and computing the SVD of the templates to see where the most coherent energy is (in the first basis vector), or just computing the stack of the waveforms.

4.1.3. Storing templates

Templates are returned as obspy Stream objects. You will likely want to store these templates on disk. This is usually best done by saving them as miniseed files. Miniseed uses an efficient compression algorithm and allows multiplexing, which is useful for template storage. However we do not constrain you to this.

>>> templates[0].write('template.ms', format="MSEED")

4.1.4. Advanced example

In this example we will download some data and picks from the New Zealand GeoNet database and make use of the functions in EQcorrscan to quickly and simply generate templates for use in matched-filter detection. In the example we are looking at an earthquake sequence on the east coast of New Zealand’s North Island that occurred on the 4th of January 2016. We will take a set of four template events from the sequence that have been picked by GeoNet, of a range of magnitudes. You can decide if these were good templates or not. You could easily extend this by choosing more template events (the mainshock in the sequence is a M 5 and you can get the information by clicking here).

You do not need to use data from an online server, many pick formats can be parsed, either by obspy, or (for seisan pick files) through the Sfile_utils module in EQcorrscan.

This template script is written to be general, so you should be able to give command line arguments to the script to generate templates from other FDSN databases. Note that some data-centers do not support full FDSN quakeml files, and working out which do is quite painful.

Try this example for another, Northern California Data Center earthquake:

python template_creation.py NCEDC 72572665

This will (unfortunately for you) generate a warning about un-labelled picks, as many data-centers do not provide phase-hints with their picks. We care about which phase is which when we have to run our cross-correlation re-picker (yet to be completed), which means that we, by convention, assign P-picks to the vertical channel and S-picks to both horizontal channels.

This will also show template waveforms for both the automatic and the manual picks - you can change this by removing either automatic or manual picks y setting the evaluation_mode flag in eqcorrscan.utils.catalog_utils.filter_picks().

Note: To run this script and for all map plotting you will need to install matplotlib-toolkits basemap package. Install instructions and a link to the download are here. We recommend that you install the package using your system package manager if it is available.

Important considerations

In this tutorial we enforce downloading of day-long data for the template generation. This is to ensure that the data we make the template from, and the data we use for detection are processed in exactly the same way. If we were to only download a short segment of data around the event and process this we would find that the resampling process would result in minor differences between the templates and the continuous data. This has the effect that, for self-detections, the cross-correlation values are less than 1.

This is an important effect and something that you should consider when generating your own templates. You MUST process your templates in the exact same way (using the same routines, same filters, same resampling, and same data length) as your continuous data. It can have a very significant impact to your results.

The functions provided in eqcorrscan.core.template_gen are there to aid you, but if you look at the source code, all they are doing is:

  • Detrending;
  • Resampling;
  • Filtering;
  • and cutting.

If you want to do these things another way you are more then welcome to!

"""
Simple tutorial detailing how to generate a series of templates from catalog\
data available online.
"""

from obspy.clients.fdsn import Client
from obspy import read_events
from obspy.core.event import Catalog

from eqcorrscan.utils.catalog_utils import filter_picks
from eqcorrscan.core import template_gen


def mktemplates(network_code='GEONET',
                publicIDs=['2016p008122', '2016p008353', '2016p008155',
                           '2016p008194'], plot=True):
    """Functional wrapper to make templates"""
    # We want to download some QuakeML files from the New Zealand GeoNet
    # network, GeoNet currently doesn't support FDSN event queries, so we
    # have to work around to download quakeml from their quakeml.geonet site.

    client = Client(network_code)
    # We want to download a few events from an earthquake sequence, these are
    # identified by publiID numbers, given as arguments

    catalog = Catalog()
    for publicID in publicIDs:
        if network_code == 'GEONET':
            data_stream = client._download(
                'http://quakeml.geonet.org.nz/quakeml/1.2/' + publicID)
            data_stream.seek(0, 0)
            catalog += read_events(data_stream, format="quakeml")
            data_stream.close()
        else:
            catalog += client.get_events(
                eventid=publicID, includearrivals=True)

    # Lets plot the catalog to see what we have
    if plot:
        catalog.plot(projection='local', resolution='h')

    # We don't need all the picks, lets take the information from the
    # five most used stations - note that this is done to reduce computational
    # costs.
    catalog = filter_picks(catalog, top_n_picks=5)
    # We only want the P picks in this example, but you can use others or all
    #  picks if you want.
    for event in catalog:
        for pick in event.picks:
            if pick.phase_hint == 'S':
                event.picks.remove(pick)

    # Now we can generate the templates
    templates = template_gen.from_client(
        catalog=catalog, client_id=network_code, lowcut=2.0, highcut=9.0,
        samp_rate=20.0, filt_order=4, length=3.0, prepick=0.15, swin='all',
        process_len=3600, debug=0, plot=plot)

    # We now have a series of templates! Using Obspy's Stream.write() method we
    # can save these to disk for later use.  We will do that now for use in the
    # following tutorials.
    for i, template in enumerate(templates):
        template.write('tutorial_template_' + str(i) + '.ms', format='MSEED')
        # Note that this will warn you about data types.  As we don't care
        # at the moment, whatever obspy chooses is fine.
    return


if __name__ == '__main__':
    """Wrapper for template creation"""
    import sys
    import warnings
    if not len(sys.argv) > 1:
        warnings.warn('Needs a network ID followed by a list of event IDs, ' +
                      'will run the test case instead')
        mktemplates()
    else:
        net_code = sys.argv[1]
        idlist = list(sys.argv)[2:]
        print(idlist)
        mktemplates(net_code, idlist)