Catalogue processing
For a more general introduction to the QuakeT tool please refer to the online documentation available at https://gemsciencetools.github.io/quakeT/
This notebook demonstrates how to convert a Stochastic Event Set (SES), that is a set of ruptures describing the potential seismicity occurring within a certain period of time according to a PSHA input model, generated by the OpenQuake (OQ) Engine event-based PSHA calculation workflow, into a catalogue format compatible with the Hazard Modeller’s Toolkit (HMTK) (for a general description of the HMTK please refer to the QuakeT Documentation).
The main steps of th workflow described in this notebook are summarized below:
Load raw
SES eventandSES rupturefiles,Merge event and rupture data using the
build_hmtk_ses_catalogueutility,Assign random datetimes to events (
Months,Days,Hours,Minutes,Seconds),Transform the current catalogue format into HMTK format.
[1]:
import os
import pandas as pd
from openquake.man.ses_cat import build_hmtk_ses_catalogue
Merging rupture and event sets
[2]:
# Paths to the files produced by the OQ Engine - Usually use the following command:
# `oq engine --eos <calculation ID>`
events = '../data/aux/ses/output-241-events_62.csv'
ruptures = '../data/aux/ses/output-244-ruptures_62.csv'
output_folder = os.path.join("..", "output")
output = os.path.join(output_folder, "hmtk_sample_catalogue.csv")
# Run build_hmtk_ses_catalogue function
result = build_hmtk_ses_catalogue(events, ruptures, output)
print(f"Done!\n\nOutput file: {result}")
Done!
Output file: ../output/hmtk_sample_catalogue.csv
Statistical summary
This block loads the catalogue and generates a descriptive statistics table to provide an overview of the catalogue’s range and distribution.
[4]:
# Load the generated HMTK catalogue into a `pandas.DataFrame` instance
df = pd.read_csv(result)
# Columns for the summary
summary_cols = ["magnitude", "depth", "year"]
# Generate descriptive statistics
stats_summary = df[summary_cols].describe().T
stats_summary['range'] = stats_summary['max'] - stats_summary['min']
stats_summary.columns = [
'Count', 'Mean', 'Std Dev', 'Min', '25%', '50% (Median)', '75%', 'Max', 'Range'
]
print("Statistical Summary of the catalogue: ")
display(stats_summary.style.format("{:.2f}"))
print(f"\nTotal Number of Events: {len(df)}")
print(f"Time Span: {df['year'].min():.0f} to {df['year'].max():.0f} ({df['year'].max() - df['year'].min():.0f} years)")
Statistical Summary of the catalogue:
| Count | Mean | Std Dev | Min | 25% | 50% (Median) | 75% | Max | Range | |
|---|---|---|---|---|---|---|---|---|---|
| magnitude | 59501.00 | 3.99 | 0.48 | 3.55 | 3.65 | 3.85 | 4.15 | 7.45 | 3.90 |
| depth | 59501.00 | 9.56 | 7.06 | 5.00 | 5.00 | 5.00 | 15.00 | 27.50 | 22.50 |
| year | 59501.00 | 5009.78 | 2891.54 | 1.00 | 2518.00 | 5003.00 | 7529.00 | 10000.00 | 9999.00 |
Total Number of Events: 59501
Time Span: 1 to 10000 (9999 years)
[ ]: