Model Testing Frameworks

Hamlet provides four frameworks that can be used independently or combined in a single Hamlet run. Each framework is specified as a key under config.model_framework in the YAML configuration file.

GEM Tests and Evaluations

These tests are developed by GEM, some based on the literature (e.g. Zechar et al. 2010), some based on GEM’s own ideas and implementations. See gem for the function documentation. Sanity checks (as detailed below) are also available from the GEM testing framework, for convenience during the workflow.

The GEM framework includes the N, M, S, and L consistency tests (similar to the RELM/CSEP versions but with configurable handling of unmodeled cells/bins), as well as several additional evaluations: MFD comparison, moment rate analysis, rupture matching, cumulative occurrence, and ground motion evaluation.

Note

The likelihood test is deprecated and should not be used. Use the M_test, S_test, and L_test instead, which provide more robust and well-characterized likelihood-based consistency tests.

Statistical Consistency Tests

These tests evaluate model consistency using Monte Carlo simulations. Each test generates many stochastic catalogs from the model and compares a test statistic of the observed catalog against the distribution of the stochastic catalogs.

N-Test

Compares the total number of observed earthquakes to the number predicted by the model. The observed count is checked against a confidence interval derived from the Poisson (or cumulative Poisson) distribution.

Parameters:

conf_interval

Confidence interval for the test (e.g. 0.95 means 95%).

prob_model

Probability model: "poisson" or "poisson_cum".

prospective

Optional. If True, use the prospective catalog instead of the retrospective catalog. Default: False.

M-Test

Evaluates the consistency of the magnitude-frequency distribution of the model vs. the observations. The log-likelihood of the observed earthquakes given the model forecast is compared with the log-likelihoods of stochastic catalogs generated from the same forecast. If the observed log-likelihood falls below the critical_frac threshold of the stochastic distribution, the test fails.

The log-likelihoods are calculated for each magnitude bin using the Poisson distribution, then aggregated as the geometric mean across bins.

This test is based on Zechar et al. (2010) with two differences: (1) the total number of earthquakes in stochastic simulations is not fixed, and (2) the geometric mean is used instead of the product of bin likelihoods. Neither difference affects pass/fail outcomes meaningfully.

Parameters:

critical_frac

Fraction of simulations below which the test fails (e.g. 0.025 for a 2.5th percentile threshold).

n_iters

Number of Monte Carlo iterations.

prospective

Optional. Default: False.

normalize_n_eqs

Optional. Normalize the number of earthquakes in stochastic catalogs to match the observed count. Default: True.

not_modeled_likelihood

Optional. Likelihood assigned to magnitude bins with zero modeled rate but observed earthquakes. Default: 1e-5.

S-Test

Evaluates the spatial consistency of the model by comparing per-cell likelihoods of the observed catalog against stochastic catalogs. This highlights spatial cells where the model over- or under-predicts seismicity.

Parameters:

critical_frac

Fraction threshold for test failure.

n_iters

Number of Monte Carlo iterations.

prospective

Optional. Default: False.

normalize_n_eqs

Optional. Normalize by number of earthquakes. Default: False.

not_modeled_likelihood

Optional. Default: 1e-5.

likelihood_function

Optional. The likelihood function to use for per-cell evaluation. Options: "mfd" (default) or "conf_interval_poisson".

L-Test

Joint likelihood test combining spatial and magnitude information. This is the most comprehensive consistency test, evaluating the overall likelihood of the observed catalog given the model.

Parameters:

critical_frac

Fraction threshold for test failure.

n_iters

Number of Monte Carlo iterations.

prospective

Optional. Default: False.

not_modeled_likelihood

Optional. Default: 1e-5.

Magnitude-Frequency Distribution Evaluations

Model MFD Evaluation (model_mfd)

Sums up the MFDs from all spatial cells to produce a total model MFD, which is compared to the observed MFD from the earthquake catalog. This produces a figure in the report showing both MFDs. Use {} for default parameters.

Parameters (all optional):

investigation_time

Duration in years. Defaults to the seismic catalog duration.

annualize

If True, annualize the rates. Default: True.

Maximum Magnitude Check (max_mag_check)

A sanity check that verifies the model can produce earthquakes as large as the largest observed earthquake in each spatial cell. Note that there can be issues with very large earthquakes (with ruptures larger than the cell size), as the hypocenter for an observed event may be in a different cell than the most compatible source.

Parameters:

append_check

Optional. Boolean. If True, append pass/fail results to the bin data.

warn

Optional. Boolean. If True, log warnings for each failing cell.

Other Evaluations

Moment Over/Under Evaluation (moment_over_under)

Generates many stochastic catalogs and compares the total seismic moment release in each spatial cell to the observed moment release. This helps highlight areas that are more or less seismically productive than the observations support.

Parameters:

investigation_time

Duration of the catalog in years.

n_iters

Number of stochastic event sets to generate.

min_mag

Optional. Minimum magnitude for moment calculation. Defaults to mfd_bin_min.

max_mag

Optional. Maximum magnitude for moment calculation. Defaults to mfd_bin_max.

Rupture Matching Evaluation (rupture_matching_eval)

Matches observed earthquakes to modeled ruptures based on proximity, magnitude similarity, and (optionally) geometric similarity (attitude and rake). This evaluation helps assess whether the model contains ruptures that are consistent with the observed earthquakes.

Parameters (all optional, with defaults):

use_occurrence_rate

Weight matches by rupture occurrence rate. Default: False.

distance_lambda

Distance decay parameter for the matching function. Default: 1.0.

mag_window

Magnitude window for considering candidate ruptures. Default: 1.0.

group_return_threshold

Threshold for group matching. Default: 0.9.

min_likelihood

Minimum match likelihood. Default: 0.1.

no_attitude_default_like

Default likelihood when a rupture has no attitude data. Default: 0.5.

no_rake_default_like

Default likelihood when a rupture has no rake data. Default: 0.5.

return_one

"best" to return only the best match, or "all" for all matches above the threshold. Default: "best".

parallel

Use parallel processing for matching. Default: False.

Cumulative Occurrence Evaluation (cumulative_occurrence_eval)

Evaluates the cumulative earthquake occurrence over time for each magnitude bin, comparing the observed temporal pattern to the model’s predicted rate. This is useful for identifying temporal clustering or quiescence relative to the model.

Takes no configuration parameters (use {}).

Catalog Ground Motion Evaluation (catalog_ground_motion_eval)

Compares observed ground motions from a flatfile with model predictions. This requires a flatfile to be specified under input.flatfile in the configuration.

Parameters:

match_rups

Optional. If True, match earthquakes to model ruptures before computing ground motion comparisons; if False, then new ruptures are generated based on the earthquake information in the flatfile. If ruptures are matched, it provides a better understanding of whether the model is reproducing the ground motions from specific earthquakes, and the confidence with which an earthquake is assigned to a tectonic region type is much higher (as the best-matching rupture will already have one defined). Default: False.

gmf_method

Optional. Method for ground motion calculation. Default: "ground_motion_fields".

RELM/CSEP Tests

The RELM (Regional Earthquake Likelihood Models) / CSEP tests are implementations of the standard CSEP consistency tests. These are similar to the GEM tests but differ in some statistical assumptions – notably, not_modeled_likelihood is hardcoded to 0.0 in the RELM framework, meaning that any observations in unmodeled cells or magnitude bins will cause the test to fail.

Available tests:

N-Test

Compares the total number of observed earthquakes to the model prediction. See N-Test for details on the test logic.

Parameters:

conf_interval

Confidence interval for the test (e.g. 0.95).

prob_model

Probability model: "poisson" or "poisson_cum".

investigation_time

Duration in years.

prospective

Optional. Use prospective catalog. Default: False.

M-Test

Evaluates the consistency of the magnitude-frequency distribution. See M-Test for details on the test logic.

Parameters:

critical_frac

Fraction threshold for test failure (e.g. 0.25).

n_iters

Number of Monte Carlo iterations.

investigation_time

Duration in years.

prospective

Optional. Use prospective catalog. Default: False.

S-Test

Evaluates the spatial consistency of the model. See S-Test for details on the test logic.

Parameters:

critical_frac

Fraction threshold for test failure.

n_iters

Number of Monte Carlo iterations.

investigation_time

Duration in years.

prospective

Optional. Default: False.

likelihood_function

Optional. "mfd" or "conf_interval_poisson". Default: "mfd".

normalize_n_eqs

Optional. Normalize by number of earthquakes. Default: False.

L-Test

Joint likelihood test combining spatial and magnitude information.

Parameters:

critical_frac

Fraction threshold for test failure.

n_iters

Number of Monte Carlo iterations.

investigation_time

Duration in years.

prospective

Optional. Default: False.

Sanity Checks

Sanity checks are basic tests to verify that the model is internally consistent and matches the observations at a gross level.

Maximum Magnitude Check (max_check)

Evaluates the observed seismicity and MFD inside each spatial cell to check whether the maximum magnitude of the MFD is larger than the largest observed earthquake. Cells where the observed maximum magnitude exceeds the model maximum magnitude are flagged.

Parameters:

warn

Optional. Boolean. If True, log a warning for each cell that fails the check. Default: True.