The YAML configuration file¶
A Hamlet evaluation is configured using a YAML file, which describes the tests and evaluations that should be run and any required parameters, the location of the input files, and what outputs should be written. The file is made into a dictionary that is used to guide the Hamlet run, so the order of items is not important.
A simple example of a YAML configuration file is given at the bottom of this
page, and a comprehensive reference configuration (with comments explaining
every option) is available in the source repository at
doc_src/source/example_configs/reference_config.yml.
Metadata (meta)¶
This optional section describes the model. The information is used when writing reports.
meta:
model_name: Model Name
description: "words describing the model"
Test and Evaluation Configuration (config)¶
This section specifies which frameworks, tests, and evaluations to run, along with their parameters.
model_framework¶
This is a dictionary of frameworks to run. Each framework key contains its own tests and evaluations as sub-keys. Multiple frameworks can be run in a single Hamlet run. Available frameworks are:
gem– GEM tests and evaluations (see GEM Tests and Evaluations). The most supported and/or configurable versions of all tests and evaluations are available here. It is recommended to use these.relm– RELM/CSEP tests (see RELM/CSEP Tests). These ‘letter tests’ are found with specific defaults set.sanity– Basic sanity checks (see Sanity Checks).
Tests and evaluations are nested directly under their framework, with their
configuration parameters as sub-keys. Use an empty dictionary ({}) for
those that need no configuration.
config:
model_framework:
gem:
model_mfd: {}
max_mag_check:
append_check: True
warn: True
N_test:
conf_interval: 0.95
prob_model: poisson
M_test:
critical_frac: 0.025
n_iters: 1000
relm:
S_test:
critical_frac: 0.25
n_iters: 1000
investigation_time: 40.
parallelA Boolean (
TrueorFalse) flag that determines whether parallel algorithms are used for loading the seismic source model and performing the more time-intensive tests and evaluations (such as Monte Carlo based consistency tests and moment analysis).This flag should be
Truefor medium to large models, unless RAM is a major limitation. For small models, it may be faster to run on a single core, because the overhead in instantiating multiple processes can be substantial.rand_seedAn optional integer random seed for reproducible random sampling (which is fundamental in Hamlet, as stochastic catalogs are generated for most tests and evaluations).
log_fileAn optional path to a log file. If given, log output is written to this file in addition to the console.
Inputs (input)¶
This section describes the inputs into a Hamlet run: the seismic source model, the observed earthquake catalog, and the configuration of the spatial bins.
Seismic Source Model (ssm)¶
The SSM must be in the modern OpenQuake format. It can be specified either with
a logic tree XML file or an OpenQuake job.ini file. All SSM parameters default
to null when not specified (see cfg_defaults in
openquake.hme.core.core).
Sub-parameters:
ssm_dirFilepath (absolute or relative) to the directory containing the seismic source model logic tree file.
ssm_lt_fileThe name of the logic tree XML file for the SSM, i.e.
ssmLT.xmlfor most GEM Hazard Mosaic model repositories.job_ini_fileAlternative to
ssm_dir+ssm_lt_file: path to an OpenQuakejob.inifile that specifies the source model. This is useful when the source model configuration is complex.branchSpecifies which logic tree branch to evaluate. Because different branches are often mutually-exclusive, alternative descriptions of earthquake occurrence, evaluating multiple branches at once may greatly increase the forecasted occurrence rates and result in inaccurate evaluation.
Set to
"iterate"to evaluate each branch independently in a single Hamlet run. This loads the earthquake catalog once and then evaluates each branch separately, producing combined results.tectonic_region_typesOptional filter specifying which Tectonic Region Type(s) should be evaluated. The types must correspond to those in the SSM. Pass as a YAML list or omit/set to
nullto include all types.tectonic_region_types: - Active Shallow Crust - Stable Continental
source_typesOptional filter specifying which source types to evaluate. Options include
simple_fault,complex_fault,area,point,multipoint, andMultiFaultSource. Pass as a YAML list, ornullfor all types.
Depth Filtering (Optional)¶
min_depthOptional. Minimum source depth in km. Sources shallower than this are excluded from the evaluation. Default:
0.0. Specified at theinputlevel (not insidessm).max_depthOptional. Maximum source depth in km. Sources deeper than this are excluded from the evaluation. Default: no limit. Specified at the
inputlevel (not insidessm).input: min_depth: 0 max_depth: 40
Observed Earthquake Catalog (seis_catalog)¶
This set of parameters determines how the seismic catalog will be found and parsed so that it can be compared to the source model.
seis_catalog_fileRelative or absolute filepath to the CSV earthquake catalog file.
Temporal Parameters¶
Provide any two of start_date, stop_date, and duration; the third
will be calculated automatically. Alternatively, use a completeness_table.
start_dateStart of the catalog time window. Can be an integer (interpreted as January 1 of that year) or a date string (e.g.
"1976-01-01").stop_dateEnd of the catalog time window. Same format as
start_date.durationDuration of the catalog in years (float).
completeness_tableA list of
[year, magnitude]pairs defining the completeness threshold over time. Each pair means “the catalog is complete above this magnitude from this year onward.” When used, the effective duration varies by magnitude bin, which enables more accurate evaluation of models across the magnitude range.completeness_table: - [1960, 5.0] - [1900, 7.2]
Column Mappings (columns)¶
Maps expected fields to actual column names in the CSV file. Only specify columns whose names differ from the defaults.
x_col– Defaults tolongitude.y_col– Defaults tolatitude.depth– Defaults todepth.magnitude– Defaults tomagnitude.source– No default. The institutional source of the earthquake (e.g.Agency).event_id– No default. The earthquake ID column (e.g.eventID).time– Either a single column name containing a parseable timestamp, or an ordered list of columns to construct the time:time: - year - month - day - hour - minute - second
Bins (bins)¶
This section configures the spatial and magnitude binning used throughout Hamlet.
Default values for many parameters (including h3_res, simple_ruptures,
subset, and rupture_file options) are defined in
openquake.hme.core.core.cfg_defaults and are applied automatically when
not specified in the YAML file.
h3_resThe resolution of the H3 hexagonal grid used for spatial binning. Values range from 0 (coarsest, ~4,000 km edge length) to 15 (finest). Default:
3(~69 km edge length). H3 hexagons are generated automatically based on the spatial extent of the source model.mfd_bin_minMinimum magnitude for the MFD. Required.
mfd_bin_maxMaximum magnitude for the MFD. Required.
mfd_bin_widthWidth of the magnitude bins. Required.
Rupture File Caching (rupture_file)¶
Loading ruptures from a source model can be slow for large models. These optional parameters let you save processed ruptures to disk and reload them in subsequent runs, if you have not changed the model and are just changing the test configuration (branches, depth ranges, catalog completeness, etc.).
rupture_file:
read_rupture_file: false
save_rupture_file: false
rupture_file_path: ./ruptures.hdf5
read_rupture_fileIf
true, read ruptures from the file atrupture_file_pathinstead of processing the SSM. Default:false.save_rupture_fileIf
true, save processed ruptures torupture_file_path. Default:false.rupture_file_pathPath to the rupture file. Supported formats:
.hdf5,.feather,.csv.
Spatial Subsetting (subset)¶
Optional parameters to restrict the evaluation to a geographic subset of the model domain.
subset:
file: path/to/subset.geojson
buffer: 0.0
filePath to a GIS file containing the subset geometry. Default:
null.bufferBuffer distance around the subset geometry (in the units of the GIS file’s coordinate reference system). Default:
0.0.
Flatfile (flatfile)¶
Path to a ground motion flatfile (CSV). Required for the
catalog_ground_motion_eval evaluation. Specified at the input level.
input:
flatfile: path/to/flatfile.csv
Note that this requires a flatfile that is constructed with a format that is used internally at GEM (and currently undocumented, pending publication). Until the format stabilizes, there will almost certainly be problems.
Other Input Options¶
simple_rupturesBoolean. Use simplified rupture representations (hypocenter point sources) for faster processing. Default:
True.
Reporting (report)¶
This optional section configures report generation. Currently there is one
option, the basic HTML report, which aggregates all test and evaluation
results into maps, plots, and tables.
report:
basic:
outfile: outputs/report.html
JSON Output (json)¶
Optional. Write test results as a JSON file.
json:
outfile: outputs/results.json
Minimal Example¶
###############################################################################
# Hamlet Minimal Configuration Example
#
# This is the simplest useful configuration: it runs the model MFD comparison,
# the N-test, and the maximum magnitude sanity check for a single logic tree
# branch.
###############################################################################
meta:
model_name: My Model
description: "Minimal example evaluation"
config:
model_framework:
gem:
model_mfd: {}
max_mag_check:
append_check: True
warn: True
N_test:
conf_interval: 0.95
prob_model: poisson
parallel: False
input:
bins:
mfd_bin_min: 6.0
mfd_bin_max: 9.0
mfd_bin_width: 0.2
ssm:
ssm_dir: path/to/ssm/
ssm_lt_file: ssmLT.xml
branch: b1
tectonic_region_types:
- Active Shallow Crust
seis_catalog:
seis_catalog_file: path/to/catalog.csv
stop_date: 2018-01-01
duration: 40.
columns:
event_id: eventID
time:
- year
- month
- day
report:
basic:
outfile: outputs/report.html
Comprehensive Reference¶
The complete reference configuration with all options documented is available
at doc_src/source/example_configs/reference_config.yml in the repository.
###############################################################################
# Hamlet Reference Configuration
#
# This file documents every available configuration option. It is not meant
# to be used as-is; copy it and remove or modify the sections you need.
# Options marked [optional] can be omitted; defaults are noted where they exist.
# Many defaults are defined in openquake.hme.core.core.cfg_defaults and are
# applied automatically when not specified.
###############################################################################
# ---------------------------------------------------------------------------
# Metadata [optional]
# Used in report titles and descriptions.
# ---------------------------------------------------------------------------
meta:
model_name: My Hazard Model
description: "Description of the model being evaluated"
# ---------------------------------------------------------------------------
# Test and Evaluation Configuration
# ---------------------------------------------------------------------------
config:
# -------------------------------------------------------------------------
# model_framework
#
# A dictionary of frameworks to run. Each framework contains its own
# tests and evaluations as keys. Multiple frameworks can be run in a
# single Hamlet run. Available frameworks: gem, relm, sanity.
# -------------------------------------------------------------------------
model_framework:
# -----------------------------------------------------------------------
# GEM Framework
# Tests developed by GEM, some based on the literature, some original.
# -----------------------------------------------------------------------
gem:
# Model MFD Evaluation
# Compares the total model magnitude-frequency distribution to the
# observed MFD from the earthquake catalog. Produces a figure in
# the report. Use {} for default parameters.
model_mfd:
investigation_time: 40. # [optional] Duration in years; defaults
# to seis_catalog duration
annualize: True # [optional] Annualize rates; default True
# Maximum Magnitude Check (sanity check)
# Checks whether the model can produce earthquakes as large as the
# largest observed earthquake in each spatial cell.
max_mag_check:
append_check: True # [optional] Append pass/fail to bin data
warn: True # [optional] Log warnings for failures
# N-Test: Number of Earthquakes Test
# Compares the total observed earthquake count to the model prediction.
N_test:
conf_interval: 0.95 # Confidence interval for the test
prob_model: poisson # Probability model: "poisson" or "poisson_cum"
prospective: False # [optional] Use prospective catalog; default False
# investigation_time: 40. # [optional] Overridden by catalog duration
# when not prospective
# M-Test: Magnitude-Frequency Distribution Consistency Test
# Evaluates consistency of the model MFD vs. observations using
# log-likelihood comparisons with stochastic catalogs.
M_test:
critical_frac: 0.025 # Fraction of simulations below which the
# test fails (e.g. 0.025 = 2.5th percentile)
prospective: False # [optional] Use prospective catalog
n_iters: 1000 # Number of Monte Carlo iterations
normalize_n_eqs: False # [optional] Normalize by number of
# earthquakes; default True
not_modeled_likelihood: 1.0e-5 # [optional] Likelihood for unmodeled
# bins; default 1e-5
# investigation_time: 40. # [optional] Required if prospective: True
# S-Test: Spatial Consistency Test
# Evaluates spatial consistency of the model by comparing per-cell
# likelihoods of observed vs. stochastic catalogs.
S_test:
critical_frac: 0.25 # Fraction threshold for test failure
prospective: False # [optional] Use prospective catalog
n_iters: 1000 # Number of Monte Carlo iterations
normalize_n_eqs: False # [optional] Normalize by number of
# earthquakes; default False
not_modeled_likelihood: 1.0e-5 # [optional] Likelihood for unmodeled
# cells; default 1e-5
likelihood_function: mfd # [optional] Likelihood function to use;
# options: "mfd", "conf_interval_poisson"
# investigation_time: 40. # [optional] Required if prospective: True
# L-Test: Likelihood Test
# Joint likelihood test combining spatial and magnitude information.
L_test:
critical_frac: 0.025 # Fraction threshold for test failure
prospective: False # [optional] Use prospective catalog
n_iters: 1000 # Number of Monte Carlo iterations
not_modeled_likelihood: 1.0e-5 # [optional] default 1e-5
# investigation_time: 40. # [optional] Required if prospective: True
# Moment Over/Under Evaluation
# Compares observed vs. stochastic moment release in each spatial cell.
moment_over_under:
investigation_time: 40. # Duration in years
n_iters: 100 # Number of stochastic event sets
# min_mag: 6.0 # [optional] Defaults to mfd_bin_min
# max_mag: 8.5 # [optional] Defaults to mfd_bin_max
# Rupture Matching Evaluation
# Matches observed earthquakes to modeled ruptures based on location,
# magnitude, and (optionally) geometry.
rupture_matching_eval:
use_occurrence_rate: False # [optional] Weight by occurrence rate;
# default False
distance_lambda: 1.0 # [optional] Distance decay parameter
mag_window: 1.0 # [optional] Magnitude window for matching
group_return_threshold: 0.9 # [optional] Threshold for group matching
min_likelihood: 0.1 # [optional] Minimum match likelihood
no_attitude_default_like: 0.5 # [optional] Default likelihood when
# rupture has no attitude data
no_rake_default_like: 0.5 # [optional] Default likelihood when
# rupture has no rake data
return_one: best # [optional] "best" or "all"
parallel: False # [optional] Parallel matching; default False
# Cumulative Occurrence Evaluation
# Evaluates cumulative earthquake occurrence over time by magnitude bin.
# Use {} for default parameters (no configuration needed).
cumulative_occurrence_eval: {}
# Catalog Ground Motion Evaluation
# Compares observed ground motions from a flatfile with model predictions.
# Requires a flatfile to be specified under input.
catalog_ground_motion_eval:
match_rups: False # [optional] Match ruptures to earthquakes
# before ground motion comparison
# gmf_method: ground_motion_fields # [optional] Method for ground
# motion calculation
# -----------------------------------------------------------------------
# RELM/CSEP Framework
# Implementation of the RELM (Regional Earthquake Likelihood Models)
# tests. These are similar to the GEM tests but use different statistical
# assumptions (e.g. not_modeled_likelihood is hardcoded to 0.0).
# -----------------------------------------------------------------------
relm:
N_test:
conf_interval: 0.95
prob_model: poisson # "poisson" or "poisson_cum"
investigation_time: 40.
prospective: False # [optional]
M_test:
critical_frac: 0.25
n_iters: 1000
investigation_time: 40.
prospective: False # [optional]
S_test:
critical_frac: 0.25
n_iters: 1000
investigation_time: 40.
prospective: False # [optional]
likelihood_function: mfd # [optional] "mfd" or "conf_interval_poisson"
normalize_n_eqs: False # [optional]
L_test:
critical_frac: 0.25
n_iters: 1000
investigation_time: 40.
prospective: False # [optional]
# -----------------------------------------------------------------------
# Sanity Checks Framework
# Basic checks for model internal consistency.
# -----------------------------------------------------------------------
sanity:
max_check:
warn: True # [optional] Log warnings; default True
# -------------------------------------------------------------------------
# Global config options
# -------------------------------------------------------------------------
parallel: False # Use multiprocessing for source loading
# and intensive tests. Recommended for
# medium to large models.
rand_seed: 69 # [optional] Random seed for reproducible
# Monte Carlo simulations. Must be an integer.
log_file: hamlet_run.log # [optional] Path to log file output
# ---------------------------------------------------------------------------
# Inputs
# ---------------------------------------------------------------------------
input:
# -------------------------------------------------------------------------
# Spatial and magnitude binning
# -------------------------------------------------------------------------
bins:
mfd_bin_min: 6.0 # Minimum magnitude (required)
mfd_bin_max: 9.0 # Maximum magnitude (required)
mfd_bin_width: 0.2 # Bin width (required)
h3_res: 3 # [optional] H3 hexagonal grid resolution
# (0-15, where 0 is coarsest); default 3
# -------------------------------------------------------------------------
# Seismic Source Model (SSM)
#
# The SSM must be in OpenQuake format. Specify either ssm_dir + ssm_lt_file,
# or job_ini_file.
# -------------------------------------------------------------------------
ssm:
ssm_dir: path/to/ssm/ # Directory containing the SSM files
ssm_lt_file: ssmLT.xml # Logic tree XML file name
# job_ini_file: job.ini # [optional] Alternative: specify an
# OpenQuake job.ini file instead of
# ssm_dir + ssm_lt_file. Default: null
branch: b1 # [optional] Logic tree branch to evaluate.
# Set to "iterate" to evaluate each branch
# independently in a single run. Default: null
tectonic_region_types: # [optional] Filter by tectonic region type.
- Active Shallow Crust # Must match types in the SSM. Default: null
# (all types included).
source_types: null # [optional] Filter by source type. Default: null. Options:
# simple_fault, complex_fault, area, point,
# multipoint, MultiFaultSource.
# Pass as a list or null for all.
# -------------------------------------------------------------------------
# Depth filtering [optional]
# Sources outside this depth range are excluded from evaluation.
# -------------------------------------------------------------------------
min_depth: 0 # [optional] Minimum source depth in km; default: 0.0
max_depth: 40 # [optional] Maximum source depth in km; default: no limit
# -------------------------------------------------------------------------
# Rupture file caching [optional]
# Loading ruptures from a source model can be slow. These options let you
# save processed ruptures to disk and reload them in subsequent runs.
# -------------------------------------------------------------------------
rupture_file:
read_rupture_file: false # Read ruptures from file instead of SSM; default: false
save_rupture_file: false # Save processed ruptures to file; default: false
rupture_file_path: ./ruptures.hdf5 # Path to the rupture file; default: null
# Supported formats: .hdf5, .feather, .csv
# -------------------------------------------------------------------------
# Spatial subsetting [optional]
# Restrict the evaluation to a geographic subset of the model domain.
# -------------------------------------------------------------------------
subset:
file: path/to/subset.geojson # GIS file with subset geometry; default: null
buffer: 0.0 # Buffer distance around the subset geometry; default: 0.0
# -------------------------------------------------------------------------
# Flatfile for ground motion evaluation [optional]
# Required for catalog_ground_motion_eval test.
# -------------------------------------------------------------------------
flatfile: path/to/flatfile.csv
# -------------------------------------------------------------------------
# Simple ruptures [optional]
# Use simplified rupture representations (point sources) for faster
# processing. Default: true (from cfg_defaults).
# -------------------------------------------------------------------------
simple_ruptures: true
# -------------------------------------------------------------------------
# Observed Earthquake Catalog
# -------------------------------------------------------------------------
seis_catalog:
seis_catalog_file: path/to/catalog.csv
# Temporal parameters: provide any two of start_date, stop_date, duration.
# The third will be calculated. Alternatively, use a completeness_table.
# start_date: 1976 # Can be an integer (year) or date string
stop_date: 2018-01-01 # Date string (YYYY-MM-DD)
# duration: 40. # Duration in years
# Completeness table [optional]
# A list of [year, magnitude] pairs defining the completeness threshold
# over time. Each pair means "the catalog is complete above this
# magnitude from this year onward." When used, the effective duration
# varies by magnitude bin.
completeness_table:
- [1960, 5.0]
- [1900, 7.2]
# Column mappings [optional]
# Map expected fields to actual column names in the CSV. Only specify
# columns whose names differ from the defaults.
columns:
# x_col: longitude # default: longitude
# y_col: latitude # default: latitude
# depth: depth # default: depth
# magnitude: magnitude # default: magnitude
# source: Agency # no default; institutional source of the eq
event_id: eventID # no default; earthquake ID column
# Time column(s): either a single column name or a list of components
time:
- year
- month
- day
- hour
- minute
- second
# -------------------------------------------------------------------------
# Prospective catalog [optional]
# A separate catalog for prospective (forward-looking) evaluation.
# Uses the same format/columns as seis_catalog.
# -------------------------------------------------------------------------
# prospective_catalog:
# seis_catalog_file: path/to/prospective_catalog.csv
# duration: 5.
# ---------------------------------------------------------------------------
# Report [optional]
# Generate an HTML report summarizing all test results.
# ---------------------------------------------------------------------------
report:
basic:
outfile: outputs/report.html
# ---------------------------------------------------------------------------
# JSON output [optional]
# Write test results as a JSON file.
# ---------------------------------------------------------------------------
json:
outfile: outputs/results.json