The YAML configuration file

A Hamlet evaluation is configured using a YAML file, which describes the tests and evaluations that should be run and any required parameters, the location of the input files, and what outputs should be written. The file is made into a dictionary that is used to guide the Hamlet run, so the order of items is not important.

A simple example of a YAML configuration file is given at the bottom of this page, and a comprehensive reference configuration (with comments explaining every option) is available in the source repository at doc_src/source/example_configs/reference_config.yml.

Metadata (meta)

This optional section describes the model. The information is used when writing reports.

meta:
    model_name: Model Name
    description: "words describing the model"

Test and Evaluation Configuration (config)

This section specifies which frameworks, tests, and evaluations to run, along with their parameters.

model_framework

This is a dictionary of frameworks to run. Each framework key contains its own tests and evaluations as sub-keys. Multiple frameworks can be run in a single Hamlet run. Available frameworks are:

  • gem – GEM tests and evaluations (see GEM Tests and Evaluations). The most supported and/or configurable versions of all tests and evaluations are available here. It is recommended to use these.

  • relm – RELM/CSEP tests (see RELM/CSEP Tests). These ‘letter tests’ are found with specific defaults set.

  • sanity – Basic sanity checks (see Sanity Checks).

Tests and evaluations are nested directly under their framework, with their configuration parameters as sub-keys. Use an empty dictionary ({}) for those that need no configuration.

config:
  model_framework:
    gem:
      model_mfd: {}
      max_mag_check:
        append_check: True
        warn: True
      N_test:
        conf_interval: 0.95
        prob_model: poisson
      M_test:
        critical_frac: 0.025
        n_iters: 1000
    relm:
      S_test:
        critical_frac: 0.25
        n_iters: 1000
        investigation_time: 40.
parallel

A Boolean (True or False) flag that determines whether parallel algorithms are used for loading the seismic source model and performing the more time-intensive tests and evaluations (such as Monte Carlo based consistency tests and moment analysis).

This flag should be True for medium to large models, unless RAM is a major limitation. For small models, it may be faster to run on a single core, because the overhead in instantiating multiple processes can be substantial.

rand_seed

An optional integer random seed for reproducible random sampling (which is fundamental in Hamlet, as stochastic catalogs are generated for most tests and evaluations).

log_file

An optional path to a log file. If given, log output is written to this file in addition to the console.

Inputs (input)

This section describes the inputs into a Hamlet run: the seismic source model, the observed earthquake catalog, and the configuration of the spatial bins.

Seismic Source Model (ssm)

The SSM must be in the modern OpenQuake format. It can be specified either with a logic tree XML file or an OpenQuake job.ini file. All SSM parameters default to null when not specified (see cfg_defaults in openquake.hme.core.core).

Sub-parameters:

ssm_dir

Filepath (absolute or relative) to the directory containing the seismic source model logic tree file.

ssm_lt_file

The name of the logic tree XML file for the SSM, i.e. ssmLT.xml for most GEM Hazard Mosaic model repositories.

job_ini_file

Alternative to ssm_dir + ssm_lt_file: path to an OpenQuake job.ini file that specifies the source model. This is useful when the source model configuration is complex.

branch

Specifies which logic tree branch to evaluate. Because different branches are often mutually-exclusive, alternative descriptions of earthquake occurrence, evaluating multiple branches at once may greatly increase the forecasted occurrence rates and result in inaccurate evaluation.

Set to "iterate" to evaluate each branch independently in a single Hamlet run. This loads the earthquake catalog once and then evaluates each branch separately, producing combined results.

tectonic_region_types

Optional filter specifying which Tectonic Region Type(s) should be evaluated. The types must correspond to those in the SSM. Pass as a YAML list or omit/set to null to include all types.

tectonic_region_types:
    - Active Shallow Crust
    - Stable Continental
source_types

Optional filter specifying which source types to evaluate. Options include simple_fault, complex_fault, area, point, multipoint, and MultiFaultSource. Pass as a YAML list, or null for all types.

Depth Filtering (Optional)

min_depth

Optional. Minimum source depth in km. Sources shallower than this are excluded from the evaluation. Default: 0.0. Specified at the input level (not inside ssm).

max_depth

Optional. Maximum source depth in km. Sources deeper than this are excluded from the evaluation. Default: no limit. Specified at the input level (not inside ssm).

input:
  min_depth: 0
  max_depth: 40

Observed Earthquake Catalog (seis_catalog)

This set of parameters determines how the seismic catalog will be found and parsed so that it can be compared to the source model.

seis_catalog_file

Relative or absolute filepath to the CSV earthquake catalog file.

Temporal Parameters

Provide any two of start_date, stop_date, and duration; the third will be calculated automatically. Alternatively, use a completeness_table.

start_date

Start of the catalog time window. Can be an integer (interpreted as January 1 of that year) or a date string (e.g. "1976-01-01").

stop_date

End of the catalog time window. Same format as start_date.

duration

Duration of the catalog in years (float).

completeness_table

A list of [year, magnitude] pairs defining the completeness threshold over time. Each pair means “the catalog is complete above this magnitude from this year onward.” When used, the effective duration varies by magnitude bin, which enables more accurate evaluation of models across the magnitude range.

completeness_table:
  - [1960, 5.0]
  - [1900, 7.2]

Column Mappings (columns)

Maps expected fields to actual column names in the CSV file. Only specify columns whose names differ from the defaults.

  • x_col – Defaults to longitude.

  • y_col – Defaults to latitude.

  • depth – Defaults to depth.

  • magnitude – Defaults to magnitude.

  • source – No default. The institutional source of the earthquake (e.g. Agency).

  • event_id – No default. The earthquake ID column (e.g. eventID).

  • time – Either a single column name containing a parseable timestamp, or an ordered list of columns to construct the time:

    time:
        - year
        - month
        - day
        - hour
        - minute
        - second
    

Bins (bins)

This section configures the spatial and magnitude binning used throughout Hamlet.

Default values for many parameters (including h3_res, simple_ruptures, subset, and rupture_file options) are defined in openquake.hme.core.core.cfg_defaults and are applied automatically when not specified in the YAML file.

h3_res

The resolution of the H3 hexagonal grid used for spatial binning. Values range from 0 (coarsest, ~4,000 km edge length) to 15 (finest). Default: 3 (~69 km edge length). H3 hexagons are generated automatically based on the spatial extent of the source model.

mfd_bin_min

Minimum magnitude for the MFD. Required.

mfd_bin_max

Maximum magnitude for the MFD. Required.

mfd_bin_width

Width of the magnitude bins. Required.

Rupture File Caching (rupture_file)

Loading ruptures from a source model can be slow for large models. These optional parameters let you save processed ruptures to disk and reload them in subsequent runs, if you have not changed the model and are just changing the test configuration (branches, depth ranges, catalog completeness, etc.).

rupture_file:
  read_rupture_file: false
  save_rupture_file: false
  rupture_file_path: ./ruptures.hdf5
read_rupture_file

If true, read ruptures from the file at rupture_file_path instead of processing the SSM. Default: false.

save_rupture_file

If true, save processed ruptures to rupture_file_path. Default: false.

rupture_file_path

Path to the rupture file. Supported formats: .hdf5, .feather, .csv.

Spatial Subsetting (subset)

Optional parameters to restrict the evaluation to a geographic subset of the model domain.

subset:
  file: path/to/subset.geojson
  buffer: 0.0
file

Path to a GIS file containing the subset geometry. Default: null.

buffer

Buffer distance around the subset geometry (in the units of the GIS file’s coordinate reference system). Default: 0.0.

Flatfile (flatfile)

Path to a ground motion flatfile (CSV). Required for the catalog_ground_motion_eval evaluation. Specified at the input level.

input:
  flatfile: path/to/flatfile.csv

Note that this requires a flatfile that is constructed with a format that is used internally at GEM (and currently undocumented, pending publication). Until the format stabilizes, there will almost certainly be problems.

Other Input Options

simple_ruptures

Boolean. Use simplified rupture representations (hypocenter point sources) for faster processing. Default: True.

Reporting (report)

This optional section configures report generation. Currently there is one option, the basic HTML report, which aggregates all test and evaluation results into maps, plots, and tables.

report:
  basic:
    outfile: outputs/report.html

JSON Output (json)

Optional. Write test results as a JSON file.

json:
  outfile: outputs/results.json

Minimal Example

###############################################################################
# Hamlet Minimal Configuration Example
#
# This is the simplest useful configuration: it runs the model MFD comparison,
# the N-test, and the maximum magnitude sanity check for a single logic tree
# branch.
###############################################################################

meta:
  model_name: My Model
  description: "Minimal example evaluation"

config:
  model_framework:
    gem:
      model_mfd: {}
      max_mag_check:
        append_check: True
        warn: True
      N_test:
        conf_interval: 0.95
        prob_model: poisson

  parallel: False

input:
  bins:
    mfd_bin_min: 6.0
    mfd_bin_max: 9.0
    mfd_bin_width: 0.2

  ssm:
    ssm_dir: path/to/ssm/
    ssm_lt_file: ssmLT.xml
    branch: b1
    tectonic_region_types:
      - Active Shallow Crust

  seis_catalog:
    seis_catalog_file: path/to/catalog.csv
    stop_date: 2018-01-01
    duration: 40.
    columns:
      event_id: eventID
      time:
        - year
        - month
        - day

report:
  basic:
    outfile: outputs/report.html

Comprehensive Reference

The complete reference configuration with all options documented is available at doc_src/source/example_configs/reference_config.yml in the repository.

###############################################################################
# Hamlet Reference Configuration
#
# This file documents every available configuration option. It is not meant
# to be used as-is; copy it and remove or modify the sections you need.
# Options marked [optional] can be omitted; defaults are noted where they exist.
# Many defaults are defined in openquake.hme.core.core.cfg_defaults and are
# applied automatically when not specified.
###############################################################################

# ---------------------------------------------------------------------------
# Metadata [optional]
# Used in report titles and descriptions.
# ---------------------------------------------------------------------------
meta:
  model_name: My Hazard Model
  description: "Description of the model being evaluated"

# ---------------------------------------------------------------------------
# Test and Evaluation Configuration
# ---------------------------------------------------------------------------
config:

  # -------------------------------------------------------------------------
  # model_framework
  #
  # A dictionary of frameworks to run. Each framework contains its own
  # tests and evaluations as keys. Multiple frameworks can be run in a
  # single Hamlet run. Available frameworks: gem, relm, sanity.
  # -------------------------------------------------------------------------
  model_framework:

    # -----------------------------------------------------------------------
    # GEM Framework
    # Tests developed by GEM, some based on the literature, some original.
    # -----------------------------------------------------------------------
    gem:

      # Model MFD Evaluation
      # Compares the total model magnitude-frequency distribution to the
      # observed MFD from the earthquake catalog. Produces a figure in
      # the report. Use {} for default parameters.
      model_mfd:
        investigation_time: 40.    # [optional] Duration in years; defaults
                                   # to seis_catalog duration
        annualize: True            # [optional] Annualize rates; default True

      # Maximum Magnitude Check (sanity check)
      # Checks whether the model can produce earthquakes as large as the
      # largest observed earthquake in each spatial cell.
      max_mag_check:
        append_check: True         # [optional] Append pass/fail to bin data
        warn: True                 # [optional] Log warnings for failures

      # N-Test: Number of Earthquakes Test
      # Compares the total observed earthquake count to the model prediction.
      N_test:
        conf_interval: 0.95        # Confidence interval for the test
        prob_model: poisson        # Probability model: "poisson" or "poisson_cum"
        prospective: False         # [optional] Use prospective catalog; default False
        # investigation_time: 40.  # [optional] Overridden by catalog duration
                                   # when not prospective

      # M-Test: Magnitude-Frequency Distribution Consistency Test
      # Evaluates consistency of the model MFD vs. observations using
      # log-likelihood comparisons with stochastic catalogs.
      M_test:
        critical_frac: 0.025       # Fraction of simulations below which the
                                   # test fails (e.g. 0.025 = 2.5th percentile)
        prospective: False         # [optional] Use prospective catalog
        n_iters: 1000              # Number of Monte Carlo iterations
        normalize_n_eqs: False     # [optional] Normalize by number of
                                   # earthquakes; default True
        not_modeled_likelihood: 1.0e-5  # [optional] Likelihood for unmodeled
                                        # bins; default 1e-5
        # investigation_time: 40.  # [optional] Required if prospective: True

      # S-Test: Spatial Consistency Test
      # Evaluates spatial consistency of the model by comparing per-cell
      # likelihoods of observed vs. stochastic catalogs.
      S_test:
        critical_frac: 0.25        # Fraction threshold for test failure
        prospective: False         # [optional] Use prospective catalog
        n_iters: 1000              # Number of Monte Carlo iterations
        normalize_n_eqs: False     # [optional] Normalize by number of
                                   # earthquakes; default False
        not_modeled_likelihood: 1.0e-5  # [optional] Likelihood for unmodeled
                                        # cells; default 1e-5
        likelihood_function: mfd   # [optional] Likelihood function to use;
                                   # options: "mfd", "conf_interval_poisson"
        # investigation_time: 40.  # [optional] Required if prospective: True

      # L-Test: Likelihood Test
      # Joint likelihood test combining spatial and magnitude information.
      L_test:
        critical_frac: 0.025       # Fraction threshold for test failure
        prospective: False         # [optional] Use prospective catalog
        n_iters: 1000              # Number of Monte Carlo iterations
        not_modeled_likelihood: 1.0e-5  # [optional] default 1e-5
        # investigation_time: 40.  # [optional] Required if prospective: True

      # Moment Over/Under Evaluation
      # Compares observed vs. stochastic moment release in each spatial cell.
      moment_over_under:
        investigation_time: 40.    # Duration in years
        n_iters: 100               # Number of stochastic event sets
        # min_mag: 6.0             # [optional] Defaults to mfd_bin_min
        # max_mag: 8.5             # [optional] Defaults to mfd_bin_max

      # Rupture Matching Evaluation
      # Matches observed earthquakes to modeled ruptures based on location,
      # magnitude, and (optionally) geometry.
      rupture_matching_eval:
        use_occurrence_rate: False  # [optional] Weight by occurrence rate;
                                   # default False
        distance_lambda: 1.0       # [optional] Distance decay parameter
        mag_window: 1.0            # [optional] Magnitude window for matching
        group_return_threshold: 0.9  # [optional] Threshold for group matching
        min_likelihood: 0.1        # [optional] Minimum match likelihood
        no_attitude_default_like: 0.5  # [optional] Default likelihood when
                                       # rupture has no attitude data
        no_rake_default_like: 0.5  # [optional] Default likelihood when
                                   # rupture has no rake data
        return_one: best           # [optional] "best" or "all"
        parallel: False            # [optional] Parallel matching; default False

      # Cumulative Occurrence Evaluation
      # Evaluates cumulative earthquake occurrence over time by magnitude bin.
      # Use {} for default parameters (no configuration needed).
      cumulative_occurrence_eval: {}

      # Catalog Ground Motion Evaluation
      # Compares observed ground motions from a flatfile with model predictions.
      # Requires a flatfile to be specified under input.
      catalog_ground_motion_eval:
        match_rups: False          # [optional] Match ruptures to earthquakes
                                   # before ground motion comparison
        # gmf_method: ground_motion_fields  # [optional] Method for ground
                                            # motion calculation


    # -----------------------------------------------------------------------
    # RELM/CSEP Framework
    # Implementation of the RELM (Regional Earthquake Likelihood Models)
    # tests. These are similar to the GEM tests but use different statistical
    # assumptions (e.g. not_modeled_likelihood is hardcoded to 0.0).
    # -----------------------------------------------------------------------
    relm:

      N_test:
        conf_interval: 0.95
        prob_model: poisson        # "poisson" or "poisson_cum"
        investigation_time: 40.
        prospective: False         # [optional]

      M_test:
        critical_frac: 0.25
        n_iters: 1000
        investigation_time: 40.
        prospective: False         # [optional]

      S_test:
        critical_frac: 0.25
        n_iters: 1000
        investigation_time: 40.
        prospective: False         # [optional]
        likelihood_function: mfd   # [optional] "mfd" or "conf_interval_poisson"
        normalize_n_eqs: False     # [optional]

      L_test:
        critical_frac: 0.25
        n_iters: 1000
        investigation_time: 40.
        prospective: False         # [optional]


    # -----------------------------------------------------------------------
    # Sanity Checks Framework
    # Basic checks for model internal consistency.
    # -----------------------------------------------------------------------
    sanity:
      max_check:
        warn: True                 # [optional] Log warnings; default True



  # -------------------------------------------------------------------------
  # Global config options
  # -------------------------------------------------------------------------
  parallel: False                  # Use multiprocessing for source loading
                                   # and intensive tests. Recommended for
                                   # medium to large models.

  rand_seed: 69                    # [optional] Random seed for reproducible
                                   # Monte Carlo simulations. Must be an integer.

  log_file: hamlet_run.log         # [optional] Path to log file output


# ---------------------------------------------------------------------------
# Inputs
# ---------------------------------------------------------------------------
input:

  # -------------------------------------------------------------------------
  # Spatial and magnitude binning
  # -------------------------------------------------------------------------
  bins:
    mfd_bin_min: 6.0               # Minimum magnitude (required)
    mfd_bin_max: 9.0               # Maximum magnitude (required)
    mfd_bin_width: 0.2             # Bin width (required)
    h3_res: 3                      # [optional] H3 hexagonal grid resolution
                                   # (0-15, where 0 is coarsest); default 3

  # -------------------------------------------------------------------------
  # Seismic Source Model (SSM)
  #
  # The SSM must be in OpenQuake format. Specify either ssm_dir + ssm_lt_file,
  # or job_ini_file.
  # -------------------------------------------------------------------------
  ssm:
    ssm_dir: path/to/ssm/          # Directory containing the SSM files
    ssm_lt_file: ssmLT.xml         # Logic tree XML file name
    # job_ini_file: job.ini        # [optional] Alternative: specify an
                                   # OpenQuake job.ini file instead of
                                   # ssm_dir + ssm_lt_file. Default: null

    branch: b1                     # [optional] Logic tree branch to evaluate.
                                   # Set to "iterate" to evaluate each branch
                                   # independently in a single run. Default: null

    tectonic_region_types:         # [optional] Filter by tectonic region type.
      - Active Shallow Crust      # Must match types in the SSM. Default: null
                                   # (all types included).

    source_types: null             # [optional] Filter by source type. Default: null. Options:
                                   # simple_fault, complex_fault, area, point,
                                   # multipoint, MultiFaultSource.
                                   # Pass as a list or null for all.

  # -------------------------------------------------------------------------
  # Depth filtering [optional]
  # Sources outside this depth range are excluded from evaluation.
  # -------------------------------------------------------------------------
  min_depth: 0                     # [optional] Minimum source depth in km; default: 0.0
  max_depth: 40                    # [optional] Maximum source depth in km; default: no limit

  # -------------------------------------------------------------------------
  # Rupture file caching [optional]
  # Loading ruptures from a source model can be slow. These options let you
  # save processed ruptures to disk and reload them in subsequent runs.
  # -------------------------------------------------------------------------
  rupture_file:
    read_rupture_file: false       # Read ruptures from file instead of SSM; default: false
    save_rupture_file: false       # Save processed ruptures to file; default: false
    rupture_file_path: ./ruptures.hdf5  # Path to the rupture file; default: null
                                   # Supported formats: .hdf5, .feather, .csv

  # -------------------------------------------------------------------------
  # Spatial subsetting [optional]
  # Restrict the evaluation to a geographic subset of the model domain.
  # -------------------------------------------------------------------------
  subset:
    file: path/to/subset.geojson   # GIS file with subset geometry; default: null
    buffer: 0.0                    # Buffer distance around the subset geometry; default: 0.0

  # -------------------------------------------------------------------------
  # Flatfile for ground motion evaluation [optional]
  # Required for catalog_ground_motion_eval test.
  # -------------------------------------------------------------------------
  flatfile: path/to/flatfile.csv

  # -------------------------------------------------------------------------
  # Simple ruptures [optional]
  # Use simplified rupture representations (point sources) for faster
  # processing. Default: true (from cfg_defaults).
  # -------------------------------------------------------------------------
  simple_ruptures: true

  # -------------------------------------------------------------------------
  # Observed Earthquake Catalog
  # -------------------------------------------------------------------------
  seis_catalog:
    seis_catalog_file: path/to/catalog.csv

    # Temporal parameters: provide any two of start_date, stop_date, duration.
    # The third will be calculated. Alternatively, use a completeness_table.
    # start_date: 1976             # Can be an integer (year) or date string
    stop_date: 2018-01-01          # Date string (YYYY-MM-DD)
    # duration: 40.                # Duration in years

    # Completeness table [optional]
    # A list of [year, magnitude] pairs defining the completeness threshold
    # over time. Each pair means "the catalog is complete above this
    # magnitude from this year onward." When used, the effective duration
    # varies by magnitude bin.
    completeness_table:
      - [1960, 5.0]
      - [1900, 7.2]

    # Column mappings [optional]
    # Map expected fields to actual column names in the CSV. Only specify
    # columns whose names differ from the defaults.
    columns:
      # x_col: longitude           # default: longitude
      # y_col: latitude            # default: latitude
      # depth: depth               # default: depth
      # magnitude: magnitude       # default: magnitude
      # source: Agency             # no default; institutional source of the eq
      event_id: eventID            # no default; earthquake ID column

      # Time column(s): either a single column name or a list of components
      time:
        - year
        - month
        - day
        - hour
        - minute
        - second

  # -------------------------------------------------------------------------
  # Prospective catalog [optional]
  # A separate catalog for prospective (forward-looking) evaluation.
  # Uses the same format/columns as seis_catalog.
  # -------------------------------------------------------------------------
  # prospective_catalog:
  #   seis_catalog_file: path/to/prospective_catalog.csv
  #   duration: 5.


# ---------------------------------------------------------------------------
# Report [optional]
# Generate an HTML report summarizing all test results.
# ---------------------------------------------------------------------------
report:
  basic:
    outfile: outputs/report.html


# ---------------------------------------------------------------------------
# JSON output [optional]
# Write test results as a JSON file.
# ---------------------------------------------------------------------------
json:
  outfile: outputs/results.json