Tephra2 Inversion

06 Jun 2018 tephra2utils tephra2 inversion codes utilities

The current set of functions helps running the advection-diffusion model Tephra2 in inversion mode to estimate the best eruption source parameters (ESP) of a tephra deposit. The functions contain two sections. A first section contains a mixture of bash and python scripts for running the inversion of OpenPBS and SLURM clusters. A second section contains Matlab scripts for processing the inversion output designed to help its interpretation.

Updates - November 2018

The inversion scripts have been updated in November 2018. To find the scripts to post-process runs performed before this date, please download the previous version here.

Table of content

Introversion
- Single vs. batch runs
- Caveats
Installation
Getting started
- Case-study files
- Configuration file
Running the inversion
- Single vs. batch vs. seed runs
- OpenPBS
Post-processing
- Output figures
  - 1. Spaces of solutions
  - 2. Everything against everything!

Introversion

The inversion searches for the set of ESP that best reproduce observed values of tephra accumulations (kg/m²) with the computed values produced by Tephra2. As for any optimization problem, the many degrees of freedom require a critical interpretation of the inversion results. These scripts help building an empirical vision of the results in the perspective of the knowledge of the studied eruption.

For a purpose of illustration, let’s consider that the fit depends mainly on plume height and eruption mass. We will therefore attempt to find the combination of plume height and eruption mass that best reproduce the deposit. To do so, the Tephra2 uses a downhill simplex method, which searches for the minimum of a function - here the fit between observed and computed values - between user-defined ranges of [height_min - height_max] and [mass_min - mass_max].

Unfortunately, this method can suffer from numerical artifacts resulting in “fake” minima, especially when initial user-defined ranges are too large. For instance let’s consider two sets of plume heights and mass, height₁, mass₁ and height₂, mass₂, both giving a similarly good solution from a numerical perspective, but height₁, mass₁ being irrealistic from a volcanological point of view. We therefore need a method to subjectively discard this solution from the inversion result.

Single vs. batch runs

To avoid this problem, it is possible to run the inversion using three approaches, the single, the batch and the seed modes. The single mode searches for a minimum within the entire space defined by the ranges of plume heights and masses and consists of one single inversion run. The batch mode splits the space of plume heights and masses into smaller domains, and one inversion run is performed for each domain. The seed mode, for a lack of a better term, is pretty much the equivalent of the single mode repeated multiple times varying the seed. The seed is a number that controls the reproducability of stochastic runs, and in the case of the inversion it will control the starting point of the optimisation process. In case of unconstrained deposits, varying the seed is likely to affect the results. It is a good procedure to assess the sensitivity of the results to the input seed.

So what are the advantages and disadvantages of each method?

Single
- Pros: Provides the most accurate minimum
- Cons: Can get stuck in local minima
Batch
- Pros: Provides an overview of all minima on the height-mass space
- Cons: Small sub-spaces do not provide enough flexibility to accuratly define minima
Seed
- Pros: Advantages of the single approach without disadvantages
- Cons: Euuh… Not sure, it’s pretty much my prefered approach now

Caveats

Below are a few things to keep in mind when using the inversion:

The mass is always better constrained than the plume height
Constraining the plume height is more difficult because of the fact that in Tephra2, the plume height is intrinsically related to the computation of parameters controlling i) the grain size distribution, ii) the distribution of mass in the plume and iii) empirical parameters

Installation

Two components are needed, both available on GitHub:

Tephra2 source code, available here
Tephra2Utils, available here

For Tephra2, as far as we are concerned here, we only need the two executables created by the compilation of Tephra2 (tephra2012_inversion and tephra2-2012). For Tephra2Utils, we need the entire Inversion/ directory and the post processing files (plotBetaPlume.m, plotT2.m and processT2Inversion.m).

Organize your working folder like this:

ROOT
├── Inversion/
│       ├── _example_folder/                    -> Example of input files
│       │        ├── forwardGrid.utm
│       │        ├── inversionConfig.conf
│       │        ├── inversionInput.txt
│       │        ├── inversionWind.txt
│       │        └── runInversion.sh
│       ├── _scripts/                           -> Scripts used during inversion*
│       │        ├── genConfig.py
│       ├── _templates/                        -> Templates used during inversion*
│       │        ├── forwardConfTemplate.conf
│       │        └── inversionConfTemplate.conf
├── dependencies/                               -> Dependencies for post-processing*
│       └── xlwrite/
├── plotBetaPlume.m                             -> Used during post-processing*
├── plotT2.m                                    -> Used during post-processing*
├── processT2Inversion.m                        -> Main post-processing function*
├── plotT2Inversion.m                           -> Plot results of batch and seed runs*
├── tephra2012_inversion                        -> Tephra2 inversion exec
└── tephra2-2012                                -> Tephra2 forward exec

* Should not have to be edited

Getting started

Duplicate the folder Inversion/example_folder/ and rename it to whatever name makes you happy. I usually name this folder after the tephra deposit I am inverting. Note that you can have as many folders for as many deposits of as many volcanoes as you want here. Let’s now edit the required files.

Case-study files

Replace the following files by those for your own case-study.

inversionInput.txt:
This is the main file containing your field observations. It is a 4-columns, tab-delimited file containing:
- Column 1: Easting coordinates of the outcrop (UTM, WGS84)
- Column 2: Northing coordinates of the outcrop (UTM, WGS84)
- Column 3: Altitude (m asl). Note that Tephra2 can not accurately use the elevation. Instead, a mean elevation of all outcrops should be estimated and used as a unique value for all outcrops
- Column 4: The mass accumulation (kg/m²) estimated at each outcrop. Note that this is a mass accumulation and not a thickness
forwardGrid.utm:
The calculation grid used by the forward solution of Tephra2, and used during the inversion process to compare the deposit modelled by the inversion with field data. You can use TephraProb to create a grid following this tutorial.
inversionWind.txt:
If a wind observation is available for the studied eruption, it can be used to constrain the inversion. If no wind is available, ranges of wind speed and directions can be defined instead (described later), but please keep then the inversionWind.txt file in the folder. If the eruption is within the time interval provided by Reanalysis datasets, TephraProb can be used to retrieve it. The wind file should be a 3-columns, tab-delimited file containing:
- Column 1: Altitude (m asl)
- Column 2: Wind speed (m/s)
- Column 3: Wind direction (i.e. the direction the wind blows to; degrees from N)

Configuration file

The main inversion file is inversionConfig.conf, which must not be renamed. As a tip, replace the ERUPTION_NAME in the second line by the name of the eruption you are modelling. That my save you precious time when looking for data years from now!

Below is a summary of the different sections to fill in.

Inversion behaviour

This section controls the general behaviour of the inversion runs.

Variable	Description
`BATCH`	Enter 0 for single runs, 1 for batch runs and 2 for seed rund
`fixedWind`	Enter 0 to use ranges of wind direction/speed, 1 to use a wind profile
`SEED`	Used to control the seed for reproducibility. To randomely choose a seed, enter -1

Input files

This section defines the path to the input files previously described.

Variable	Description
`inputFile`	File containing the field observations
`windFile`	File containing the wind profile. Leave inversionWind.txt if fixedWind=0
`gridFile`	Path to the grid file

Vent properties

This section defines the vent geometry. Coordinates are in UTM and should be projected on the same zone as the field data.

Variable	Description
`ventE`	Vent easting (UTM, WGS84)
`ventN`	Vent northin (UTM, WGS84)
`ventA`	Vent elevation (m asl)
`ventZ`	UTM zone of the vent. Only the numeric part, negative in S hemisphere

Ranges to invert

This section contains all the variable parameters that are optimized during the inversion. Each parameter contains min/max values.

Variable	Description
`Ht`	Plume heights (m asl)
`Mass`	Total mass (log10, kg). Used only when BATCH=0
`Diff`	Diffusion coefficient for large particles (m²/s)
`FTT`	Fall-time threshold(s)
`MedPhi`	Median of the TGSD (Phi). Note that minMedPhi is the coarsest diameter
`SigPhi`	Standard deviation of the TGSD (Phi)
`Alpha`	Alpha parameter of the Beta function controlling the mass distribution in the plume
`Beta`	Beta parameter of the Beta function controlling the mass distribution in the plume
`WindSpeed`	Wind speed (m/s) - used when fixedWind=0
`WindDir`	Wind direction (degree from N) - used when fixedWind=0

Constant parameters

This section contains parameters that are kept constant during the inversion.

Variable	Description
`lithicDensity`	Density of the lithics (kg/m³)
`pumiceDensity`	Density of the pumices (kg/m³)
`minPhi`	Minimum bound (coarsest diameter; phi) of the TGSD
`maxPhi`	Maximum bound (finest diameter; phi) of the TGSD

Batch parameters

This section contains parameters that control the behavior of the batch inversion. It is still experimental, so it is recommended to keep the proposed parameters for now.

Variable	Description
`deltaMass`	Mass interval between each sub-run of the batch inversion (log10; kg)
`incrMass`	Mass increment between each sub-run of the batch inversion (log10; kg)
`deltaHt`	Height interval between each sub-run of the batch inversion (m asl)
`incrHt`	Height increment between each sub-run of the batch inversion (m asl)

Tephra2 advanced parameters

Variable	Description
`colSteps`	Number of integration steps along the height of the column
`partStep`	Number of integration steps along the TGSD
`fitTest`	Merit test: 0 (chi-squared test), 1 (root mean squared error), 2 (Tokyo log test)
`eddy`	Eddy constant
`plumeModel`	Mass distribution in the plume (2 is a beta distribution)
`windLevels`	Number of wind levels to consider. Only if fixedWind=0

Running the inversion

Tephra2 is parallelised using OpenMPI and commonly run on clusters. Different clusters have different architectures, which will probably require some measure of editing. If your files are prepared locally, now is the time to upload everything on the server. Be sure to keep the directory tree specified before.

The main script used to run the inversion is runInversion.sh. The _template/ folder contains templates for different cluster architectures, namely Slurm, OpenPBS and PBSPro. Edit the header of the script and copy it to the _example_folder/ folder. Navigate to _example_folder/ to start running the script.

Single vs. batch vs. seed runs

For parallelisation purposes, single runs, batch runs and seed runs are submitted in slightly different ways.

For single runs, the mass range is defined in the inversionConfig.conf file;
For batch runs use job arrays, the mass range is specified on the command line when submitting the job, which will allow each mass increment to be sent to a different node;
For seed runs, the mass is specified in inversionConfig.conf (as for single runs), but the number of seeds is specified on the command line when submitting the job.

OpenPBS

For a single inversion run, set BATCH=0 in inversionConfig.conf and type:

qsub runInversion.sh

For a batch inversion run with a mass range between 10⁹ and 10¹¹ kg, set BATCH=1 in inversionConfig.conf and use the -t flag:

qsub -t 9-11 runInversion.sh

For a seed inversion performing 100 runs with a seed number varying between 1-100, type:

qsub -t 1-100 runInversion.sh

Post-processing

Upon successful (yey!) completion of an inversion run, one (single run) or multiple (batch run) folders are created inside _example_folder/, named massM_htH, where M and H are the lower intervals of the mass (log10 kg) and plume height (km) ranges, respectively. For post processing, follow these steps:

In the _example_folder/ on your local computer, create a folder named n/, where n is the number of your inversion attempt (start at 1/ and name the subsequent folders in a continuous way). By doing so, each new inversion attempt doesn’t have to delete the previous one, and it becomes easier to track different attempts
Copy the content of _example_folder/ on the cluster into your local n/ folder
In Matlab, run the processInversion.m script. When asked, go and select your n/ folder and let the script work

This process will first create four new files in each folder:

obsVScomp.pdf: Plot of the computed values of tephra accumulation against their observed equivalents;
plume.pdf: The plume distribution
tephra2.out: Result of the forward solution
wind.pdf: Wind profile

Additionally, each inversion attempt is recorded in an Excel file at the root of the Inversion/ folder in a different sheet named after n/.

Output figures

1. Spaces of solutions

In the case of batch or seed runs, the function plotT2Inversion can be used to explore spaces of results. A plane of fit values is computed and interpolated between two variable of choice. In the case of batch runs, this visualisation method is useful to assess the location(s) of minima. In the case of seed runs, these figures are helpeful to assess the stability of the inversion output to the seed number.

2. Everything against everything!

In the case of seed runs, the script produces a matrix plot (type doc plotmatrix in the Matlab command line). I find this plot to be the most informative because:

It allows indentifying populations of solutions from the histograms
It allows indentifying relationships between parameters

2020 Feb

2018 Dec

2018 Nov

2018 Oct

2018 Jul

2018 Jun

2018 Feb

2017 Dec

2017 Oct

2017 Sep

Tephra2 Inversion

Updates - November 2018

Table of content

Introversion

Single vs. batch runs

Caveats

Installation

Getting started

Case-study files

Configuration file

Running the inversion

Single vs. batch vs. seed runs

OpenPBS

Post-processing

Output figures

1. Spaces of solutions

2. Everything against everything!

2020 Feb

2018 Dec

2018 Nov

2018 Oct

2018 Jul

2018 Jun

2018 Feb

2017 Dec

2017 Oct

2017 Sep

Tephra2 Inversion

Updates - November 2018

Table of content

Introversion

Single vs. batch runs

Caveats

Installation

Getting started

Case-study files

Configuration file

Running the inversion

Single vs. batch vs. seed runs

OpenPBS

Post-processing

Output figures

1. Spaces of solutions

2. Everything against everything!

Related Posts

＿ TephraProb v.1.7 21 Feb 2020

＿ TephraProb troubleshoot 20 Dec 2018

＿ Random tools 13 Dec 2018