Tephra2 Inversion
06 Jun 2018tephra2utils
tephra2
inversion
codes
utilities
The current set of functions helps running the advection-diffusion model Tephra2 in inversion mode to estimate the best eruption source parameters (ESP) of a tephra deposit. The functions contain two sections. A first section contains a mixture of bash and python scripts for running the inversion of OpenPBS and SLURM clusters. A second section contains Matlab scripts for processing the inversion output designed to help its interpretation.
Updates - November 2018
The inversion scripts have been updated in November 2018. To find the scripts to post-process runs performed before this date, please download the previous version here.
Table of content
Introversion
The inversion searches for the set of ESP that best reproduce observed values of tephra accumulations (kg/m2) with the computed values produced by Tephra2. As for any optimization problem, the many degrees of freedom require a critical interpretation of the inversion results. These scripts help building an empirical vision of the results in the perspective of the knowledge of the studied eruption.
For a purpose of illustration, let’s consider that the fit depends mainly on plume height and eruption mass. We will therefore attempt to find the combination of plume height and eruption mass that best reproduce the deposit. To do so, the Tephra2 uses a downhill simplex method, which searches for the minimum of a function - here the fit between observed and computed values - between user-defined ranges of [heightmin - heightmax] and [massmin - massmax].
Unfortunately, this method can suffer from numerical artifacts resulting in “fake” minima, especially when initial user-defined ranges are too large. For instance let’s consider two sets of plume heights and mass, height1, mass1 and height2, mass2, both giving a similarly good solution from a numerical perspective, but height1, mass1 being irrealistic from a volcanological point of view. We therefore need a method to subjectively discard this solution from the inversion result.
Single vs. batch runs
To avoid this problem, it is possible to run the inversion using three approaches, the single, the batch and the seed modes. The single mode searches for a minimum within the entire space defined by the ranges of plume heights and masses and consists of one single inversion run. The batch mode splits the space of plume heights and masses into smaller domains, and one inversion run is performed for each domain. The seed mode, for a lack of a better term, is pretty much the equivalent of the single mode repeated multiple times varying the seed. The seed is a number that controls the reproducability of stochastic runs, and in the case of the inversion it will control the starting point of the optimisation process. In case of unconstrained deposits, varying the seed is likely to affect the results. It is a good procedure to assess the sensitivity of the results to the input seed.
So what are the advantages and disadvantages of each method?
- Single
- Pros: Provides the most accurate minimum
- Cons: Can get stuck in local minima
- Batch
- Pros: Provides an overview of all minima on the height-mass space
- Cons: Small sub-spaces do not provide enough flexibility to accuratly define minima
- Seed
- Pros: Advantages of the single approach without disadvantages
- Cons: Euuh… Not sure, it’s pretty much my prefered approach now
Caveats
Below are a few things to keep in mind when using the inversion:
- The mass is always better constrained than the plume height
- Constraining the plume height is more difficult because of the fact that in Tephra2, the plume height is intrinsically related to the computation of parameters controlling i) the grain size distribution, ii) the distribution of mass in the plume and iii) empirical parameters
Installation
Two components are needed, both available on GitHub:
For Tephra2, as far as we are concerned here, we only need the two executables created by the compilation of Tephra2 (tephra2012_inversion and tephra2-2012). For Tephra2Utils, we need the entire
Organize your working folder like this:
ROOT ├── Inversion/ │ ├── _example_folder/ -> Example of input files │ │ ├── forwardGrid.utm │ │ ├── inversionConfig.conf │ │ ├── inversionInput.txt │ │ ├── inversionWind.txt │ │ └── runInversion.sh │ ├── _scripts/ -> Scripts used during inversion* │ │ ├── genConfig.py │ ├── _templates/ -> Templates used during inversion* │ │ ├── forwardConfTemplate.conf │ │ └── inversionConfTemplate.conf ├── dependencies/ -> Dependencies for post-processing* │ └── xlwrite/ ├── plotBetaPlume.m -> Used during post-processing* ├── plotT2.m -> Used during post-processing* ├── processT2Inversion.m -> Main post-processing function* ├── plotT2Inversion.m -> Plot results of batch and seed runs* ├── tephra2012_inversion -> Tephra2 inversion exec └── tephra2-2012 -> Tephra2 forward exec * Should not have to be edited
Getting started
Duplicate the folder
Case-study files
Replace the following files by those for your own case-study.
-
inversionInput.txt: This is the main file containing your field observations. It is a 4-columns, tab-delimited file containing:
- Column 1: Easting coordinates of the outcrop (UTM, WGS84)
- Column 2: Northing coordinates of the outcrop (UTM, WGS84)
- Column 3: Altitude (m asl). Note that Tephra2 can not accurately use the elevation. Instead, a mean elevation of all outcrops should be estimated and used as a unique value for all outcrops
- Column 4: The mass accumulation (kg/m2) estimated at each outcrop. Note that this is a mass accumulation and not a thickness
-
forwardGrid.utm: The calculation grid used by the forward solution of Tephra2, and used during the inversion process to compare the deposit modelled by the inversion with field data. You can use TephraProb to create a grid following this tutorial.
-
inversionWind.txt: If a wind observation is available for the studied eruption, it can be used to constrain the inversion. If no wind is available, ranges of wind speed and directions can be defined instead (described later), but please keep then the inversionWind.txt file in the folder. If the eruption is within the time interval provided by Reanalysis datasets, TephraProb can be used to retrieve it. The wind file should be a 3-columns, tab-delimited file containing:
- Column 1: Altitude (m asl)
- Column 2: Wind speed (m/s)
- Column 3: Wind direction (i.e. the direction the wind blows to; degrees from N)
Configuration file
The main inversion file is
Below is a summary of the different sections to fill in.
Inversion behaviour
This section controls the general behaviour of the inversion runs.
Variable | Description |
---|---|
BATCH | Enter 0 for single runs, 1 for batch runs and 2 for seed rund |
fixedWind | Enter 0 to use ranges of wind direction/speed, 1 to use a wind profile |
SEED | Used to control the seed for reproducibility. To randomely choose a seed, enter |
Input files
This section defines the path to the input files previously described.
Variable | Description |
---|---|
inputFile | File containing the field observations |
windFile | File containing the wind profile. Leave |
gridFile | Path to the grid file |
Vent properties
This section defines the vent geometry. Coordinates are in UTM and should be projected on the same zone as the field data.
Variable | Description |
---|---|
ventE | Vent easting (UTM, WGS84) |
ventN | Vent northin (UTM, WGS84) |
ventA | Vent elevation (m asl) |
ventZ | UTM zone of the vent. Only the numeric part, negative in S hemisphere |
Ranges to invert
This section contains all the variable parameters that are optimized during the inversion. Each parameter contains min/max values.
Variable | Description |
---|---|
Ht | Plume heights (m asl) |
Mass | Total mass (log10, kg). Used only when BATCH=0 |
Diff | Diffusion coefficient for large particles (m2/s) |
FTT | Fall-time threshold(s) |
MedPhi | Median of the TGSD (Phi). Note that minMedPhi is the coarsest diameter |
SigPhi | Standard deviation of the TGSD (Phi) |
Alpha | Alpha parameter of the Beta function controlling the mass distribution in the plume |
Beta | Beta parameter of the Beta function controlling the mass distribution in the plume |
WindSpeed | Wind speed (m/s) - used when fixedWind=0 |
WindDir | Wind direction (degree from N) - used when fixedWind=0 |
Constant parameters
This section contains parameters that are kept constant during the inversion.
Variable | Description |
---|---|
lithicDensity | Density of the lithics (kg/m3) |
pumiceDensity | Density of the pumices (kg/m3) |
minPhi | Minimum bound (coarsest diameter; phi) of the TGSD |
maxPhi | Maximum bound (finest diameter; phi) of the TGSD |
Batch parameters
This section contains parameters that control the behavior of the batch inversion. It is still experimental, so it is recommended to keep the proposed parameters for now.
Variable | Description |
---|---|
deltaMass | Mass interval between each sub-run of the batch inversion (log10; kg) |
incrMass | Mass increment between each sub-run of the batch inversion (log10; kg) |
deltaHt | Height interval between each sub-run of the batch inversion (m asl) |
incrHt | Height increment between each sub-run of the batch inversion (m asl) |
Tephra2 advanced parameters
Variable | Description |
---|---|
colSteps | Number of integration steps along the height of the column |
partStep | Number of integration steps along the TGSD |
fitTest | Merit test: 0 (chi-squared test), 1 (root mean squared error), 2 (Tokyo log test) |
eddy | Eddy constant |
plumeModel | Mass distribution in the plume (2 is a beta distribution) |
windLevels | Number of wind levels to consider. Only if fixedWind=0 |
Running the inversion
Tephra2 is parallelised using OpenMPI and commonly run on clusters. Different clusters have different architectures, which will probably require some measure of editing. If your files are prepared locally, now is the time to upload everything on the server. Be sure to keep the directory tree specified before.
The main script used to run the inversion is runInversion.sh. The
Single vs. batch vs. seed runs
For parallelisation purposes, single runs, batch runs and seed runs are submitted in slightly different ways.
- For single runs, the mass range is defined in the
inversionConfig.conf file; - For batch runs use job arrays, the mass range is specified on the command line when submitting the job, which will allow each mass increment to be sent to a different node;
- For seed runs, the mass is specified in
inversionConfig.conf (as for single runs), but the number of seeds is specified on the command line when submitting the job.
OpenPBS
For a single inversion run, set BATCH=0 in
qsub runInversion.sh
For a batch inversion run with a mass range between 109 and 1011 kg, set BATCH=1 in
qsub -t 9-11 runInversion.sh
For a seed inversion performing 100 runs with a seed number varying between 1-100, type:
qsub -t 1-100 runInversion.sh
Post-processing
Upon successful (yey!) completion of an inversion run, one (single run) or multiple (batch run) folders are created inside
- In the
_example_folder/ on your local computer, create a folder namedn/ , where n is the number of your inversion attempt (start at 1/ and name the subsequent folders in a continuous way). By doing so, each new inversion attempt doesn’t have to delete the previous one, and it becomes easier to track different attempts - Copy the content of
_example_folder/ on the cluster into your localn/ folder - In Matlab, run the
processInversion.m script. When asked, go and select yourn/ folder and let the script work
This process will first create four new files in each folder:
- obsVScomp.pdf: Plot of the computed values of tephra accumulation against their observed equivalents;
- plume.pdf: The plume distribution
- tephra2.out: Result of the forward solution
- wind.pdf: Wind profile
Additionally, each inversion attempt is recorded in an Excel file at the root of the
Output figures
1. Spaces of solutions
In the case of batch or seed runs, the function
2. Everything against everything!
In the case of seed runs, the script produces a matrix plot (type
- It allows indentifying populations of solutions from the histograms
- It allows indentifying relationships between parameters