Skip to content

Developing and data mining with machine learning a volcanic ash database (VolcAshDB)

Damià Benet 1, Fidel Costa1, Kévin Migadel1, Daniel W. J. Lee2, Claudia D'Oriano3, Massimo Pompilio3, Dini Nurfiani4, Hamdi Rifai5

  • Affiliations: 1: Institut de Physique du Globe de Paris, Université Paris Cité, Paris, France 2: Lamont-Doherty Earth Observatory, Columbia University, New York, NY, USA 3: Istituto Nazionale di Geofisica e Vulcanologia, sezione di Pisa, Pisa, Italy 4: National Research and Innovation Agency (BRIN), Bandung, Indonesia 5: Physics Department, Faculty of Mathematics and Natural Sciences, Universitas Negeri Padang, Indonesia 

  • Presentation type: Poster

  • Presentation time: Monday 16:30 - 18:30, Room Poster Hall

  • Poster Board Number: 152

  • Programme No: 3.1.44

  • Theme 3 > Session 1


Abstract

Volcanic ash provides direct evidence of the interior of the volcanoes when they explosively erupt offering unique insights into the state of the volcano and likely transitions in style. Petrologists typically classify ash particles into different types: fragments from the magma driving the eruption (juvenile), older volcanic building material (lithic), weathered or hydrothermally altered material, and free crystals. However, different researchers may identify the same particle in a different class because diagnostic particle features vary across eruptions and volcanoes. Such lack of standardization is a major challenge to compare the ash particles and their interpretations between eruptions and observatories. To address this, we developed a database of volcanic ash particles (VolcAshDB) with ash samples from various magma composition and eruptive styles. Particle data includes a multi-focused, binocular particle image and an array of features that characterize particles' shape, texture and color of 12,044 particles (available at https://volcashdb.ipgp.fr/). We applied machine learning (ML) for automatic particle classification and compared the performance across algorithms of the tree-based family (Decision Trees, Random Forests, Extreme Gradient Boost), deep learning (ResNet, ConvNext) and Vision Transformer (ViT). The ViT achieved the highest accuracy at 93%, and the XGBoost model highlighted the most diagnostic features for classifying particles from different eruptive styles. Through our web-platform, users can browse and obtain visualization summaries of our dataset, and we plan to allow users to contribute to the database to further improve the robustness of our models.