{ "cells": [ { "cell_type": "markdown", "id": "499e16c5", "metadata": {}, "source": [ "# Chemical Analysis\n", "Chemical networks are complex systems where the interplay between many elements often means that small changes in one aspect of the network can greatly effect the outcome in unexpected ways. Nevertheless, there are cases where a simple chemical explanation can be found for some observed behaviour in the model outputs. This tutorial demonstrates how to use some of the functionality of the UCLCHEM library to analyse model outputs and discover these explanations.\n", "\n", "We do recommend caution when following the approach laid out in this tutorial. There are many pitfalls which we will try to point out as we go. Ultimately, a comprehensive view of how important a reaction is to the outcome of your model requires detailed statistical calculations such as the use of [SHAP values](https://github.com/slundberg/shap) to assign meaningful scores to how much various reactions in a network contribute to a species' abundance. Therefore, care must be taken by the user to ensure that the conclusions they draw from a simpler approach are sound. Examples of papers doing this for UCLCHEM include:\n", "- [Understanding molecular abundances in star-forming regions using interpretable machine learning Open Access](https://ui.adsabs.harvard.edu/abs/2023MNRAS.526..404H/abstract) Heyl, J., Butterworth, J., & Viti, S. 2023, MNRAS, 526, 404\n", "- [A statistical and machine learning approach to the study of astrochemistry](https://ui.adsabs.harvard.edu/abs/2023FaDi..245..569H/abstract) Heyl, J., Viti, S., & Vermariƫn, G. 2023, Faraday Discussions, 245,\n", "569\n", "- [Understanding molecular ratios in the carbon and oxygen poor outer Milky Way with interpretable machine learning](https://ui.adsabs.harvard.edu/abs/2025arXiv250508410V/abstract) Vermariƫn, G., Viti, S., Heyl, J., Fontani, F., 2025, A&A, 699, A18 \n", "\n", "We'll use an example from work that was published in 2022 [Energizing Star Formation: The Cosmic-Ray Ionization Rate in NGC 253 Derived from ALCHEMI Measurements of H3O+ and SO](https://ui.adsabs.harvard.edu/abs/2022ApJ...931...89H/abstract) to demonstrate the use of the rates coming out of UCLCHEM and how it can be used to draw conclusions about the most important reactions in a network for a given species/behaviour." ] }, { "cell_type": "code", "execution_count": 1, "id": "a561c97d", "metadata": { "execution": { "iopub.execute_input": "2026-01-23T13:19:59.467270Z", "iopub.status.busy": "2026-01-23T13:19:59.467009Z", "iopub.status.idle": "2026-01-23T13:20:01.355084Z", "shell.execute_reply": "2026-01-23T13:20:01.354228Z" } }, "outputs": [], "source": [ "import uclchem\n", "from glob import glob\n", "from joblib import Parallel, delayed\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "7a5bc54f", "metadata": {}, "source": [ "## H3O+ and SO\n", "\n", "In a piece of inference work in which we measured the cosmic ray ionization rate (CRIR) in NGC 253 [(Holdship et al. 2022)](https://ui.adsabs.harvard.edu/abs/2022arXiv220403668H/abstract). We found that both H3O+ and SO were sensitive to the ionization rate. Furthermore, since H3O+ was increased in abundance by increasing CRIR and SO was destroyed, their ratio was extremely sensitive to the rate. \n", "\n", "In the work, we present the plot below which shows how the equilibrium abundance of each species changes with the CRIR as well as the ratio. We plot this for a range of temperatures to show that this behaviour is not particularly sensitive to the gas temperature.\n", "\n", "\n", "\n", "\n", "In a sense, this is all the information we need. A complex array of reactions all compete and contribute to produce the outcome of the model. Whatever they are, we see the abundance of these species are very sensitive to the CR over a wide range of temperature and density. However, we can only trust this conclusion as far as we trust the entire chemical network since we don't know what exactly causes this behaviour.\n", "\n", "If we use the analysis function, we may find all of this is driven by a small handful of reactions. The benefit would then be that our trust in our conclusions only depends on how much we trust the rates of those specific reactions. If it is very important, we can evaluate those reactions and readers of our work can make the same decisions as information from experiment changes.\n", "\n", "### 1. Generate Test Cases\n", "Ideally, we should run a huge grid of models and find some rigorous way to evaluate how important each reaction is to the outcome of the model. However, if we run a small grid representative of different conditions then any simple chemical explanation will be evident if it exists. If it no simple explanation is appropriate, then the analysis module is not appropriate for our task and we could consider more complex approaches.\n", "\n", "Let's run a simple grid with all possible combinations of the following:\n", "- A low CRIR (zeta=1) and high CRIR (zeta=1e4)\n", "- A typical cloud density (n=1e4) and high density (n=1e6)\n", "- The lower temperature bound of NGC 253 CMZ (75 K)* and a high temperature (250 K) \n", "* The lower boundary is a bit lower, but the computational time of 50K models is a lot longer than 75K so we stick with a bit higher values for speed\n", "\n", "and that will give us enough to work with for our analysis.\n", "\n", "When we run the model and want to interact with the rates directly after running, UCLCHEM must be told to return it to the user. This \n", "can be done using both `return_dataframe=True` and `return_rates=True`. The model will then return the\n", "physics (temperature, density etc), abundances and rates as a function of time." ] }, { "cell_type": "code", "execution_count": 2, "id": "941db005", "metadata": { "execution": { "iopub.execute_input": "2026-01-23T13:20:01.357590Z", "iopub.status.busy": "2026-01-23T13:20:01.357273Z", "iopub.status.idle": "2026-01-23T13:20:01.372097Z", "shell.execute_reply": "2026-01-23T13:20:01.371424Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8 models to run\n" ] }, { "data": { "text/html": [ "
| \n", " | temperature | \n", "density | \n", "zeta | \n", "
|---|---|---|---|
| model_0 | \n", "50.0 | \n", "10000.0 | \n", "10.0 | \n", "
| model_1 | \n", "50.0 | \n", "10000.0 | \n", "10000.0 | \n", "
| model_2 | \n", "250.0 | \n", "10000.0 | \n", "10.0 | \n", "
| model_3 | \n", "250.0 | \n", "10000.0 | \n", "10000.0 | \n", "
| model_4 | \n", "50.0 | \n", "1000000.0 | \n", "10.0 | \n", "
| model_5 | \n", "50.0 | \n", "1000000.0 | \n", "10000.0 | \n", "
| model_6 | \n", "250.0 | \n", "1000000.0 | \n", "10.0 | \n", "
| model_7 | \n", "250.0 | \n", "1000000.0 | \n", "10000.0 | \n", "
| \n", " | CH+ + H2O -> H3O+ + C | \n", "CH4 + H2O+ -> H3O+ + CH3 | \n", "CH4 + OH+ -> H3O+ + CH2 | \n", "CH4+ + H2O -> H3O+ + CH3 | \n", "CH5+ + H2O -> H3O+ + CH4 | \n", "H2 + H2O+ -> H3O+ + H | \n", "H2+ + H2O -> H3O+ + H | \n", "H2O + C2H2+ -> C2H + H3O+ | \n", "H2O + H2CO+ -> HCO + H3O+ | \n", "H2O + H2CL+ -> HCL + H3O+ | \n", "... | \n", "H2O+ + HCO -> CO + H3O+ | \n", "H3+ + CH3CHO -> H3O+ + C2H4 | \n", "H3+ + H2O -> H3O+ + H2 | \n", "H3+ + HCOOH -> H3O+ + CO + H2 | \n", "NH + H2O+ -> H3O+ + N | \n", "NH+ + H2O -> H3O+ + N | \n", "NH2+ + H2O -> H3O+ + NH | \n", "O + CH5+ -> H3O+ + CH2 | \n", "OH + H2O+ -> H3O+ + O | \n", "OH+ + H2O -> H3O+ + O | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 42 | \n", "6.353582e-10 | \n", "1.400000e-09 | \n", "1.310000e-09 | \n", "2.848157e-09 | \n", "4.053147e-09 | \n", "0.0 | \n", "3.724513e-09 | \n", "2.409979e-10 | \n", "2.848157e-09 | \n", "2.190890e-09 | \n", "... | \n", "3.067246e-10 | \n", "1.139263e-09 | \n", "6.463126e-09 | \n", "1.971801e-09 | \n", "7.777660e-10 | \n", "1.150217e-09 | \n", "3.023429e-09 | \n", "2.200000e-10 | \n", "7.558571e-10 | \n", "1.424079e-09 | \n", "
| 43 | \n", "6.353582e-10 | \n", "1.400000e-09 | \n", "1.310000e-09 | \n", "2.848157e-09 | \n", "4.053147e-09 | \n", "0.0 | \n", "3.724513e-09 | \n", "2.409979e-10 | \n", "2.848157e-09 | \n", "2.190890e-09 | \n", "... | \n", "3.067246e-10 | \n", "1.139263e-09 | \n", "6.463126e-09 | \n", "1.971801e-09 | \n", "7.777660e-10 | \n", "1.150217e-09 | \n", "3.023429e-09 | \n", "2.200000e-10 | \n", "7.558571e-10 | \n", "1.424079e-09 | \n", "
| 44 | \n", "6.353582e-10 | \n", "1.400000e-09 | \n", "1.310000e-09 | \n", "2.848157e-09 | \n", "4.053147e-09 | \n", "0.0 | \n", "3.724513e-09 | \n", "2.409979e-10 | \n", "2.848157e-09 | \n", "2.190890e-09 | \n", "... | \n", "3.067246e-10 | \n", "1.139263e-09 | \n", "6.463126e-09 | \n", "1.971801e-09 | \n", "7.777660e-10 | \n", "1.150217e-09 | \n", "3.023429e-09 | \n", "2.200000e-10 | \n", "7.558571e-10 | \n", "1.424079e-09 | \n", "
| 45 | \n", "6.353582e-10 | \n", "1.400000e-09 | \n", "1.310000e-09 | \n", "2.848157e-09 | \n", "4.053147e-09 | \n", "0.0 | \n", "3.724513e-09 | \n", "2.409979e-10 | \n", "2.848157e-09 | \n", "2.190890e-09 | \n", "... | \n", "3.067246e-10 | \n", "1.139263e-09 | \n", "6.463126e-09 | \n", "1.971801e-09 | \n", "7.777660e-10 | \n", "1.150217e-09 | \n", "3.023429e-09 | \n", "2.200000e-10 | \n", "7.558571e-10 | \n", "1.424079e-09 | \n", "
| 46 | \n", "6.353582e-10 | \n", "1.400000e-09 | \n", "1.310000e-09 | \n", "2.848157e-09 | \n", "4.053147e-09 | \n", "0.0 | \n", "3.724513e-09 | \n", "2.409979e-10 | \n", "2.848157e-09 | \n", "2.190890e-09 | \n", "... | \n", "3.067246e-10 | \n", "1.139263e-09 | \n", "6.463126e-09 | \n", "1.971801e-09 | \n", "7.777660e-10 | \n", "1.150217e-09 | \n", "3.023429e-09 | \n", "2.200000e-10 | \n", "7.558571e-10 | \n", "1.424079e-09 | \n", "
5 rows Ć 36 columns
\n", "