diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index d203cbae..fa45fde0 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -55,10 +55,9 @@ You can see your local version by using a web-browser to navigate to `http://loc
 [gh-fork-pull]: https://reflectoring.io/github-fork-and-pull/
 
 
-```eval_rst
-.. toctree::
-    :hidden:
+```{toctree}
+:hidden:
 
-    CONDUCT.md
-    LICENSE.md
+CONDUCT.md
+LICENSE.md
 ```
diff --git a/README.md b/README.md
index 5bb4c185..8d0a63e5 100644
--- a/README.md
+++ b/README.md
@@ -1,33 +1,39 @@
 # The LHCb Starterkit lessons ![Build Status](https://github.com/lhcb/starterkit-lessons/actions/workflows/build.yml/badge.svg)
 
+
 These are the lessons historically taught during the [LHCb Starterkit][starterkit].
 These lessons focus on LHCb software from Runs 1 and 2. For Run 3, the software changed a lot and a [new set of Starterkit lessons][run3-starterkit] was written.
 If you'd like to join the next workshop, visit [the website][starterkit] to find out when that will how and how to sign up.
 
+
 If you'd just like to learn about how to use the LHCb software, [read on](first-analysis-steps/README.md)!
 
+
 [starterkit]: https://lhcb.github.io/starterkit
 [run3-starterkit]: https://lhcb-starterkit-run3.docs.cern.ch/
 [first-analysis-steps]: https://lhcb.github.io/starterkit-lessons/first-analysis-steps/
 
 
-```eval_rst
-.. toctree::
-    :maxdepth: 3
-    :includehidden:
-    :caption: Contents:
-
-    first-analysis-steps/README.md
-    second-analysis-steps/README.md
-    self-guided-lessons/README.md
-    CONTRIBUTING.md
+```{toctree}
+:maxdepth: 3
+:includehidden:
+:caption: Contents:
 
-.. toctree::
-    :maxdepth: 2
-    :includehidden:
-    :caption: External links:
 
-    interesting-links/analysis-essentials.md
-    LHCb Core documentation <https://cern.ch/lhcb-core-doc/>
-    LHCb glossary <https://lhcb.github.io/glossary/>
+first-analysis-steps/README
+second-analysis-steps/README
+self-guided-lessons/README
+CONTRIBUTING
 ```
+
+
+```{toctree}
+:maxdepth: 2
+:includehidden:
+:caption: External links:
+
+
+interesting-links/analysis-essentials
+LHCb Core documentation <https://cern.ch/lhcb-core-doc/>
+LHCb glossary <https://lhcb.github.io/glossary/>
+```
\ No newline at end of file
diff --git a/first-analysis-steps/README.md b/first-analysis-steps/README.md
index 90683735..72ec40b2 100644
--- a/first-analysis-steps/README.md
+++ b/first-analysis-steps/README.md
@@ -13,33 +13,32 @@ The [analysis essentials course](https://hsf-training.github.io/analysis-essenti
 
 {% endprereq %}
 
-```eval_rst
-.. toctree::
-    :hidden:
-    :caption: Contents:
-
-    prerequisites.md
-    introduction-to-course.md
-    physics-at-lhcb.md
-    dataflow.md
-    run-2-data-flow.md
-    analysisflow.md
-    davinci.md
-    bookkeeping.md
-    files-from-grid.md
-    interactive-dst.md
-    minimal-dv-job.md
-    loki-functors.md
-    add-tupletools.md
-    decay-tree-fitter.md
-    analysis-productions.md
-    davinci-grid.md
-    split-jobs.md
-    ganga-data.md
-    eos-storage.md
-    lhcb-dev.md
-    dataflow-run3.md
-    asking-questions.md
-    ecgd.md
-    contributing-lesson.md
+```{toctree}
+:hidden:
+:caption: Contents:
+
+prerequisites.md
+introduction-to-course.md
+physics-at-lhcb.md
+dataflow.md
+run-2-data-flow.md
+analysisflow.md
+davinci.md
+bookkeeping.md
+files-from-grid.md
+interactive-dst.md
+minimal-dv-job.md
+loki-functors.md
+add-tupletools.md
+decay-tree-fitter.md
+analysis-productions.md
+davinci-grid.md
+split-jobs.md
+ganga-data.md
+eos-storage.md
+lhcb-dev.md
+dataflow-run3.md
+asking-questions.md
+ecgd.md
+contributing-lesson.md
 ```
diff --git a/first-analysis-steps/add-tupletools.md b/first-analysis-steps/add-tupletools.md
index 492bfe56..c50b81c4 100644
--- a/first-analysis-steps/add-tupletools.md
+++ b/first-analysis-steps/add-tupletools.md
@@ -164,7 +164,7 @@ To add LoKi-based leaves to the tree, we need to use the `LoKi::Hybrid::TupleToo
   1. Its *name*, specified in the `addTupleTool` call after a `/`.  This is 
      very useful (and recommended) if we want to have different 
      `LoKi::Hybrid::TupleTool` for each of our branches. For instance, we may 
-     want to add different information for the D*, the D0 and the soft `$ \pi $`:
+     want to add different information for the D*, the D0 and the soft $\pi$:
 
 ```python
 dstar_hybrid = dtt.Dstar.addTupleTool('LoKi::Hybrid::TupleTool/LoKi_Dstar')
diff --git a/first-analysis-steps/analysis-productions.md b/first-analysis-steps/analysis-productions.md
index 95be5ffd..0b00544c 100644
--- a/first-analysis-steps/analysis-productions.md
+++ b/first-analysis-steps/analysis-productions.md
@@ -59,7 +59,7 @@ Before making any edits, you should create a branch for your changes, and switch
 git checkout -b ${USER}/starterkit-practice
 ```
 
-Now we need to create a folder to store all the things we're going to add for our new production. For this practice production, we'll continue with the `$ B^+ \to (J/\psi \to \mu^+ \mu^-) K^+ $` decays used in the previous few lessons, so we should name the folder appropriately:
+Now we need to create a folder to store all the things we're going to add for our new production. For this practice production, we'll continue with the $B^+ \to (J/\psi \to \mu^+ \mu^-) K^+$ decays used in the previous few lessons, so we should name the folder appropriately:
 
 ```bash
 mkdir starterkit
@@ -99,7 +99,7 @@ Bu2JpsiK_24c4_MagDown:
 Here, the unindented lines are the names of jobs (although `defaults` has a special function), and the indented lines are the options we're applying to those jobs. Using this file will create one job called `Bu2JpsiK_24c4_MagDown`, that will read in data from the provided bookkeeping path. All the options applied under `defaults` are automatically applied to all other jobs - very useful for avoiding repetition. The options we're using here are copied from the Run 3 DaVinci lesson:
 
 * **application**: the version of DaVinci to use. Here we choose v64r12, see [here](http://lhcbdoc.web.cern.ch/lhcbdoc/davinci/) to check what versions are available.
-* **wg**: the working group this production is a part of. Since this is a `$ B^+ \to (J/\psi \to \mu^+ \mu^-) K^+ $` decay, we'll set this to `B2CC`.
+* **wg**: the working group this production is a part of. Since this is a $B^+ \to (J/\psi \to \mu^+ \mu^-) K^+$ decay, we'll set this to `B2CC`.
 * **inform**: optionally, you can enter your email address to receive updates on the status of your jobs.
 * **options**: the settings to use when running DaVinci. These are copied from the Run 3 DaVinci lesson.
 * **output**: the name of the output `.root` ntuples. These will get registered in bookkeeping as well.
diff --git a/first-analysis-steps/analysisflow.md b/first-analysis-steps/analysisflow.md
index 613683f1..7e610601 100644
--- a/first-analysis-steps/analysisflow.md
+++ b/first-analysis-steps/analysisflow.md
@@ -20,7 +20,7 @@ This is done using the software package called DaVinci.
 
 {% endcallout %}
 
-### Getting data files
+## Getting data files
 
 After preselecting data either in the Stripping, Sprucing or triggering step, users can produce ROOT files containing _ntuples_, running the DaVinci application.
 An ntuple is a (often complex) data structure typically stored within a (ROOT) file, which contains information about events or candidates in the data sample, such as the candidate mass or trigger decision flags.
@@ -39,7 +39,7 @@ We will discuss the concept of analysis preservation a bit later in this lesson.
 In first analysis steps we cover both running DaVinci on [Ganga](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/davinci-grid.html) and via [Analysis Productions](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/analysis-productions.html).
 
 
-### Useful high energy physics analysis tools
+## Useful high energy physics analysis tools
 
 After getting the ntuples a user usually develops new analysis code or expands an existing code, that their collaborators use. 
 Analysis code is usually based on the popular high-energy physics software tools or on the more general data analysis tools, like [numpy](https://numpy.org/) or [pandas](https://pandas.pydata.org/). 
@@ -66,7 +66,7 @@ This list is by no means exhaustive, so if there are any other tools you use oft
 
 Discussions on the new analysis tools that might be useful for the LHCb community are held in the [Work Package 4](https://lhcb-dpa.web.cern.ch/lhcb-dpa/wp4/index.html) of the Data Processing & Analysis project (DPA). 
 
-### Analysis Preservation
+## Analysis Preservation
 
 When the samples are ready one can proceed with developing the necessary macros and scripts to perform the analysis steps, such as applying additional selections, fitting distributions, computing efficiencies and acceptances, etc. 
 Starting from the ntuples a typical analysis will consist of the following steps: 
diff --git a/first-analysis-steps/bookkeeping.md b/first-analysis-steps/bookkeeping.md
index 35ee22d2..eed8024f 100644
--- a/first-analysis-steps/bookkeeping.md
+++ b/first-analysis-steps/bookkeeping.md
@@ -10,8 +10,8 @@ After this, a tree of various application and processing versions will
 eventually lead to the data you need.
 
 So, before we can run our first DaVinci job we need to locate some events. In 
-this tutorial we will use the decay `$ D^{* +} \to D^{0}\pi^{+} $` as an example, 
-where the `$ D^{0} $` decays to `$ K^{-} K^{+} $`.
+this tutorial we will use the decay $ D^{* +} \to D^{0}\pi^{+} $ as an example, 
+where the $D^{0}$ decays to $K^{-} K^{+}$.
 
 {% objectives "Learning Objectives" %}
 
@@ -40,8 +40,8 @@ representation of the [event
 type](https://cds.cern.ch/record/855452?ln=en).  The text is the human
 readable version of that.
 
-This sample of simulated events will only contain events where a `$ D^{* +} \to 
-D^{0}(\to K^{-}K^{+})\pi^{+} $` was generated within the LHCb acceptance, 
+This sample of simulated events will only contain events where a $D^{* +} \to 
+D^{0}(\to K^{-}K^{+})\pi^{+}$ was generated within the LHCb acceptance, 
 although the decay might not have been fully reconstructed. (Not all simulated 
 samples have the same requirements made on the signal decay.)
 
@@ -109,9 +109,9 @@ by typing this path and pressing the `Go` button.
 
 Think of a decay and try to find a Monte Carlo sample for it. You could use 
 the decay that your analysis is about, or if you don't have any ideas you 
-could look for the semileptonic decay `$ \Lambda_{b}^{0} \to 
-\Lambda_{c}^{+}\mu^{-}\bar{\nu}_{\mu} $`, where the `$ \Lambda_{c}^{+} $` decays 
-to `$ pK^{-}\pi^{+} $`.
+could look for the semileptonic decay $\Lambda_{b}^{0} \to 
+\Lambda_{c}^{+}\mu^{-}\bar{\nu}_{\mu}$, where the $\Lambda_{c}^{+}$ decays 
+to $pK^{-}\pi^{+}$.
 
 {% endchallenge %}
 
diff --git a/first-analysis-steps/dataflow-run3.md b/first-analysis-steps/dataflow-run3.md
index 68e13e9e..059d3831 100644
--- a/first-analysis-steps/dataflow-run3.md
+++ b/first-analysis-steps/dataflow-run3.md
@@ -11,14 +11,14 @@
 
 In Run 1 & 2 LHCb proved itself not only to be a high-precision
 heavy flavour physics experiment, but also extended the core physics programme to many different areas such as electroweak physics and fixed-target experiments. This incredible precision led to
-over 500 papers including breakthroughs such as the first discovery of `$ C\!P $`-violation in charm and the first observation of the decay `$ B_s^0\to \mu^+\mu^- $` among many others.
+over 500 papers including breakthroughs such as the first discovery of $C\!P$-violation in charm and the first observation of the decay $B_s^0\to \mu^+\mu^-$ among many others.
 
-In order to reach even higher precision the experiment aims to take `$ 50\,\mathrm{fb}^{-1} $` of
+In order to reach even higher precision the experiment aims to take $50\,\mathrm{fb}^{-1}$ of
 data in Run 3 by increasing the instantaneous luminosity by a factor of five.
 To be capable of dealing with the higher detector occupancy the experiment will be equipped with an entire new set of tracking detectors with higher granularity and improved radiation tolerance.
 
 One of the most important upgrades in Run 3 will be the removal of the LHCb L0 hardware triggers. As described in the following, this will bring significant changes in the data flow of the experiment both for online and offline processing. It also means that the front-end and readout electronics of all sub-detectors will be replaced, to be able
-to operate at the bunch crossing rate of `$ 40\,\mathrm{MHz} $`, as well as the photodetectors of the RICH1 detector. 
+to operate at the bunch crossing rate of $40\,\mathrm{MHz}$, as well as the photodetectors of the RICH1 detector. 
 
 ## Upgrade of the LHCb trigger system
 
@@ -26,7 +26,7 @@ The trigger layout for Run 3 will look like this:
 
 !["Data processing chain for Run 3"](img/hidef_RTA_dataflow_widescreen.png)
 
-The LHCb trigger system will be fully redesigned by removing the L0 hardware trigger and moving to a fully-software based trigger. The hardware trigger has a rate limit of 1 MHz, which would be a limitation with the increase in luminosity. Such a low rate could be only achieved by having tight hardware trigger thresholds on `$ p_\mathrm{T} $` and `$ E_\mathrm{T} $` which is inefficient especially for fully hadronic decay modes. The removal of this bottleneck means that the full detector readout as well as running the HLT1 needs to be enabled at the average non-empty bunch crossing rate in LHCb of `$ 30\,\mathrm{MHz} $`, a not so trivial computing challenge!
+The LHCb trigger system will be fully redesigned by removing the L0 hardware trigger and moving to a fully-software based trigger. The hardware trigger has a rate limit of 1 MHz, which would be a limitation with the increase in luminosity. Such a low rate could be only achieved by having tight hardware trigger thresholds on $p_\mathrm{T}$ and $E_\mathrm{T}$ which is inefficient especially for fully hadronic decay modes. The removal of this bottleneck means that the full detector readout as well as running the HLT1 needs to be enabled at the average non-empty bunch crossing rate in LHCb of $30\,\mathrm{MHz}$, a not so trivial computing challenge!
 
 As we saw already at the [Run 1 and Run 2 dataflow lecture](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/dataflow.html), the software trigger is implemented in two steps: the HLT1 which performs partial event reconstruction and simple trigger decisions to reduce the data rate, and HLT2, which performs the more computationally expensive full reconstruction and complete trigger selection. One of the most important tasks of building the events is track reconstruction, which is an inherently parallelizable process. For this purpose, HLT1 in Run 3 is implemented as part of the ```Allen project``` and run on GPUs.
 
@@ -38,7 +38,7 @@ Having the HLT1 run on GPUs imposes some different requirements on the code deve
 The raw data of events selected by HLT1 is passed on to the buffer system and stored there. The buffering of events enables to run the real-time alignment and calibration before events are entering HLT2. This is crucial because in this way calibration and alignment constants obtained can be used in the full event reconstruction performed in HLT2. For the determination of these constants, HLT1 selects dedicated calibration samples. 
 More information about the real-time alignment and calibration can be found in the [Upgrade Alignment TWIKI page](https://twiki.cern.ch/twiki/bin/view/LHCb/AlignmentInUpgrade).
 
-The total bandwidth that can be saved from HLT2 to tape is limited to `$ 10\,\mathrm{GB}/s $`. An important change in the HLT2 selections with respect to the Run 2 will be the increased use of the Turbo model. Wherever possible, the Turbo will be the baseline, so that in total for approximately 2/3 of data only the data of the signal candidate (raw and reconstructed) will be saved and no further offline reconstruction will be possible. This results in significantly smaller event sizes so that more events can be saved.
+The total bandwidth that can be saved from HLT2 to tape is limited to $10\,\mathrm{GB}/s$. An important change in the HLT2 selections with respect to the Run 2 will be the increased use of the Turbo model. Wherever possible, the Turbo will be the baseline, so that in total for approximately 2/3 of data only the data of the signal candidate (raw and reconstructed) will be saved and no further offline reconstruction will be possible. This results in significantly smaller event sizes so that more events can be saved.
 
 For details and tutorials on how to develop HLT2 line selections as well as how to check their efficiencies and data rates follow the [Moore documentation](https://lhcbdoc.web.cern.ch/lhcbdoc/moore/master/index.html).
 
diff --git a/first-analysis-steps/decay-tree-fitter.md b/first-analysis-steps/decay-tree-fitter.md
index 828cf434..56482715 100644
--- a/first-analysis-steps/decay-tree-fitter.md
+++ b/first-analysis-steps/decay-tree-fitter.md
@@ -111,7 +111,7 @@ a vertex fit acts like a vertex constraint, improving the opening-angle resoluti
 {% endcallout %} 
 
 
-Now let us look at the refitted mass of the `$ D^{*+} $`, with the `$ D^0 $` constrained to its nominal mass.
+Now let us look at the refitted mass of the $D^{*+}$, with the $D^0$ constrained to its nominal mass.
 It is stored in the variable `Dstar_ConsD_M`.
 If you plot this you will note that some values are unphysical.
 So, let's restrict the range we look at to something that makes sense.
diff --git a/first-analysis-steps/interactive-dst.md b/first-analysis-steps/interactive-dst.md
index d3981d99..aa23c074 100644
--- a/first-analysis-steps/interactive-dst.md
+++ b/first-analysis-steps/interactive-dst.md
@@ -176,9 +176,9 @@ one with:
 >>> print(candidates[0])
 ```
 
-Which will print out some information about the [Particle](https://lhcb-doxygen.web.cern.ch/lhcb-doxygen/davinci/latest/d0/d13/class_l_h_cb_1_1_particle.html). In our case a `$ D^{* +} $` ([particle ID number](http://pdg.lbl.gov/2019/reviews/rpp2018-rev-monte-carlo-numbering.pdf) 413). You can access its decay products with
+Which will print out some information about the [Particle](https://lhcb-doxygen.web.cern.ch/lhcb-doxygen/davinci/latest/d0/d13/class_l_h_cb_1_1_particle.html). In our case a $D^{* +}$ ([particle ID number](http://pdg.lbl.gov/2019/reviews/rpp2018-rev-monte-carlo-numbering.pdf) 413). You can access its decay products with
 `candidates[0].daughtersVector()[0]` and `candidates[0].daughtersVector()[1]`,
-which will be a `$ D^{0} $` and a `$ \pi^{+} $`.
+which will be a $D^{0}$ and a $\pi^{+}$.
 
 There is a useful tool for printing out decay trees, which you can
 pass the top level particle to and it will print out the full decay tree etc:
diff --git a/first-analysis-steps/loki-functors.md b/first-analysis-steps/loki-functors.md
index dbeb3f11..8f5d0e99 100644
--- a/first-analysis-steps/loki-functors.md
+++ b/first-analysis-steps/loki-functors.md
@@ -66,7 +66,7 @@ candidate = candidates[0]
 
 The object `candidate `, loaded from the DST, is of type `LHCb::Particle` and we are looking at its representation via python bindings. 
 We can do `help(candidate)` to find out which functions are available.
-We can try to get very simple properties of the `$ D^{* +} $` candidate. Let's start from the components of its momentum.
+We can try to get very simple properties of the $D^{* +}$ candidate. Let's start from the components of its momentum.
 This can be done calling the function `momentum()` for our candidate in the following way:
 ```python
 p_x = candidate.momentum().X()
@@ -134,8 +134,8 @@ print (PT(candidate)/GeV)
 
 {% endcallout %} 
 
-If we want to get the properties of the `$ D^{* +} $` vertex, for example its fit 
-quality (`$ \chi^2 $`), we need to pass a vertex object to the vertex functor.
+If we want to get the properties of the $D^{* +}$ vertex, for example its fit 
+quality ($\chi^2$), we need to pass a vertex object to the vertex functor.
 
 ```python
 from LoKiPhys.decorators import VCHI2
@@ -219,11 +219,11 @@ The list can be overwhelming, so it's also worth checking a more curated selecti
 {% endcallout %} 
 
 So far we've only looked at the properties of the head of the decay (that is, 
-the `$ D^{* +} $`), but what if we want to get information about its decay products? As 
+the $D^{* +}$), but what if we want to get information about its decay products? As 
 an example, let's get the largest transverse momentum of the final state 
 particles.
 A simple solution would be to navigate the tree and calculate the maximum 
-`$ p_{\text{T}} $`.
+$p_{\text{T}}$.
 
 ```python
 def find_tracks(particle):
@@ -311,7 +311,7 @@ Out[10]:
 
 ```
 we know that `D0` is the first child and `pi+` is the second.
-Therefore, to access the mass of the `$ D^{0} $` we have 2 options:
+Therefore, to access the mass of the $D^{0}$ we have 2 options:
 ```python
 from LoKiPhys.decorators import CHILD
 # Option 1
@@ -328,7 +328,7 @@ Evaluate the quality of the D0 decay vertex.
 
 {% endchallenge %} 
 
-In the similar way, we may access properties of child of the child: for example, a kaon from the `$ D^{0} $` decay:
+In the similar way, we may access properties of child of the child: for example, a kaon from the $D^{0}$ decay:
 ```python
 from LoKiPhys.decorators import CHILD
 mass_kaon = CHILD(CHILD(M, 1),1)(candidate)
diff --git a/first-analysis-steps/physics-at-lhcb.md b/first-analysis-steps/physics-at-lhcb.md
index b5b426a5..abec11c4 100644
--- a/first-analysis-steps/physics-at-lhcb.md
+++ b/first-analysis-steps/physics-at-lhcb.md
@@ -25,12 +25,12 @@ fraction can traverse the full detector, and so we typically consider these
 particles as ‘stable’. Unstable objects, with much shorter lifetimes, are
 formed as combinations of these ‘stable’ particles[^1]:
 
-1. Charged pions `$ \pi^{\pm} $`
-2. Charged kaons `$ K^{\pm} $`
-3. Protons `$ p/\bar{p} $`
-4. Electrons `$ e^{\pm} $`
-5. Muons `$ \mu^{\pm} $`
-6. Photons `$ \gamma $`
+1. Charged pions $\pi^{\pm}$
+2. Charged kaons $K^{\pm}$
+3. Protons $p/\bar{p}$
+4. Electrons $e^{\pm}$
+5. Muons $\mu^{\pm}$
+6. Photons $\gamma$
 7. Deuterons ([deuterium nuclei][deuterium])
 
 Many properties of these objects, such as their momentum and charge, are
@@ -75,8 +75,8 @@ a real trajectory.
 Given this, we can never know anything with complete certainty. Instead we
 must infer properties _statistically_ based on ensembles of objects and events.
 
-Let’s say we want to count the number of [`$ J/\psi $` mesons][pdgjpsi] that
-decay to two muons, `$ \mu^{+}\mu^{-} $`. This proceeds via three steps:
+Let’s say we want to count the number of [$J/\psi$ mesons][pdgjpsi] that
+decay to two muons, $\mu^{+}\mu^{-}$. This proceeds via three steps:
 
 1. Select tracks created by the reconstruction;
 2. Create pairs of oppositely-charged tracks;
@@ -86,8 +86,8 @@ decay to two muons, `$ \mu^{+}\mu^{-} $`. This proceeds via three steps:
 We’ll go over each of these steps, starting with tracks produced by the
 reconstruction. Naturally, there are many of these in any given event,
 typically hundreds, so we begin be applying _selections_ on the tracks based on
-our physics understanding. For example, the `$ J/\psi $` is quite heavy (at
-around `$ 3.1\,\mathrm{GeV}/c^{2} $`), so we might expect its decay products to have a
+our physics understanding. For example, the $J/\psi$ is quite heavy (at
+around $3.1\,\mathrm{GeV}/c^{2}$), so we might expect its decay products to have a
 higher momentum on average than objects produced from soft processes in the
 collision. The reconstruction gives us some probability-like
 information for a given track to have been created by a true muon, so we might
@@ -97,21 +97,21 @@ With a reduced set of a tracks, we can create all pairs of opposite-sign muons.
 We know that a real particle decay happens at a point in space, so we could
 require that the distance of closest approach between the two muons does not
 exceed some maximum value. We could further require that the invariant mass of
-the dimuon combination is close to the known mass of the `$ J/\psi $`, invoking
+the dimuon combination is close to the known mass of the $J/\psi$, invoking
 the conservation of momentum.
 
-With selected dimuon pairs, we can _fit_ a `$ J/\psi \to \mu^{+}\mu^{-} $` decay
+With selected dimuon pairs, we can _fit_ a $J/\psi \to \mu^{+}\mu^{-}$ decay
 vertex. This is done by expressing the hypothesis that there is a common origin
 vertex of both tracks as an optimisation problem, and then varying the measured
-`$ \mu^{+} $` and `$ \mu^{-} $` four-momenta within their measured uncertainties to
+$\mu^{+}$ and $\mu^{-}$ four-momenta within their measured uncertainties to
 best fit that hypothesis. The result is a vertex object which has, for example,
-a fit `$ \chi^{2} $` associated to it. The quality of the fit can be used in a
+a fit $\chi^{2}$ associated to it. The quality of the fit can be used in a
 selection.
 
 Finally, with the fitted muon four-vectors, we can form the four-vector of the
-`$ J/\psi $` as their sum, creating the `$ J/\psi $` _candidate_[^2]. Now we get
-back to our original goal of measuring the number of true `$ J/\psi \to
-\mu^{+}\mu^{-} $`. By plotting the `$ J/\psi $` invariant mass values as a
+$J/\psi$ as their sum, creating the $J/\psi$ _candidate_[^2]. Now we get
+back to our original goal of measuring the number of true $J/\psi \to
+\mu^{+}\mu^{-}$. By plotting the $J/\psi$ invariant mass values as a
 histogram, we might hope to see a signal component. An example is shown in the
 following plot, created using simulated toy data.
 
@@ -124,17 +124,17 @@ decays, within some uncertainty.
 
 ## Building more complex decays
 
-With a set of `$ J/\psi $` candidates, we could build more complex decay chains
-such as `$ B_{s}^{0} \to J/\psi\phi(1020) $`.
-The `$ J/\psi $` and the [`$ \phi $` meson][pdgphi] can decay in various ways,
-but we might choose to reconstruct them in the `$ \mu^{+}\mu^{-} $` and
-`$ K^{+}K^{-} $` final states, respectively.
-The `$ \phi \to K^{+}K^{-} $` candidates can then be reconstructed in a similar
-manner to that described previously for the `$ J/\psi $` decay.
-With a set of `$ J/\psi $` and `$ \phi $` candidates, we can then build
-[`$ B_{s}^{0} $` meson][pdgbs] candidates by combining the two decay products in
+With a set of $J/\psi$ candidates, we could build more complex decay chains
+such as $B_{s}^{0} \to J/\psi\phi(1020)$.
+The $J/\psi$ and the [$\phi$ meson][pdgphi] can decay in various ways,
+but we might choose to reconstruct them in the $\mu^{+}\mu^{-}$ and
+$K^{+}K^{-}$ final states, respectively.
+The $\phi \to K^{+}K^{-}$ candidates can then be reconstructed in a similar
+manner to that described previously for the $J/\psi$ decay.
+With a set of $J/\psi$ and $\phi$ candidates, we can then build
+[$B_{s}^{0}$ meson][pdgbs] candidates by combining the two decay products in
 another vertex fit. If our selection is clean enough, we may then see a
-`$ B_s^{0} $` signal peak, shown below (again, using toy data).
+$B_s^{0}$ signal peak, shown below (again, using toy data).
 
 ![Four-body invariant mass spectrum](img/jpsiphi_mass.png)
 
@@ -150,6 +150,6 @@ properties that are relevant for your analysis.
 [detdesc]: https://doi.org/10.1088/1748-0221/3/08/S08005
 [run1perf]: https://arxiv.org/abs/1412.6352
 
-[^1]: Other ‘stable’ particles under this definition include neutrons `$ n $` and the long-lived neutral kaon weak eigenstate `$ K_{\mathrm{L}}^{0} $`, but these are not part of standard reconstruction output.
+[^1]: Other ‘stable’ particles under this definition include neutrons $n$ and the long-lived neutral kaon weak eigenstate $K_{\mathrm{L}}^{0}$, but these are not part of standard reconstruction output.
 
 [^2]: ‘Candidate’ because, again, we never know anything with complete certainty; this could be combination of muons that just happen to pass our selection criteria.
diff --git a/second-analysis-steps/README.md b/second-analysis-steps/README.md
index f4511039..597403f2 100644
--- a/second-analysis-steps/README.md
+++ b/second-analysis-steps/README.md
@@ -18,22 +18,21 @@ Before starting, you should be familiar with the [first analysis steps](/first-a
 [lessons-issues]: https://github.com/lhcb/starterkit-lessons/issues
 [lessons-repo]: https://github.com/lhcb/starterkit-lessons
 
-```eval_rst
-.. toctree::
-    :hidden:
-    :caption: Contents:
-
-    lb-git.md
-    building-decays.md
-    fixing-errors.md
-    rerun-stripping.md
-    switch-mass-hypo.md
-    filter-in-trees.md
-    simulation.md
-    hlt-intro.md
-    tistos-diy.md
-    ganga-scripting.md
-    managing-files-with-ganga.md
-    advanced-dirac.md
-    containers.md
+```{toctree}
+:hidden:
+:caption: Contents:
+
+lb-git.md
+building-decays.md
+fixing-errors.md
+rerun-stripping.md
+switch-mass-hypo.md
+filter-in-trees.md
+simulation.md
+hlt-intro.md
+tistos-diy.md
+ganga-scripting.md
+managing-files-with-ganga.md
+advanced-dirac.md
+containers.md
 ```
diff --git a/second-analysis-steps/advanced-dirac.md b/second-analysis-steps/advanced-dirac.md
index 2c516dd0..e7ce995d 100644
--- a/second-analysis-steps/advanced-dirac.md
+++ b/second-analysis-steps/advanced-dirac.md
@@ -16,7 +16,7 @@ This tutorial will be based on a couple of python files. Please download the fol
 
 {% endcallout %}
 
-### GangaTasks
+## GangaTasks
 
 The first and most important package to introduce is GangaTasks. This package is designed to stop busy analysts from spending more time managing GRID jobs than working on physics. It has the following core features.
 
@@ -150,7 +150,7 @@ tasks(task_num).overview()
 
 {% endcallout %}
 
-### Alternative Backends - DIRAC (Python [bugged](https://github.com/ganga-devs/ganga/pull/1896))
+## Alternative Backends - DIRAC (Python [bugged](https://github.com/ganga-devs/ganga/pull/1896))
 
 So far we have only run Tasks on the `Localhost`. Naturally this will not be appropriate for many of the jobs you will need to do. So firstly lets get our python scripts running on `DIRAC` rather than `Localhost`. First we need to ensure that our `DIRAC` submission can access lb-conda. This is done using `Tags` which allow us to configure the behind the scenes of `DIRAC`. As such we need to add the following snippet to our code.
 
@@ -167,7 +167,7 @@ source /cvmfs/lhcb.cern.ch/lib/LbEnv
 
 Since any sites that are not at CERN will not source this by default.
 
-### Alternative Backends - DIRAC (DaVinci)
+## Alternative Backends - DIRAC (DaVinci)
 
 As you can also imagine it is useful to be able to include DaVinci jobs as Transforms in certain analysis chains. As mentioned earlier Transforms have the following advantages over traditional jobs.
 
@@ -192,7 +192,7 @@ trf1.backend = Dirac()
 
 For more details of how to prepare DaVinci jobs for GRID submission please refer to the [Running DaVinci on the GRID](../first-analysis-steps/davinci-grid.md) lesson.
 
-### Alternative Backends - Condor
+## Alternative Backends - Condor
 
 Transforms can also be set to run on the Condor backend. For those of you familiar with Condor you should recognise the `requirements` object that allows you to set requirements for host selection. These include `opsys`, `arch`, `memory` and others and can be inspected directly through the `IPython` interface. Changes to the choice of Condor universe can also be made by directly by changing the contents of `backend.universe`. An example of using the Condor backend is as follows.
 
diff --git a/second-analysis-steps/building-decays-part0.md b/second-analysis-steps/building-decays-part0.md
index 014aeef9..a31e64f5 100644
--- a/second-analysis-steps/building-decays-part0.md
+++ b/second-analysis-steps/building-decays-part0.md
@@ -1,4 +1,4 @@
-## The Selection Framework
+# The Selection Framework
 
 {% objectives "Learning Objectives" %}
 
@@ -10,13 +10,13 @@
 In order to perform most physics analyses we need to build a *decay chain* with reconstructed particles that represents the physics process we want to study.
 In LHCb, this decay chain can be built through `LHCb::Particle` and `LHCb::MCParticle` objects that represent individual particles and contain links to their children, also represented by the same type of object.
 
-We'll learn all the concepts involved by running through our usual full example of the `$ D^\ast\rightarrow D^0(\rightarrow K^{-} K^{+}) \pi $` decay chain.
+We'll learn all the concepts involved by running through our usual full example of the $D^\ast\rightarrow D^0(\rightarrow K^{-} K^{+}) \pi$ decay chain.
 
-The LHCb approach to building decays is from the bottom up. Therefore, to build `$ D^\ast\rightarrow D^0(\rightarrow K^{-} K^{+}) \pi $` we need to
+The LHCb approach to building decays is from the bottom up. Therefore, to build $D^\ast\rightarrow D^0(\rightarrow K^{-} K^{+}) \pi$ we need to
 
   1. Get input pions and kaons and filter them according to our physics needs.
-  2. Combine two kaons to build a `$ D^0 $`, and apply selection cuts to it.
-  3. Combine this `$ D^0 $` with a pion to build the `$ D^\ast $`, again filtering when necessary.
+  2. Combine two kaons to build a $D^0$, and apply selection cuts to it.
+  3. Combine this $D^0$ with a pion to build the $D^\ast$, again filtering when necessary.
 
 To do that, we need to know a little bit more about how the LHCb analysis framework works.
 
diff --git a/second-analysis-steps/building-decays-part1.md b/second-analysis-steps/building-decays-part1.md
index ae381a78..0daca02f 100644
--- a/second-analysis-steps/building-decays-part1.md
+++ b/second-analysis-steps/building-decays-part1.md
@@ -1,4 +1,4 @@
-## A Historical Approach
+# A Historical Approach
 
 {% objectives "Learning Objectives" %}
 
@@ -17,7 +17,7 @@ At the end of this lesson its shortcomings will be highlighted and a better way
 {% endcallout %}
 
 Now we'll learn to apply the concepts of the Selection Framework by running through a full example:
-using the DST files from the [Downloading a file from the Grid](../first-analysis-steps/files-from-grid.md) lesson, we will build our own `$ D^\ast\rightarrow D^0(\rightarrow K^{-} K^{+}) \pi $` decay chain from scratch.
+using the DST files from the [Downloading a file from the Grid](../first-analysis-steps/files-from-grid.md) lesson, we will build our own $D^\ast\rightarrow D^0(\rightarrow K^{-} K^{+}) \pi$ decay chain from scratch.
 Get your [LoKi skills](../first-analysis-steps/loki-functors.md) ready and let's start.
 
 {% callout "Getting started" %}
@@ -59,7 +59,7 @@ Kaons = AutomaticData('Phys/StdAllLooseKaons/Particles')
 
 {% endcallout %}
 
-Once we have the input kaons, we can combine them to build a `$ D^0 $` by means of the `CombineParticles` algorithm.
+Once we have the input kaons, we can combine them to build a $D^0$ by means of the `CombineParticles` algorithm.
 This algorithm performs the combinatorics for us according to a given decay descriptor and puts the resulting particle in the TES, allowing also to apply some cuts on them:
 
  - `DaughtersCuts` is a dictionary that maps each child particle type to a LoKi 
@@ -127,7 +127,7 @@ We can already see that this two-step process (building the `CombineParticles` a
 This can be simplified using a `SimpleSelection` object, which will be discussed in the next lesson.
 
 For the time being, let's finish building our candidates.
-Now we can use another `CombineParticles` to build the `$ D^\ast $` with pions and the `$ D^0 $`'s as inputs, and applying a filtering only on the soft pion:
+Now we can use another `CombineParticles` to build the $D^\ast$ with pions and the $D^0$'s as inputs, and applying a filtering only on the soft pion:
 
 ```python
 dstar_decay_products = {'pi+': '(TRCHI2DOF < 3) & (PT > 100*MeV)'}
diff --git a/second-analysis-steps/building-decays-part2.md b/second-analysis-steps/building-decays-part2.md
index 01af04d4..2d776b07 100644
--- a/second-analysis-steps/building-decays-part2.md
+++ b/second-analysis-steps/building-decays-part2.md
@@ -1,4 +1,4 @@
-## Modern Selection Framework
+# Modern Selection Framework
 
 {% objectives "Learning Objectives" %}
 
diff --git a/second-analysis-steps/building-decays.md b/second-analysis-steps/building-decays.md
index 6c109ea8..bc12947c 100644
--- a/second-analysis-steps/building-decays.md
+++ b/second-analysis-steps/building-decays.md
@@ -33,12 +33,11 @@ combinations in a separate step?
 
 {% endchallenge %}
 
-```eval_rst
-.. toctree::
-    :maxdepth: 3
-    :caption: Contents:
-
-    building-decays-part0.md
-    building-decays-part1.md
-    building-decays-part2.md
+```{toctree}
+:maxdepth: 3
+:caption: Contents:
+
+building-decays-part0.md
+building-decays-part1.md
+building-decays-part2.md
 ```
diff --git a/second-analysis-steps/filter-in-trees.md b/second-analysis-steps/filter-in-trees.md
index 4b74870f..98228ef6 100644
--- a/second-analysis-steps/filter-in-trees.md
+++ b/second-analysis-steps/filter-in-trees.md
@@ -11,8 +11,8 @@ Sometimes we want to extract a portion of the decay tree in order to build a dif
 To do that, we need to put the particles we're interested in in a new container so they can afterwards be used as inputs to a `CombineParticles` instance (as we saw in [the selection framework lesson](/second-analysis-steps/building-decays-part0)).
 To achieve this we can use the `FilterInTrees` algorithm, a simple variation of `FilterDesktop` ([doxygen](https://lhcb-doxygen.web.cern.ch/lhcb-doxygen/davinci/latest/d0/d0c/class_filter_desktop.html)).
 
-Let's start from the example in [the selection framework lesson](/second-analysis-steps/building-decays-part0) and let's check that the `$ K^+ $` child of the `$ D^0 $` does not come from a `$ K^{*}(892)^{0} \to K^{+}\pi^{-} $`.
-To do that, we have to extract the `$ K^+ $` from `([D0 -> K+ K-]CC)` and combine it with all pions in `Phys/StdAllNoPIDsPions/Particles`.
+Let's start from the example in [the selection framework lesson](/second-analysis-steps/building-decays-part0) and let's check that the $K^+$ child of the $D^0$ does not come from a $K^{*}(892)^{0} \to K^{+}\pi^{-}$.
+To do that, we have to extract the $K^+$ from `([D0 -> K+ K-]CC)` and combine it with all pions in `Phys/StdAllNoPIDsPions/Particles`.
 
 Using `FilterInTrees` is done in the same way we would use `FilterDesktop`:
 
@@ -30,7 +30,7 @@ kaons_from_d0_sel = Selection("kaons_from_d0_sel",
                             RequiredSelections=[DataOnDemand(Location=tesLoc)])
 ```
 
-The output of `kaons_from_d0_sel` is a container with all the kaons coming from the `$ D^0 $`.
+The output of `kaons_from_d0_sel` is a container with all the kaons coming from the $D^0$.
 
 The final step is easy, very similar to [building your own decay](/second-analysis-steps/building-decays-part0):
 
diff --git a/second-analysis-steps/fixing-errors.md b/second-analysis-steps/fixing-errors.md
index 6cf75ebb..7d204702 100644
--- a/second-analysis-steps/fixing-errors.md
+++ b/second-analysis-steps/fixing-errors.md
@@ -86,7 +86,7 @@ Sel_D0            SUCCESS Number of counters : 10
 
 It's easy to see we have 0 input kaons and we can see we only get input pions!
 
-Another problem: we messed up with a cut, for example in building the `$ D^* $`,
+Another problem: we messed up with a cut, for example in building the $D^*$,
 
 ```python
 dstar_mother = (
diff --git a/second-analysis-steps/ganga-scripting.md b/second-analysis-steps/ganga-scripting.md
index defee1e1..6e19b4e8 100644
--- a/second-analysis-steps/ganga-scripting.md
+++ b/second-analysis-steps/ganga-scripting.md
@@ -13,7 +13,7 @@ writing job definition scripts, and exploring how we can define utility
 functions that will be available across all of our Ganga sessions.
 
 
-### Defining jobs with scripts
+## Defining jobs with scripts
 
 The `ganga` executable is similar to the `python` and `ipython` executables in 
 a couple of ways.
@@ -182,8 +182,6 @@ The `argparse` module can do a lot, being able to parse complex sets of
 arguments with much difficultly. It's a useful tool to know in general, so we 
 recommend that you check out the [documentation][argparse] to learn more.
 
-[argparse]: https://docs.python.org/2/library/argparse.html
-
 {% endcallout %}
 
 When we do supply all the necessary arguments, the values are then available in
@@ -214,7 +212,7 @@ Above we added the `--test` flag as an example: if this is `True`, you could
 run the application over only a single data file, and run the job locally 
 rather than on the Grid (setting `j.backend` appropriately).
 
-### Adding helpers functions
+## Adding helpers functions
 
 We've seen above how giving a script to `ganga` makes the variables defined in 
 those scripts available interactively.
diff --git a/second-analysis-steps/hlt-intro.md b/second-analysis-steps/hlt-intro.md
index 0a8cbd8d..82302429 100644
--- a/second-analysis-steps/hlt-intro.md
+++ b/second-analysis-steps/hlt-intro.md
@@ -40,7 +40,7 @@ line which goes to mdst.
 
 For Run 3 LHCb is removing the hardware trigger and the first stage of the software trigger will run at 30 MHz. The HLT is being developed in the [Real-Time-Analysis project](https://twiki.cern.ch/twiki/bin/viewauth/LHCb/RealTimeAnalysis).
 
-### Add trigger information to your ntuple
+## Add trigger information to your ntuple
 In this section and the following, we will explain how to find out which trigger lines selected my signal.
 
 Copy the following DaVinci script to get an ntuple.
@@ -122,7 +122,7 @@ Find out what the hexadecimal presentation of the Hlt1 and Hlt2 TCK is.
 
 {% endchallenge %}
 
-### Exploring a TCK: List of trigger lines
+## Exploring a TCK: List of trigger lines
 
 To get a list of all available TCKs one can use TCKsh which is a python shell with predefined
 functions to explore TCKs, do
@@ -143,7 +143,7 @@ What are the names of the topological trigger lines in Run 1 and Run 2?
 
 {% endchallenge %}
 
-### Add trigger information to your ntuple, continued
+## Add trigger information to your ntuple, continued
 Once you have a list of trigger lines, you can add this to `TupleToolTrigger`:
 ```python
 triggerList = [
@@ -173,7 +173,7 @@ ttt.TriggerList = triggerList
 Be aware you have to append `Decision` to the name of the trigger line.
 
 
-### Add TISTOS information to your ntuple
+## Add TISTOS information to your ntuple
 For analysis purposes it is important to know if your signal candidate was part of the trigger decision or not as one has different efficiencies for both categories. The following categories exist [ [LHCb-PUB-2014-039](https://cds.cern.ch/record/1701134/files/LHCb-PUB-2014-039.pdf) ]:
 1. Triggered On Signal (TOS): events for which the presence of the signal is sufficient to generate a positive trigger decision.
 2. Triggered Independent of Signal (TIS): the “rest” of the event is sufficient to generate a positive trigger decision, where the rest of the event is defined through an operational procedure consisting in removing the signal and all detector hits belonging to it.
@@ -194,7 +194,7 @@ Explain why a single particle cannot be TOS on the Hlt1TwoTrackMVA line.
 
 {% endchallenge %}
 
-### Exploring a TCK: Properties of trigger lines
+## Exploring a TCK: Properties of trigger lines
 
 If you want to get an overview of an Hlt line and its algorithms, you can do 
 
@@ -216,7 +216,7 @@ The regex is needed to search through all algorithms connected to a line.
 
 <!-- ## Old
 
-### Run Moore from settings (Run1 + Run2)
+## Run Moore from settings (Run1 + Run2)
 The application of the software trigger is called Moore. Moore relies on the same
 algorithms as are used in Brunel to run the reconstruction and in DaVinci to
 select particle decays.
@@ -287,7 +287,7 @@ What is difference in run time of Hlt1 and Hlt2?
 
 {% endchallenge %}
 
-### Run Moore from TCK* (Run1 + Run2)
+## Run Moore from TCK* (Run1 + Run2)
 
 There are two ways to run Moore, from `ThresholdSettings` and from `TCK` (Trigger Configuration Key).
 When you develop a trigger line, it is more convenient to run from ThresholdSettings. The TCK
@@ -352,7 +352,7 @@ Moore().inputFiles = ["TestTCK1.mdf"]
 Moore().EvtMax = 100
 ```
 
-### Adapt an existing trigger line (Run1 + Run2)
+## Adapt an existing trigger line (Run1 + Run2)
 
 HLT2 lines are similar to stripping lines. They combine basic particles to composite objects
 and you apply selections to get a clean sample. The framework in which you write a trigger
diff --git a/second-analysis-steps/simulation-advanced-dkfiles.md b/second-analysis-steps/simulation-advanced-dkfiles.md
index 00d9176e..222a8cc2 100644
--- a/second-analysis-steps/simulation-advanced-dkfiles.md
+++ b/second-analysis-steps/simulation-advanced-dkfiles.md
@@ -4,9 +4,9 @@
 
 {% endobjectives %} 
 
-## Advanced: Modifying the decay
+# Advanced: Modifying the decay
 
-##Two-body decays - getting angular momentum right
+#Two-body decays - getting angular momentum right
 EvtGen has specific models for each two body spin configuration, for example Scalar to Vector+Scalar (SVS), and Vector to lepton+lepton(VLL)
 ```
 #
@@ -42,7 +42,7 @@ Decay anti-B0sig
 Enddecay
 ```
 
-## 3+ multi-body decays
+# 3+ multi-body decays
 For 3+ bodies the physics models get more complicated. For a fully hadronic final state, typically a Dalitz model will be specified, e.g:
 ```
 # D_DALITZ includes resonances contributions (K*(892), K*(1430), K*(1680))
@@ -63,7 +63,7 @@ Enddecay
 CDecay anti-B0sig
 ```
 here the numbers correspond to measured values for the form factor parameters. 
-## Cocktail decays
+# Cocktail decays
 Often you will want to simulate more than one decay mode in a sample, e.g:
 ```
 Decay MyD_s+
@@ -79,11 +79,11 @@ CDecay MyD_s-
 ```
 Note that the fractions will always be renormalised to sum to 1 - you can directly use PDG branching fractions without having to rescale by hand.
 
-## Final state radiation
+# Final state radiation
 After generating the decay, final state radiation is added using PHOTOS. Note that PHOTOS is enabled by default, even though many decfiles explicitly specify it. It needs to be explicitly removed via "noPhotos"
 
 
-## Changing particle masses / lifetimes/ widths
+# Changing particle masses / lifetimes/ widths
 Sometimes you need to change the mass or lifetime of a particle, either because the initial values are wrong, or the particle you actually want doesn't exist in EvtGen, and you need to adapt an existing particle.
 This can be done with python code inserted in the header:
 
@@ -108,7 +108,7 @@ The format is:
 ```
 
 
-## Finding Constants Used in an Existing MC Sample (Masses/Lifetimes/etc)
+# Finding Constants Used in an Existing MC Sample (Masses/Lifetimes/etc)
 
 If you have a pre-existing MC sample and you want to find constants which have been used in its generation (typically the lifetime or masses of generated particles), The method to do this has two steps
 
diff --git a/second-analysis-steps/simulation-dkfiles.md b/second-analysis-steps/simulation-dkfiles.md
index 6c0b7326..ceee63ec 100644
--- a/second-analysis-steps/simulation-dkfiles.md
+++ b/second-analysis-steps/simulation-dkfiles.md
@@ -4,7 +4,7 @@
 
 {% endobjectives %}
 
-## Controlling the decay: DecFiles
+# Controlling the decay: DecFiles
 
 The `DecFile` controls the decay itself (i.e. what `EvtGen` does) as well as provide any event-type specific configuration (e.g. generator cuts).
 They exist in the `Gen/DecFile` package which we have already checked out and built in the beginning and are given as `.dec` files in `Gen/DecFiles/dkfiles`.
diff --git a/second-analysis-steps/simulation-fastsim.md b/second-analysis-steps/simulation-fastsim.md
index 8aa942e7..9c981409 100644
--- a/second-analysis-steps/simulation-fastsim.md
+++ b/second-analysis-steps/simulation-fastsim.md
@@ -44,5 +44,5 @@ cuts which do not depend on what Pythia did (i.e. Lorentz-invariant quantities s
 to split the generator cuts: One part is applied as usual, triggering a reset of the generation phase if not passed. The other part is applied immediately after
 EvtGen generated the signal decay. If the generated decay fails (e.g. the invariant mass combination of two children is below a threshold), the decay products are
 removed and a new decay is generated until the cuts are passed. This avoids rerunning Pythia unnecessarily and can lead to substantial CPU savings when studying rare
-particles (e.g. `$ \Lambda_b $`).
+particles (e.g. $\Lambda_b$).
 For implementation details, see https://twiki.cern.ch/twiki/bin/view/LHCb/GeneratorLevelEvtGenCuts
diff --git a/second-analysis-steps/simulation-gencuts.md b/second-analysis-steps/simulation-gencuts.md
index 4159c4fe..9136a960 100644
--- a/second-analysis-steps/simulation-gencuts.md
+++ b/second-analysis-steps/simulation-gencuts.md
@@ -9,7 +9,7 @@
 
 ## Adding a decay channel
 
-In order to make our decay model more realistic we can add in known resonances. E.g. instead of solely relying on phase-space, we can add in the prominent `$ \Phi\to K^{+}K^{-} $` resonance to hopefully. In the decfile, we can another line to the decay channels of the `$ D^0 $`:
+In order to make our decay model more realistic we can add in known resonances. E.g. instead of solely relying on phase-space, we can add in the prominent $\Phi\to K^{+}K^{-}$ resonance to hopefully. In the decfile, we can another line to the decay channels of the $D^0$:
 ```bash
 Decay MyD0                                                                                                                                                                                                                                                 
   0.5 K+ K- mu+ mu- PHSP;                                                                                                                                                                                                                                
@@ -17,7 +17,7 @@ Decay MyD0
 Enddecay                                                                                                                                                                                                                                                   
 CDecay MyantiD0                                                                                                                                                                                                                                            
 ```
-This triggers EvtGen to produce the `$ \phi $` resonance in 50% of the cases and `$ \phi $` is subsequently decayed. However, we have not told EvtGen that the `$ \phi $` should be decayed only to a `$ K^+K^- $`, hence it will randomly choose from all possible decays it knows about. Instead of modifying the common `phi`, we apply the same trick as we did for the `$ D^0 $` decay:
+This triggers EvtGen to produce the $\phi$ resonance in 50% of the cases and $\phi$ is subsequently decayed. However, we have not told EvtGen that the $\phi$ should be decayed only to a $K^+K^-$, hence it will randomly choose from all possible decays it knows about. Instead of modifying the common `phi`, we apply the same trick as we did for the $D^0$ decay:
 ```bash
 Alias MyPhi phi
 ChargeConj MyPhi MyPhi
@@ -32,7 +32,7 @@ Decay MyPhi
   1.000 K+  K-    VSS;
 Enddecay
 ```
-After changing the decfile, you have to rerun `make`. Try out the modified decfile with Gauss, you should see a large spike in `$ m(K^+K^-) $`.
+After changing the decfile, you have to rerun `make`. Try out the modified decfile with Gauss, you should see a large spike in $m(K^+K^-)$.
 
 ## Generator level cuts
 
diff --git a/second-analysis-steps/simulation-intro.md b/second-analysis-steps/simulation-intro.md
index 99b548e0..14e57fb6 100644
--- a/second-analysis-steps/simulation-intro.md
+++ b/second-analysis-steps/simulation-intro.md
@@ -4,7 +4,7 @@
 
 {% endobjectives %}
 
-## What is Gauss?
+# What is Gauss?
 
 Gauss is the LHCb simulation framework which manages the creation of simulated events by interfacing to multiple external applications. Most commonly, an event is created via the following procedure:
 
@@ -21,7 +21,7 @@ Gauss is the LHCb simulation framework which manages the creation of simulated e
 
 {% endcallout %}
 
-## Choosing your Gauss Version
+# Choosing your Gauss Version
 
 Each release of Gauss v49 represents a different release of Sim09. You may want to run an older version of Gauss to verify something in a pre-existing MC dataset or for any number of other reasons. Here is a table matching each Gauss v49 release to each Sim09 version:
 <!--
diff --git a/second-analysis-steps/simulation-running-gauss.md b/second-analysis-steps/simulation-running-gauss.md
index d373e00c..4fcde0d3 100644
--- a/second-analysis-steps/simulation-running-gauss.md
+++ b/second-analysis-steps/simulation-running-gauss.md
@@ -5,7 +5,7 @@
 
 {% endobjectives %} 
 
-## Which option files to use and how to run Gauss
+# Which option files to use and how to run Gauss
 
 Imagine you need to know the option files and software versions used for a simulated sample you have found in the bookkeeping, e.g.
 ```
@@ -25,7 +25,7 @@ produces the sample using Pythia 8 while `'$LBPYTHIA8ROOT/options/Pythia8.py'
 
 {% endcallout %}
 
-## Running Gauss and create a generator-only sample
+# Running Gauss and create a generator-only sample
 
 The production system handles the necessary settings for initial event- and runnumber and the used database tags. In a private production, you need to set these yourself in an additional options file, containing, for example:
 ```python
@@ -82,9 +82,9 @@ importOptions("$APPCONFIGOPTS/Gauss/Beam6500GeV-md100-2016-nu1.6.py")
 
 {% endcallout %}
 
-See if you can generate a generator level only sample for event type `27175000` ( `$ D^{*+} \to D^{0}(\to K^{+}K^{-}\mu^{+}\mu^{-})\pi^{+} $` )
+See if you can generate a generator level only sample for event type `27175000` ( $D^{*+} \to D^{0}(\to K^{+}K^{-}\mu^{+}\mu^{-})\pi^{+}$ )
 
-## Make an nTuple
+# Make an nTuple
 
 The `.xgen` file can be processed into something more usable (copied together from [here](https://gitlab.cern.ch/lhcb-datapkg/Gen/DecFiles/blob/master/CONTRIBUTING.md#)).
 A larger input file containing 50,000 generated events for event-type can be found on EOS: `root://eoslhcb.cern.ch//eos/lhcb/wg/dpa/wp7/Run2SK/simulation/Gauss-27175000-50000ev-20241216.xgen`.
diff --git a/second-analysis-steps/simulation.md b/second-analysis-steps/simulation.md
index 3d4cb1c2..ed7abf40 100644
--- a/second-analysis-steps/simulation.md
+++ b/second-analysis-steps/simulation.md
@@ -43,15 +43,14 @@ There is a chance the step creating the dec files repository didn't work on syst
 
 {% endcallout %}
 
-```eval_rst
-.. toctree::
-    :maxdepth: 3
-    :caption: Contents:
-
-    simulation-intro.md
-    simulation-running-gauss.md
-    simulation-dkfiles.md
-    simulation-gencuts.md
-    simulation-advanced-dkfiles.md
-    simulation-fastsim.md
+```{toctree}
+:maxdepth: 3
+:caption: Contents:
+
+simulation-intro.md
+simulation-running-gauss.md
+simulation-dkfiles.md
+simulation-gencuts.md
+simulation-advanced-dkfiles.md
+simulation-fastsim.md
 ```
diff --git a/second-analysis-steps/tistos-diy.md b/second-analysis-steps/tistos-diy.md
index 7b0e61e6..3ccf3edb 100644
--- a/second-analysis-steps/tistos-diy.md
+++ b/second-analysis-steps/tistos-diy.md
@@ -218,7 +218,7 @@ result = hlt1TisTosTool.tisTosTobTrigger()
 
 The (Tos) trigger efficiency of a trigger selection can be calculated as:
 
-`$ \epsilon_{\mathrm{Tos}}=N_{\mathrm{Tis}\&\mathrm{Tos}} / {N_{\mathrm{Tis}}} $`
+$\epsilon_{\mathrm{Tos}}=N_{\mathrm{Tis}\&\mathrm{Tos}} / {N_{\mathrm{Tis}}}$
 
 Loop over the events in the DST and calculate the efficiency of
 Hlt1TrackAllL0. You can add some more Hlt1 selecitons when checking for Tis,
diff --git a/self-guided-lessons/README.md b/self-guided-lessons/README.md
index dda35041..0cdbe091 100644
--- a/self-guided-lessons/README.md
+++ b/self-guided-lessons/README.md
@@ -8,11 +8,10 @@ If you have any problems or questions on these tutorials, you can [open an issue
 [lessons-repo]: https://github.com/lhcb/starterkit-lessons
 
 
-```eval_rst
-.. toctree::
-    :maxdepth: 3
-    :caption: Contents:
+```{toctree}
+:maxdepth: 3
+:caption: Contents:
 
-    htcondor.md
-    local-grid-proxy.md
+htcondor.md
+local-grid-proxy.md
 ```
diff --git a/self-guided-lessons/htcondor-first-job.md b/self-guided-lessons/htcondor-first-job.md
index 25331850..b18fbed1 100644
--- a/self-guided-lessons/htcondor-first-job.md
+++ b/self-guided-lessons/htcondor-first-job.md
@@ -45,7 +45,7 @@ Submitting job(s).
 
 This tells you that the job has been submitted, and gives you the cluster ID (essentially, this is a unique reference number that HTCondor assigns to the job(s) you submitted). If you look in the current directory, you should see that it has automatically created the three files specified in the submit file earlier (`output.out`, `error.err`, and `log.log`). As the job progresses, it will add to these files as appropriate.
 
-### Tracking your jobs
+## Tracking your jobs
 
 While we're waiting for the job to complete, we can track its progress from a terminal. There are multiple ways to do this, but here are some of the most useful:
 
diff --git a/self-guided-lessons/htcondor-more-options.md b/self-guided-lessons/htcondor-more-options.md
index fc40ab53..4f73d4ae 100644
--- a/self-guided-lessons/htcondor-more-options.md
+++ b/self-guided-lessons/htcondor-more-options.md
@@ -11,7 +11,7 @@
 
 The options showcased in submit file used in the previous section are enough for certain simpler jobs, but for many purposes you will want to do more. This section of the lesson will cover a selection of some of the more common and/or useful ones, and the settings you'll likely want to use for your own jobs. For more information, a complete list can be found [in the documentation](https://research.cs.wisc.edu/htcondor/manual/v7.6/condor_submit.html) for `condor_submit`.
 
-### Input files
+## Input files
 
 Often, you'll want your job to process data that's stored in another file. Create a file `input.txt` and add some lines of text to it. To test this, you can use the following bash script, placed in `exec.sh`:
 
@@ -66,7 +66,7 @@ Now you can submit the job, and check that this works as expected.
 
 {% endcallout %}
 
-### Output files
+## Output files
 
 HTCondor uses the name 'output file' to refer to different things in different contexts, so it's important to make it clear which is meant. One of these is the one that's been used so far, which refers to the one specified in the submit file (using the `output` option) which contains the STDOUT (terminal output) of your executable. The other use is for any new files that your executable may create as it runs. This part will cover the second of these.
 
@@ -108,7 +108,7 @@ when_to_transfer_output = ON_EXIT
 
 * `transfer_output_remaps`: allows for files to be transferred back with different names or to a file path other than the job's initial working directory. This is particularly useful if you have a script with a fixed output file name, and you want to run many jobs without these files overwriting each other (see the next section for more on how to handle multiple jobs).
 
-### Arguments
+## Arguments
 
 If your executable takes command-line arguments, you need to specify them in the submit file using the `arguments` option. Anything you put here will be passed as an argument to the executable.
 
@@ -146,7 +146,7 @@ This will cause the script to pass all arguments it receives along to the python
 
 Submitting the job now should give produce a file `result.txt` with all the text all in upper-case. You can also switch modes by changing the argument in the submit file from `upper` to `lower` - submitting this job should replace `result.txt` with a version in all lower-case.
 
-### Resources and Requirements
+## Resources and Requirements
 
 You can specify running conditions of your jobs in the submit file. The system for doing this uses what are known as ClassAd attributes, which are essentially properties of some part of the system (machines, jobs, the scheduler, etc.). You can use these to tell HTCondor about the requirements and preferences you have for your jobs. These can be specified in the submit file using the `requirements` and `rank` options respectively.
 
@@ -182,7 +182,7 @@ testmatch    = 3 days
 nextweek     = 1 week
 ```
 
-### Notifications
+## Notifications
 
 If you want, you can tell HTCondor to send you emails updating you on the progress of your job. In your submit file, add the line:
 
diff --git a/self-guided-lessons/htcondor-queue.md b/self-guided-lessons/htcondor-queue.md
index 1d6d0de5..1183c0b4 100644
--- a/self-guided-lessons/htcondor-queue.md
+++ b/self-guided-lessons/htcondor-queue.md
@@ -104,7 +104,7 @@ transfer_input_files = my_script.py, input/data_$(ProcId).txt
 {% endchallenge %}
 
 
-### More on variables
+## More on variables
 
 As you've seen above, the variables that HTCondor automatically defines provide a lot of potential for the automisation of large numbers of jobs. To expand on this, you are also able to define your own variables within your submit file using very similar syntax. Among other things, this is particularly useful for avoiding repetition - in the example above, the input and output file names had to be written in two places. Equivalent functionality can be achieved by using:
 
@@ -118,7 +118,7 @@ transfer_input_files = my_script.py, input/$(INPUTFN)
 transfer_output_files = $(OUTPUTFN)
 ```
 
-### Other ways to queue
+## Other ways to queue
 
 The above syntax for queuing multiple jobs is very simple and convenient for jobs with sequential enumeration starting from zero, but this isn't always the case. Fortunately, the queue command is extremely flexible and has many alternative choices of syntax. This part of the lesson showcases some of these options and how to use them. For more information and other examples, you can visit [the documentation for the queue command](https://research.cs.wisc.edu/htcondor/manual/v8.5/2_5Submitting_Job.html#SECTION00352000000000000000).
 
@@ -152,7 +152,7 @@ down,00083879
 
 This method of creating a list of input files is particularly convenient because it doesn't require you to manually list entries, or to define exactly how many files there are, as the previous methods have. Here, you can use wildcards to get HTCondor to automatically construct the list for you. For example, if you had a set of input `.dat` files and wanted to create a separate job for each one, you could use the command `queue INPUTFN matching files *.dat`.
 
-### Mixing options and variables
+## Mixing options and variables
 
 It may have become apparent by now that options and variables behave in a very similar way within a submit file - you provide parameters to an option (e.g. `transfer_input_files = input.dat`) using the same syntax as you use to provide a value to a variable (e.g. `MyCustomVar = 25`). You can think of options as being specific variables that HTCondor looks at to determine how to create and schedule your jobs.
 
diff --git a/self-guided-lessons/htcondor.md b/self-guided-lessons/htcondor.md
index 727bcf9e..ffa3004d 100644
--- a/self-guided-lessons/htcondor.md
+++ b/self-guided-lessons/htcondor.md
@@ -32,14 +32,13 @@ In order to decide where to place your jobs in the queue, HTCondor considers fac
 
 Your priority score is determined based on your recent usage. Submitting lots of intensive jobs will cause your score to increase quickly, but if you wait a little while afterwards it will gradually reduce back down *(N.B. a lower score means your jobs will have a higher priority)*.
 
-```eval_rst
-.. toctree::
-    :maxdepth: 3
-    :caption: Contents:
-
-    htcondor-first-job.md
-    htcondor-more-options.md
-    htcondor-queue.md
+```{toctree}
+:maxdepth: 3
+:caption: Contents:
+
+htcondor-first-job.md
+htcondor-more-options.md
+htcondor-queue.md
 ```
 
 ## System architecture
diff --git a/self-guided-lessons/local-grid-proxy.md b/self-guided-lessons/local-grid-proxy.md
index bfcdae9b..1baaab12 100644
--- a/self-guided-lessons/local-grid-proxy.md
+++ b/self-guided-lessons/local-grid-proxy.md
@@ -12,27 +12,27 @@ Fortunately VOMS can be easily installed and used without requiring the full LHC
 
 The VOMS client can be installed in the following ways, if your operating system isn't listed see [here](https://github.com/italiangrid/voms-clients) (and consider [making a pull request](/CONTRIBUTING) to update this guide).
 
-#### macOS with [homebrew](https://brew.sh)
+### macOS with [homebrew](https://brew.sh)
 
 ```bash
 brew cask install java
 brew install voms
 ```
 
-#### Ubuntu
+### Ubuntu
 
 ```bash
 sudo apt-get install voms-clients
 ```
 
-#### CentOS
+### CentOS
 
 ```bash
 sudo yum install epel-release
 sudo yum install voms-clients
 ```
 
-#### Arch Linux
+### Arch Linux
 
 Install [voms from the AUR](https://aur.archlinux.org/packages/voms/).