[1,3]\fnmThomas \surIzgin

[1]\orgdivDepartment of Mathematics, \orgnameUniversity of Kassel, \orgaddress\streetHeinrich-Plett-Str. 40, \postcode34132, \cityKassel, \countryGermany 2]\orgdivInstitute of Mathematics, \orgnameJohannes Gutenberg University Mainz, \orgaddress\streetStaudingerweg 9, \postcode55128, \cityMainz, \countryGermany 3]\orgdivDivision of Applied Mathematics, \orgaddress\orgnameBrown University, \cityProvidence, \stateRhode Island \postcode02906, \countryUSA

A Positivity-Preserving Relaxation Algorithm

izgin@mathematik.uni-kassel.de \fnmHendrik \surRanocha hendrik.ranocha@uni-mainz.de \fnmChi-Wang \surShu chi-wang_shu@brown.edu * [ [

Abstract

We combine Patankar-type methods with suitable relaxation procedures that are capable of ensuring correct dissipation or conservation of functionals such as entropy or energy while producing unconditionally positive and conservative approximations. To that end, we adapt the relaxation algorithm to enforce positivity by using either ideas from the dense output framework when a linear invariant must be preserved, or simply a geometric mean if the only constraint is positivity preservation. The latter merely requires the solution of a scalar nonlinear equation while former results in a coupled linear-nonlinear system of equations. We present sufficient conditions for the solvability of the respective equations. Several applications in the context of ordinary and partial differential equations are presented, and the theoretical findings are validated numerically.

keywords:

Positivity preservation, Relaxation methods, Entropy stability

pacs:

[

MSC Classification]65M06, 65M08, 65M20,65M22

1 Introduction

We consider initial-value problems (IVPs)

\displaystyle\mathbf{u}^{\prime}(t)=\mathbf{f}(\mathbf{u}(t)),\quad\mathbf{u}(t_{0})

\displaystyle=\mathbf{u}^{0}\in\mathbb{R}^{d},

(1)

either as classical ordinary differential equation (ODE) model on its own, or more typically obtained after discretizing a partial differential equation (PDE) in space. We are interested in two types of structures of the IVP (1). First, many applications require positive solutions, i.e., $\mathbf{u}(t)>\mathbf{0}$ for all $t\geq t_{0}$ if $\mathbf{u}^{0}>\mathbf{0}$ , where inequalities are understood component-wise. This occurs, for example, when modeling chemical reactions, population dynamics, or the density of fluids. Second, many problems are equipped with additional functionals of interest, such as Lyapunov functionals, energy, or entropy. We say that the IVP (1) is dissipative with respect to a smooth functional $\eta$ , if $\eta^{\prime}(\bf u)\bf f(\bf u)\leq 0$ , i.e.,

\frac{\mathrm{d}}{\mathrm{d}t}\eta(\mathbf{u}(t))\leq 0

for all solutions $\mathbf{u}(t)$ of (1). Similarly, (1) is conservative with respect to $\eta$ , if $\eta^{\prime}(\bf u)\bf f(\bf u)=0$ , i.e.,

\frac{\mathrm{d}}{\mathrm{d}t}\eta(\mathbf{u}(t))=0.

In addition to the (typically nonlinear) functional $\eta$ , many problems also conserve additional linear invariants, such as mass or momentum, which we also want to preserve on the discrete level.

When discretizing (1) in time using a one-step method, we would like to preserve these properties, i.e., we would like to have an unconditionally positive method satisfying

\mathbf{u}^{0}>\mathbf{0}\implies\mathbf{u}^{n}>\mathbf{0}\quad\text{for all }n\geq 0.

For many positive ODEs/PDEs, avoiding negative approximations is critical; such artifacts can lead to qualitatively incorrect solutions or the total failure of the numerical method [BBKS2007, sandu2001positive, STKB2005, SSPMPRK2]. Moreover, for dissipative problems, we would like to use a dissipative method that satisfies

\eta(\mathbf{u}^{n})\leq\eta(\mathbf{u}^{n-1})\leq\dotsc\leq\eta(\mathbf{u}^{0}).

(2)

Similarly, a conservative method applied to a conservative problem should satisfy

\eta(\mathbf{u}^{n})=\eta(\mathbf{u}^{n-1})=\dotsc=\eta(\mathbf{u}^{0}).

(3)

For convex $\eta$ , the implicit Euler method is a well-known example of an unconditionally positive and dissipative method. However, this analysis neglects possible positivity issues that can arise while solving the implicit equations as well as remaining errors of the nonlinear iterative solver.

Concerning positivity, the implicit Euler method is essentially the best method one can use in the class of general linear methods, since any unconditionally positive method can be at most first-order accurate [bolley1978conservation]. To address this challenge, several strategies have been proposed:

1.

Clipping techniques, which forcibly set negative values to zero, either result in a mass-shifting optimization problem or otherwise compromise conservation of linear invariants, and, to date, lack a proof of stability [BIM2022].
2.

Projection techniques [sandu2001positive, nusslein2021positivity] can be positive and conserve linear invariants, but they may result in step size constraints and/or reduced accuracy.
3.

Fully implicit, nonlinear methods [HR2020, ricchiuto2011habilitation] can enforce positivity but require costly iterative solvers, which may fail to converge (to a positive solution), and thus, still produce nonphysical results.
4.

Diagonally split Runge–Kutta (DSRK) methods [horvath_positivity_1998] can be unconditionally positive and with order higher than one. However, they are typically less accurate than the implicit Euler method in practice [macdonald2007].
5.

Adaptive methods [STKB2005] use root-finding procedures and adapt the time step size. This can be effective, but the resulting schemes are only conditionally positive.
6.

Strong stability preserving (SSP) methods [GKS2011] are positive if the explicit Euler method is positive under a certain time step restriction. However, only the implicit Euler method leads to unconditional positivity, and thus, all other SSP methods are only conditionally positive.
7.

Patankar-type methods represent a family of explicit or linearly implicit yet nonlinear schemes, which are unconditionally positive and can preserve certain linear invariants [Patankar1980, BDM2003, MCD2020, KM18, AKM2020].

In this work, we focus on Patankar-type schemes. The main idea behind them is to modify an existing time-stepping method by introducing nonlinear weights in such a way that the resulting numerical scheme becomes unconditionally positive. The primary challenge lies in designing these weights so that the modified scheme preserves the accuracy of the original (baseline) method. This nonlinear modification is achieved using the so-called Patankar-trick [Patankar1980], which gives this family of methods its name. A notable example is the incorporation of modified Patankar (MP) weights into classical Runge–Kutta (RK) schemes, leading to the development of modified Patankar–Runge–Kutta (MPRK) methods [BDM2003, KM18, KM18Order3], which in addition to being unconditionally positive, are also conservative. Motivated by their strong numerical performance, the Patankar-trick has since been successfully extended to a variety of time integration frameworks, including SSP Runge–Kutta (SSPRK) methods [SSPMPRK2, SSPMPRK3], arbitrary high-order Deferred Correction (DeC) schemes [MPDeC], generalized BBKS methods [AKM2020], GeCo schemes [MCD2020], and linear multistep methods [IMPV2025]. The resulting modified schemes all belong to the broader Patankar-type family, which can themselves be recast as non-standard additive Runge–Kutta (NSARK) methods, see [NSARK, IzginThesis].

Concerning the preservation (conservation/dissipation) of functionals $\eta$ , several results are available for linear schemes such as RK methods applied to linear problems [tadmor2002semidiscrete, ranocha2018L2stability, sun2017stability, sun2019strong, achleitner2024necessary, tadmor2025stability, sun2022energy] and fully-implicit methods [lefloch2002fully, friedrich2019entropy, burrage1979stability, burrage1980nonlinear, dahlby2011preserving]. There are also positive results on dissipative schemes if the problem is sufficiently dissipative [higueras2005monotonicity, jungel2015entropy, jungel2017entropy]. In the general case including conservative problems, however, results are restrictive and include many negative results [ranocha2021strong, ranocha2020energy]. Similarly to positivity preservation, postprocessing/projection methods can be used to enforce the desired conservation/dissipation properties of time integration methods [Shampine1986, grimm2005geometric, calvo2006preservation, calvo2010projection, laburta2015numerical]. In this work, we focus on the relaxation approach [ketcheson2019relaxation, ranocha2020relaxation, ranocha2020general], which can be used to enforce conservation/dissipation of functionals while preserving all linear invariants. The basic idea of relaxation methods goes back to [sanzserna1982explicit] and [dekker1984stability, pp. 265–266].

Thus, there are several studies and methods devoted to either positivity preservation or the preservation of functionals such as entropy, but to the best of our knowledge, there is no high-order method that can guarantee both properties simultaneously. The main contribution of this work is to design a modified relaxation algorithm capable of simultaneously preserving positivity and conservation/dissipation of functionals. To that end, we first equip unconditionally positivity-preserving NSARK methods with suitable estimates for dissipative entropies by applying the relaxation framework from [ranocha2020general]. While relaxation can be rendered positivity-preserving for dissipative problems with minor adjustments (see Remark 3), entropy-conservative problems require more sophisticated treatment. Leveraging dense output formulae for MPRK methods [izgin2024], we propose a modified relaxation step that ensures unconditional positivity. Furthermore, we introduce a bootstrapping technique to achieve arbitrarily high-order accuracy in time for MPRK schemes.

The remainder of the paper is structured as follows. We recall the relaxation technique from [ranocha2020general] in Section 2.1. In Section 2.2 we give a brief introduction to NSARK methods. After that, we explain in Section 3.1 how to apply the relaxation algorithm for NSARK schemes and entropy dissipative problems. The main result is given for the entropy-conservative case, see Section 3.2, where we equip different families of MP schemes with a positivity-preserving relaxation algorithm and present a bootstrapping technique to obtain arbitrary high order (in time) for MPRK schemes. Finally, we present several examples of ordinary and partial differential equations and validate our findings for second- and third-order MP schemes.

2 Preliminaries

In this section, we briefly review relaxation methods to preserve functionals $\eta$ and non-standard additive Runge–Kutta (NSARK) methods, which includes Patankar-type methods as a special case.

2.1 Classical Relaxation

One way to guarantee dissipation (2) or conservation (3) of functionals $\eta$ is the relaxation procedure explained in [ranocha2020general]. We are given a numerical one-step method of order $p\geq 2$ generating approximations $\mathbf{u}^{n}$ to $\mathbf{u}(t_{n})$ with a time step size of $\Delta t$ . We then have to repeat the following steps, starting with $n=0$ .

1.

Define the quantities $(t_{\text{old}},\mathbf{u}_{\text{old}},\eta_{\text{old}})\coloneqq(t_{n},\mathbf{u}^{n},\eta(\mathbf{u}^{n}))$ as well as $(t_{\text{new}},\mathbf{u}_{\text{new}})\coloneqq(t_{n+1},\mathbf{u}^{n+1})$ .
2.
- •
  
  For dissipative problems (1) compute a suitable estimate
  
  $\eta_{\text{new}}=\eta(\mathbf{u}_{\text{new}})+\mathcal{O}(\Delta t^{p+1}),\quad\Delta t\to 0.$
- •
  
  For conservative problems we can simply set $\eta_{\text{new}}\coloneqq\eta_{\text{old}}$ , since we arrive at $\eta_{\text{new}}=\eta(\mathbf{u}_{\text{old}})=\eta(\mathbf{u}(t_{n+1}))=\eta(\mathbf{u}_{\text{new}})+\mathcal{O}(\Delta t^{p+1})$ by means of an induction over $n$ .

Solve the system

\begin{pmatrix}t_{\gamma}^{n}\\ \mathbf{u}_{\gamma}^{n}\\ \eta(\mathbf{u}_{\gamma}^{n})\end{pmatrix}=\begin{pmatrix}[r]t_{\text{old}}\\ \mathbf{u}_{\text{old}}\\ \eta_{\text{old}}\end{pmatrix}+\gamma\begin{pmatrix}[r]t_{\text{new}}-t_{\text{old}}\\ \mathbf{u}_{\text{new}}-\mathbf{u}_{\text{old}}\\ \eta_{\text{new}}-\eta_{\text{old}}\end{pmatrix}

(4)

by inserting $\mathbf{u}_{\gamma}^{n}$ into the last equation and solving for $\gamma\approx 1$ , and then computing $t_{\gamma}^{n}$ and $\mathbf{u}_{\gamma}^{n}$ according to the remaining equations.

4.

Proceed with the numerical scheme using $t_{\gamma}^{n}$ and $\mathbf{u}_{\gamma}^{n}$ instead of $t_{n+1}$ and $\mathbf{u}^{n+1}$ .

For dissipative problems, the “suitable estimate $\eta_{\text{new}}$ ” must guarantee the discrete dissipativity (2) for the approximations from the relaxation procedure. We will introduce such a suitable estimate for NSARK methods that are based on ARK methods with a non-negative extended Butcher tableau in Section 3.1. For now, let us proceed by revisiting the main results from [ranocha2020general], assuming we have such an $\eta_{\text{new}}$ at hand.

Theorem 1 ([ranocha2020general, Theorem 2.13, Theorem 2.14]).

Consider the relaxation procedure (4) with a numerical method of order $p$ and $\Delta t>0$ sufficiently small. If

	$\eta\text{ is convex and }\eta^{\prime\prime}(\mathbf{u}_{\text{old}})(\mathbf{f}(\mathbf{u}_{\text{old}}),\mathbf{f}(\mathbf{u}_{\text{old}}))\neq 0\quad\text{ or }$		(5a)
	$\eta^{\prime}(\mathbf{u}_{\text{new}})\frac{\mathbf{u}_{\text{new}}-\mathbf{u}_{\text{old}}}{\\|\mathbf{u}_{\text{new}}-\mathbf{u}_{\text{old}}\\|}=c(\mathbf{u}_{\text{old}})\Delta t+\mathcal{O}(\Delta t^{2})\text{ with }c\neq 0,$		(5b)

then there exists a unique $\gamma=1+\mathcal{O}(\Delta t^{p-1})$ that satisfies (4). Additionally, the relaxation method is of order $p$ , that is $\mathbf{u}_{\gamma}^{n}=\mathbf{u}(t_{\gamma}^{n})+\mathcal{O}(\Delta t^{p+1})$ . In particular, there exist $\gamma_{1},\gamma_{2}>0$ such that

r(\gamma)\coloneqq\eta(\mathbf{u}_{\text{old}}+\gamma(\mathbf{u}_{\text{new}}-\mathbf{u}_{\text{old}}))-\left(\eta_{\text{old}}+\gamma(\eta_{\text{new}}-\eta_{\text{old}})\right)

(6)

satisfies $r(\gamma_{1})r(\gamma_{2})<0$ for $\Delta t>0$ small enough.

This theorem is the theoretical basis for the existence and uniqueness of the solution of the relaxation procedure (4). Unfortunately, the theorem does not give bounds on $\Delta t$ for the existence of the solution, so that computations may be rejected due to $\Delta t$ being too large.

Remark 1 (Issue with positivity).

The main issue of positivity-preservation with the above relaxation algorithm is that the update

\mathbf{u}^{n}_{\gamma}=\mathbf{u}^{n}+\gamma(\mathbf{u}^{n+1}-\mathbf{u}^{n})=\gamma\mathbf{u}^{n+1}+(1-\gamma)\mathbf{u}^{n}

is not necessarily positivity-preserving for $\gamma>1$ , even if the baseline method is positive.

Nevertheless, to overcome this issue, we propose to use unconditionally positive¹¹1That is, $\mathbf{u}^{n}>\mathbf{0}$ component-wise implies $\mathbf{u}^{n+1}>\mathbf{0}$ for all $\Delta t>0$ . time integrators. In the upcoming sections, we introduce the methods of interest for this work, all of which may be recast as so-called non-standard additive Runge–Kutta schemes.

2.2 Non-standard Additive Runge–Kutta Methods

Non-standard additive Runge–Kutta methods (NSARK) methods are applied to an IVP (1), where the right-hand side is split into a sum, that is

\displaystyle\mathbf{u}^{\prime}(t)=\mathbf{f}(\mathbf{u}(t))=\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}\mathbf{f}^{[\nu]}(\mathbf{u}(t)),\quad\mathbf{u}(t_{0})

\displaystyle=\mathbf{u}^{0}\in\mathbb{R}^{d}.

(7)

Already for traditional additive Runge–Kutta (ARK) methods, including Implicit-Explicit (IMEX) Runge–Kutta (RK) methods [Crouzeix1980, ARS1997], the main idea is to apply very different RK schemes determined by $\mathbf{A}^{[\nu]}=(a_{ij}^{[\nu]})_{i,j=1,\dotsc,s}$ , $\mathbf{b}^{[\nu]}=(b_{1}^{[\nu]},\dotsc,b_{s}^{[\nu]})$ , $\mathbf{c}^{[\nu]}=(c_{1}^{[\nu]},\dotsc,c_{s}^{[\nu]})^{T}$ to the different addends $\mathbf{f}^{[\nu]}$ . For internal consistency, we require that the different RK schemes actually do not differ in the abscissa, i.e.

c_{i}=c_{i}^{[\nu]}=\sum_{j=1}^{s}a_{ij}^{[\nu]}

(8)

for $i=1,\dotsc,s$ and $\nu=1,\dotsc,N$ , see [SG2015]. However, for autonomous IVPs (7), this has no effect on the resulting ARK method, which in this case reads

	$\displaystyle\mathbf{u}^{(i)}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}a^{[\nu]}_{ij}\mathbf{f}^{[\nu]}(\mathbf{u}^{(j)}),\quad i=1,\dotsc,s,$		(9)
	$\displaystyle\mathbf{u}^{n+1}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}b^{[\nu]}_{j}\mathbf{f}^{[\nu]}(\mathbf{u}^{(j)}),$		(9)

and the corresponding extended Butcher tableau is given by

\begin{array}[]{c|c|c|c|c}\mathbf{c}&\mathbf{A}^{[1]}&\mathbf{A}^{[2]}&\cdots&\mathbf{A}^{[N]}\\ \hline\cr&\mathbf{b}^{[1]}&\mathbf{b}^{[2]}&\cdots&\mathbf{b}^{[N]}\end{array}

with $\mathbf{c}=(c_{1},\dotsc,c_{s})^{T}$ .

NSARK methods now differ from ARK schemes (9) in that their extended Butcher tableau is allowed to also depend on the step size and the solution. In particular, NSARK methods applied to (7) are of the form

	$\displaystyle\mathbf{u}^{(i)}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}a^{[\nu]}_{ij}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{f}^{[\nu]}(\mathbf{u}^{(j)}),\quad i=1,\dotsc,s,$		(10)
	$\displaystyle\mathbf{u}^{n+1}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}b^{[\nu]}_{j}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{f}^{[\nu]}(\mathbf{u}^{(j)}),$		(10)

where $\mathbf{U}^{n}=(\mathbf{u}^{n}\rotatebox[origin={c}]{90.0}{$\dabar@\dabar@\dabar@$}\mathbf{u}^{(1)}\rotatebox[origin={c}]{90.0}{$\dabar@\dabar@\dabar@$}\dotsc\rotatebox[origin={c}]{90.0}{$\dabar@\dabar@\dabar@$}\mathbf{u}^{(s)}\rotatebox[origin={c}]{90.0}{$\dabar@\dabar@\dabar@$}\mathbf{u}^{n+1})\in\mathbb{R}^{d\times s+2}$ .

In the case of gBBKS [AKM2020], Geometric Conservative (GeCo) [MCD2020], both of which may be interpreted as NSRK schemes, as well as modified Patankar–Runge–Kutta (MPRK) [KM18, KM18Order3] methods, the same RK scheme is used for the treatment of the different addends in (7) and only the solution-dependent terms vary. For MP strong-stability-preserving RK (MPSSPRK) schemes, the situation is different, see Section 2.2.3. In this work we focus on modified Patankar (MP) schemes in the entropy-conservative case and leave gBBKS and GeCo methods for future works.

2.2.1 Production-Destruction-Rest Systems

The application of modified Patankar (MP) schemes is restricted to production-destruction-rest (PDRS) systems

u_{k}^{\prime}(t)=r^{P}_{k}(\mathbf{u}(t))-r^{D}_{k}(\mathbf{u}(t))+\sum_{\nu=1}^{d}(p_{k\nu}(\mathbf{u}(t))-d_{k\nu}(\mathbf{u}(t))),\quad k=1,\dotsc,d

with $p_{k\nu}=d_{\nu k}$ and $r^{P}_{k},r^{D}_{k},p_{k\nu},d_{k\nu}\geq 0$ on $\mathbb{R}^{d}_{>0}$ . We note that this is only a formal restriction since every autonomous system with real-valued right-hand sides can be rewritten as such a PDRS [IR2023]. Now, one can recover the function $\mathbf{f}^{[\nu]}$ in (7) and specify the solution-dependent Butcher coefficients. Indeed, according to [IzginThesis, Remark 2.25], a PDRS can be written in terms of (7) by using the convention $p_{kk}=d_{kk}=0$ and choosing $N=d+1$ as well as

	$\displaystyle\mathbf{f}^{[N]}(\mathbf{u}(t))$	$\displaystyle=(r_{1}^{P}(\mathbf{u}(t)),\dotsc,r_{d}^{P}(\mathbf{u}(t)))^{T}$		(11)
	$\displaystyle f^{[\nu]}_{k}(\mathbf{u}(t))$	$\displaystyle=$		(11)

for $k,\nu=1,\dotsc,d$ .

2.2.2 Modified Patankar–Runge–Kutta Schemes

With (11), every MPRK scheme [BDM2003, KM18, KM18Order3] that is based on a single explicit RK method with a non-negative Butcher array can be expressed in terms of an NSARK scheme using

	$\displaystyle a^{[\nu]}_{ij}(\mathbf{U}^{n},t_{n},\Delta t)$	$\displaystyle=a_{ij}\gamma_{\nu}^{[i]}(\mathbf{U}^{n},t_{n},\Delta t),$		(12)
	$\displaystyle b^{[\nu]}_{j}(\mathbf{U}^{n},t_{n},\Delta t)$	$\displaystyle=b_{j}\delta_{\nu}(\mathbf{U}^{n},t_{n},\Delta t),$		(12)

where

	$\displaystyle\gamma_{\nu}^{[i]}(\mathbf{U}^{n},t_{n},\Delta t)$	$\displaystyle=\,\,\text{ and }$		(13)
	$\displaystyle\delta_{\nu}(\mathbf{U}^{n},t_{n},\Delta t)$	$\displaystyle=$		(13)

are the so-called non-standard weights (NSWs). Here, $\pi_{\nu}^{(i)}$ and $\sigma_{\nu}$ denote the so-called Patankar-weight denominators (PWDs) and can be chosen for the particular MPRK method to ensure stability and accuracy, see [KM18, IzginThesis] for more insights. If the Butcher array contains negative entries, more care is needed when defining the MPRK method, see e.g. [MPDeC].

Example 1 (Second-order Family).

The second-order family of MPRK schemes, denoted by MPRK22( $\alpha$ ), is given by

	$\displaystyle u_{k}^{(1)}=$	$\displaystyle\,u_{k}^{n},$
	$\displaystyle u_{k}^{(2)}=$	$\displaystyle\,u_{k}^{n}+\alpha\Delta t\left(r^{P}_{k}(\mathbf{u}^{(1)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(1)})\frac{u_{\nu}^{(2)}}{u_{\nu}^{n}}-\left(r^{D}_{k}(\mathbf{u}^{(1)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(1)})\right)\frac{u_{k}^{(2)}}{u_{k}^{n}}\right),$
	$\displaystyle u_{k}^{n+1}=$	$\displaystyle\,u_{k}^{n}+\Delta t\sum_{j=1}^{2}b_{j}\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{u_{\nu}^{n+1}}{(u_{\nu}^{(2)})^{\frac{1}{\alpha}}(u_{\nu}^{n})^{1-\frac{1}{\alpha}}}\right.$
		$\displaystyle\hphantom{u_{k}^{n}+\Delta t\sum_{j=1}^{2}b_{j}}-\left.\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{u_{k}^{n+1}}{(u_{k}^{(2)})^{\frac{1}{\alpha}}(u_{k}^{n})^{1-\frac{1}{\alpha}}}\right)$

with $k=1,\dotsc,d$ , $\alpha\geq\frac{1}{2}$ and $b_{2}=\tfrac{1}{2\alpha}$ as well as $b_{1}=1-b_{2}$ . In terms of the previous notation, we are using the Butcher array

\displaystyle\begin{array}[]{c|cc}0&&\\ \alpha&\alpha&\\ \hline\cr&1-\frac{1}{2\alpha}&\frac{1}{2\alpha}\end{array}

and the PWDs

\pi_{k}^{(2)}=u_{k}^{n},\quad\sigma_{k}=(u_{k}^{(2)})^{\frac{1}{\alpha}}(u_{k}^{n})^{1-\frac{1}{\alpha}}.

In this work, we will focus on $\alpha=1$ as suggested by [IssuesMPRK].

Example 2 (Third-order Family).

There are two third-order families of MPRK schemes, see [KM18Order3]. One of them is based on the Butcher array

\displaystyle\begin{array}[]{c|ccc}0&&&\\ \alpha&\alpha&&\\ \beta&\frac{3\alpha\beta(1-\alpha)-\beta^{2}}{\alpha(2-3\alpha)}&\frac{\beta(\beta-\alpha)}{\alpha(2-3\alpha)}&\\ \hline\cr&1+\frac{2-3(\alpha+\beta)}{6\alpha\beta}&\frac{3\beta-2}{6\alpha(\beta-\alpha)}&\frac{2-3\alpha}{6\beta(\beta-\alpha)}\end{array}

(14)

see [KM18Order3] for more details on the domain of $\alpha$ and $\beta$ . The PWDs are given by

$\displaystyle\pi_{\nu}^{(2)}=$	$\displaystyle u_{\nu}^{n},\qquad\pi_{\nu}^{(3)}=\,(u_{\nu}^{(2)})^{\frac{1}{p}}(u_{\nu}^{n})^{1-\frac{1}{p}},$	(15)
$\displaystyle\sigma_{k}=$	$\displaystyle u_{k}^{n}+\Delta t\sum_{j=1}^{2}\beta_{j}\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{\sigma_{\nu}}{(u_{\nu}^{(2)})^{\frac{1}{a_{21}}}(u_{\nu}^{n})^{1-\frac{1}{a_{21}}}}\right.$
	$\displaystyle\hskip 56.9055pt-\left.\left(r_{k}^{D}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{\sigma_{k}}{(u_{k}^{(2)})^{\frac{1}{a_{21}}}(u_{k}^{n})^{1-\frac{1}{a_{21}}}}\right)$

for $\nu,k=1,\dotsc,d$ , where $\beta_{1}=1-\beta_{2}$ , $\beta_{2}=\frac{1}{2a_{21}}$ , and $p=3a_{21}\left(a_{31}+a_{32}\right)b_{3}$ . Note that $\bm{\sigma}$ requires the solution of another linear system, which is why this family is denoted by MPRK43I( $\alpha,\beta$ ). In this work we focus on $\alpha=0.5$ and $\beta=0.75$ .

In any case we point out that the schemes are implicit due to the numerators in (13). Indeed, they are linearly implicit as the PWDs $\pi_{\nu}^{(i)}$ and $\sigma_{\nu}$ are required to be independent of the numerator [BDM2003, IzginThesis]. Consequently, an MPRK scheme can be written in matrix-vector notation as follows.

	$\displaystyle\mathbf{M}^{(i)}\mathbf{u}^{(i)}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{i-1}a_{ij}\mathbf{r}^{P}(\mathbf{u}^{(j)}),\quad i=1,\dotsc,s,$		(16)
	$\displaystyle\mathbf{M}\mathbf{u}^{n+1}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}b_{j}\mathbf{r}^{P}(\mathbf{u}^{(j)}),$		(16)

where $\mathbf{r}^{P}=(r_{1}^{P},\dotsc,r_{d}^{P})^{T}$ and $\mathbf{M}^{(i)}=(m^{(i)}_{k\nu})_{1\leq k,\nu\leq d}$ with

	$\displaystyle m^{(i)}_{kk}$	$\displaystyle=1+\Delta t\sum_{j=1}^{i-1}a_{ij}\left(r_{k}^{D}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{1}{\pi_{k}^{(i)}},$		(17)
	$\displaystyle m^{(i)}_{k\nu}$	$\displaystyle=-\Delta t\sum_{j=1}^{i-1}a_{ij}p_{k\nu}(\mathbf{u}^{(j)})\frac{1}{\pi_{\nu}^{(i)}},\quad k\neq\nu$		(17)

as well as, using $\mathbf{M}=(m_{k\nu})_{1\leq k,\nu\leq d}$ ,

	$\displaystyle m_{kk}$	$\displaystyle=1+\Delta t\sum_{j=1}^{s}b_{j}\left(r_{k}^{D}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{1}{\sigma_{k}},$		(18)
	$\displaystyle m_{k\nu}$	$\displaystyle=-\Delta t\sum_{j=1}^{s}b_{j}p_{k\nu}(\mathbf{u}^{(j)})\frac{1}{\sigma_{\nu}},\quad k\neq\nu.$		(18)

2.2.3 MP Strong-Stability-Preserving-Runge–Kutta Schemes

Although there exist second- and third-order MPSSPRK schemes, see [SSPMPRK2, SSPMPRK3], we focus for simplicity on the second-order method and the conservative PDS case. The generalization to non-conservative PDRS is straightforward but complicates the formulae. Also, the consideration of third-order MPSSPRK schemes will be left out for future work. The two-parameter family of second-order MPSSPRK schemes from [SSPMPRK2] is given by

$\displaystyle\mathbf{u}^{(1)}=$	$\displaystyle\mathbf{u}^{n},$	(19)
$\displaystyle u_{i}^{(2)}=$	$\displaystyle u_{i}^{n}+\beta\Delta t\left(\sum_{j=1}^{d}p_{ij}(\mathbf{u}^{n})\frac{u_{j}^{(2)}}{u_{j}^{n}}-\sum_{j=1}^{d}d_{ij}(\mathbf{u}^{n})\frac{u_{i}^{(2)}}{u_{i}^{n}}\right),$
$\displaystyle u_{i}^{n+1}={}$	$\displaystyle(1-\alpha)u_{i}^{n}+\alpha u_{i}^{(2)}+\Delta t\Biggl(\sum_{j=1}^{d}\left(\beta_{20}p_{ij}(\mathbf{u}^{n})+\beta_{21}p_{ij}(\mathbf{u}^{(2)})\right)\frac{u_{j}^{n+1}}{(u_{j}^{n})^{1-s}(u_{j}^{(2)})^{s}}$
	$\displaystyle-\sum_{j=1}^{d}\left(\beta_{20}d_{ij}(\mathbf{u}^{n})+\beta_{21}d_{ij}(\mathbf{u}^{(2)})\right)\frac{u_{i}^{n+1}}{(u_{i}^{n})^{1-s}(u_{i}^{(2)})^{s}}\Biggr),$

where $\beta_{20}=1-\frac{1}{2\beta}-\alpha\beta$ , $\beta_{21}=\frac{1}{2\beta}$ and $s=\frac{1-\alpha\beta+\alpha\beta^{2}}{\beta(1-\alpha\beta)}$ . There, the free parameters $\alpha$ and $\beta$ are subject to

0\leq\alpha\leq 1,\quad\beta>0,\quad\alpha\beta+\frac{1}{2\beta}\leq 1.

(20)

We refer to the above scheme as MPSSPRK2( $\alpha,\beta$ ). For numerical experiments we use $\alpha=\frac{1}{2}$ and $\beta=1$ [SSPMPRK2].

Substituting the second stage into the update, we can collect production and destruction terms. Hence, in the notation of (11), the solution-dependent coefficients for the conservative PDS case are

	$\displaystyle a_{21}^{[\nu]}(\mathbf{U}^{n},t_{n},\Delta t)$	$\displaystyle=\beta\frac{u_{\nu}^{(2)}}{u^{n}_{\nu}}$		(21)
	$\displaystyle b_{1}^{[\nu]}(\mathbf{U}^{n},t_{n},\Delta t)$	$\displaystyle=\alpha\beta\frac{u_{\nu}^{(2)}}{u^{n}_{\nu}}+\beta_{20}\frac{u_{\nu}^{n+1}}{\sigma_{\nu}},\quad b_{2}^{[\nu]}(\mathbf{U}^{n},t_{n},\Delta t)=\beta_{21}\frac{u_{\nu}^{n+1}}{\sigma_{\nu}},$		(21)

where $\sigma_{\nu}=(u_{\nu}^{n})^{1-s}(u_{\nu}^{(2)})^{s}$ .

3 Positivity-Preserving Relaxation Technique

In what follows, we adapt the classical relaxation algorithm from Section 2.1 such that it becomes positivity-preserving.

3.1 Entropy Dissipative Case

First of all, we present a suitable estimate $\eta_{\text{new}}$ for a general NSARK scheme. In order to minimize the computational effort, we propose to re-use the computed stage values of the NSARK scheme satisfying (12), that is

$\displaystyle\mathbf{u}^{(i)}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}a_{ij}\gamma^{[i]}_{\nu}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{f}^{[\nu]}(\mathbf{u}^{(j)}),\quad i=1,\dotsc,s,$	(22)
$\displaystyle\mathbf{u}^{n+1}$	$\displaystyle=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}b_{j}\delta_{\nu}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{f}^{[\nu]}(\mathbf{u}^{(j)}),$
$\displaystyle\eta_{\text{new}}$	$\displaystyle=\eta(\mathbf{u}^{n})+\Delta t\sum_{j=1}^{s}b_{j}(\eta^{\prime}\mathbf{f})(\mathbf{u}^{(j)}),$

which can be interpreted as computing the numerical approximation of the augmented system

\frac{\mathrm{d}}{\mathrm{d}t}\begin{pmatrix}\mathbf{u}(t)\\ \eta(\mathbf{u}(t))\end{pmatrix}=\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{N}\underbrace{\begin{pmatrix}\mathbf{f}^{[\nu]}(\mathbf{u}(t))\\ 0\end{pmatrix}}_{\eqqcolon\hat{\mathbf{f}}^{[\nu]}(\mathbf{u}(t))}+\underbrace{\begin{pmatrix}\mathbf{0}\\ (\eta^{\prime}\mathbf{f})(\mathbf{u}(t))\end{pmatrix}}_{\eqqcolon\hat{\mathbf{f}}^{[N+1]}(\mathbf{u}(t))}

using an NSARK method with the extended Butcher tableau

\begin{array}[]{c|c|c|c|c|c}\mathbf{c}&\bm{\Gamma}_{1}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{A}&\bm{\Gamma}_{2}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{A}&\cdots&\bm{\Gamma}_{N}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{A}&\mathbf{A}\\ \hline\cr&\delta_{1}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{b}&\delta_{2}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{b}&\cdots&\delta_{N}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{b}&\mathbf{b}\end{array},

(23)

where $\bm{\Gamma}_{\nu}\coloneqq\operatorname{diag}(\gamma^{[1]}_{\nu},\dotsc,\gamma^{[s]}_{\nu})$ . Assuming that the two corresponding base methods described by the Butcher tableaux

\begin{array}[]{c|c|c|c|c}\mathbf{c}&\bm{\Gamma}_{1}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{A}&\bm{\Gamma}_{2}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{A}&\cdots&\bm{\Gamma}_{N}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{A}\\ \hline\cr&\delta_{1}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{b}&\delta_{2}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{b}&\cdots&\delta_{N}(\mathbf{U}^{n},t_{n},\Delta t)\mathbf{b}\end{array}\quad\text{ and }\quad\begin{array}[]{c|c}\mathbf{c}&\mathbf{A}\\ \hline\cr&\mathbf{b}\end{array}

both are of $p$ -th order for some $p\in\{2,3,4\}$ , it can be seen from [NSARK, Theorem 18,Lemma 25,Lemma 26] that the overall scheme (22) is of order $p$ , since the respective order conditions are decoupled with respect to the columns of the tableau (23).

Remark 2.

The NSWs from MPSSPRK schemes, see (21), also satisfy (12) after multiplying and dividing by the respective Butcher coefficient.

As a result, we indeed obtain that $\eta_{\text{new}}=\eta(\mathbf{u}_{\text{new}})+\mathcal{O}(\Delta t^{p+1})$ , and additionally,

\eta_{\text{new}}\leq\eta(\mathbf{u}^{n})

(24)

whenever $b_{j}\geq 0$ for $j=1,\dotsc,s$ . If $b_{j}<0$ for some $j=1,\dotsc,s$ , one can still use Gauß quadrature, as suggested in [ranocha2020general, Page 866] together with the unconditionally positive dense output formulae derived in [izgin2024] to obtain the approximations needed for the quadrature formula.

In view of Remark 1, the relaxation technique is in danger of not being positivity-preserving for $\gamma>1$ . The following corollary gives a work around for dissipative problems as we will discuss in the upcoming Remark 3.

Corollary 1 ([ranocha2020general, Pages 882-883]).

If $\eta$ is convex with $\eta^{\prime\prime}(\mathbf{u}_{\text{old}})(\mathbf{f}(\mathbf{u}_{\text{old}}),\mathbf{f}(\mathbf{u}_{\text{old}}))\neq 0$ then $r$ from (6) is convex and satisfies $r(0)=0$ , $r^{\prime}(0)<0$ and $r^{\prime}(\gamma)>0$ for all $\gamma\geq 1$ and $\Delta t>0$ small enough.

Remark 3 (Positivity-preserving relaxation for convex $\eta$ ).

Suppose that $\gamma^{*}>1$ is the solution to (4), i.e. $r(\gamma^{*})=0$ , so that the positivity of the relaxation update $\mathbf{u}_{\gamma}^{n}$ is not guaranteed any longer. Because of Corollary 1 we know that $r(\gamma)\leq r(\gamma^{*})=0$ for all $\gamma\in[1,\gamma^{*}]$ . In particular $r(1)\leq 0$ follows, i. e., for $\gamma=1$ we obtain from (6) the relation

\eta(\mathbf{u}_{\gamma}^{n})=\eta(\mathbf{u}_{\text{new}})\leq\eta_{\text{new}}.

This means, that due to (24) only more dissipation will be introduced by using

\gamma=\min\{\gamma^{*},1\}\in(0,1],

where $\gamma^{*}$ is the solution to (4). But with that choice, $\mathbf{u}_{\gamma}^{n}$ is again a convex combination of positive data, and hence, positivity preservation is guaranteed for dissipative problems with a convex $\eta$ .

3.2 Entropy-Conservative Case

As $\mathbf{u}_{\gamma}^{n}$ in (4) is not guaranteed to be positivity-preserving for $\gamma>1$ , see Remark 1, we propose to replace the update formula by a positivity-preserving variant. To indicate this difference in our notation we will write $\mathbf{u}^{n+\gamma}$ for a positivity-preserving approximation rather than $\mathbf{u}_{\gamma}^{n}$ .

3.2.1 Explicit Positivity-Preserving Procedure

If we are only interested in preserving positivity, a single nonlinear invariant but no further linear invariants, we may apply a Patankar–Runge–Kutta method to guarantee the positivity of the update and combine it with the geometric mean

\mathbf{u}^{n+\gamma}=(\mathbf{u}^{n+1})^{\gamma}(\mathbf{u}^{n})^{1-\gamma}.

(25)

In logarithmic variables, this reduces to

\ln(\mathbf{u}^{n+\gamma})=\ln(\mathbf{u}^{n})+\gamma(\ln(\mathbf{u}^{n+1})-\ln(\mathbf{u}^{n})),

where we can find a solution to the relaxation problem in logarithmic variables according to the classical theory. Now, if $\eta$ is convex and non-decreasing in each argument the composition $\eta\circ\exp$ is also convex [boyd2004convex, Section 2.3.4], and hence, also

\eta(\mathbf{u}^{n+\gamma})=\eta_{\text{old}}

possesses a positive solution $\gamma=1+\mathcal{O}(\Delta t^{p-1})$ .

3.2.2 Implicit Positivity-Preserving Procedure for Conservative PDS

One possible candidate for computing $\mathbf{u}^{n+\gamma}$ is to use dense output formulae recently developed in [izgin2024], which we briefly recall in the upcoming section.

Positivity-Preserving Dense Output

We first focus on Runge–Kutta methods and the MP variant, but the ideas can be carried out for MPSSPRK schemes in a straightforward manner as we will see.

The main idea is to replace $b_{j}\in\mathbb{R}$ by a function $\bar{b}_{j}\colon[0,1]\to\mathbb{R}$ such that

u^{n+\gamma}_{k}=u^{n}_{k}+\Delta t\sum_{j=1}^{s}\bar{b}_{j}(\gamma)\sum_{\nu=1}^{d}\left(r_{k}^{P}(\mathbf{u}^{(j)})-r_{k}^{D}(\mathbf{u}^{(j)})+p_{k\nu}(\mathbf{u}^{(j)})-d_{k\nu}(\mathbf{u}^{(j)})\right)

approximates $u_{k}(t^{n}+\gamma\Delta t)$ . We impose

\bar{b}_{j}(0)=0\quad\text{and}\quad\bar{b}_{j}(1)=b_{j}

to recover

\mathbf{u}^{n+\gamma}=\begin{cases}\mathbf{u}^{n},&\gamma=0,\\ \mathbf{u}^{n+1},&\gamma=1.\end{cases}

Example 3 (Second-order dense output for MPRK22( $\alpha$ )).

Using $\bar{b}_{j}(\gamma)=\gamma b_{j}$ and

u^{n+\gamma}_{k}=u_{k}^{n}+\Delta t\sum_{j=1}^{s}\bar{b}_{j}(\gamma)\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{u_{k}^{n+1}}{\sigma_{k}}\right)

yields a positivity-preserving dense output. Indeed, we find $\mathbf{u}^{n+\gamma}=(1-\gamma)\mathbf{u}^{n}+\gamma\mathbf{u}^{n+1}$ in this case, which coincides with the relaxation update. For $\gamma>1$ this is not necessarily positivity-preserving. For our purposes, we want to ensure positivity even for $\gamma>1$ , which can be done using

u^{n+\gamma}_{k}=u_{k}^{n}+\Delta t\sum_{j=1}^{s}\bar{b}_{j}(\gamma)\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{u_{k}^{n+\gamma}}{\bar{\sigma}_{k}(\gamma)}\right)

(26)

together with

\bar{\sigma}_{k}(\gamma)=\sigma_{k}=(u_{k}^{(2)})^{\frac{1}{\alpha}}(u_{k}^{n})^{1-\frac{1}{\alpha}}

(27)

\bar{\sigma}_{k}(\gamma)=(u_{k}^{(2)})^{\frac{\gamma}{\alpha}}(u_{k}^{n})^{1-\frac{\gamma}{\alpha}}.

(28)

Example 4 (Higher order positive dense output for MPRK).

In general, one may use $\bar{b}_{j}(\gamma)$ from the classical dense output formula paired with the update (26). Then, the only quantity to define is $\bar{\bm{\sigma}}(\gamma)$ . According to [izgin2024], it is sufficient to use a lower order dense output MPRK scheme for the computation of $\bar{\bm{\sigma}}(\gamma)$ . For instance, we note that third-order MPRK schemes are equipped with

\bar{b}_{1}(\gamma)=\gamma-(1-b_{1})\gamma^{2},\quad\bar{b}_{j}(\gamma)=\gamma^{2}b_{j},\quad j=2,\dotsc,s

for the dense output. However, we will discuss a different approach in this work, and thus, omit to also recall $\bar{\bm{\sigma}}$ from [izgin2024].

Example 5 (Second-order dense output for MPSSPRK).

Let us incorporate the $\gamma$ -dependency in (21), which gives

\displaystyle b_{1}^{[\nu]}(\mathbf{U}^{n},t_{n},\Delta t,\gamma)

\displaystyle=\gamma\left(\alpha\beta\frac{u_{\nu}^{(2)}}{u^{n}_{\nu}}+\beta_{20}\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}\right),\quad b_{2}^{[\nu]}(\mathbf{U}^{n},t_{n},\Delta t,\gamma)=\gamma\beta_{21}\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)},

where we restrict to the choice $\bar{\sigma}_{\nu}(\gamma)=(u_{\nu}^{n})^{1-\gamma s}(u_{\nu}^{(2)})^{\gamma s}$ . Then

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n}+\Delta t\sum_{j=1}^{s}\sum_{\begin{subarray}{c}\nu=1\end{subarray}}^{d}b_{j}(\mathbf{U}^{n},t_{n},\Delta t,\gamma)\mathbf{f}^{[\nu]}(\mathbf{u}^{(j)}).

Remark 4 (Use of dense output for relaxation).

As we will show, we can use the dense output from Example 3 for the relaxation algorithm. However, proving the existence of a solution to the relaxation equation for (27) is more complex than for (28), which is due to the respective truncation errors. Moreover, as illustrated in Example 4, the bootstrapping for higher order positive dense output involves higher degree polynomials for $\bar{b}_{j}$ , which may not be positive for $\gamma>1$ . This is crucial since the solvability for the linear systems (16) relies on positive Butcher coefficients. While we could implement a trick [MPDeC] to overcome this issue, we rather focus on a different bootstrapping approach to keep the overall algorithm simple.

Preparatory Results for MPRK22( $\alpha$ )

We proceed to develop a relaxation technique using (26)-(27), which is more complicated than using (28) but on the other hand motivates us to derive more general results. To that end, we note that the scheme with (26)-(27) can be written in matrix-vector notation as

$\displaystyle\mathbf{u}^{(1)}$	$\displaystyle=\mathbf{u}^{n}$	(29)
$\displaystyle\mathbf{M}^{(2)}(\mathbf{u}^{n})\mathbf{u}^{(2)}$	$\displaystyle=\mathbf{u}^{n}+\alpha\Delta t\mathbf{r}^{P}(\mathbf{u}^{n})$
$\displaystyle\mathbf{M}_{\gamma}(\mathbf{u}^{n})\mathbf{u}^{n+\gamma}$	$\displaystyle=\mathbf{u}^{n}+\gamma\Delta t\sum_{j=1}^{s}b_{j}\mathbf{r}^{P}(\mathbf{u}^{(j)}),\quad\gamma>0,$

where $\mathbf{M}^{(2)}$ can be obtained from (17) and

\mathbf{M}_{\gamma}=\gamma(\mathbf{M}-\mathbf{I})+\mathbf{I}

(30)

with $\mathbf{M}$ from (18).

Finally, the relaxation step (4) for entropy-conservative problems is now updated to

\begin{pmatrix}t_{\gamma}^{n}\\ \mathbf{M}_{\gamma}(\mathbf{u}_{\text{old}})\mathbf{u}^{n+\gamma}\\ \eta(\mathbf{u}^{n+\gamma})\end{pmatrix}=\begin{pmatrix}[r]t_{\text{old}}\\ \mathbf{u}_{\text{old}}\\ \eta_{\text{old}}\end{pmatrix}+\gamma\begin{pmatrix}[r]t_{\text{new}}-t_{\text{old}}\\ \Delta t\sum_{j=1}^{2}b_{j}\mathbf{r}^{P}(\mathbf{u}^{(j)})\\ 0\end{pmatrix},

(31)

resulting in a coupled linear-nonlinear system for the simultaneous computation of $\gamma$ and $\mathbf{u}^{n+\gamma}$ . Note that if such a $\gamma>0$ exists, the relaxation method for MPRK22( $\alpha$ ) naturally is of the correct order for all $\gamma$ as we are using an appropriate dense output formula.

Since we allow for a truncation error of $\mathcal{O}(\Delta t^{3})$ for the second-order MPRK22 $(\alpha$ ) scheme, it is beneficial to prove the following

Lemma 1.

If $\bar{b}_{j}(\gamma)=\gamma b_{j}$ and $\bar{\sigma}_{k}(\gamma)=\sigma_{k}$ , then the MPRK22( $\alpha$ ) scheme (26) satisfies

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)\Delta t\mathbf{d}^{n}_{\gamma}+\mathcal{O}(\Delta t^{3}),

(32)

where

	$\displaystyle d^{n}_{\gamma,k}$	$\displaystyle=\frac{u_{k}^{n+1}-u_{k}^{n}}{\Delta t}+\Delta t\gamma\sum_{j=1}^{s}b_{j}\left(\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{f_{\nu}(\mathbf{u}^{n})}{u_{\nu}^{n}}-\left(r_{k}^{D}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{f_{k}(\mathbf{u}^{n})}{u_{k}^{n}}\right)$		(33)
		$\displaystyle=f_{k}(\mathbf{u}^{n})+\mathcal{O}(\Delta t).$		(33)

Proof.

Utilizing [KM2019, Lemma 2, Lemma 3], we observe

\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}=\frac{u_{\nu}^{n+\gamma}}{(u_{\nu}^{(2)})^{\frac{1}{\alpha}}(u_{\nu}^{n})^{1-\frac{1}{\alpha}}}=1+(\gamma-1)\Delta t\frac{f_{\nu}(\mathbf{u}^{n})}{u_{\nu}^{n}}+\mathcal{O}(\Delta t^{2})=\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}+(\gamma-1)\Delta t\frac{f_{\nu}(\mathbf{u}^{n})}{u_{\nu}^{n}}+\mathcal{O}(\Delta t^{2})

(34)

as $\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}=1+\mathcal{O}(\Delta t^{2})$ [NSARK]. Substituting this into (26) we receive

	$\displaystyle u^{n+\gamma}_{k}$	$\displaystyle=u_{k}^{n}+\gamma\Delta t\sum_{j=1}^{s}b_{j}\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{u_{k}^{n+\gamma}}{\bar{\sigma}_{k}(\gamma)}\right)$
		$\displaystyle=u_{k}^{n}+\gamma(u_{k}^{n+1}-u_{k}^{n})$
		$\displaystyle+\gamma(\gamma-1)\Delta t^{2}\sum_{j=1}^{s}b_{j}\left(\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{f_{\nu}(\mathbf{u}^{n})}{u_{\nu}^{n}}-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{f_{k}(\mathbf{u}^{n})}{u_{k}^{n}}\right)+\mathcal{O}(\Delta t^{3})$
		$\displaystyle=u_{k}^{n+1}+(\gamma-1)(u_{k}^{n+1}-u_{k}^{n})$
		$\displaystyle+\gamma(\gamma-1)\Delta t^{2}\sum_{j=1}^{s}b_{j}\left(\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{f_{\nu}(\mathbf{u}^{n})}{u_{\nu}^{n}}-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{f_{k}(\mathbf{u}^{n})}{u_{k}^{n}}\right)+\mathcal{O}(\Delta t^{3})$
		$\displaystyle=u_{k}^{n+1}+(\gamma-1)\Delta td^{n}_{\gamma,k}+\mathcal{O}(\Delta t^{3}).$

∎

Remark 5 (Influence of $\bar{\bm{\sigma}}$ and application to MPSSPRK).

In the situation of Lemma 1, if we use (28) rather than (27), then instead of (34) we obtain

\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}=1+\mathcal{O}(\Delta t^{2})=\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}+\mathcal{O}(\Delta t^{2})

using the same technique, and finally

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)\Delta t\underbrace{\frac{\mathbf{u}^{n+1}-\mathbf{u}^{n}}{\Delta t}}_{\eqqcolon\mathbf{d}^{n}}+\mathcal{O}(\Delta t^{3}).

(35)

Here $\mathbf{d}^{n}_{\gamma}=\mathbf{d}^{n}$ is independent of $\gamma$ . We note that the PWDs for MPSSPRK2 are similar to the MPRK case, see Section 2.2.3. Indeed, one can show that (35) also holds for the second-order MPSSPRK family.

While the scheme using (26)-(27) is the motivation for the general formulation of our main result in the upcoming section, the scheme (26),(28) will be the basis for the bootstrapping algorithm to obtain higher order, see Section 3.2.2.

Main Result for Entropy Conservation and Positivity Preservation

We assume that the method can be written in the form

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)\Delta t\mathbf{d}^{n}_{\gamma}(\Delta t)+\mathcal{O}(\Delta t^{p+1})

with $p\geq 2$ being the order of the baseline method and some suitable $\mathbf{d}^{n}_{\gamma}(\Delta t)$ depending on the method. Now, since $\mathbf{d}$ generally also depends on $\gamma$ , we need the preparatory

Lemma 2.

Let $h\colon U\times V\to\mathbb{R}$ ,

h(\lambda,\gamma)\coloneqq\lambda-(\gamma-1)\Delta t\lVert\mathbf{d}_{\gamma}^{n}(\Delta t)\rVert_{2}

(36)

where $U\times V$ is an open neighborhood of $(0,1)$ .

If $\mathbf{d}_{\gamma}^{n}(\Delta t)$ is a $\mathcal{C}^{1}$ map on $V$ w.r.t. $\gamma$ with $\lVert\mathbf{d}_{1}^{n}(\Delta t)\rVert_{2}\neq 0$ for $\Delta t$ small enough, then there exist a neighborhood $\widetilde{U}$ of $0$ and a continuous function $\widetilde{\gamma}\colon\widetilde{U}\to\mathbb{R}$ such that $h(\lambda,\widetilde{\gamma}(\lambda))=0$ for all $\lambda\in\widetilde{U}$ and $\Delta t>0$ small enough.

Proof.

We have $h(0,1)=0$ and

\partial_{\gamma}h(\lambda,\gamma)=-\Delta t\lVert\mathbf{d}_{\gamma}^{n}(\Delta t)\rVert_{2}-(\gamma-1)\Delta t\frac{(\partial_{\gamma}\mathbf{d}_{\gamma}^{n}(\Delta t))^{T}\mathbf{d}_{\gamma}^{n}(\Delta t)}{\lVert\mathbf{d}_{\gamma}^{n}(\Delta t)\rVert_{2}}.

Thus, $\partial_{\gamma}h(0,1)=-\Delta t\lVert\mathbf{d}_{1}^{n}(\Delta t)\rVert_{2}\neq 0$ for $\Delta t$ small enough. The assertion then follows from the implicit function theorem. ∎

With that, we are positioned to prove the main theorem for the relaxation technique.

Theorem 2.

In the situation of Lemma 2, let

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)\Delta t\mathbf{d}^{n}_{\gamma}(\Delta t)+\mathcal{O}(\Delta t^{p+1})

with $p\geq 2$ being the order of the baseline method and suppose $\eta\in\mathcal{C}^{2}$ with

\eta^{\prime}(\mathbf{u}^{n+1})\cdot\frac{\mathbf{d}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t)}{\lVert\mathbf{d}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t)\rVert_{2}}=c(\mathbf{u}^{n},\mu)\Delta t+\mathcal{O}(\Delta t^{2})

for $\mathinner{\!\left\lvert\mu\right\rvert}$ small enough and

\lim_{\mu\to 0}c(\mathbf{u}^{n},\mu)\eqqcolon\widetilde{c}(\mathbf{u}^{n})\neq 0.

Then the equation

\eta(\mathbf{u}^{n+\gamma})-\eta_{\text{old}}=0

possesses a positive solution $\gamma$ . Furthermore, if $\lVert\mathbf{d}_{\gamma}^{n}(\Delta t)\rVert_{2}=\mathcal{O}(\Delta t^{q})$ , then there exists a unique positive solution satisfying $\gamma=1+\mathcal{O}(\Delta t^{p-1-q})$ .

Proof.

We set

\mathbf{w}_{\gamma}^{n}(\Delta t)\coloneqq\frac{\mathbf{d}_{\gamma}^{n}(\Delta t)}{\lVert\mathbf{d}_{\gamma}^{n}(\Delta t)\rVert_{2}}

and follow the proof of [CLM2010, Theorem 2] by analyzing the function

z(\Delta t,\mu)\coloneqq\Delta t^{-2}\left(\eta\left(\mathbf{u}^{n+\widetilde{\gamma}(\Delta t\mu)}\right)-\eta_{\text{old}}\right),\quad\Delta t\neq 0.

(37)

The idea is to deduce that

z(\Delta t,\mu)=\mu c(\mathbf{u}^{n},\mu)+\frac{\mu^{2}}{2}(\mathbf{w}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t))^{T}H_{\eta}(\mathbf{u}^{n+1})\mathbf{w}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t)+\mathcal{O}(\Delta t),

(38)

where $H_{\eta}$ denotes the Hessian. Then, since $\widetilde{\gamma}(0)=1$ , we have

z(0,\mu)\coloneqq\lim_{\Delta t\to 0}z(\Delta t,\mu)=\mu\widetilde{c}(\mathbf{u}^{n})+\frac{\mu^{2}}{2}(\mathbf{w}^{n}_{1}(0))^{T}H_{\eta}(\mathbf{u}^{n})\mathbf{w}^{n}_{1}(0)\quad\text{and}\quad\partial_{\mu}z(0,0)=\widetilde{c}(\mathbf{u}^{n})\neq 0.

According to the proof of [CLM2010, Theorem 2], there exist $\Delta t^{*}>0$ and a unique function $\mu\mathrel{\mathop{\ordinarycolon}}[0,\Delta t^{*}]\to\mathbb{R}$ such that $z(\Delta t,\mu(\Delta t))=0$ for all $0\leq\Delta t\leq\Delta t^{*}$ . Indeed, because of

z(\Delta t,0)=\Delta t^{-2}\left(\eta(\mathbf{u}^{n+1})-\eta_{\text{old}}\right)

it can be deduced along the same lines that $\mu=\mathcal{O}(\Delta t^{p-1})$ .

To prove (38) we first note that

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+\lambda\mathbf{w}_{\gamma}^{n}(\Delta t)+\mathcal{O}(\Delta t^{p+1}),

(39)

where

\lambda\coloneqq(\gamma-1)\Delta t\lVert\mathbf{d}_{\gamma}^{n}(\Delta t)\rVert_{2}.

(40)

Next, as $h(\lambda,\gamma)=0$ we can use Lemma 2 to solve (40) for $\gamma$ and plug it into (39) resulting in a function of $\lambda$ only, i.e.

\mathbf{u}^{n+\widetilde{\gamma}(\lambda)}=\mathbf{u}^{n+1}+\lambda\mathbf{w}_{\widetilde{\gamma}(\lambda)}^{n}(\Delta t)+\mathcal{O}(\Delta t^{p+1}).

In the following, we denote by $\mathbf{u}(t)$ the exact local solution at $t$ satisfying $\mathbf{u}(t_{n})=\mathbf{u}^{n}$ and recall that we are considering an entropy-conservative problem. Hence, with $\lambda\eqqcolon\Delta t\mu$ and the assumptions on $\eta^{\prime}$ we receive

	$\displaystyle z(\Delta t,\mu)$	$\displaystyle=\Delta t^{-2}\left(\eta(\mathbf{u}^{n+\widetilde{\gamma}(\Delta t\mu)})-\eta(\mathbf{u}(t_{n}+\Delta t)))\right)$
		$\displaystyle=\Delta t^{-2}\left(\eta(\mathbf{u}^{n+1}+\Delta t\mu\mathbf{w}_{\widetilde{\gamma}(\Delta t\mu)}^{n}(\Delta t))-\eta(\mathbf{u}(t_{n}+\Delta t)))\right)+\mathcal{O}(\Delta t^{p-1})$
		$\displaystyle=\frac{\eta(\mathbf{u}^{n+1})-\eta(\mathbf{u}(t_{n}+\Delta t))+\Delta t\mu\eta^{\prime}(\mathbf{u}^{n+1})\mathbf{w}_{\widetilde{\gamma}(\Delta t\mu)}^{n}(\Delta t)}{\Delta t^{2}}$
		$\displaystyle\hphantom{==}+\frac{\mu^{2}}{2}(\mathbf{w}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t))^{T}H_{\eta}(\mathbf{u}^{n+1})\mathbf{w}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t)+\mathcal{O}(\Delta t)$
		$\displaystyle=\mu c(\mathbf{u}^{n},\mu)+\frac{\mu^{2}}{2}(\mathbf{w}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t))^{T}H_{\eta}(\mathbf{u}^{n+1})\mathbf{w}^{n}_{\widetilde{\gamma}(\Delta t\mu)}(\Delta t)+\mathcal{O}(\Delta t).$

Finally, since $\mathcal{O}(\Delta t^{p-1})=\mu=\frac{\lambda}{\Delta t}=(\gamma-1)\lVert\mathbf{d}_{\gamma}^{n}(\Delta t)\rVert_{2}$ we deduce that $\gamma=1+\mathcal{O}(\Delta t^{p-1-q})$ . ∎

Now if we are given such a solution $\gamma=1+\mathcal{O}(\Delta t^{p-1})$ we can deduce that the relaxation update is of order $p$ as the following lemma shows.

Lemma 3.

If $\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)(\mathbf{u}^{n+1}-\mathbf{u}^{n})+\mathcal{O}(\Delta t^{p+1})$ with a $p$ -th order baseline method and $\gamma=1+\mathcal{O}(\Delta t^{p-1})$ , then

\mathbf{u}^{n+\gamma}=\mathbf{u}(t_{n}+\gamma\Delta t)+\mathcal{O}(\Delta t^{p+1}).

Proof.

This is just Lemma [ranocha2020general, Lemma 2.7], where the relaxation method is perturbed by an additive error of $\mathcal{O}(\Delta t^{p+1})$ . ∎

Bootstrapping Algorithm for Positivity-Preserving Relaxation

The main idea for generalizing the relaxation technique to higher order is to use the observation from Remark 5, where

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)\Delta t\underbrace{\frac{\mathbf{u}^{n+1}-\mathbf{u}^{n}}{\Delta t}}_{\eqqcolon\mathbf{d}^{n}}+\mathcal{O}(\Delta t^{p+1}).

Since $\mathbf{d}^{n}$ now is independent of $\gamma$ , there is no need for Lemma 2 as we have an explicit expression for $\widetilde{\gamma}$ , i.e. $\widetilde{\gamma}(\lambda)=\frac{\lambda+\Delta t\lVert\mathbf{d}^{n}\rVert_{2}}{\Delta t\lVert\mathbf{d}^{n}\rVert_{2}}$ and we can apply Theorem 2 giving us $\gamma=1+\mathcal{O}(\Delta t^{p-1})$ . Also, Remark 4 motivates us to start a bootstrapping algorithm using the functions $\bar{b}_{j}(\gamma)=\gamma b_{j}$ also for higher order. This seems to contradict our results from the theory on dense output, but in combination with relaxation, the issue can be resolved, see Remark 6 below.

Now in view of the following lemma, we can bootstrap the relaxation technique to higher orders.

Lemma 4.

Consider a scheme of the form (26) with $\bar{b}_{j}(\gamma)=\gamma b_{j}$ and

\displaystyle\bar{\bm{\sigma}}(\gamma)=\mathbf{u}(t_{n}+\Delta t)+(\gamma-1)(\mathbf{u}(t_{n}+\Delta t)-\mathbf{u}(t_{n}))+\mathcal{O}(\Delta t^{p})

(41)

for all $\gamma$ in a neighborhood $V$ of $1$ . Then

\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}=1+\mathcal{O}(\Delta t^{p})=\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}+\mathcal{O}(\Delta t^{p}),

and in particular,

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)(\mathbf{u}^{n+1}-\mathbf{u}^{n})+\mathcal{O}(\Delta t^{p+1})

for all $\gamma\in V$ .

Remark 6.

Before we prove this lemma, we want to stress two things. First, using (41) with $\bar{b}_{j}(\gamma)=\gamma b_{j}$ does not result in a higher-order dense output formula, but only guarantees to obtain the desired order at the root $\gamma$ of $\eta(\mathbf{u}^{n+\gamma})=\eta_{\text{old}}$ , see Theorem 2 and Lemma 3.

Secondly, the bootstrapping process consists of using $\mathbf{u}^{n+\gamma}$ from the scheme of order $p-1$ as the new $\bar{\bm{\sigma}}(\gamma)$ resulting in a new method of order $p$ . We can start the bootstrapping process by using the second-order MPRK22( $\alpha$ ) scheme as a baseline method together with (28). Note that this naturally results in nested functions that depend on $\gamma$ , which should be kept in mind when implementing the Newton iteration.

Proof of Lemma 4.

We prove this claim by induction over $q=1,\dotsc,p$ and exploit [IzginThesis, Lemma 4.6,Lemma 4.8] to justify the implication

\mathbf{u}^{n+\gamma}=\bar{\bm{\sigma}}(\gamma)+\mathcal{O}(\Delta t^{q})\Longrightarrow\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}=1+\mathcal{O}(\Delta t^{q}),\quad\nu=1,\dotsc,N.

For $q=1$ we find $\mathbf{u}^{n+\gamma}=\mathbf{u}^{n}+\mathcal{O}(\Delta t)$ and $\bar{\bm{\sigma}}(\gamma)=\mathbf{u}^{n}+\mathcal{O}(\Delta t)$ since $\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}=\mathcal{O}(1)$ due to [IzginThesis, Lemma 4.6].

Now suppose that (41) holds with $\mathcal{O}(\Delta t^{q})$ and some $q\geq 2$ . By the induction hypothesis we have

\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}=1+\mathcal{O}(\Delta t^{q-1})=\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}+\mathcal{O}(\Delta t^{q-1})

where we used $\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}=1+\mathcal{O}(\Delta t^{p})$ and $p\geq q$ for the last equality, see e.g. [NSARK, Lemma 16]. Substituting this into (26), we see

\mathbf{u}^{n+\gamma}=\mathbf{u}^{n+1}+(\gamma-1)(\mathbf{u}^{n+1}-\mathbf{u}^{n})+\mathcal{O}(\Delta t^{q}).

Finally, since (41) holds with $\mathcal{O}(\Delta t^{q})$ , we deduce

\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}=1+\mathcal{O}(\Delta t^{q})=\frac{u_{\nu}^{n+1}}{\sigma_{\nu}}+\mathcal{O}(\Delta t^{q}).

∎

Example 6 (Third-Order Relaxation for Conservative Problems using MPRK).

Looking at the third-order MPRK family from Example 2, the relaxation scheme is fully defined by setting

	$\displaystyle\bar{\sigma}_{k}(\gamma)=$	$\displaystyle u_{k}^{n}+\gamma\Delta t\sum_{j=1}^{2}\beta_{j}\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{\bar{\sigma}_{\nu}(\gamma)}{(u_{\nu}^{(2)})^{\frac{\gamma}{a_{21}}}(u_{\nu}^{n})^{1-\frac{\gamma}{a_{21}}}}\right.$		(42)
		$\displaystyle\hskip 56.9055pt-\left.\left(r_{k}^{D}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{\bar{\sigma}_{k}(\gamma)}{(u_{k}^{(2)})^{\frac{\gamma}{a_{21}}}(u_{k}^{n})^{1-\frac{\gamma}{a_{21}}}}\right)$		(42)

for $k=1,\dotsc,d$ , $\beta_{1}=1-\beta_{2}$ , and $\beta_{2}=\frac{1}{2a_{21}}$ .

Applying Newton’s Method

As an illustrative example, we focus on (26) with $\bar{b}_{j}(\gamma)=\gamma b_{j}$ , i.e.

\begin{aligned} u^{n+\gamma}_{k}=u_{k}^{n}&+\Delta t\gamma\sum_{j=1}^{s}b_{j}\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{u_{k}^{n+\gamma}}{\bar{\sigma}_{k}(\gamma)}\right)\\ \end{aligned},

(43)

and that

	$\displaystyle(\mathbf{M}_{\gamma})_{kk}$	$\displaystyle=1+\gamma\Delta t\sum_{j=1}^{s}b_{j}\left(r_{k}^{D}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{1}{\bar{\sigma}_{k}(\gamma)},$		(44)
	$\displaystyle(\mathbf{M}_{\gamma})_{k\nu}$	$\displaystyle=-\gamma\Delta t\sum_{j=1}^{s}b_{j}p_{k\nu}(\mathbf{u}^{(j)})\frac{1}{\bar{\sigma}_{\nu}(\gamma)},\quad k\neq\nu.$		(44)

in (31).

Starting with (43) we deduce

	$\displaystyle\frac{\mathrm{d}}{\mathrm{d}\gamma}u^{n+\gamma}_{k}=\Delta t\sum_{j=1}^{s}\bar{b}_{j}^{\prime}(\gamma)$	$\displaystyle\left(r_{k}^{P}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\frac{u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\frac{u_{k}^{n+\gamma}}{\bar{\sigma}_{k}(\gamma)}\right)$
	$\displaystyle+\Delta t\sum_{j=1}^{s}\bar{b}_{j}(\gamma$	$\displaystyle)\left(\sum_{\nu=1}^{d}p_{k\nu}(\mathbf{u}^{(j)})\left(\frac{\frac{\mathrm{d}}{\mathrm{d}\gamma}u_{\nu}^{n+\gamma}}{\bar{\sigma}_{\nu}(\gamma)}-\frac{u_{\nu}^{n+\gamma}\bar{\sigma}_{\nu}^{\prime}(\gamma)}{(\bar{\sigma}_{\nu}(\gamma))^{2}}\right)\right.$
		$\displaystyle\left.-\left(r^{D}_{k}(\mathbf{u}^{(j)})+\sum_{\nu=1}^{d}d_{k\nu}(\mathbf{u}^{(j)})\right)\left(\frac{\frac{\mathrm{d}}{\mathrm{d}\gamma}u_{k}^{n+\gamma}}{\bar{\sigma}_{k}(\gamma)}-\frac{u_{k}^{n+\gamma}\bar{\sigma}_{k}^{\prime}(\gamma)}{(\bar{\sigma}_{k}(\gamma))^{2}}\right)\right).$

To rewrite this in matrix-vector notation, we denote by “ $\oslash$ ” and “ $\odot$ ” the component-wise division and multiplication (Hadamard division and product), respectively. Then, using

\mathbf{v}(\gamma)\coloneqq\mathbf{u}^{n+\gamma}\odot\bar{\bm{\sigma}}^{\prime}(\gamma)\oslash(\bar{\bm{\sigma}}(\gamma)),

we end up with

\displaystyle\mathbf{M}_{\gamma}(\mathbf{u}^{n})\frac{\mathrm{d}}{\mathrm{d}\gamma}\mathbf{u}^{n+\gamma}=

\displaystyle\frac{1}{\gamma}(\mathbf{u}^{n+\gamma}-\mathbf{u}^{n})+(\mathbf{M}_{\gamma}(\mathbf{u}^{n})-\mathbf{I})\mathbf{v}(\gamma).

(45)

Note that if $\bar{\bm{\sigma}}(\gamma)=\bm{\sigma}$ , then $\mathbf{v}(\gamma)=\mathbf{0}$ . Also note that the derivative of $\bar{\bm{\sigma}}$ from (42) itself satisfies an analogue equation to (45) as it represents an MPRK22( $\alpha$ ) relaxation method of order $2$ .

Also, the system for MPSSPRK22 is the same; one only has to plug in the expressions $\mathbf{M}_{\gamma}$ and $\bar{\bm{\sigma}}$ for that particular scheme.

4 Numerical Experiments

In this section we apply our new relaxation algorithm to dissipative and conservative problems to validate our theoretical findings and to experimentally test the constraints on $\Delta t$ for solving the system (31). We note that we use Newton’s method for the computation of $\gamma$ , if not stated otherwise. Also, we use (27) as default for MPRK22 $(\alpha$ ). Also, we may use a PID controller with parameters $\beta_{1}=0.7$ , $\beta_{2}=0.4$ , and $\beta_{3}=0$ , see [IR2023] for more details. The resulting method is denoted by MPRK22adap. We note that our implementation of the relaxation algorithm, which can be found in our reproducibility repository [IRS2026repository], is also adaptive in the sense that successful relaxation steps increase the $\Delta t$ by 1% while unsuccessful searches for $\gamma$ result in a 10% decrease of the time step size. We refer to the repository [IRS2026repository] for the implemented abortion criteria.

4.1 Lotka-Volterra System

The classical Lotka-Volterra system

	$\displaystyle u_{1}^{\prime}(t)$	$\displaystyle=2u_{1}(t)-u_{1}(t)u_{2}(t),d$
	$\displaystyle u_{2}^{\prime}(t)$	$\displaystyle=u_{1}(t)u_{2}(t)-u_{2}(t),\quad\mathbf{u}(0)=(2,2)^{T}$

can be written as a non-conservative PDRS with

r_{1}^{P}=2u_{1},\quad p_{21}=u_{1}u_{2}=d_{12},\quad r_{2}^{D}=u_{2}.

The entropy

\eta(\mathbf{u})=\log(u_{1})-u_{1}+2\log(u_{2})-u_{2}

is conserved. Since the Lotka-Volterra system has periodic orbits, we expect improved numerical results using relaxation to conserve the entropy [cano1997error, cano1998error, calvo2011error].

Refer to caption — Figure 1: Numerical solution of Lotka Volterra problem using MPRK22(1) (top) and $\Delta t=1$ . The error is $\text{err}^{n}=\max(\lvert u_{1}^{n}-u_{1}^{\text{ref},n}\rvert,\lvert u_{2}^{n}-u_{2}^{\text{ref},n}\rvert)$ . Left: without relaxation. Right: with relaxation.

Although there are only positivity constraints, $\eta$ is not non-decreasing for all $\mathbf{u}>\mathbf{0}$ in this example, which is why we use the default relaxation algorithm. As expected, we observe that relaxation improves the error growth of the base method from quadratic to linear, see Figure 1.

4.2 Stratospheric Reaction Problem

The stratospheric reaction problem [sandu2001positive] is a stiff system of ODEs $\mathbf{w}^{\prime}(t)=\mathbf{f}(t,\mathbf{w}(t))$ describing the interaction of the constituents $\mathbf{w}=(w_{1},\dotsc,w_{6})=(O^{1D},O,O_{3},O_{2},NO,NO_{2})$ . This non-conservative PDS possesses two linear invariants determined by the vectors $\widetilde{\mathbf{n}}_{1}=(1,1,3,2,1,2)^{T}$ and $\widetilde{\mathbf{n}}_{2}=(0,0,0,0,1,1)^{T}$ . In order to apply MPRK schemes to this problem, we scale the corresponding differential equations writing

\operatorname{diag}(\widetilde{\mathbf{n}}_{1})\mathbf{w}^{\prime}(t)=\operatorname{diag}(\widetilde{\mathbf{n}_{1}})\mathbf{f}(t,\operatorname{diag}(\widetilde{\mathbf{n}}_{1})^{-1}\operatorname{diag}(\widetilde{\mathbf{n}}_{1})\mathbf{w}(t)).

Hence, introducing $\mathbf{u}(t)=\operatorname{diag}(\widetilde{\mathbf{n}}_{1})\mathbf{w}(t)$ , the two linear invariants of the differential equations $\mathbf{u}^{\prime}(t)=\mathbf{f}(t,\operatorname{diag}(\widetilde{\mathbf{n}}_{1})^{-1}\mathbf{u}(t))$ are $\mathbf{n}_{1}=(1,1,1,1,1,1)^{T}$ and $\mathbf{n}_{2}=(0,0,0,0,1,\tfrac{1}{2})^{T}$ . Moreover, the scaled system takes the form

$\displaystyle u_{1}^{\prime}$	$\displaystyle=\tfrac{1}{3}r_{5}(t,\mathbf{u})-(r_{6}(\mathbf{u})+\tfrac{1}{3}r_{7}(\mathbf{u})),$	(46)
$\displaystyle u_{2}^{\prime}$	$\displaystyle=r_{1}(t,\mathbf{u})+\tfrac{1}{3}r_{3}(t,\mathbf{u})+r_{6}(\mathbf{u})+\tfrac{1}{2}r_{10}(t,\mathbf{u})-(\tfrac{1}{2}r_{2}(\mathbf{u})+\tfrac{1}{3}r_{4}(\mathbf{u})+\tfrac{1}{2}r_{9}(\mathbf{u})+r_{11}(\mathbf{u})),$
$\displaystyle u_{3}^{\prime}$	$\displaystyle=\tfrac{3}{2}r_{2}(\mathbf{u})-(r_{3}(t,\mathbf{u})+r_{4}(\mathbf{u})+r_{5}(t,\mathbf{u})+r_{7}(\mathbf{u})+r_{8}(\mathbf{u})),$
$\displaystyle u_{4}^{\prime}$	$\displaystyle=\tfrac{2}{3}r_{3}(t,\mathbf{u})+\tfrac{4}{3}r_{4}(\mathbf{u})+\tfrac{2}{3}r_{5}(t,\mathbf{u})+\tfrac{4}{3}r_{7}(\mathbf{u})+\tfrac{2}{3}r_{8}(\mathbf{u})+r_{9}(\mathbf{u})-(r_{1}(t,\mathbf{u})+r_{2}(\mathbf{u})),$
$\displaystyle u_{5}^{\prime}$	$\displaystyle=\tfrac{1}{2}r_{9}(\mathbf{u})+\tfrac{1}{2}r_{10}(t,\mathbf{u})-(\tfrac{1}{3}r_{8}(\mathbf{u})r_{11}(\mathbf{u})),$
$\displaystyle u_{6}^{\prime}$	$\displaystyle=\tfrac{2}{3}r_{8}(\mathbf{u})+2r_{11}(\mathbf{u})-(r_{9}(\mathbf{u})+r_{10}(t,\mathbf{u})),$

where

$\displaystyle r_{1}(t,\mathbf{u})$	$\displaystyle=k_{1}(t)u_{4},$	$\displaystyle k_{1}(t)$	$\displaystyle=\sigma(T(t))^{3}\cdot 643\cdot 0^{-10},$
$\displaystyle r_{2}(t,\mathbf{u})$	$\displaystyle=k_{2}u_{2}u_{4},$	$\displaystyle k_{2}$	$\displaystyle=018\cdot 0^{-17},$
$\displaystyle r_{3}(t,\mathbf{u})$	$\displaystyle=k_{3}(t)u_{3},$	$\displaystyle k_{3}(t)$	$\displaystyle=\sigma(T(t))\cdot 120\cdot 0^{-4},$
$\displaystyle r_{4}(t,\mathbf{u})$	$\displaystyle=k_{4}u_{2}u_{3},$	$\displaystyle k_{4}$	$\displaystyle=576\cdot 0^{-15},$
$\displaystyle r_{5}(t,\mathbf{u})$	$\displaystyle=k_{5}(t)u_{3},$	$\displaystyle k_{5}(t)$	$\displaystyle=\sigma(T(t))^{2}\cdot 070\cdot 0^{-3},$
$\displaystyle r_{6}(t,\mathbf{u})$	$\displaystyle=k_{6}Mu_{1},$	$\displaystyle k_{6}$	$\displaystyle=110\cdot 0^{-11},$	$\displaystyle M$	$\displaystyle=120\cdot 0^{16},$
$\displaystyle r_{7}(t,\mathbf{u})$	$\displaystyle=k_{7}u_{1}u_{3},$	$\displaystyle k_{7}$	$\displaystyle=200\cdot 0^{-10},$
$\displaystyle r_{8}(t,\mathbf{u})$	$\displaystyle=k_{8}u_{3}u_{5},$	$\displaystyle k_{8}$	$\displaystyle=062\cdot 0^{-15},$
$\displaystyle r_{9}(t,\mathbf{u})$	$\displaystyle=k_{9}u_{2}u_{6},$	$\displaystyle k_{9}$	$\displaystyle=069\cdot 0^{-11},$
$\displaystyle r_{10}(t,\mathbf{u})$	$\displaystyle=k_{10}(t)u_{6},$	$\displaystyle k_{10}(t)$	$\displaystyle=\sigma(T(t))\cdot 289\cdot 0^{-2},$
$\displaystyle r_{11}(t,\mathbf{u})$	$\displaystyle=k_{11}u_{2}u_{5},$	$\displaystyle k_{11}$	$\displaystyle=0^{-8}$

as well as

	$\displaystyle\sigma(T(t))$	$\displaystyle=$
	$\displaystyle T(t)$	$\displaystyle=\frac{t}{3600}\mod 4,\quad T_{r}=5,\quad T_{s}=95.$

The non-zero production and destruction terms of the system (46) are given by

$\displaystyle d_{12}(t,\mathbf{u})$	$\displaystyle=r_{6}(\mathbf{u}),$	$\displaystyle d_{14}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{3}r_{7}(\mathbf{u}),$	$\displaystyle d_{23}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{2}r_{2}(\mathbf{u}),$
$\displaystyle d_{24}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{3}r_{4}(\mathbf{u}),$	$\displaystyle d_{25}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{2}r_{9}(\mathbf{u}),$	$\displaystyle d_{26}(t,\mathbf{u})$	$\displaystyle=r_{11}(\mathbf{u}),$
$\displaystyle d_{31}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{3}r_{5}(t,\mathbf{u}),$	$\displaystyle d_{32}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{3}r_{3}(t,\mathbf{u}),$	$\displaystyle d_{36}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{3}r_{8}(\mathbf{u}),$
$\displaystyle d_{34}(t,\mathbf{u})$	$\displaystyle=\tfrac{2}{3}r_{3}(t,\mathbf{u})+r_{4}(\mathbf{u})+\tfrac{2}{3}r_{5}(t,\mathbf{u})+r_{7}(\mathbf{u})+\tfrac{2}{3}r_{8}(\mathbf{u}),\hskip-170.71652pt$
$\displaystyle d_{42}(t,\mathbf{u})$	$\displaystyle=r_{1}(t,\mathbf{u}),$	$\displaystyle d_{43}(t,\mathbf{u})$	$\displaystyle=r_{2}(\mathbf{u}),$	$\displaystyle d_{56}(t,\mathbf{u})$	$\displaystyle=r_{11}(\mathbf{u})+\tfrac{1}{3}r_{8}(\mathbf{u}),$
$\displaystyle d_{62}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{2}r_{10}(t,\mathbf{u}),$	$\displaystyle d_{64}(t,\mathbf{u})$	$\displaystyle=r_{9}(\mathbf{u}),$	$\displaystyle d_{65}(t,\mathbf{u})$	$\displaystyle=\tfrac{1}{2}r_{10}(t,\mathbf{u}),$

and $p_{ij}=d_{ji}$ . The solution to this problem will be approximated over the time interval $[12\cdot 3600,84\cdot 3600]$ , where a unit of time represents a second. A reference solution of the scaled problem is depicted in Figure 2.

As we will see, MPRK schemes do not conserve the second linear invariant, which is why

\mathbf{n}_{2}^{T}\frac{\mathbf{u}^{n+1}-\mathbf{u}^{n}}{\lVert\mathbf{u}^{n+1}-\mathbf{u}^{n}\rVert}=c(\mathbf{u}^{n})\Delta t+\mathcal{O}(\Delta t^{2})

with $c\neq 0$ . Hence, we may use

\eta(\mathbf{u})=\mathbf{n}_{2}^{T}\mathbf{u}

as an entropy function satisfying (5) to preserve also the second linear invariant with our relaxation technique for conservative problems. As can be seen in Figure 3, using relaxation improves the accuracy of the solution significantly. However, we note that Newton’s method, while working in principle, sometimes fails at finding a solution $\gamma\approx 1$ in our implementation. Indeed, we observed $\gamma\approx 10^{-12}$ and thus decided to use Regula Falsi as a solver.

4.3 Linear advection

Consider the linear advection equation

\partial_{t}u+\partial_{x}u=0

(47)

with periodic boundary conditions on $I=[0,2]$ and a positive initial condition $u^{0}>0$ . Then, the solution $u(t,x)=u^{0}(x-t)$ stays positive. Moreover, every functional of the form

\eta(u(t,I))=\int_{I}U\bigl(u(t,x)\bigr)\operatorname{d\!}x

(48)

for an entropy function $U$ is conserved with associated entropy flux $F(u)=U(u)$ . Following Tadmor [Tadmor03], the entropy variables are $w=U^{\prime}(u)$ and the flux potential is $\psi=wf-F=U^{\prime}(u)u-U(u)$ . The corresponding entropy-conservative numerical flux is

f^{\mathrm{num}}(u_{-},u_{+})=\frac{\psi(u_{+})-\psi(u_{-})}{w(u_{+})-w(u_{-})}=\frac{U^{\prime}(u_{+})u_{+}-U(u_{+})-U^{\prime}(u_{-})u_{-}+U(u_{-})}{U^{\prime}(u_{+})-U^{\prime}(u_{-})}.

(49)

If $U^{\prime}(u)\to\infty$ faster than $U(u)$ as $u\to 0$ , then the numerical flux goes to zero if one of the states goes to zero. Therefore, the resulting finite volume method

\partial_{t}u_{i}+\frac{f^{\mathrm{num}}(u_{i-1},u_{i})-f^{\mathrm{num}}(u_{i},u_{i+1})}{\Delta x}=0

(50)

is positivity-preserving in this case. In particular, the numerical fluxes $f^{\mathrm{num}}(u_{-},u_{+})$ are always non-negative, resulting in a conservative production-destruction system. Next, we consider several examples.

•

The entropy

U(u)=u\log(u)-u

(51)

leads to the entropy-conservative numerical flux

f^{\mathrm{num}}(u_{-},u_{+})=\frac{u_{+}-u_{-}}{\log(u_{+})-\log(u_{-})}=\mathrel{\mathop{\ordinarycolon}}\{\!\{u\}\!\}_{\mathrm{log}}

(52)

using the logarithmic mean, see [ranocha2021preventing, Section 3.2].

•

Similarly, the entropy

U(u)=-\sqrt{u}

(53)

leads to the entropy variables $w=-1/(2\sqrt{u})$ , the flux potential $\psi=\sqrt{u}/2$ , and the entropy-conservative numerical flux

f^{\mathrm{num}}(u_{-},u_{+})=\frac{\sqrt{u_{+}}-\sqrt{u_{-}}}{-1/\sqrt{u_{+}}+1/\sqrt{u_{-}}}=\sqrt{u_{-}u_{+}}=\mathrel{\mathop{\ordinarycolon}}\{\!\{u\}\!\}_{\mathrm{geo}}

(54)

using the geometric mean.

•

Analogously, the entropy

U(u)=1/u

(55)

leads to the entropy variables $w=-1/u^{2}$ , the flux potential $\psi=-2/u$ , and the entropy-conservative numerical flux

f^{\mathrm{num}}(u_{-},u_{+})=\frac{-2/u_{+}+2/u_{-}}{-1/u_{+}^{2}+1/u_{-}^{2}}=\frac{2u_{-}u_{+}}{u_{+}+u_{-}}=\mathrel{\mathop{\ordinarycolon}}\{\!\{u\}\!\}_{\mathrm{harm}}

(56)

using the harmonic mean.

Please note that positivity preservation for an entropy-conservative method depends on the choice of the entropy function. For example, the standard $L^{2}$ entropy

U(u)=\frac{u^{2}}{2}

(57)

leads to the numerical flux

f^{\mathrm{num}}(u_{-},u_{+})=\frac{1}{2}\frac{u_{+}^{2}-u_{-}^{2}}{u_{+}-u_{-}}=\frac{u_{-}+u_{+}}{2},

(58)

i.e., the standard arithmetic mean. The resulting finite volume discretization

\partial_{t}u_{i}+\frac{u_{i+1}-u_{i-1}}{2\Delta x}=0

(59)

is the classical second-order central discretization, which is not positivity-preserving.

We use $N_{x}=100$ cells and the initial condition

u(0,x)=1.9\sin(\pi x)+2,\quad x\in[0,2]

and apply different iterative methods for solving for $\gamma$ . The respective results are depicted in Figure 4.

4.4 Shallow water equations

The classical shallow water equations

\partial_{t}\underbrace{\begin{pmatrix}h\\ hv\end{pmatrix}}_{=u}+\partial_{x}\underbrace{\begin{pmatrix}hv\\ hv^{2}+\tfrac{1}{2}gh^{2}\end{pmatrix}}_{=f(u)}=0

(60)

have the total energy

U(u)=\tfrac{1}{2}hv^{2}+\tfrac{1}{2}gh^{2}

(61)

as entropy. The corresponding entropy variables are

w=\begin{pmatrix}gh-\tfrac{1}{2}v^{2}\\ v\end{pmatrix}

(62)

and the entropy flux potential is

\psi=\tfrac{1}{2}gh^{2}v.

(63)

For constant velocity $v$ , the condition for an entropy-conservative numerical flux is

0=[\![w]\!]\cdot f^{\mathrm{num}}-[\![\psi]\!]=g[\![h]\!]f^{\mathrm{num}}_{h}-\tfrac{1}{2}g[\![h^{2}]\!]v,

(64)

where $[\![w]\!]\coloneqq w_{i+1}-w_{i}$ . Thus, the numerical flux for the water height $h$ is

f^{\mathrm{num}}_{h}=\tfrac{1}{2}\frac{[\![h^{2}]\!]}{[\![h]\!]}v=\{\!\{h\}\!\}v

(65)

with the arithmetic mean

\{\!\{h\}\!\}=\tfrac{1}{2}(h_{-}+h_{+}).

(66)

Similarly to the linear advection equations above, the arithmetic mean does not lead to a positivity-preserving semidiscretization. This proves

Theorem 3.

An entropy-conservative semidiscretization of the shallow water equations is not positivity-preserving.

One can show a similar result for the polytropic Euler equations with pressure $p\propto\varrho^{\gamma}$ and $\gamma>1$ . However, the limiting case of the isothermal Euler equations is different and discussed in the next subsection.

4.5 Isothermal Euler equations

The 1D isothermal Euler equations are

\partial_{t}\underbrace{\begin{pmatrix}\varrho\\ \varrho v\end{pmatrix}}_{=\mathbf{u}}+\partial_{x}\underbrace{\begin{pmatrix}\varrho v\\ \varrho v^{2}+c^{2}\varrho\end{pmatrix}}_{=\mathbf{f}(\mathbf{u})}=0,

(67)

where $\varrho$ is the density, $v$ is the velocity, and $c$ is the speed of sound. We take the total energy

U(\mathbf{u})=\tfrac{1}{2}\varrho v^{2}+\tfrac{1}{2}c^{2}\varrho\log(\varrho)

(68)

as (mathematical) entropy. An associated entropy-conservative numerical flux at the interface $i+\frac{1}{2}$ is given by [winters2020entropy]

f^{\mathrm{num}}_{\varrho}=\{\!\{\varrho\}\!\}_{\mathrm{log}}\{\!\{v\}\!\},\quad f^{\mathrm{num}}_{\varrho v}=\{\!\{\varrho\}\!\}_{\mathrm{log}}\{\!\{v\}\!\}^{2}+\{\!\{c^{2}\varrho\}\!\}.

(69)

Since the logarithmic mean goes to zero if one of the states goes to zero, the resulting entropy-conservative finite volume method is unconditionally positive. Even more general, the flux differencing method [tadmor1987numerical, Tadmor03, lefloch2002fully, fisher2013discretely, ranocha2018comparison, chen2017entropy] based on diagonal-norm SBP operators. In particular, high-order discontinuous Galerkin spectral element methods (DGSEMs) are positivity-preserving. While we apply the underlying explicit RK method to the second conserved variable together with the standard relaxation algorithm, we use MPRK22 for $\varrho$ , where we use the PDS

p_{i+1,i}=d_{i,i+1}=\max\left\{0,f^{\mathrm{num}}_{\varrho}\right\},\quad p_{i,i+1}=d_{i+1,i}=-\min\left\{0,f^{\mathrm{num}}_{\varrho}\right\}

for $i=1,\dotsc,N-1,$ and we take periodic boundary conditions into account for the terms if $i=N$ . The study of flux-balanced MPRK schemes introduced in [IMST2026] is left for future works. In Figure 5 we solve the Riemann problem (RP)

\mathbf{u}_{L}=(0.8,10^{-3}),\quad\mathbf{u}_{R}=(1,10^{-2})

with periodic boundary conditions and final time $t_{\mathrm{end}}=1$ . We note that one should not use an entropy conservative flux for an RP, however, this is a good example that our time integrator maintains the entropy properties of the space discretization.

4.6 Porous Medium Equation

The porous medium equation

u_{t}=(u^{m})_{xx}=(a(u)u_{x})_{x},\quad a(u)=mu^{m-1}

with a free parameter $m>1$ , see for instance [Boscarino23], admits a non-negative weak solution

u^{(m)}(t,x)=t^{-k}\left[\max\left(1-\frac{k(m-1)}{2m}\frac{\lvert x\rvert^{2}}{t^{2k}},0\right)\right]^{\frac{1}{m-1}}

with $k=\frac{1}{m+1}$ , the so-called Barenblatt solution [Barenblatt52]. For every $t>0$ , the solution has a compact support $[-\alpha_{m}(t),\alpha_{m}(t)]$ where

\alpha_{m}(t)=\sqrt{\frac{2m}{k(m-1)}}t^{k}.

We follow [Boscarino23] using $u(0,x)=u^{(m)}(1,x)$ as an initial condition. We plot the numerical solution at time $t=2$ on the spatial domain $[-6,6]$ using the boundary conditions $u(t,\pm 6)=0$ for $t>1$ .

We use the second-order space discretization from [Mattsson12, ranocha2019mimetic] given by

	$\displaystyle f_{i}(\mathbf{u}(t))=$	$\displaystyle\frac{a(u_{i}(t))+a(u_{i+1}(t))}{2\Delta x^{2}}u_{i+1}(t)$
		$\displaystyle-\frac{a(u_{i-1}(t))+2a(u_{i}(t))+a(u_{i+1}(t))}{2\Delta x^{2}}u_{i}(t)$
		$\displaystyle+\frac{a(u_{i-1}(t))+a(u_{i}(t))}{2\Delta x^{2}}u_{i-1}(t)$

for $i=2,\dotsc,N$ and

f_{j}(\mathbf{u}(t))=\frac{a(u_{j}(t))}{2\Delta x^{2}}u_{j}(t),\quad\text{ for }\quad j\in\{1,N\}.

Next, we consider the convex entropy

\eta(\mathbf{u})=\frac{\Delta x^{2}}{2}\sum_{i=1}^{N_{x}}u_{i}^{2},

which satisfies

\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}\eta(\mathbf{u}(t))\leq 0

for the boundary conditions mentioned above, see [ranocha2019mimetic, Theorem 4.1]. This system of ODEs may be rewritten as a conservative PDS by setting

	$\displaystyle p_{i,i+1}(\mathbf{u})$	$\displaystyle=\frac{a(u_{i})+a(u_{i+1})}{2\Delta x^{2}}u_{i+1},$	$\displaystyle\quad p_{i,i-1}(\mathbf{u})$	$\displaystyle=\frac{a(u_{i-1})+a(u_{i})}{2\Delta x^{2}}u_{i-1},$	$\displaystyle\quad i$	$\displaystyle=2,\dotsc,N,$
	$\displaystyle p_{1,2}(\mathbf{u})$	$\displaystyle=\frac{a(u_{2})}{2\Delta x^{2}}u_{2},$	$\displaystyle\quad p_{N,N-1}(\mathbf{u})$	$\displaystyle=\frac{a(u_{N-1})}{2\Delta x^{2}}u_{N-1},$	$\displaystyle\quad d_{i,j}$	$\displaystyle=p_{j,i}.$

According to [Boscarino23], the cases $m=3,5$ are particularly interesting as the numerical solution of the proposed third-order IMEX method in [Boscarino23, p. 10, eq. (30)] generates negative approximations and which cannot happen with MPRK schemes. Indeed, we observe in Figure 6 that we obtain positive approximations while the relaxation algorithm gives us an entropy estimate. Here, we do not plot $\gamma$ as it was constantly at $1$ .

5 Summary and conclusions

In this work we investigated non-standard additive Runge–Kutta (NSARK) schemes, which include modified Patankar (MP) methods or Geometric Conservative (GeCo) to name a few. Being particularly interested in positivity-preserving methods that are also capable of conserving at least one linear invariant, we answered the question of whether these schemes can be equipped with a relaxation technique that preserves these properties while ensuring entropy stability. We point out that positivity preservation is easy to accomplish for entropy dissipative problems. For entropy conservative problems, where no linear invariant needs to be preserved, one can equip an unconditionally positive method with the geometric mean to compute the relaxation update. If the conservative problem has a linear invariant or one is interested in keeping a conservative PDS part also conservative within the relaxation procedure, we propose to use a linearly implicit formula for the relaxation update, which in turn results in a coupled linear-nonlinear system for the simultaneous computation of $\gamma$ and $\mathbf{u}^{n+\gamma}$ . All techniques can be used for any positivity-preserving method maintaining the order, however, the latter relaxation technique involves a bootstrapping algorithm to achieve higher-order entropy conservative methods preserving a linear invariant.

We have tested our theoretical findings by means of multiple examples of ordinary and partial differential equations. Furthermore, interpreting a linear invariant as entropy, we were able to preserve both linear invariants of the stiff stratospheric reaction problem using MPRK. We have also tested several flux and entropy pairs for the linear advection equation testing the different iterative solvers for the computation of $\gamma$ and $\mathbf{u}^{n+\gamma}$ . Moreover, we applied our technique also in the context of the isothermal Euler equation guaranteeing the positivity of the density. Finally, we have also tested MPRK and MPSSPRK schemes with the entropy dissipative porous medium equation, where we are also able to avoid negative approximations.

Future research topics include the testing of further NSARK schemes, including the recently developed flux-balanced versions, and the efficiency of the related methods. As some of the NSARK schemes are already proven to be conditionally stable, a thorough stability analysis for these methods is also part of ongoing research.

\bmhead

Acknowledgements

Declarations

Funding

T. Izgin gratefully acknowledges the financial support by Fulbright Germany. H. Ranocha was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation, project number 513301895) and the Daimler und Benz Stiftung (Daimler and Benz foundation, project number 32-10/22). C.-W. Shu acknowledges partial support from NSF grant DMS-2309249.

Conflict of interest

Not applicable.

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors consent to submit for publication.

Data, Materials and Code availability

The source code used in this study is available at [IRS2026repository].

Author contribution

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Thomas Izgin. The first draft of the manuscript was written by Thomas Izgin and all authors commented on as well as wrote on previous versions of the manuscript. All authors read and approved the final manuscript.

A Positivity-Preserving Relaxation Algorithm

Abstract

keywords:

pacs:

1 Introduction

2 Preliminaries

2.1 Classical Relaxation

Theorem 1 ([ranocha2020general, Theorem 2.13, Theorem 2.14]).

Remark 1 (Issue with positivity).

2.2 Non-standard Additive Runge–Kutta Methods

2.2.1 Production-Destruction-Rest Systems

2.2.2 Modified Patankar–Runge–Kutta Schemes

Example 1 (Second-order Family).

Example 2 (Third-order Family).

2.2.3 MP Strong-Stability-Preserving-Runge–Kutta Schemes

3 Positivity-Preserving Relaxation Technique

3.1 Entropy Dissipative Case

Remark 2.

Corollary 1 ([ranocha2020general, Pages 882-883]).

Remark 3 (Positivity-preserving relaxation for convex η\eta).

3.2 Entropy-Conservative Case

3.2.1 Explicit Positivity-Preserving Procedure

3.2.2 Implicit Positivity-Preserving Procedure for Conservative PDS

Positivity-Preserving Dense Output

Example 3 (Second-order dense output for MPRK22(α\alpha)).

Example 4 (Higher order positive dense output for MPRK).

Example 5 (Second-order dense output for MPSSPRK).

Remark 4 (Use of dense output for relaxation).

Preparatory Results for MPRK22(α\alpha)

Lemma 1.

Proof.

Remark 5 (Influence of 𝝈¯\bar{\bm{\sigma}} and application to MPSSPRK).

Main Result for Entropy Conservation and Positivity Preservation

Lemma 2.

Proof.

Theorem 2.

Proof.

Lemma 3.

Proof.

Bootstrapping Algorithm for Positivity-Preserving Relaxation

Lemma 4.

Remark 6.

Proof of Lemma 4.

Example 6 (Third-Order Relaxation for Conservative Problems using MPRK).

Applying Newton’s Method

4 Numerical Experiments

4.1 Lotka-Volterra System

4.2 Stratospheric Reaction Problem

4.3 Linear advection

4.4 Shallow water equations

Theorem 3.

4.5 Isothermal Euler equations

4.6 Porous Medium Equation

5 Summary and conclusions

Declarations

Funding

Conflict of interest

Ethics approval and consent to participate

Consent for publication

Data, Materials and Code availability

Author contribution

References

Remark 3 (Positivity-preserving relaxation for convex $\eta$ ).

Example 3 (Second-order dense output for MPRK22( $\alpha$ )).

Preparatory Results for MPRK22( $\alpha$ )

Remark 5 (Influence of $\bar{\bm{\sigma}}$ and application to MPSSPRK).