Four Attacks and a Proof for Telegram

We study the use of symmetric cryptography in the MTProto 2.0 protocol, Telegram’s equivalent of the TLS protocol. We give positive and negative results. On the one hand, we formally and in detail specify a slight variant of Telegram’s “record protocol” and prove that it achieves security in a suitable bidirectional secure channel model, albeit under unstudied assumptions; this model itself advances the state of the art for secure channels. On the other hand, we first motivate our slight deviation from MTProto as deployed by giving two attacks on the original protocol specification: one of practical, one of theoretical interest. Then, we give two attacks on the implementation, which are outside of our formal model: one targeting the client, one targeting the server. The client-side attack enables plaintext recovery by exploiting timing side channels, of varying strength, in three official Telegram clients. On its own this attack is thwarted by the secrecy of header fields that are established by Telegram’s key exchange protocol. We thus chain this attack with an attack against the implementation of the key exchange protocol on Telegram’s servers. This final attack breaks the authentication properties of Telegram’s key exchange, allowing a MitM attack. More mundanely, it also reduces the cost of the client-side plaintext-recovery attack. In totality, our results provide the first comprehensive study of MTProto’s use of symmetric cryptography, as well as highlight weaknesses in its key exchange.

Analysis of the Telegram Key Exchange

More Practical Single-Trace Attacks on the Number Theoretic Transform

Establishing Secure Communication Channels Using Remote Attestation with TPM 2.0

1 Introduction

Telegram is a chat platform that in January 2021 reportedly had 500 M monthly users [60]. It provides a host of multimedia and chat features, such as one-on-one chats, public and private group chats for up to 200,000 users as well as public channels with an unlimited number of subscribers. Prior works establish the popularity of Telegram with higher-risk users such as activists [25] and participants of protests [1]. In particular, it is reported in [1, 25] that these groups of users shun Signal in favour of Telegram, partly due to the absence of some key features, but mostly due to Signal’s reliance on phone numbers as contact handles.

This heavy usage contrasts with the scant attention paid to Telegram’s bespoke cryptographic design—MTProto—by the cryptographic community. To date, only four works treat Telegram. In [34], an attack against the IND-CCA security of MTProto 1.0 was reported, in response to which the protocol was updated. In [54], a replay attack based on improper validation in the Android client was reported. Similarly, [39] reports input validation bugs in Telegram’s Windows Phone client. Recently, in [46] MTProto 2.0 (the current version) was proven secure in a symbolic model, but assuming ideal building blocks and abstracting away all implementation/primitive details. In short, the security that Telegram offers is not well understood.

Telegram uses its MTProto “record layer”—offering protection based on symmetric cryptographic techniques—for two different types of chats. By default, messages are encrypted and authenticated between a client and a server, but not end-to-end encrypted: such chats are referred to as cloud chats. Here Telegram’s MTProto protocol plays the same role that TLS plays in, for example, Facebook Messenger. In addition, Telegram offers optional end-to-end encryption for one-on-one chats which are referred to as secret chats (these are tunnelled over cloud chats). So far, the focus in the cryptographic literature has been on secret chats [34, 39] as opposed to cloud chats. In contrast, in [1] it is established that the one-on-one chats played only a minor role for the protest participants interviewed in the study; significant activity was reportedly coordinated using group chats secured by the MTProto protocol between Telegram clients and the Telegram servers. For this reason, we focus here on cloud chats. Given the similarities between the cryptography used in secret and cloud chats, our positive results can be modified to apply to the case of secret chats (but we omit any detailed analysis).

For completeness, note that follow-up work [3] has in the meantime also studied the MTProto “handshake” in a suitable multi-stage key exchange model.

1.1 Contributions

We provide an in-depth study of how Telegram uses symmetric cryptography inside MTProto for cloud chats. We give four distinctive contributions: our security model for secure channels, the formal specification of our variant of MTProto, our attacks on the original protocol and our security proofs for the formal specification of MTProto.

1.1.1 Security model

Starting from the observation that MTProto entangles the keys of the two channel directions, in Sect. 3 we develop a bidirectional security model for two-party secure channels that allows an adversary full control over generating and delivering ciphertexts from/to either party (client or server). The model assumes that the two parties start with a shared key and use stateful algorithms. Our security definitions come in two flavours, one capturing confidentiality and the other integrity. We also consider a combined security notion and its relationship to the individual notions. Our formalisation is broad enough to consider a variety of different styles of secure channels—for example, allowing channels where messages can be delivered out of order within some bounds, or where messages can be dropped (neither of which we consider appropriate for secure messaging). This caters for situations where the secure channel operates over an unreliable transport protocol, but where the channel is designed to recover from accidental errors in message delivery as well as from certain permitted adversarial behaviours.

This is done technically by introducing the concept of support functions, inspired by the support predicates recently introduced by [28] but extending them to cater for a wider range of situations. Here the core idea is that a support function operates on the transcript of messages and ciphertexts sent and received (in both directions) and its output is used to decide whether an adversarial behaviour—say, reordering or dropping messages—counts as a “win” in the security games. It is also used to define a suitable correctness notion with respect to expected behaviours of the channel.

As a final feature, our secure channel definitions allow the adversary complete control over all randomness used by the two parties, since we can achieve security against such a strong adversary in the stateful setting. This decision reflects a concern about Telegram clients expressed by Telegram developers [61].

1.1.2 Formal specification of MTProto

In Sect. 4, we provide a detailed formal specification of Telegram’s symmetric encryption; Fig. 1 illustrates its main components. Our specification is computational and does not abstract away the building blocks used in Telegram. This in itself is a non-trivial task as no formal specification exists and behaviour can only be derived from official (but incomplete) documentation and from dynamic analysis of Telegram’s implementation; moreover, different clients do not have the same behaviour.

Formally, we define an MTProto-based bidirectional channel $\textsf{MTP}\text {-}\textsf{CH} $ as a composition of multiple cryptographic primitives. This allows us to recover a variant of the real-world MTProto protocol by instantiating the primitives with specific constructions and to study whether each of them satisfies the security notions that are required in order to achieve the desired security of $\textsf{MTP}\text {-}\textsf{CH} $. This allows us to significantly simplify the analysis. However, we emphasise that our goal is to be descriptive, not prescriptive, i.e. we do not suggest alternative instantiations of $\textsf{MTP}\text {-}\textsf{CH} $.

To arrive at our specification, we had to make several decisions on what behaviour to model and where to draw the line of abstraction. Notably, there are various behaviours exhibited by (official) Telegram implementations that lead to attacks.

In particular, we verified in practice that current implementations allow an attacker on the network to reorder messages from a client to the server, with the transcript on the client being updated later to reflect the attacker-altered server’s view. We stress, though, that this trivial yet practical attack is not inherent in MTProto and can be avoided by updating the processing of message metadata in Telegram’s servers. The consequences of such an attack can be quite severe, as we discuss further in Sect. 4.2.

Further, if a message is not acknowledged within a certain time in MTProto, it is resent using the same metadata and with fresh random padding. While this appears to be a useful feature and a mitigation against message drops, it would actually enable an attack in our formal model if such retransmissions were included in the specification. In particular, an adversary who also has control over the randomness can break stateful IND-CPA security with three encryption queries, while an attacker without that control could do so with about $2^{64}$ encryption queries. We use these more theoretical attacks to motivate our decision not to allow re-encryption with fixed metadata in our formal specification of MTProto, i.e. we insist that the state is evolving.

1.1.3 Proof of security

We then prove in Sect. 5 that our slight variant of MTProto achieves channel confidentiality and integrity in our model, under certain assumptions on the components used in its construction. As described in Sect. 1.3, Telegram has implemented our proposed alterations so that there can be some assurances about MTProto as currently deployed.^{Footnote 1}

We use code-based game-hopping proofs in which the analysis is modularised into a sequence of small steps that can be individually verified. As well as providing all details of the proofs, we also give high-level intuitions. Significant complexity arises in the proofs from two sources: the entanglement of keys used in the two channel directions, and the detailed nature of the specification of MTProto that we use.

We eschew an asymptotic approach in favour of concrete security analysis. This results in security theorems that quantitatively relate the confidentiality and integrity of MTProto as a secure channel to the security of its underlying cryptographic components. Our main security results, Theorems 1 and 2 and Corollaries 1 and 2, provide confidentiality and integrity bounds containing terms equivalent to $\approx q/2^{64}$ where $q$ is the number of $\textsc {Send}$ queries an attacker makes. We discuss this further in Sect. 5.

However, our security proofs rely on several assumptions about cryptographic primitives that, while plausible, have not been considered in the literature. In more detail, due to the way Telegram makes use of $\textsf{SHA}-\textsf{256}$ as a MAC algorithm and as a KDF, we have to rely on the novel assumption that the block cipher $\textsf{SHACAL}-\textsf{2}$ underlying the $\textsf{SHA}-\textsf{256}$ compression function is a leakage-resilient PRF under related-key attacks, where “leakage-resilient” means that the adversary can choose a part of the key. Our proofs rely on two distinct variants of such an assumption. In Appendix F, we show that these assumptions hold in the ideal cipher model, but further cryptanalysis is needed to validate them for $\textsf{SHACAL}-\textsf{2}$. For similar reasons, we also require a dual-PRF assumption of $\textsf{SHACAL}-\textsf{2}$. We stress that such assumptions are likely necessary for our or any other computational security proofs for MTProto. This is due to the specifics of how MTProto uses $\textsf{SHA}-\textsf{256}$ and how it constructs keys and tags from public inputs and overlapping key bits of a master secret. Given the importance of Telegram, these assumptions provide new, significant cryptanalysis targets as well as motivate further research on related-key attacks.

Besides using $\textsf{SHA}-\textsf{256}$ as a MAC algorithm and a KDF, MTProto also uses $\textsf{SHA}-\textsf{1}$ to compute a key identifier. This does not lead to length-extension attacks because in each use case either the input is required to have a fixed length, or the output gets truncated. The latter technique was previously studied as ChopMD [23] and employed to build AMAC [10]. But rather than applying these results to show that the design of the MAC algorithm prevents forgeries, our proofs rely on an observation that even if length-extension attacks were possible, it would still not lead to breaking the security of the overall scheme. This is true because the plaintext encoding format of MTProto mandates the presence of certain metadata in the first block of the encrypted payload.

1.1.4 Attacks

We present further implementation attacks against Telegram in Sections 6 and 7. These attacks highlight the limits of our formal modelling and the fragility of MTProto implementations. The first of these, a timing attack against Telegram’s use of IGE mode encryption, can be avoided by careful implementation, but we found multiple vulnerable clients.^{Footnote 2} The attack takes inspiration from an attack on SSH [5]. It exploits that Telegram encrypts a length field and checks integrity of plaintexts rather than ciphertexts. If this process is not implemented whilst taking care to avoid a timing side channel, it can be turned into an attack recovering up to 32 bits of plaintext. We give examples from the official Desktop, Android and iOS Telegram clients, each exhibiting a different timing side channel. However, we stress that the conditions of this attack are difficult to meet in practice. In particular, to recover bits from a plaintext message block $m_{i}$ we assume knowledge of message block $m_{i-1}$ (we consider this a relatively mild assumption) and, critically, message block $m_{1}$ which contains two 64-bit random values negotiated between the client and the server. Thus, confidentiality hinges on the secrecy of two random strings—a salt and an id. Notably, these fields were not designated for this purpose in the Telegram documentation.

In order to recover $m_{1}$ and thereby enable our plaintext-recovery attack, in Section 7 we chain it with another attack on the server-side implementation of Telegram’s key exchange protocol. This attack exploits how Telegram servers process RSA ciphertexts. While the exploited behaviour was confirmed by the Telegram developers, we did not verify it with an experiment.^{Footnote 3} It uses a combination of lattice reduction and Bleichenbacher-like techniques [19]. This attack actually breaks server authentication—allowing a MiTM attack—assuming the attack can be completed before a session times out. But, more germanely, it also allows us to recover the id field. This essentially reduces the overall security of Telegram to guessing the 64-bit salt field. Details can be found in Section 7. We stress, though, that even if all the assumptions that we make in Section 7 are met, our exploit chain (Sect. 6, Section 7)—while being considerably cheaper than breaking the underlying $\textsf{AES}-\textsf{256}$ encryption—is far from practical. Yet, it demonstrates the fragility of MTProto, which could be avoided—along with unstudied assumptions—by relying on standard authenticated encryption or, indeed, just using TLS.

We conclude with a broader discussion of Telegram security and with our recommendations in Sect. 8.

1.2 Publication history

This is the full version of the paper published at IEEE S&P 2022 [4]. The proofs referred to in [4, Section V] are contained in full here and can be found in Appendices E and F and in Sect. 5 (in particular Sections 5.5 and 5.6). We have also expanded the content of several other sections as follows: Section 3 defining bidirectional channels, originally [4, Section III], was expanded with more context and illustrating examples. Section 6.1 on the timing attack, originally [4, Section VI], contains the code samples for all affected Telegram clients. Section 7 on the key exchange attack, originally [4, Appendix A], is significantly expanded and contains an overview of the key exchange protocol as well as the attack in detail. This work also contains several new appendices: Appendices A to C expand and help to position our new channels framework, while Appendices D and G give more details about the Telegram protocol and the implementation of our attacks.

1.3 Disclosure

We notified Telegram’s developers about the vulnerabilities we found in MTProto on 16 April 2021. They acknowledged receipt soon after and the behaviours we describe on 8 June 2021. They awarded a bug bounty for the timing side channel and for the overall analysis. We were informed by the Telegram developers that they do not do security or bugfix releases except for immediate post-release crash fixes. The development team also informed us that they did not wish to issue security advisories at the time of patching nor commit to release dates for specific fixes. Therefore, the fixes were rolled out as part of regular Telegram updates. The Telegram developers informed us that as of version 7.8.1 for Android, 7.8.3 for iOS and 2.8.8 for Telegram Desktop all vulnerabilities reported here were addressed. When we write “the current version of MTProto” or “current implementations”, we refer to the versions prior to those version numbers, i.e. the versions we analysed.

2 Preliminaries

2.1 Notational conventions

2.1.1 Basic notation

Let ${{\mathbb {N}}}=\{1, 2, \ldots \}$. For $i\in {{\mathbb {N}}}$ let [i] be the set $\{1, \ldots , i\}$. We denote the empty string by $\varepsilon $, the empty set by $\emptyset $, and the empty list by $[]$. We let $x_1\leftarrow x_2 \leftarrow v$ denote assigning the value v to both $x_1$ and $x_2$. Let $x\in \{0,1\}^*$ be any string; then |x| denotes its bit length, x[i] denotes its i-th bit for $0 \le i < \left| x\right| $, and $x[a: b]=x[a]\ldots x[b-1]$ for $0\le a < b \le |x|$. For any $x\in \{0,1\}^*$ and $\ell \in {{\mathbb {N}}}$ such that $|x| \le {\ell }$, we write $\langle x \rangle _{\ell }$ to denote the bit-string of length $\ell $ that is built by padding x with leading zeros. For any two strings $x, y \in \{0,1\}^*$, $x~\Vert ~y$ denotes their concatenation. If ${X}$ is a finite set, we let denote picking an element of ${X}$ uniformly at random and assigning it to x. If $\textsf{T}$ is a table, $\textsf{T}[i]$ denotes the element of the table that is indexed by i. If $\textsf{tr}$ is a list, then $\textsf{tr}[i]$ denotes the element of this list that is indexed by i, where the index is 0-based; further, $\textsf{tr} ~\Vert ~x$ denotes appending the element x to $\textsf{tr}$. We let $\bot \not \in \{0,1\}^*$ be an error code that indicates rejection, and we may also use when another distinct error code is needed. Uninitialised integers are assumed to be set to 0, Booleans to $\texttt {false}$, strings to $\varepsilon $, sets to $\emptyset $, and lists to $[]$. Each element of a table is assumed to be initialised to $\bot $, indicating that it is empty. We use int64 as a shorthand for a 64-bit integer data type. We use 0x to prefix a hexadecimal string in big-endian order. All variables are represented in big-endian unless specified otherwise.

2.1.2 Algorithms and adversaries

Algorithms may be randomised unless otherwise indicated. Running time is worst case. If A is an algorithm, $y \leftarrow A(x_1,\ldots ;r)$ denotes running A with random coins r on inputs $x_1,\ldots $ and assigning the output to y. We let be the result of picking r at random and letting $y \leftarrow A(x_1,\ldots ;r)$. We let $[A(x_1,\ldots )]$ denote the set of all possible outputs of A when invoked with inputs $x_1,\ldots $. The instruction ${\textbf{abort}}(x_1,\dots )$ is used to immediately halt the algorithm with output $(x_1,\dots )$. Adversaries are algorithms. Besides using $\bot $ as an error code, we also let oracles explicitly return $\bot $ if they would have otherwise terminated with no output. We require that adversaries never pass $\bot $ as input to their oracles. If any of the inputs taken by an adversary A is $\bot $, then all of its outputs are $\bot $.

2.1.3 Security games and reductions

We use the code-based game-playing framework of [17]. (See Fig. 3 for an example.) $\Pr [\textrm{G}]$ denotes the probability that game $\textrm{G}$ returns $\texttt {true}$. Variables in each game are shared with its oracles. In the security reductions, we omit specifying the running times of the constructed adversaries when they are roughly the same as the running time of the initial adversary. Let $\textrm{G}_{\mathcal {D}}$ be any security game defining a decision-based problem that requires an adversary $\mathcal {D}$ to guess a challenge bit d; let $d'$ denote the output of $\mathcal {D}$, and let game $\textrm{G}_{\mathcal {D}}$ return $\texttt {true}$ iff $d' = d$. Depending on the context, we interchangeably use the two equivalent advantage definitions for such games: $\textsf{Adv}^{\textsf{}}_{}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}_{\mathcal {D}}] - 1$, and $\textsf{Adv}^{\textsf{}}_{}(\mathcal {D}) = \Pr \left[ \,d' = 1\,|\,d = 1\, \right] - \Pr \left[ \,d' = 1\,|\,d = 0\, \right] $. As part of our reductions, the intermediary games (e.g. Fig. 39) use the following colour-coding: for equivalent but expanded code and for the code added for the transitions between games; the adversaries constructed for the transitions (e.g. Fig. 36) use to mark the changes in the code of the simulated reduction games.

2.2 Standard definitions

2.2.1 Fundamental Lemma of Game Playing

In our game-hopping proofs, we frequently make use of the Fundamental Lemma of Game Playing [17]. Suppose that the games $\textrm{G}_i$ and $\textrm{G}_{i+1}$ are identical until the flag $\textsf{bad}$ is set. Then, we have

$$\begin{aligned} \Pr [\textrm{G}_i] - \Pr [\textrm{G}_{i+1}] \le \Pr [\textsf{bad}^{\textrm{G}_i}] = \Pr [\textsf{bad}^{\textrm{G}_{i+1}}], \end{aligned}$$

where $\Pr [\textsf{bad}^\textrm{G}]$ denotes the probability of setting the flag $\textsf{bad}$ in game $\textrm{G}$.

2.2.2 Collision-resistant functions

Let $f:\mathcal {D}_{f}\rightarrow \mathcal {R}_{f}$ be a function. Consider game $\textrm{G}^{\textsf{cr}}$ of Fig. 2, defined for $f$ and an adversary $\mathcal {F}$. The advantage of $\mathcal {F}$ in breaking the $\textrm{CR}$-security of $f$ is defined as $\textsf{Adv}^{\textsf{cr}}_{f}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{cr}}_{f, \mathcal {F}}]$. To win the game, adversary $\mathcal {F}$ has to find two distinct inputs $x_0, x_1 \in \mathcal {D}_{f}$ such that $f(x_0) = f(x_1)$. Note that $f$ is unkeyed, so there exists a trivial adversary $\mathcal {F}$ with $\textsf{Adv}^{\textsf{cr}}_{f}(\mathcal {F}) = 1$ whenever $f$ is not injective. We will use this notion in a constructive way, to build a specific collision-resistance adversary $\mathcal {F}$ (for $f= \textsf{SHA}-\textsf{256}$ with a truncated output) in a security reduction.

2.2.3 Function families

A family of functions $\textsf{F}$ specifies a deterministic algorithm $\textsf{F}.\textsf{Ev}$, a key set $\textsf{F}.\textsf{KS}$, an input set $\textsf{F}.\textsf{IN}$ and an output length $\textsf{F}.\textsf{ol}\in {{\mathbb {N}}}$. $\textsf{F}.\textsf{Ev}$ takes a function key $\textit{fk}\in \textsf{F}.\textsf{KS}$ and an input $x\in \textsf{F}.\textsf{IN}$ to return an output $y\in \{0,1\}^{\textsf{F}.\textsf{ol}}$. We write $y \leftarrow \textsf{F}.\textsf{Ev}(\textit{fk}, x)$. The key length of $\textsf{F}$ is $\textsf{F}.\textsf{kl}\in {{\mathbb {N}}}$ if $\textsf{F}.\textsf{KS} = \{0,1\}^{\textsf{F}.\textsf{kl}}$.

2.2.4 Block ciphers

Let $\textsf{E}$ be a function family. We say that $\textsf{E}$ is a block cipher if $\textsf{E}.\textsf{IN} = \{0,1\}^{\textsf{E}.\textsf{ol}}$, and if $\textsf{E}$ specifies (in addition to $\textsf{E}.\textsf{Ev}$) an inverse algorithm $\textsf{E}.\textsf{Inv}:\{0,1\}^{\textsf{E}.\textsf{ol}} \rightarrow \textsf{E}.\textsf{IN}$ such that $\textsf{E}.\textsf{Inv}(\textit{ek}, \textsf{E}.\textsf{Ev}(\textit{ek}, x)) = x$ for all $\textit{ek}\in \textsf{E}.\textsf{KS}$ and all $x\in \textsf{E}.\textsf{IN}$. We refer to $\textsf{E}.\textsf{ol}$ as the block length of $\textsf{E}$. Our pictures and attacks use $E_K$ and $E_{K}^{-1}$ as a shorthand for $\textsf{E}.\textsf{Ev}(K, \cdot )$ and $\textsf{E}.\textsf{Inv}(K, \cdot )$, respectively.

2.2.5 One-time PRF security of function family for multiple keys

Consider game $\textrm{G}^\textsf{otprf}_{\textsf{F}, \mathcal {D}}$ of Fig. 3, defined for a function family $\textsf{F}$ and an adversary $\mathcal {D}$. The advantage of $\mathcal {D}$ in breaking the $\textrm{OTPRF}$-security of $\textsf{F}$ is defined as $\textsf{Adv}^{\textsf{otprf}}_{\textsf{F}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^\textsf{otprf}_{\textsf{F}, \mathcal {D}}] - 1$. The game samples a uniformly random challenge bit b and runs adversary $\mathcal {D}$, providing it with access to oracle $\textsc {RoR}$. The oracle takes $x\in \textsf{F}.\textsf{IN}$ as input, and the adversary is allowed to query the oracle arbitrarily many times. Each time $\textsc {RoR}$ is queried on any x, it samples a uniformly random key $\textit{fk}$ from $\textsf{F}.\textsf{KS}$ and returns either $\textsf{F}.\textsf{Ev}(\textit{fk}, x)$ (if $b = 1$) or a uniformly random element from $\{0,1\}^{\textsf{F}.\textsf{ol}}$ (if $b = 0$). $\mathcal {D}$ wins if it returns a bit $b'$ that is equal to the challenge bit.

2.2.6 Symmetric encryption schemes

A symmetric encryption scheme $\textsf{SE}$ specifies algorithms $\mathsf {\textsf{SE}.Enc}$ and $\mathsf {\textsf{SE}.Dec}$, where $\mathsf {\textsf{SE}.Dec}$ is deterministic. Associated to $\textsf{SE}$ is a key length $\mathsf {\textsf{SE}.kl}\in {{\mathbb {N}}}$, a message space $\mathsf {\textsf{SE}.MS}\subseteq \{0,1\}^* \setminus \{\varepsilon \}$, and a ciphertext length function $\mathsf {\textsf{SE}.cl}:{{\mathbb {N}}}\rightarrow {{\mathbb {N}}}$. The encryption algorithm $\mathsf {\textsf{SE}.Enc}$ takes a key $k\in \{0,1\}^{\mathsf {\textsf{SE}.kl}}$ and a message $m\in \mathsf {\textsf{SE}.MS}$ to return a ciphertext $c\in \{0,1\}^{\mathsf {\textsf{SE}.cl}(\left| m\right| )}$. We write . The decryption algorithm $\mathsf {\textsf{SE}.Dec}$ takes k, c to return message $m \in \mathsf {\textsf{SE}.MS}\cup \{\bot \}$, where $\bot $ denotes incorrect decryption. We write $m \leftarrow \mathsf {\textsf{SE}.Dec}(k, c)$. Decryption correctness requires that $\mathsf {\textsf{SE}.Dec}(k, c) = m$ for all $k\in \{0,1\}^{\mathsf {\textsf{SE}.kl}}$, all $m\in \mathsf {\textsf{SE}.MS}$, and all $c\in [\mathsf {\textsf{SE}.Enc}(k, m)]$. We say that $\textsf{SE}$ is deterministic if $\mathsf {\textsf{SE}.Enc}$ is deterministic.

2.2.7 One-time indistinguishability of SE

Consider game $\textrm{G}^{\mathsf {otind\$}}$ of Fig. 4, defined for a deterministic symmetric encryption scheme $\textsf{SE}$ and an adversary $\mathcal {D}$. We define the advantage of $\mathcal {D}$ in breaking the $\mathrm {OTIND\$}$-security of $\textsf{SE}$ as $\textsf{Adv}^{\mathsf {otind\$}}_{\textsf{SE}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^{\mathsf {otind\$}}_{\textsf{SE}, \mathcal {D}}] - 1$. The game proceeds as the $\textrm{OTPRF}$ game.

2.2.8 CBC block cipher mode of operation

Let $\textsf{E}$ be a block cipher. Define the Cipher Block Chaining (CBC) mode of operation as a deterministic symmetric encryption scheme $\textsf{SE}= \textsf{CBC}[\textsf{E}]$ shown in Fig. 5, where key length is $\mathsf {\textsf{SE}.kl}= \textsf{E}.\textsf{kl} + \textsf{E}.\textsf{ol}$, the message space $\mathsf {\textsf{SE}.MS}= \bigcup _{t\in {{\mathbb {N}}}} \{0,1\}^{\textsf{E}.\textsf{ol}\cdot t}$ consists of messages whose lengths are multiples of the block length, and the ciphertext length function $\mathsf {\textsf{SE}.cl}$ is the identity function. Note that Fig. 5 gives a somewhat non-standard definition for CBC, as it includes the IV ($c_0$) as part of the key material. However, in this work, we are only interested in one-time security of $\textsf{SE}$, so keys and IVs are generated together and the IV is not included as part of the ciphertext.

2.2.9 IGE block cipher mode of operation

Let $\textsf{E}$ be a block cipher. Define the Infinite Garble Extension (IGE) mode of operation as $\textsf{SE}= \textsf{IGE}[\textsf{E}]$ as in Fig. 5, with parameters as in the CBC mode except for key length $\mathsf {\textsf{SE}.kl}= \textsf{E}.\textsf{kl} + 2 \cdot \textsf{E}.\textsf{ol}$ (since IGE has two IV blocks which we again include as part of the key). We depict IGE decryption in Fig. 6 as we rely on this in Sect. 6.

IGE was first defined in [22], which claims it has infinite error propagation and thus can provide integrity. This claim was disproved in an attack on Free-MAC [36], which has the same specification as IGE. [36] shows that given a plaintext–ciphertext pair it is possible to construct another ciphertext that will correctly decrypt to a plaintext such that only two of its blocks differ from the original plaintext, i.e. the “errors” introduced in the ciphertext do not propagate forever. IGE also appears as a special case of the Accumulated Block Chaining (ABC) mode [38]. A chosen-plaintext attack on ABC that relied on IV reuse between encryptions was described in [11].

2.2.10 MD transform

Figure 7 defines the Merkle–Damgård transform as a function family $\textsf{MD}[h ]$ for a given compression function $h :\{0,1\}^\ell \times \{0,1\}^{\ell '} \rightarrow \{0,1\}^\ell $, with $\textsf{MD}.\textsf{IN} = \bigcup _{t\in {{\mathbb {N}}}} \{0,1\}^{{\ell '} \cdot t}$, $\textsf{MD}.\textsf{KS} = \{0,1\}^\ell $ and $\textsf{MD}.\textsf{ol} = \ell $.^{Footnote 4}

2.2.11 $\textsf{SHA}-\textsf{1}$ and $\textsf{SHA}-\textsf{256}$

Let $\textsf{SHA}-\textsf{1}: \{0,1\}^* \rightarrow \{0,1\}^{160}$ and $\textsf{SHA}-\textsf{256}: \{0,1\}^* \rightarrow \{0,1\}^{256}$ be the hash functions as defined in [47]. We refer to their compression functions as $h _{160}: \{0,1\}^{160} \times \{0,1\}^{512} \rightarrow \{0,1\}^{160}$ and $h _{256}: \{0,1\}^{256} \times \{0,1\}^{512} \rightarrow \{0,1\}^{256}$, and to their initial states as $\textsf{IV}_{160}$ and $\textsf{IV}_{256}$. We can write

$$\begin{aligned} \begin{aligned} \textsf{SHA}-\textsf{1}(x)&= \textsf{MD}[h _{160}].\textsf{Ev}(\textsf{IV}_{160}, \textsf{SHA}-\textsf{pad}(x)), \text { and} \\ \textsf{SHA}-\textsf{256}(x)&= \textsf{MD}[h _{256}].\textsf{Ev}(\textsf{IV}_{256}, \textsf{SHA}-\textsf{pad}(x)) \end{aligned} \end{aligned}$$

where $\textsf{SHA}-\textsf{pad}$ is defined in Fig. 7.

2.2.12 $\textsf{SHACAL}-\textsf{1}$ and $\textsf{SHACAL}-\textsf{2}$

Let $\;\hat{+}\;$ be an addition operator over 32-bit words, meaning for any $x,y\in \bigcup _{t\in {{\mathbb {N}}}}\{0,1\}^{32\cdot t}$ with $\left| x\right| =\left| y\right| $ the instruction $z \leftarrow x \;\hat{+}\;y$ splits x and y into 32-bit words and independently adds together words at the same positions, each modulo $2^{32}$; it then computes z by concatenating together the resulting 32-bit words. Let $\textsf{SHACAL}-\textsf{1}$ [30] be the block cipher defined by $\textsf{SHACAL}-\textsf{1}.\textsf{kl} = 512$, $\textsf{SHACAL}-\textsf{1}.\textsf{ol} = 160$ such that $h _{160}(k, x) = k \;\hat{+}\;\textsf{SHACAL}-\textsf{1}.\textsf{Ev}(x, k)$. Similarly, let $\textsf{SHACAL}-\textsf{2}$ be the block cipher defined by $\textsf{SHACAL}-\textsf{2}.\textsf{kl} = 512$, $\textsf{SHACAL}-\textsf{2}.\textsf{ol} = 256$ such that $h _{256}(k, x) = k \;\hat{+}\;\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(x, k)$. See Fig. 8.

3 Bidirectional channels

3.1 Our formal model in the context of prior work

3.1.1 The choice of a cryptographic primitive

We model the symmetric part of Telegram’s MTProto protocol as a bidirectional cryptographic channel. A channel provides a method for two users to exchange messages, and it is called bidirectional [43] when each user can both send and receive messages. A unidirectional channel provides an interface between two users where only a single user can send messages, and only the opposite user can receive them. Two unidirectional channels can be composed to build a bidirectional channel, but some care needs to be taken to establish what level of security is inherited by the resulting channel [43]. A symmetric encryption scheme can be thought of as a special case of a unidirectional channel; it allows to achieve security notions stronger than unforgeability once its encryption and decryption algorithms are modelled as being stateful [15, 16].

MTProto uses distinct but related secret keys to send messages in the opposite directions on the channel, so it would not be sufficient to model it as a unidirectional channel. Such an analysis could miss bad interactions between the two directions.

3.1.2 The choice of a security model

Cryptographic security models normally require that channels provide in-order delivery of all messages. In the unidirectional setting, this means that the receiver should only accept messages in the same order as they were dispatched by the sender. In particular, the channel must prevent all attempts to forge, replay, reorder or drop messages.^{Footnote 5} In the bidirectional setting, the in-order delivery is required to hold separately in either direction of communication.^{Footnote 6}

The current version of MTProto 2.0 does not enforce in-order message delivery. It determines whether a successfully decrypted ciphertext should be accepted based on a complex set of rules. In particular, it happens to allow message reordering, as we describe in Sect. 4.2. We consider that a vulnerability. So in Sect. 4.4 we define a slight variant of MTProto 2.0 that enforces in-order delivery. Our security analysis in Sect. 5 is then provided with respect to the fixed version of the protocol. Nevertheless, we set out to choose a formal model for channels that could also potentially be used to analyse the current version of MTProto 2.0. In particular, we chose a model that could express both in-order delivery and the message delivery rules that are used in the current version.^{Footnote 7}

No prior work on bidirectional channels defines correctness and security notions that could be used to capture message delivery rules of varied strengths. In the unidirectional setting, [20, 40] each define a hierarchy of multiple security notions where the weakest notion requires only unforgeability and the strongest requires in-order delivery. [28, 51] define abstract definitional frameworks for unidirectional channels with fully parametrisable security notions. In this work we extend the robust channel framework of [27, 28], lifting it to the bidirectional setting.

3.1.3 Extending the robust channel framework

The robust channel framework [28] defines unidirectional correctness and security notions with respect to an arbitrary support predicate. When a ciphertext is delivered to the receiver, the corresponding notion uses the support predicate to determine whether the channel is expected to accept this ciphertext or to reject it, i.e. whether this ciphertext is currently supported. For example, the notion of correctness in [28] requires that a channel accepts and correctly decrypts all supported ciphertexts, whereas their notion of integrity requires that a channel rejects all ciphertexts that are not supported. The correctness and security games in [28] maintain a sequence of ciphertexts that were sent by the sender, and a sequence of ciphertexts that were received and accepted by the receiver. A support predicate takes both sequences as input and it can use them to decide on whether an incoming ciphertext is supported. For completeness, we provide the core definitions of [28] in Appendix C.2.

We lift the robust channel framework [28] to the bidirectional setting, and we significantly extend it in other ways. Most importantly, our framework uses more information to determine whether an incoming ciphertext is supported. In particular, we define our correctness and security games to maintain a support transcript for each of the two users; this extends the idea of using sequences of sent and received ciphertexts in [28]. A user’s support transcript represents a sequence of events, each entry describing an attempt to send or to receive a message. More precisely, each entry can be thought of as describing one of the following events (stated in terms of some specific plaintext m and/or ciphertext c): “sent c that encrypts m”, “failed to send m”, “received c, accepted it, and decrypted it as m”, “received c and rejected it”. In our framework, the support transcripts are used by a support function; it extends the concept of the support predicate from [28]. Given the support transcripts of both users as input, a support function in our framework is meant to prescribe the exact behaviour of a channel when a new ciphertext is delivered to either user. Namely, a support function either determines that the incoming ciphertext must be rejected, or it determines that the incoming ciphertext must be accepted and a specific plaintext value must be obtained upon decrypting this ciphertext. For example, our notion of correctness is similar in spirit to that of [28], requiring that a channel accepts and correctly decrypts each plaintext that is not rejected by a support function. The core difference between our correctness notion and that of [28] is in how these definitions determine whether a specific ciphertext was decrypted “correctly”. In our framework, the output of a support function prescribes that a specific plaintext value must be obtained, whereas in [28] the correctness game builds a lookup table to determine that value.

The above example provides an intuition that by defining our support transcripts to contain plaintext messages, we obtain simpler correctness and security definitions when compared to [28]. But one could also see this as a trade-off between different parts of the formalism, because some complexity that is removed from the correctness and security games might simply be relegated to the step of specifying and analysing a support function. In order to better understand how our framework relates to the robust channel framework, in Appendix C we provide a thorough comparison between the unidirectional variants of our definitions and those of [28].

3.1.4 Relation to secure messaging models

A recent line of work uses channels to study the best achievable security of instant messaging between two users. A limited, unidirectional case was first considered by [18]; follow-up work uses bidirectional channels [7, 21, 33, 35]. The focus is on achieving strong forward security and post-compromise security guarantees in the presence of an attacker that can compromise secret states of the users. With the exception of [7], all of this work models channels that are required to provide in-order message delivery. In contrast, the immediate decryption-aware channel of [7] effectively allows message drops but mandates that the dropped messages can later be delivered and retroactively assigned to their correct positions in the communication transcript. Any of these bidirectional models except [7] could be simplified (to not require advanced security properties) and used for a formal analysis of our MTProto-based channel from Sect. 4.4. None of these models would be able to capture the correctness and security properties of MTProto 2.0 as it is currently implemented.

3.2 Syntax of channels

We refer to the two users of a channel as $\mathcal {I}$ and $\mathcal {R}$. These will map to the client and the server in the setting of MTProto. We use $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$ as a variable to represent an arbitrary user and $\overline{\textit{u}}$ to represent the other user, meaning $\overline{\textit{u}}$ denotes the sole element of $\{\mathcal {I},\mathcal {R}\}\setminus \{\textit{u}\}$. We use $\textit{st}_\textit{u}$ to represent the internal state of user $\textit{u}$. A channel uses an initialisation algorithm to abstract away the key agreement; this matches the main focus of our work—to study the symmetric encryption of MTProto.

Definition 1

A channel $\textsf{CH}$ specifies algorithms $\mathsf {\textsf{CH}.Init}$, $\mathsf {\textsf{CH}.Send}$ and $\mathsf {\textsf{CH}.Recv}$, where $\mathsf {\textsf{CH}.Recv}$ is deterministic. The syntax used for the algorithms of $\textsf{CH}$ is given in Fig. 9. Associated to $\textsf{CH}$ is a plaintext space $\mathsf {\textsf{CH}.MS}\subseteq \{0,1\}^* \setminus \{\varepsilon \}$ and a randomness space $\mathsf {\textsf{CH}.SendRS}$ of $\mathsf {\textsf{CH}.Send}$. The initialisation algorithm $\mathsf {\textsf{CH}.Init}$ returns $\mathcal {I}$’s and $\mathcal {R}$’s initial states $\textit{st}_{\mathcal {I}}$ and $\textit{st}_{\mathcal {R}}$. The sending algorithm $\mathsf {\textsf{CH}.Send}$ takes $\textit{st}_{\textit{u}}$ for some $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$, a plaintext $m\in \mathsf {\textsf{CH}.MS}$, and auxiliary information $\textit{aux}$ to return the updated state $\textit{st}_{\textit{u}}$ and a ciphertext c, where $c = \bot $ may be used to indicate a failure to send. We may surface random coins $r\in \mathsf {\textsf{CH}.SendRS}$ as an additional input to $\mathsf {\textsf{CH}.Send}$. The receiving algorithm $\mathsf {\textsf{CH}.Recv}$ takes $\textit{st}_{\textit{u}}$, c and auxiliary information $\textit{aux}$ to return the updated state $\textit{st}_{\textit{u}}$ and a plaintext $m\in \mathsf {\textsf{CH}.MS}\cup \{\bot \}$, where $\bot $ indicates a failure to recover a plaintext.

Our channel definition reflects some unusual choices that are necessary to model the MTProto protocol. The abstract auxiliary information field $\textit{aux}$ will be used to associate timestamps to each sent and received message.^{Footnote 8} In this work, we do not use the $\textit{aux}$ field to model associated data that would need to be authenticated, but our definitions in principle allow to use it that way. Also note that the sending algorithm $\mathsf {\textsf{CH}.Send}$ is randomised, but a stateful channel in general does not need randomness to achieve basic security notions. We only use randomness to faithfully model MTProto; it uses randomness to determine the length and contents of message padding. Our correctness and security notions will let an attacker choose arbitrary random coins, so we surface it as an optional input to the sending algorithm $\mathsf {\textsf{CH}.Send}$.

3.3 Support transcripts and functions

In this section, we extend the definitional framework for robust channels from [28]. In Sect. 3.1, we outlined the core differences between the two frameworks, and in Appendix C we provide a detailed comparison between them.

3.3.1 Support transcripts

We define a support transcript to represent the communication record of a single user. Each transcript entry describes an attempt to send or to receive a plaintext, ordered chronologically. A support transcript $\textsf{tr}_{\textit{u}}$ of user $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$ contains two types of entries: $(\textsf{sent}, m, \textsf{label}, \textit{aux})$ and $(\textsf{recv}, m, \textsf{label}, \textit{aux})$ for an event of sending or receiving some plaintext m, respectively. In either case, $\textsf{label}$ is a support label whose purpose is to distinguish between different network messages each encrypting or encoding a specific plaintext m, and $\textit{aux}$ is auxiliary information such as the timestamp at the moment of sending or receiving the network message. Depending on the level of abstraction, our model uses ciphertexts or message encodings as support labels.^{Footnote 9}

Definition 2

A support transcript $\textsf{tr}_{\textit{u}}$ for user $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$ is a list of entries of the form $(\textsf{op}, m, \textsf{label}, \textit{aux})$, where $\textsf{op}\in \{\textsf{sent}, \textsf{recv}\}$. An entry with $\textsf{op}= \textsf{sent}$ indicates that user $\textit{u}$ attempted to send a network message that encrypts or encodes plaintext m with auxiliary information $\textit{aux}$. An entry with $\textsf{op}= \textsf{recv}$ indicates that user $\textit{u}$ received a network message with auxiliary information $\textit{aux}$ and used it to recover plaintext m. In either case, the network message is identified by its support label $\textsf{label}$.

A support transcript is not intended to surface the implementation details of the primitive that is used for communication. This is reflected in our abstract treatment of the support labels: an outside observer with no knowledge of the internal states of the two communicating users might not be able to interpret the (possibly encrypted) network messages that are being exchanged. So our framework treats each network message as a mere label that can be observed to be sent by a user in response to some plaintext input. One might subsequently observe the same label being taken as input by the opposite user, resulting in some plaintext output. If the scheme used for the two-user communication guarantees that all such labels are unique, an observer might be able to use the equality of exchanged labels across both support transcripts to determine whether a message replay, reordering or drop occurred. The MTProto-based scheme that we study in this paper produces distinct ciphertexts, and our framework uses ciphertexts as support labels when analysing a channel; this will allow us to rely on equality patterns that arise between them. In this work, we use no information about support labels beyond their equality patterns.

Support transcripts can include entries of the form $(\textsf{recv}, m, \textsf{label}, \textit{aux})$ with the plaintext $m = \bot $ to indicate that the received network message was rejected. Support transcripts can also include entries of the form $(\textsf{sent}, m, \textsf{label}, \textit{aux})$ with the support label value $\textsf{label}= \bot $, e.g. to indicate that a network message encrypting the plaintext m could not be sent over a terminated channel. Our support transcripts are therefore suitable for two-user communication primitives that implement a wide range of possible behaviours in the event of an error, from terminating after the first failure to full recovery.

We now provide the construction of a sample channel, along with an example of how communication over this channel can be captured using support transcripts. We will use this channel and its support transcripts to showcase more examples throughout this section. Let $\textsf{SE}$ be an arbitrary symmetric encryption scheme that provides integrity and confidentiality (i.e. it provides authenticated encryption). Consider a sample channel $\textsf{CH}= \textsf{SAMPLE}\text {-}\textsf{CH}[\textsf{SE}]$ as defined in Fig. 10. In addition to the security assurances inherited from $\textsf{SE}$, the channel $\textsf{CH}$ is only designed to prevent forgeries that could occur by mirroring a ciphertext back to its sender. Figure 11 provides a step-by-step example of communication between users $\mathcal {I}$ and $\mathcal {R}$ over $\textsf{CH}$. It shows $\mathcal {I}$’s and $\mathcal {R}$’s support transcripts at the end of the communication between them, where the channel’s ciphertexts are used as labels. Note that the ciphertext $c_{\mathcal {I}, 2}$ was dropped and the ciphertext $c_{\mathcal {I}, 0}$ was replayed in its place. As a result, each user’s transcript shows that the other user endorsed crimes.

3.3.2 Support functions

We now define the notion of a support function. We use a support function to prescribe the exact input–output behaviour of a receiver at any point in a two-user communication process (i.e. we use it to specify the expected behaviour of a channel’s decryption algorithm or that of a message encoding scheme’s decoding algorithm, the latter primitive defined in Sect. 3.5). More specifically, a support function $\textsf{supp}$ determines whether a user $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$ should accept an incoming network message—that is associated with a support label $\textsf{label}$—from the opposite user $\overline{\textit{u}}$, based on the support transcripts $\textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}$ of both users. If the network message should be accepted, then $\textsf{supp}$ must return a plaintext $m^*$ to indicate that $\textit{u}$ is expected to recover $m^*$ as a result of accepting it; otherwise, $\textsf{supp}$ must return $\bot $ to indicate that the network message must be rejected. We also let $\textsf{supp}$ take the auxiliary information $\textit{aux}$ as input so that timestamps can be captured in our definitions.

Definition 3

A support function $\textsf{supp}$ is an efficiently computable deterministic function, written $\textsf{supp}(\textit{u}, \textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}, \textsf{label}, \textit{aux}) \rightarrow m^*$, where $\textit{u}\in \{\mathcal {I}, \mathcal {R}\}$, $\textsf{tr}_{\textit{u}}$, $\textsf{tr}_{\overline{\textit{u}}}$ are support transcripts for users $\textit{u}$ and $\overline{\textit{u}}$, respectively, $\textsf{label}$ is any label that identifies the network message, and $\textit{aux}$ is auxiliary information associated with the network message; $m^*$ is then the plaintext message that should be recovered by $\textit{u}$.

In Sect. 3.4, we define the notions of channel correctness, integrity, and indistinguishability. Our correctness and integrity notions jointly require that the channel’s receiving algorithm works exactly as prescribed by a specific support function. More precisely, both notions require that the channel’s receiving algorithm consistently returns the same output as that returned by the support function, but each notion is defined with respect to an adversary that has different capabilities. In the correctness game, the adversary gets the channel’s state as input and is only allowed to query the receiving algorithm on supported ciphertexts (i.e. those that are not rejected by the support function). In the integrity game, the adversary does not get any secrets as input and is allowed to query the receiving algorithm on all possible inputs, including attempted ciphertext forgeries or any gibberish inputs that aim to corrupt the channel’s state. This definitional approach is similar in spirit to how correctness and integrity are defined for basic cryptographic primitives. For example, for a symmetric encryption scheme one often considers the notions of decryption correctness and ciphertext integrity, where the former should hold even when the adversary knows the secret key, whereas the latter requires the adversary to produce ciphertext forgeries without knowing the key. In comparison, a channel is a stateful primitive so its correctness and integrity conditions can be significantly more complex, depending on how it should treat forgeries, replays, reordering, and drops. A support function allows us to capture these conditions in a modular way. Finally, the notion of indistinguishability that we define for a channel requires that the output of the channel’s sending algorithm leaks no information about the encrypted plaintext; this security notion makes no use of a support function.

Consider a sample support function $\textsf {SAMPLE\text {-}SUPP}_0$ in Fig. 12. It does not contain the boxed code. The support function prohibits forgeries by returning $\bot $ if the opposite user’s support transcript $\textsf{tr}_{\overline{\textit{u}}}$ does not contain an entry, indicating that $\overline{\textit{u}}$ previously sent a network message associated with the support label $\textsf{label}$. If a forgery is not detected, then the support function finds and returns a plaintext m such that $(\textsf{sent}, m, \textsf{label}, \textit{aux}')$ belongs to $\textsf{tr}_{\overline{\textit{u}}}$ with any $\textit{aux}'$. For any symmetric encryption scheme $\textsf{SE}$ that provides authenticated encryption, recall algorithms $\mathsf {\textsf{CH}.Init}$ and $\mathsf {\textsf{CH}.Send}$ of the sample channel $\textsf{CH}= \textsf{SAMPLE}\text {-}\textsf{CH}[\textsf{SE}]$ defined in Fig. 10; let us treat ciphertexts produced by $\mathsf {\textsf{CH}.Send}$ as support labels. Then, the algorithm $\mathsf {\textsf{CH}.Recv}$ from Fig. 10 implements the functionality that is prescribed by $\textsf {SAMPLE\text {-}SUPP}_0$: it rejects forgeries and otherwise recovers and returns the originally encrypted plaintext. Note that $\textsf {SAMPLE\text {-}SUPP}_0$ grabs the first plaintext m that it finds associated to $\textsf{label}$ in $\textsf{tr}_{\overline{\textit{u}}}$, without checking whether any other plaintext values are also associated to $\textsf{label}$. This does not produce ambiguity when used with algorithms $\mathsf {\textsf{CH}.Init}$ and $\mathsf {\textsf{CH}.Send}$; implicit in our example is that $\textsf{SE}$ provides decryption correctness, and therefore, two distinct plaintexts cannot be encrypted into the same ciphertext (and hence be mapped to the same support label). This illustrates that a support function may appear ambiguous in isolation, but when considered alongside a channel whose properties rule out such ambiguity, its behaviour may be well-defined.

Consider another sample support function $\textsf {SAMPLE\text {-}SUPP}_1$ as defined in Fig. 12. In addition to the code from $\textsf {SAMPLE\text {-}SUPP}_0$, this support function also contains the boxed code. The added code is designed to prevent replays by rejecting any network message associated with a support label $\textsf{label}$ that is already present in one of the entries of the receiver’s support transcript $\textsf{tr}_{\textit{u}}$. For example, consider the following intermediate support transcripts of users $\mathcal {I}$ and $\mathcal {R}$ that could have arisen at some point during the communication displayed in Fig. 11:

$$\begin{aligned} \begin{aligned} \textsf{tr}_{\mathcal {I}, 3} = \big [&(\textsf{sent}, \text {``I say yes to''}, c_{\mathcal {I}, 0}, \varepsilon ), (\textsf{sent}, \text {``all the pizza''}, c_{\mathcal {I}, 1}, \varepsilon ), \\ &(\textsf{sent}, \text {``I say no to''}, c_{\mathcal {I}, 2}, \varepsilon ) \big ] \\ \textsf{tr}_{\mathcal {R}, 2} = \big [&(\textsf{recv}, \text {``I say yes to''}, c_{\mathcal {I}, 0}, \varepsilon ), (\textsf{recv}, \text {``all the pizza''}, c_{\mathcal {I}, 1}, \varepsilon ) \big ] \end{aligned} \end{aligned}$$

These support transcripts represent the moment when $\mathcal {I}$ has already sent 3 network messages, but so far $\mathcal {R}$ has only received 2 of them. Following Fig. 11, let us assume that a replay attack happens next and $\mathcal {R}$ receives a network message containing the ciphertext $c_{\mathcal {I}, 0}$ with auxiliary information $\textit{aux}= \varepsilon $. According to $\textsf {SAMPLE\text {-}SUPP}_0$, this network message should be accepted (and should decrypt to $m^*= \text {``I say yes to''}$), but according to $\textsf {SAMPLE\text {-}SUPP}_1$ this network message should be rejected:

$$\begin{aligned} \begin{aligned} \textsf {SAMPLE\text {-}SUPP}_0(\mathcal {R}, \textsf{tr}_{\mathcal {R}, 2}, \textsf{tr}_{\mathcal {I}, 3}, c_{\mathcal {I}, 0}, \varepsilon )&= \text {``I say yes to''} \\ \textsf {SAMPLE\text {-}SUPP}_1(\mathcal {R}, \textsf{tr}_{\mathcal {R}, 2}, \textsf{tr}_{\mathcal {I}, 3}, c_{\mathcal {I}, 0}, \varepsilon )&= \bot \end{aligned} \end{aligned}$$

Note that the algorithm $\mathsf {\textsf{CH}.Recv}$ from Fig. 10 can be changed to simply reject duplicate ciphertexts in order to accommodate the specification of $\textsf {SAMPLE\text {-}SUPP}_1$, without having to change algorithms $\mathsf {\textsf{CH}.Init}$ and $\mathsf {\textsf{CH}.Send}$. That would result in a contrived channel where the same plaintext can be encrypted and sent multiple times, but only the first of them is allowed to be received. A more appropriate change would require to also concatenate a distinct counter to each plaintext processed by $\mathsf {\textsf{CH}.Send}$, so that the same plaintext can be sent and received many times while still preventing replay attacks by a third party.

We now provide some observations about the power of support functions. This is irrelevant for the purpose of analysing MTProto, but is useful to highlight the strengths and limitations of our framework in general:

A support function does not take as input any information about the internal state of the primitive that is used for communication (i.e. that of a channel or a message encoding scheme). But a communication primitive might use its internal state to interpret incoming network messages in a non-trivial way. For example, in some channels the same ciphertext (in our framework associated with the same support label) could be repeatedly decrypted to a different plaintext depending on some shared secret that is being synchronously evolved by both users. A support function might not be able to capture a receiver’s behaviour in cases like this. Support functions are best suited for communication where the knowledge that “user $\textit{u}$ created a network message $\xi $ to send a plaintext m” uniquely determines that the opposite user $\overline{\textit{u}}$ can only recover m from $\xi $ (or otherwise produce the error symbol $\bot $).
Due to having access to user support transcripts, a support function can prescribe a receiver’s behaviour that is not achievable by any implementation. For example, if two channel ciphertexts $c_0$, $c_1$ were sent by the user $\textit{u}$ prior to any of them being received by the user $\overline{\textit{u}}$, then a support function can require $\overline{\textit{u}}$ to recover both underlying plaintexts from the first ciphertext it receives. This is impossible if each ciphertext encrypted an independently sampled and uniformly random value.
A support function prescribes a receiver’s behaviour with respect to a pair of existing support transcripts. But our framework does not have a similar way to state complex requirements regarding a sender’s behaviour. For example, our framework can require a channel user’s receiving algorithm to perpetually return $\bot $ once the channel is considered closed (e.g. due to repeated errors while processing incoming ciphertexts), but it cannot require for the same user’s sending algorithm to subsequently return $\bot $ in response to all attempts to send new plaintexts.

In Sect. 5.3, we define the support function $\textsf{supp}\text {-}\textsf{ord}$ with respect to which we will analyse the security of MTProto 2.0. In Appendix A, we formalise two correctness-style properties of a support function, but we do not mandate that they must always be met. Both properties were also considered in [28]. The integrity of a support function requires that it always returns $\bot $ if the queried support label $\textsf{label}$ does not appear in the opposite user’s support transcript $\textsf{tr}_{\overline{\textit{u}}}$. The order correctness of a support function requires that it enforces in-order delivery for each direction between the two users separately, assuming that each network message is associated with a distinct support label.

3.4 Correctness and security of channels

In Sect. 3.3, we provided a high-level intuition regarding how we define channel correctness and security notions, here we formalise them. In all of the notions, we allow the adversary to control the randomness used by the channel’s sending algorithm $\mathsf {\textsf{CH}.Send}$. Channels are stateful, so they can achieve strong notions of security even when the adversary can control the randomness used for encryption.

3.4.1 Correctness

Consider the correctness game $\textrm{G}^{\textsf{corr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ in Fig. 13, defined for a channel $\textsf{CH}$, a support function $\textsf{supp}$ and an adversary $\mathcal {F}$. The advantage of $\mathcal {F}$ in breaking the correctness of $\textsf{CH}$ with respect to $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{corr}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{corr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}]$. The game starts by calling the algorithm $\mathsf {\textsf{CH}.Init}$ to initialise users $\mathcal {I}$ and $\mathcal {R}$, and the adversary is given their initial states. The adversary $\mathcal {F}$ gets access to a sending oracle $\textsc {Send}$ and to a receiving oracle $\textsc {Recv}$. Calling $\textsc {Send}(\textit{u}, m, \textit{aux}, r)$ encrypts the plaintext m with auxiliary data $\textit{aux}$ and randomness $r$ from the user $\textit{u}$ to the other user $\overline{\textit{u}}$; the resulting tuple $(\textsf{sent},m,c,\textit{aux})$ is added to the sender’s transcript $\textsf{tr}_{\textit{u}}$. Oracle $\textsc {Recv}$ can only be called on ciphertexts that should not produce a decryption error according to the behaviour prescribed by the support function $\textsf{supp}$ (when queried on the current support transcripts), meaning $\textsc {Recv}$ immediately exits with $\bot $ when $\textsf{supp}$ returns $m^*= \bot $. Calling $\textsc {Recv}(\textit{u}, c, \textit{aux})$ thus recovers the plaintext $m^*$ from the support function, decrypts the queried ciphertext c into plaintext m and adds $(\textsf{recv},m,c,\textit{aux})$ to the receiver’s transcript $\textsf{tr}_{\textit{u}}$; the game verifies that the decrypted plaintext m is equal to $m^*$. If the adversary can cause the channel to output a different m, then the adversary wins. This game captures the minimal requirement one would expect from a communication channel: that it succeeds to decrypt incoming ciphertexts in accordance to its specification, with only a limited possible interference from an adversary. In particular, the adversary is not allowed to test that the channel appropriately identifies and handles any errors.

Note that the $\textsc {Recv}$ oracle always returns $\bot $, but $\mathcal {F}$ can use the support function to compute the value m on its own for as long as the condition $m = m^*$ has never been false yet.^{Footnote 10} Based on the same condition, $\mathcal {F}$ can also use the support function to distinguish whether $\bot $ was returned because $m^*= \bot $ or because the end of the code of $\textsc {Recv}$ was reached (i.e. its last instruction “Return $\bot $” was evaluated).

Consider the sample channel $\textsf{CH}= \textsf{SAMPLE}\text {-}\textsf{CH}[\textsf{SE}]$ from Fig. 10 for any symmetric encryption scheme $\textsf{SE}$ that has decryption correctness. Then, $\textsf{CH}$ provides correctness with respect to either sample support function $\textsf{supp}\in \{\textsf {SAMPLE\text {-}SUPP}_0, \textsf {SAMPLE\text {-}SUPP}_1\}$ from Fig. 12. In particular, for all adversaries $\mathcal {F}$ we have $\textsf{Adv}^{\textsf{corr}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}) = 0$.

3.4.2 Integrity

Consider the integrity game $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ in Fig. 13, defined for a channel $\textsf{CH}$, a support function $\textsf{supp}$ and an adversary $\mathcal {F}$. The advantage of $\mathcal {F}$ in breaking the $\textrm{INT}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}]$. We define the integrity game in a very similar way to the correctness game above, but with two important distinctions. First, in the integrity game the adversary $\mathcal {F}$ no longer gets the initial states of users $\mathcal {I}$ and $\mathcal {R}$ as input. Second, the receiving oracle $\textsc {Recv}$ now allows all inputs from the adversary $\mathcal {F}$, including those that are meant to be rejected according to the support function $\textsf{supp}$. These changes reflect the intuition that the adversary $\mathcal {F}$ is now also allowed to win by producing an input such that the channel’s receiving algorithm returns $m \ne \bot $ while the support function returned $m^*= \bot $, which is essentially a forgery. The adversary does not get the channel’s initial states as input because that could trivialize its goal of producing a forgery.

According to the examples discussed in Sect. 3.3, the sample channel $\textsf{CH}= \textsf{SAMPLE}\text {-}\textsf{CH}[\textsf{SE}]$ from Fig. 10 provides integrity with respect to the sample support function $\textsf {SAMPLE\text {-}SUPP}_0$ from Fig. 12 if $\textsf{SE}$ provides authenticated encryption. Here it is in fact sufficient for $\textsf{SE}$ to only provide ciphertext integrity, without any assurances about the confidentiality of encrypted data. In contrast, no properties of $\textsf{SE}$ would be sufficient for $\textsf{CH}$ to provide integrity with respect to the sample support function $\textsf {SAMPLE\text {-}SUPP}_1$ from Fig. 12; the construction of $\textsf{SAMPLE}\text {-}\textsf{CH}$ itself would need to be changed to prevent replay attacks like the one displayed in Fig. 11.

Prior work on symmetric encryption formalises the intuition that a decryption oracle is useless to an adversary if all of its decryption queries can be simulated based on the live transcript of its encryption queries. This is captured as PA1 in [8] (where “PA” stands for plaintext awareness) and as decryption simulatability in [24]. An important distinction is that our definition of integrity requires $\mathsf {\textsf{CH}.Recv}$ to behave exactly as prescribed by a specific support function, whereas the goal of [8, 24] is to draw implications from the existence of any algorithm that can simulate $\mathsf {\textsf{CH}.Recv}$.

3.4.3 Confidentiality

Consider the indistinguishability game $\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}}$ in Fig. 14, defined for a channel $\textsf{CH}$ and an adversary $\mathcal {D}$. The advantage of $\mathcal {D}$ in breaking the $\textrm{IND}$-security of $\textsf{CH}$ is defined as $\textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}}] - 1$. The game samples a challenge bit b, and the adversary is required to guess it in order to win. The adversary $\mathcal {D}$ is provided with access to a challenge oracle $\textsc {Ch}$ and a receiving oracle $\textsc {Recv}$. The adversary can query the challenge oracle $\textsc {Ch}$ on inputs $\textit{u}, m_0, m_1, \textit{aux}, r$ to obtain a ciphertext encrypting plaintext $m_b$ with random coins $r$ from user $\textit{u}$ to user $\overline{\textit{u}}$, with auxiliary information $\textit{aux}$. Here the two plaintexts $m_0$, $m_1$ are required to have the same length. The adversary can query the receiving oracle $\textsc {Recv}$ on inputs $\textit{u}, c, \textit{aux}$ to make the user $\textit{u}$ decrypt the incoming ciphertext c from the user $\overline{\textit{u}}$ with auxiliary information $\textit{aux}$. The goal of this query is to update the receiving user’s state $\textit{st}_{\textit{u}}$; this is important because the updated state is then used to compute future outputs of queries to the challenge oracle $\textsc {Ch}$ when user $\textit{u}$ is the sender. The receiving oracle always discards the decrypted plaintext m and returns $\bot $. Note that if the channel $\textsf{CH}$ has integrity with respect to any support function $\textsf{supp}$, then the indistinguishability adversary $\mathcal {D}$ can itself use $\textsf{supp}$ to compute all outputs of the receiving oracle $\textsc {Recv}$ for either choice of the challenge bit b (i.e. at every step, $\mathcal {D}$ knows that $\textsf{supp}$ returns one of two possible plaintexts).

Consider the sample channel $\textsf{CH}= \textsf{SAMPLE}\text {-}\textsf{CH}[\textsf{SE}]$ from Fig. 10 for any symmetric encryption scheme $\textsf{SE}$ that is IND-CPA secure. Then $\textsf{CH}$ provides indistinguishability.

3.4.4 Authenticated encryption

In Appendix B, we define the authenticated encryption security of a channel, which simultaneously captures the integrity and indistinguishability notions from above. We define the joint notion in the all-in-one style of [50, 53]. We prove that our two separate security notions together are equivalent to the authenticated encryption security. This serves as a sanity check for our definitional choices.

3.5 Message encoding schemes

We advocate for a modular approach when building cryptographic channels. At its core, a channel can be expected to have a mechanism that handles the process of encoding plaintexts into payloads and decoding payloads back into plaintexts. Such a mechanism might need to maintain counters that store the number of previously encoded and decoded messages. It might add padding to plaintexts, while possibly encoding their original lengths. It might also embed other metadata into the produced payloads. We formalise it as a separate primitive called a message encoding scheme. Then, a cryptographic channel can be built by composing a message encoding scheme with appropriate cryptographic primitives that would provide integrity and confidentiality for the encoded plaintexts.

We now formally define a message encoding scheme. The modular approach suggested above leads us to define syntax for message encoding that is similar to that of a cryptographic channel. In particular, a message encoding scheme needs to have stateful encoding and decoding algorithms. Auxiliary information can be used to relay and verify metadata such as timestamps. Note that our definition uses randomness in the encoding algorithm because it is necessary when modelling Telegram (i.e. because in MTProto 2.0 the length of padding used for payloads is randomised).

Definition 4

A message encoding scheme $\textsf{ME}$ specifies algorithms $\mathsf {\textsf{ME}.Init}$, $\mathsf {\textsf{ME}.Encode}$and $\mathsf {\textsf{ME}.Decode}$, where $\mathsf {\textsf{ME}.Decode}$ is deterministic. Associated to $\textsf{ME}$ is a message space $\mathsf {\textsf{ME}.MS}\subseteq \{0,1\}^*\setminus \{\varepsilon \}$, a payload space $\mathsf {\textsf{ME}.Out}$, a randomness space $\mathsf {\textsf{ME}.EncRS}$ of $\mathsf {\textsf{ME}.Encode}$, and a payload length function $\mathsf {\textsf{ME}.pl}:{{\mathbb {N}}}\times \mathsf {\textsf{ME}.EncRS}\rightarrow {{\mathbb {N}}}$. The initialisation algorithm $\mathsf {\textsf{ME}.Init}$ returns $\mathcal {I}$’s and $\mathcal {R}$’s initial states $\textit{st}_{\mathcal {I}}$ and $\textit{st}_{\mathcal {R}}$. The encoding algorithm $\mathsf {\textsf{ME}.Encode}$ takes $\textit{st}_{\textit{u}}$ for $u\in \{\mathcal {I},\mathcal {R}\}$, a message $m\in \mathsf {\textsf{ME}.MS}$, and auxiliary information $\textit{aux}$ to return the updated state $\textit{st}_{\textit{u}}$ and a payload $p\in \mathsf {\textsf{ME}.Out}$.^{Footnote 11} We may surface random coins $\nu \in \mathsf {\textsf{ME}.EncRS}$ as an additional input to $\mathsf {\textsf{ME}.Encode}$; then a message m should be encoded into a payload $p$ of length $\left| p\right| =\mathsf {\textsf{ME}.pl}(\left| m\right| , \nu )$. The decoding algorithm $\mathsf {\textsf{ME}.Decode}$ takes $\textit{st}_{\textit{u}}, p$, and auxiliary information $\textit{aux}$ to return the updated state $\textit{st}_{\textit{u}}$ and a message $m\in \mathsf {\textsf{ME}.MS}\cup \{\bot \}$. The syntax used for the algorithms of $\textsf{ME}$ is given in Fig. 15.

We now define two properties of a message encoding scheme: encoding correctness and encoding integrity. We formalise each property with respect to a support function, in a similar way to how we formalised correctness and integrity for a channel in Sect. 3.4. The encoding correctness and integrity notions both roughly require that the decoding algorithm of a message encoding scheme always returns messages that are consistent with the support function. The two notions differ in that the encoding correctness only requires the outputs to be consistent until the first error occurs (i.e. until the support function returns $\bot $), whereas the encoding integrity also requires the decoding algorithm to recover from errors and keep returning consistent outputs throughout. We formalise both notions in the setting where the message encoding scheme is being run over an authenticated channel. This reflects the intuition that the message encoding scheme does not have to provide any cryptographic properties, but it is expected to be composed with a primitive that guarantees the integrity of communication. In contrast, the message encoding scheme itself is responsible for providing all properties that are required by a support function and are not implied by integrity. This may include the impossibility to replay, reorder and drop messages.

We use the games in Fig. 16 to formalise the encoding correctness and integrity notions of a message encoding scheme $\textsf{ME}$ with respect to a support function $\textsf{supp}$. The advantage of an adversary $\mathcal {F}$ in breaking the encoding correctness of $\textsf{ME}$ with respect to $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{ecorr}}_{\textsf{ME}, \textsf{supp}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{ecorr}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}}]$. The advantage of an adversary $\mathcal {F}$ in breaking the encoding integrity ($\textrm{EINT}$-security) of $\textsf{ME}$ with respect to $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}}]$. The encoding correctness game $\textrm{G}^{\textsf{ecorr}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}}$ contains the boxed code, while the encoding integrity game $\textrm{G}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}}$ does not. The encoding correctness requires that $\textsf{ME}$ manages to “correctly” decode all payloads that are deemed to be admissible by the support function $\textsf{supp}$, while the inadmissible payloads are ignored by the game; here the support function itself is used to determine what constitutes a “correct” decoding. The encoding integrity requires that $\textsf{ME}$ rejects inadmissible payloads while maintaining its baseline correctness; this in particular means that the processing of inadmissible payloads should not corrupt the state of $\textsf{ME}$ in unexpected ways. As a result of processing inadmissible payloads, the receiver’s transcript will contain $(\textsf{recv}, \bot , p, \textit{aux})$-type entries. The support function $\textsf{supp}$ might process various conditions involving these entries (e.g. depending on the number of errors that occurred), and the encoding scheme $\textsf{ME}$ still has to provide outputs that are consistent with $\textsf{supp}$.

The two core differences from the corresponding channel notions in Sect. 3.4 are as follows. First, the message encoding scheme is meant to be run within an integrity-protected communication channel, so the $\textsc {Recv}$ oracle in both games now starts by checking that the queried payload $p$ was returned by a prior call to the opposite user’s $\textsc {Send}$ oracle (in response to some message m and auxiliary information $\textit{aux}$). Second, the message encoding is not meant to serve any cryptographic purpose, so the initial states $\textit{st}_{\textsf{ME}, \mathcal {I}}, \textit{st}_{\textsf{ME}, \mathcal {R}}$ should not contain any secret information and are given as inputs to adversary $\mathcal {F}$ in both games. This means that the encoding integrity is a strictly stronger notion than the encoding correctness, and the latter has limited value.^{Footnote 12}

In Sect. 5.3, we define three more properties of message encoding that will be necessary for our security analysis of MTProto 2.0. None of these properties are defined with respect to a support function. Our modular approach of building a channel from a message encoding scheme serves to localise the number of times we need to consider the specifics of a support function: the integrity proof (in Sect. 5.6) of the channel that we study is reduced to the encoding integrity of the underlying message encoding scheme, and the latter is then proved in Appendix E.5.

4 MTProto 2.0 specification

In this section, we describe our modelling of the MTProto 2.0 record protocol as a bidirectional channel. First, in Sect. 4.1 we give an informal description of MTProto based on Telegram documentation and client implementations. Next, in Sect. 4.2 we outline attacks that motivate protocol changes required to achieve security. We list further modelling issues and points where we depart from Telegram documentation in Sect. 4.3. We conclude with Sect. 4.4 where we give our formal specification for a fixed version of the protocol.

4.1 Telegram description

We studied MTProto 2.0 as described in the online documentation [62] and as implemented in the official desktop^{Footnote 13} and Android^{Footnote 14} clients. We focus on cloud chats, i.e. chats that are only encrypted at the transport layer between the clients and Telegram servers. The end-to-end encrypted secret chats are implemented on top of this transport layer and only available for one-on-one chats. Figures 17 and 18 give a visual summary of the following description.

4.1.1 Key exchange

A Telegram client must first establish a symmetric 2048-bit auth_key with the server via a version of the Diffie–Hellman key exchange. We defer the details of the key exchange to Sect. 7. In practice, this key exchange first results in a permanent auth_key for each of the Telegram data centres the client connects to. Thereafter, the client runs a new key exchange on a daily basis to establish a temporary auth_key that is used instead of the permanent one.

4.1.2 “Record protocol”

Messages are protected as follows.

1.
API calls are expressed as functions in the TL schema [57].
2.
The API requests and responses are serialised according to the type language (TL) [59] and embedded in the msg_data field of a payload $p$ , shown in Table 1. The first two 128-bit blocks of $p$ have a fixed structure and contain various metadata. The maximum length of msg_data is $2^{24}$ bytes.
3.
The payload is encrypted using $\textsf{AES}-\textsf{256}$in IGE mode. The ciphertext c is a part of an MTProto ciphertext $\textsf {auth}\_\textsf {key}\_\textsf {id} ~\Vert ~\textsf {msg}\_\textsf {key} ~\Vert ~c$, where (recalling that z[a : b] denotes bits a to $b-1$, inclusive, of string z):
$$\begin{aligned} \textsf {auth}\_\textsf {key}\_\textsf {id}&:=\textsf{SHA}-\textsf{1}[\textsf {auth}\_\textsf {key} ][96:160]\\ \textsf {msg}\_\textsf {key}&:=\textsf{SHA}-\textsf{256}[\textsf {auth}\_\textsf {key} {[704+x:960+x]} ~\Vert ~p ][64:192]\\ c&:=\mathsf {IGE[AES}-\mathsf {256]}.\textsf{Enc}(\textsf{key} ~\Vert ~\textsf{iv}, p) \end{aligned}$$
Here, the first two fields form an external header. The $\mathsf {IGE[AES}-\mathsf {256]}$ keys and IVs are computed via:
$$\begin{aligned} A&:=\textsf{SHA}-\textsf{256}[\textsf {msg}\_\textsf {key} ~\Vert ~\textsf {auth}\_\textsf {key} {[x:288 + x]}]\\ B&:=\textsf{SHA}-\textsf{256}[\textsf {auth}\_\textsf {key} {[320 + x:608 + x]} ~\Vert ~\textsf {msg}\_\textsf {key} ]\\ \textsf{key}&:=A[0:64] ~\Vert ~B[64:192] ~\Vert ~A[192:256]\\ \textsf{iv}&:=B[0:64] ~\Vert ~A[64:192] ~\Vert ~B[192:256] \end{aligned}$$
In the above steps, $x=0$ for messages from the client and $x=64$ from the server. Telegram clients use the BoringSSL implementation [29] of IGE, which has 2-block IVs.
4.
MTProto ciphertexts are encapsulated in a “transport protocol”. The MTProto documentation defines multiple such protocols [55], but the default is the abridged format that prefixes the stream with a fixed value of 0xefefefef and afterwards wraps each MTProto ciphertext $c_{\textsf{MTP}}$ in a transport packet as:
- $\textsf {length} ~\Vert ~c_{\textsf{MTP}}$ where 1-byte length contains the $c_{\textsf{MTP}}$ length divided by 4, if the resulting packet length is $< 127$, or
- $\texttt {0x7f} ~\Vert ~\textsf {length} ~\Vert ~c_{\textsf{MTP}}$ where length is encoded in 3 bytes.
5.
All the resulting packets are obfuscated by default using $\textsf{AES}-\textsf{128}\,$in CTR mode. The key and IV are transmitted at the beginning of the stream, so the obfuscation provides no cryptographic protection and we ignore it henceforth.^{Footnote 15}
6.
Communication is over TCP (port 443) or HTTP. Clients attempt to choose the best available connection. There is support for TLS in the client code, but it does not seem to be used.

In combination, these operations mean that MTProto 2.0 at its core uses a “stateful Encrypt & MAC” construction. Here the MAC tag $\textsf {msg}\_\textsf {key} $ is computed using $\textsf{SHA}-\textsf{256}$ with a prepended key derived from (certain bits of) auth_key . The key and IV for IGE mode are derived on a per-message basis using a KDF based on $\textsf{SHA}-\textsf{256}$, using certain bits of auth_key as the KDF key and the $\textsf {msg}\_\textsf {key} $ as a diversifier. Note that the bit ranges of auth_key used by the client and the server to derive keys in both operations overlap with one another. Any formal security analysis needs to take this into account.

Table 1 MTProto payload format

Full size table

4.2 Attacks against MTProto metadata validation

We describe adversarial behaviours that are permitted in current Telegram implementations and that mostly depend on how clients and servers validate metadata information in the payload (especially the second 128-bit block containing msg_id , msg_seq_no and msg_length ).

We consider a network attacker that sits between the client and the Telegram servers, attempting to manipulate the conversation transcript. We distinguish between two cases: when the client is the sender of a message and when it is the receiver. By message, we mean any msg_data exchanged via MTProto, but we pay particular attention to when it contains a chat message.

4.2.1 Message reordering

By reordering we mean that an adversary can swap messages sent by one party so that they are processed in the wrong order by the receiving party. Preventing such attacks is a basic property that one would expect in a secure messaging protocol. The MTProto documentation mentions reordering attacks as something to protect against in secret chats but does not discuss it for cloud chats [65]. The implementation of cloud chats provides some protection, but not fully:

When the client is the receiver, the order of displayed chat messages is determined by the date and time values within the TL message object (which are set by the server), so adversarial reordering of packets has no effect on the order of chat messages as seen by the client. On mobile clients, messages are also delivered via push notification systems, which are typically secured with TLS. Note that service messages of MTProto typically do not have such a timestamp so reordering is theoretically possible, but it is unclear whether it would affect the client’s state since such messages tend to be responses to particular requests or notices of errors, which are not expected to arrive in a given order.
When the client is the sender, the order of chat messages can be manipulated because the server sets the date and time value for the Telegram user to whom the message was addressed based on when the server itself receives the message, and because the server will accept a message with a lower msg_id than that of a previous message as long as its msg_seq_no is also lower than that of a previous message. The server does not take the timestamp implicit within msg_id into account except to check whether it is at most 300 s in the past or 30 s in the future, so within this time interval reordering is possible. A message outside of this time interval is not ignored, but a request for time synchronisation is triggered, after receipt of which the client sends the message again with a fresh msg_id. So an attacker can also simply delay a chosen message to cause messages to be accepted out of order. In Telegram, the rotation of the server_salt every 30 to 60 min may be an obstacle to carrying out this attack in longer time intervals.

We verified that reordering between a sending client and a receiving server is possible in practice using unmodified Android clients (v6.2.0) and a malicious WiFi access point running a TCP proxy [42] with custom rules to suppress and later release certain packets. Suppose an attacker sits between Alice and a server, and Alice is in a chat with Bob. The attacker can reorder messages that Alice is sending, so the server receives them in the wrong order and forwards them in the wrong order to Bob. While Alice’s client will initially display her sent messages in the order she sent them, once it fetches history from the server it will update to display the modified order that will match that of Bob.

Note that such reordering attacks are not possible against e.g. Signal or MTProto’s closest “competitor” TLS. TLS-like protocols over UDP such as DTLS [48] or QUIC [32] either leave it to the application to handle packet reordering (reordering is possible against DTLS) or have built-in mechanisms to handle these (reordering is not possible against QUIC).

Other types of reordering. A stronger form of reordering resistance can also be required from a protocol if one considers the order in the transcript as a whole, so that the order of sent messages with respect to received messages has to be preserved. This is sometimes referred to as global transcript in the literature [74] and is generally considered to be more complex to achieve. In particular, the following is possible in both Telegram and e.g. Signal. Alice sends a message “Let’s commit all the crimes”. Then, simultaneously both Alice and Bob send a message. Alice: “Just kidding”; Bob: “Okay”. Depending on the order in which these messages arrive, the transcript on either side might be (Alice: “Let’s commit all the crimes”, Alice: “Just kidding”, Bob: “Okay”) or (Alice: “Let’s commit all the crimes”, Bob: “Okay”, Alice: “Just kidding”). That is, the transcript will have Bob acknowledging a joke or criminal activity. Note that in the context of group messaging, there is another related but weaker property: the notion of causality preservation [26]. However, when restricted to the two-party case, this property becomes equivalent to in-order delivery (as exhibited by the support function $\textsf{supp}\text {-}\textsf{ord}$ defined in Fig. 32).

4.2.2 Message drops

MTProto makes it possible to silently drop a message both when the client is the sender.^{Footnote 16} and when it is the receiver, but it is difficult to exploit in practice. Clients and the server attempt to resend messages for which they did not get acknowledgements. Such messages have the same msg_ids but are enclosed in a fresh ciphertext with random padding so the attacker must be able to distinguish the repeated encryptions to continue dropping the same payload. This is possible, for example, with the desktop client as sender, since padding length is predictable based on the message length [69]. When the client is a receiver, other message delivery mechanisms such as batching of messages inside a container or API calls like messages.getHistory make it hard for an attacker to identify repeated encryptions. In the latter case, MTProto does not prevent message drops, but there is likely no practical attack.

4.2.3 Re-encryption

If a message is not acknowledged within a certain time in MTProto, it is re-encrypted using the same msg_id and with fresh random padding. While this appears to be a useful feature and a mitigation against message drops, it breaks the expected guarantees provided by a secure channel.

The issue can be illustrated by considering a local passive adversary that captures a transcript $(c_{\mathcal {I}, 0}, c_{\mathcal {R}}, c_{\mathcal {I}, 1})$ of messages exchanged between the client and the server, where $c_{\mathcal {I}, 0}, c_{\mathcal {I}, 1}$ were sent by the client and $c_{\mathcal {R}}$ was sent by the server. This adversary should not be able to find any distinguishing information about the plaintexts by studying the transcript; this is a very basic security guarantee of the channel, covered under the IND-CPA setting that we also formalise in Sect. 3.4. However, re-encryptions in MTProto are distinguishable: by examining the ciphertexts, the adversary can determine whether $c_{\mathcal {R}}$ encrypts an automatically generated acknowledgement, or a new message from the server.^{Footnote 17}

In more detail, re-encryption means the same partial state in the form of msg_id and msg_seq_no is used for two different encryptions. A reuse of a complete state would mean the ciphertexts $c_{\mathcal {I}, 0}, c_{\mathcal {I}, 1}$ contain the same $\textsf {msg}\_\textsf {key} $, and further that $c_{\mathcal {I}, 0}^{(2)} = c_{\mathcal {I},1}^{(2)}$, i.e. that the 2nd blocks of the respective ciphertexts match. These conditions are easy to check for the adversary. In a model where the adversary controls the randomness in the protocol (as in Sect. 3.4), three encryption queries would be sufficient to perform the attack. However, in practice there is one part of the state that does change upon re-encryption and that is the padding, which is also part of the input used to compute $\textsf {msg}\_\textsf {key} $. This means that to trigger the distinguishing condition, we must rely on collisions in $\textsf {msg}\_\textsf {key} $. Since msg_key is computed via $\textsf{SHA}-\textsf{256}$ truncated to 128 bits and the birthday bound applies, we expect a collision with constant probability after $3 \cdot 2^{64}$ encryption queries. This makes the attack mainly of theoretical interest.

To allow a security proof to go through, the cleanest solution is to remove the re-encryption capability from the specification of the channel, and leave the implementation of such a feature to the application layer. If a message resend facility is needed, it can be done transparently to and independently of the channel operation, i.e. each resending would take place using an updated, unique state of the channel.

4.3 Modelling differences

In general, we would like our formal specification of MTProto 2.0 to stay as close as possible to the real protocol, so that when we prove statements about the specification, we obtain meaningful assurances about the security of the real protocol. However, as the previous section demonstrates, the current protocol has flaws. These prevent meaningful security analysis and can be removed by making small changes to the protocol’s handling of metadata. Further, the protocol has certain features that make it less amenable to formal analysis. Here we describe the modelling decisions we took that depart from the current version of MTProto 2.0 and justify each change.

4.3.1 Under-specification and inconsistencies

There is no authoritative specification of the protocol. The Telegram documentation often differs from the implementations and leaves room for multiple interpretations; thus, the clients are not consistent with each other.^{Footnote 18} Where possible, we chose a sensible “default” choice from the observed set of possibilities, but we stress that it is in general impossible to create a formal specification of MTProto that would be valid for all current implementations. For instance, the documentation defines server_salt as “A (random) 64-bit number periodically (say, every 24 h) changed (separately for each session) at the request of the server” [63]. In practice, the clients receive salts that change every hour and whose validity periods overlap with each other.^{Footnote 19} For client differences, consider padding generation: on desktop [69], a given message length will always result in the same padding length, whereas on Android [67], the padding length is randomised.

4.3.2 Application layer

Similarly, there is no clear separation between the cryptographic protocol of MTProto and the application data processing (expressed using the TL schema). However, to reason succinctly about the protocol we require a certain level of abstraction. In concrete terms, this means that we consider the msg_data field as “the message”, without interpreting its contents and in particular without modelling TL constructors. However, this separation does not exist in implementations of MTProto—for instance, message encoding behaves differently for some constructors (e.g. container messages)—and so our specification does not capture these details.

4.3.3 Client/server roles

The client and the server are not considered equal in MTProto. For instance, the server is trusted to timestamp TL messages for history, while the clients are not, which is why our reordering attacks only work in the client to server direction. The client chooses the session_id , the server generates the server_salt . The server accepts any session_id given in the first message and then expects that value, while the client checks the session_id but may accept any server_salt given,^{Footnote 20} Clients do not check the msg_seq_no field. The protocol implements elaborate measures to synchronise “bad” client time with server time, which includes: checks on the timestamp within msg_id as well as the salt, special service messages [56] and the resending of messages with regenerated headers. Since much of this behaviour is not critical for security, we model both parties of the protocol as equals. Expanding our specification with this behaviour should be possible without affecting most of the proofs.

4.3.4 Key exchange

We are concerned with the symmetric part of the protocol, and thus assume that the shared auth_key is a uniformly random string rather than of the form $g^{ab} \bmod p$ resulting from the actual key exchange.

4.3.5 Bit mixing

MTProto uses specific bit ranges of auth_key as KDF and MAC inputs. These ranges do not overlap for different primitives (i.e. the KDF key inputs are wholly distinct from the MAC key inputs), and we model auth_key as a random value, so without loss of generality our specification generates the KDF and MAC key inputs as separate random values. The key input ranges for the client and the server do overlap for KDF and MAC separately, however, so we model this in the form of related-key-deriving functions.

Further, the KDF intermixes specific bit ranges of the outputs of two $\textsf{SHA}-\textsf{256}$ calls to derive the encryption keys and IVs. We argue that this is unnecessary—the intermixed KDF output is indistinguishable from random (the usual security requirement of a key derivation function) if and only if the concatenation of the two $\textsf{SHA}-\textsf{256}$ outputs is indistinguishable from random. Hence, in our specification the KDF just returns the concatenation.

4.3.6 Order

Given that MTProto operates over reliable transport channels, it is not necessary to allow messages arriving out of order. Our specification imposes stricter validation on metadata upon decryption via a single sequence number that is checked by both sides and only the next expected value is accepted. Enforcing strict ordering also automatically rules out message replay and drop attacks, which the implementation of MTProto as studied avoided in some cases only due to application-level processing.^{Footnote 21}

4.3.7 Re-encryption

Because of the attacks in Sect. 4.2, we insist in our formalisation that all sent messages include a fresh value in the header. This is achieved via a stateful secure channel definition in which either a client or server sequence number is incremented on each call to the $\mathsf {\textsf{CH}.Send}$ oracle.

4.3.8 Message encoding

Some of the previous points outline changes to message encoding. We simplify the scheme, keeping to the format of Table 1 but not modelling diverging behaviours upon decoding. The implemented MTProto message encoding scheme behaves differently depending on whether the user is a client or a server, but each of them checks a 64-bit value in the first plaintext block, session_id and server_salt , respectively. To prove security of the channel, it is enough that there is a single such value that both parties check, and it does not need to be randomised, so we specify a constant $\textsf {session}\_\textsf {id}$ and we leave the salt as an empty field. We also merge the msg_id and msg_seq_no fields into a single sequence number field of corresponding size, reflecting that a simple counter suffices in place of the original fields. Note that though we only prove security with respect to this particular message encoding scheme, our approach to specification is flexible and can accommodate more complex message encoding schemes.

4.4 MTProto-based channel

Our specification of the MTProto channel is given in Definition 5 and Fig. 19. The users $\mathcal {I}$ and $\mathcal {R}$ represent the client and the server. We abstract the individual keyed primitives into function families and instantiate each primitive or function later in this section.^{Footnote 22}

Definition 5

Let $\textsf{ME}$ be a message encoding scheme. Let $\textsf{HASH}$ be a function family such that $\{0,1\}^{992} \subseteq \textsf{HASH}.\textsf{IN}$. Let $\textsf{MAC}$ be a function family such that $\mathsf {\textsf{ME}.Out}\subseteq \textsf{MAC}.\textsf{IN}$. Let $\textsf{KDF}$ be a function family such that $\{0,1\}^{\textsf{MAC}.\textsf{ol}} \subseteq \textsf{KDF}.\textsf{IN}$. Let $\phi _{\textsf{MAC}} : \{0,1\}^{320} \rightarrow \textsf{MAC}.\textsf{KS} \times \textsf{MAC}.\textsf{KS}$ and $\phi _{\textsf{KDF}} :\{0,1\}^{672} \rightarrow \textsf{KDF}.\textsf{KS} \times \textsf{KDF}.\textsf{KS}$. Let $\textsf{SE}$ be a deterministic symmetric encryption scheme with $\mathsf {\textsf{SE}.kl}= \textsf{KDF}.\textsf{ol}$ and $\mathsf {\textsf{SE}.MS}= \mathsf {\textsf{ME}.Out}$. Then, $\textsf{CH}= \textsf{MTP}\text {-}\textsf{CH} [\textsf{ME}, \textsf{HASH}, \textsf{MAC}, \textsf{KDF}, \phi _{\textsf{MAC}}, \phi _{\textsf{KDF}}, \textsf{SE}]$ is the channel as defined in Fig. 19, with $\mathsf {\textsf{CH}.MS}= \mathsf {\textsf{ME}.MS}$ and $\mathsf {\textsf{CH}.SendRS}= \mathsf {\textsf{ME}.EncRS}$.

$\mathsf {\textsf{CH}.Init}$ generates the keys for both users and initialises the message encoding scheme. Note that $\textsf{auth}{\_}\textsf{key}$ as described in Sect. 4.1 does not appear in the code in Fig. 19, since each part of $\textsf{auth}{\_}\textsf{key}$ that is used for keying the primitives can be generated independently. These parts are denoted by $\textit{hk}$, $\textit{kk}$ and $\textit{mk}$.^{Footnote 23} The function $\phi _{\textsf{KDF}}$ (resp. $\phi _{\textsf{MAC}}$) is then used to derive the (related) keys for each user from $\textit{kk}$ (resp. $\textit{mk}$).

$\mathsf {\textsf{CH}.Send}$ proceeds by first using $\textsf{ME}$ to encode a message m into a payload $p$. The $\textsf{MAC}$ is computed on this payload to produce a $\textsf{msg}{\_}\textsf{key}$, and the $\textsf{KDF}$ is called on the $\textsf{msg}{\_}\textsf{key}$ to compute the key and IV for symmetric encryption $\textsf{SE}$, here abstracted as k. The payload is encrypted with $\textsf{SE}$ using this key material, and the resulting ciphertext is called $c_{\textit{se}}$. The $\textsf{CH}$ ciphertext c consists of $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$, $\textsf{msg}{\_}\textsf{key}$ and the symmetric ciphertext $c_{\textit{se}}$.

$\mathsf {\textsf{CH}.Recv}$ reverses the steps by first computing k from the $\textsf{msg}{\_}\textsf{key}$ parsed from c, then decrypting $c_{\textit{se}}$ to the payload $p$, and recomputing the $\textsf{MAC}$ of $p$ to check whether it equals $\textsf{msg}{\_}\textsf{key}$. If not, it returns $\bot $ (without changing the state) to signify failure. If the check passes, it uses $\textsf{ME}$ to decode the payload into a message m. It is important the $\textsf{MAC}$ check is performed before $\mathsf {\textsf{ME}.Decode}$ is called, otherwise this opens the channel to attacks—as we show later in Sect. 6.

The message encoding scheme $\textsf{MTP}\text {-}\textsf{ME}$ is specified in Definition 6 and Fig. 20. It is a simplified scheme for in-order message delivery without replays (see Appendix D for the actual MTProto scheme that permits reordering as outlined in Sect. 4.2).

Definition 6

Let $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$ and let $\textsf {pb}, \textsf{bl}\in {{\mathbb {N}}}$. Denote by $\textsf{ME}$ $=$ $\textsf{MTP}\text {-}\textsf{ME}[\textsf {session}\_\textsf {id}$, $\textsf {pb} $, $\textsf{bl}]$ the message encoding scheme given in Fig. 20, with $\mathsf {\textsf{ME}.MS}= \bigcup _{i = 1}^{2^{24}} \{0,1\}^{8\cdot i}$, $\mathsf {\textsf{ME}.Out}= \bigcup _{i \in {{\mathbb {N}}}} \{0,1\}^{\textsf{bl}\cdot i}$ and $\mathsf {\textsf{ME}.pl}(\ell , \nu ) = 256 + \ell + \left| \textsf{GenPadding} (\ell ; \nu )\right| $.^{Footnote 24}

As justified in Sect. 4.3, $\textsf{MTP}\text {-}\textsf{ME}$ follows the header format of Table 1, but it does not use the $\textsf {server}\_\textsf {salt} $ field (we define $\textsf {salt} $ as filled with zeros to preserve the field order) and we merge the 64-bit $\textsf {msg}\_\textsf {id} $ and 32-bit $\textsf {msg}\_\textsf {seq}\_\textsf {no} $ fields into a single $96$-bit $\textsf {seq}\_\textsf {no} $ field. Note that the internal counters of $\textsf{MTP}\text {-}\textsf{ME}$ wrap around when $\textsf {seq}\_\textsf {no} $ “overflows” modulo $2^{96}$, and an attacker can start replaying old payloads as soon as this happens. So when proving the encoding integrity of $\textsf{MTP}\text {-}\textsf{ME}$ in Appendix E.5 with respect to a support function that prohibits replays, we will consider adversaries that make at most $2^{96}$ message encoding queries.^{Footnote 25}

The following $\textsf{SHA}-\textsf{1}$ and $\textsf{SHA}-\textsf{256}$-based function families capture the MTProto primitives that are used to derive $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$, the message key $\textsf{msg}{\_}\textsf{key}$ and the symmetric encryption key k.

Definition 7

$\textsf{MTP}\text {-}\textsf{HASH}$ is the function family defined by $ \textsf{MTP}\text {-}\textsf{HASH}.\textsf{KS}$ $=$ $\{0,1\}^{1056}$, $\textsf{MTP}\text {-}\textsf{HASH}.\textsf{IN}$ $=$ $\{0,1\}^{992}$, $\textsf{MTP}\text {-}\textsf{HASH}.\textsf{ol} = 128$ and $\textsf{MTP}\text {-}\textsf{HASH}.\textsf{Ev}$ given in Fig. 21.

Definition 8

$\textsf{MTP}\text {-}\textsf{MAC}$ is the function family defined by $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{KS}$ $=$ $\{0,1\}^{256}$, $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{IN}$ $=$ $\{0,1\}^*$, $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{ol}=128$ and $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}$ given in Fig. 22.

Definition 9

$\textsf{MTP}\text {-}\textsf{KDF}$ is the function family defined by $\textsf{MTP}\text {-}\textsf{KDF}.\textsf{KS}$ $=$ $\{0,1\}^{288} \times \{0,1\}^{288}$, $\textsf{MTP}\text {-}\textsf{KDF}.\textsf{IN}$ $=$ $\{0,1\}^{128}$, $\textsf{MTP}\text {-}\textsf{KDF}.\textsf{ol}$ $=$ $2 \cdot \textsf{SHA}-\textsf{256}.\textsf{ol}$ and $\textsf{MTP}\text {-}\textsf{KDF}.\textsf{Ev}$ given in Fig. 23.

Since the keys for $\textsf{KDF}$ and $\textsf{MAC}$ in MTProto are not independent for the two users, we have to work in a related-key setting. We are inspired by the RKA framework of [14], but define our related-key-deriving function $\phi _{\textsf{KDF}}$ (resp. $\phi _{\textsf{MAC}}$) to output both keys at once, as a function of $\textit{kk}$ (resp. $\textit{mk}$). See Fig. 24 for precise details of $\phi _{\textsf{KDF}}$ and $\phi _{\textsf{MAC}}$.

Finally, we define the deterministic symmetric encryption scheme.

Definition 10

Let $\textsf{AES}-\textsf{256}$ be the standard AES block cipher with $\textsf{AES}-\textsf{256}.\textsf{kl} = 256$ and $\textsf{AES}-\textsf{256}.\textsf{ol}$ $=$ 128, and let $\textsf{IGE}$ be the block cipher mode in Fig. 5. Let $\textsf{MTP}\text {-}\textsf{SE}= \textsf{IGE}[\textsf{AES}-\textsf{256}]$.

5 Formal security analysis

In this section, we define the security notions that we require to hold for each of the underlying primitives of $\textsf{MTP}\text {-}\textsf{CH} $ and then use these notions to justify its correctness and prove its security properties.

We start by defining the security notions we require from the standard primitives in Sect. 5.1 (i.e. from the MTProto-based instantiations of $\textsf{HASH}$, $\textsf{KDF}$, $\textsf{MAC}$, $\textsf{SE}$); in Sect. 5.2, we then define two novel assumptions about $\textsf{SHACAL}-\textsf{2}$ that will be used in Appendix E to justify some of the aforementioned security notions. In Sect. 5.3, we define the security notions that will be required from the MTProto-based message encoding scheme; these notions are likewise justified in Appendix E. We prove that channel $\textsf{MTP}\text {-}\textsf{CH} $ satisfies correctness, indistinguishability and integrity in Sections 5.4,5.5 and 5.6, respectively. We conclude by providing an interpretation of our formal results in Sect. 5.7.

Our proofs use games and hops between them. In our games, we annotate some lines with comments of the form “$\textrm{G}_i$–$\textrm{G}_j$” to indicate that these lines belong only to games $\textrm{G}_i$ through $\textrm{G}_j$ (inclusive). The lines not annotated with such comments are shared by all of the games that are shown in the particular figure.

5.1 Security requirements on standard primitives

5.1.1 $\textsf{MTP}\text {-}\textsf{HASH}$ is a one-time indistinguishable function family

We require that $\textsf{MTP}\text {-}\textsf{HASH}$ meets the one-time weak indistinguishability notion ($\textrm{OTWIND}$) defined in Fig. 25. The security game $\textrm{G}^\textsf{otwind}_{\textsf{HASH}, \mathcal {D}}$ in Fig. 25 evaluates the function family $\textsf{HASH}$ on a challenge input $x_b$ using a secret uniformly random function key $\textit{hk}$. Adversary $\mathcal {D}$ is given $x_0, x_1$ and the output of $\textsf{HASH}$; it is required to guess the challenge bit $b\in \{0,1\}$. The game samples inputs $x_0, x_1$ uniformly at random rather than allowing $\mathcal {D}$ to choose them, so this security notion requires $\textsf{HASH}$ to provide only a weak form of one-time indistinguishability. The advantage of $\mathcal {D}$ in breaking the $\textrm{OTWIND}$-security of $\textsf{HASH}$ is defined as $\textsf{Adv}^{\textsf{otwind}}_{\textsf{HASH}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^\textsf{otwind}_{\textsf{HASH}, \mathcal {D}}] - 1$. Appendix E.1 provides a formal reduction from the $\textrm{OTWIND}$-security of $\textsf{MTP}\text {-}\textsf{HASH}$ to the one-time PRF security of $\textsf{SHACAL}-\textsf{1}$ (as defined in Sect. 2.2).

5.1.2 $\textsf{MTP}\text {-}\textsf{KDF}$ is a PRF under related-key attacks

We require that $\textsf{MTP}\text {-}\textsf{KDF}$ behaves like a pseudorandom function in the RKA setting ($\textrm{RKPRF}$) as defined in Fig. 26. The security game $\textrm{G}^\textsf{rkprf}_{\textsf{KDF}, \phi _{\textsf{KDF}}, \mathcal {D}}$ in Fig. 26 defines a variant of the standard PRF notion allowing the adversary $\mathcal {D}$ to use its $\textsc {RoR}$ oracle to evaluate the function family $\textsf{KDF}$ on either of the two secret, related function keys $\textit{kk}_\mathcal {I}, \textit{kk}_\mathcal {R}$ (both computed using related-key-deriving function $\phi _{\textsf{KDF}}$). The advantage of $\mathcal {D}$ in breaking the $\textrm{RKPRF}$-security of $\textsf{KDF}$ with respect to $\phi _{\textsf{KDF}}$ is defined as $\textsf{Adv}^{\textsf{rkprf}}_{\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^\textsf{rkprf}_{\textsf{KDF}, \phi _{\textsf{KDF}}, \mathcal {D}}] - 1$.

In Sect. 5.2, we define a novel security notion for $\textsf{SHACAL}-\textsf{2}$ that roughly requires it to be a leakage-resilient PRF under related-key attacks; in Appendix E.2, we provide a formal reduction from the $\textrm{RKPRF}$-security of $\textsf{MTP}\text {-}\textsf{KDF}$ to the new security notion. In this context, “leakage resilience” means that the adversary can adaptively choose a part of the $\textsf{SHACAL}-\textsf{2}$ key. However, we limit the adversary to being able to evaluate $\textsf{SHACAL}-\textsf{2}$ only on a single known, constant input (which is $\textsf{IV}_{256}$, the initial state of $\textsf{SHA}-\textsf{256}$). The new security notion is formalised as the $\textrm{LRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to a pair of related-key-deriving functions $\phi _{\textsf{KDF}}$ and $\phi _{\textsf{SHACAL}-\textsf{2}}$ (the latter is defined in Sect. 5.2).

5.1.3 $\textsf{MTP}\text {-}\textsf{MAC}$ is collision-resistant under RKA

We require that collisions in the outputs of $\textsf{MTP}\text {-}\textsf{MAC}$ under related keys are hard to find ($\textrm{RKCR}$), as defined in Fig. 27. The security game $\textrm{G}^{\textsf{rkcr}}_{\textsf{MAC}, \phi _{\textsf{MAC}}, \mathcal {F}}$ in Fig. 27 gives the adversary $\mathcal {F}$ two related function keys $\textit{mk}_\mathcal {I}, \textit{mk}_\mathcal {R}$ (created by the related-key-deriving function $\phi _{\textsf{MAC}}$), and requires it to produce two payloads $p_0, p_1$ (for either user $\textit{u}$) such that there is a collision in the corresponding outputs $\textsf{msg}{\_}\textsf{key}_0, \textsf{msg}{\_}\textsf{key}_1$ of the function family $\textsf{MAC}$. The advantage of $\mathcal {F}$ in breaking the $\textrm{RKCR}$-security of $\textsf{MAC}$ with respect to $\phi _{\textsf{MAC}}$ is defined as $\textsf{Adv}^{\textsf{rkcr}}_{\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{rkcr}}_{\textsf{MAC}, \phi _{\textsf{MAC}}, \mathcal {F}}]$. It is clear by inspection that the $\textrm{RKCR}$-security of $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}(\textit{mk}_{\textit{u}}, p) = \textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p){[64:192]}$ (with respect to $\phi _{\textsf{MAC}}$ from Fig. 24) reduces to the collision resistance of truncated output $\textsf{SHA}-\textsf{256}$.

5.1.4 $\textsf{MTP}\text {-}\textsf{MAC}$ is a PRF under RKA for unique-prefix inputs

We require that $\textsf{MTP}\text {-}\textsf{MAC}$ behaves like a pseudorandom function in the RKA setting when it is evaluated on a set of inputs that have unique 256-bit prefixes ($\textrm{UPRKPRF}$), as defined in Fig. 28. The security game $\textrm{G}^\textsf{uprkprf}_{\textsf{MAC}, \phi _{\textsf{MAC}}, \mathcal {D}}$ in Fig. 28 extends the standard PRF notion to use two related $\phi _{\textsf{MAC}}$-derived function keys $\textit{mk}_\mathcal {I}, \textit{mk}_\mathcal {R}$ for the function family $\textsf{MAC}$ (similar to the $\textrm{RKPRF}$-security notion we defined above), but it also enforces that the adversary $\mathcal {D}$ cannot query its oracle $\textsc {RoR}$ on two inputs $(\textit{u}, p_0)$ and $(\textit{u}, p_1)$ for any $\textit{u}\in \{\mathcal {I}, \mathcal {R}\}$ such that $p_0, p_1$ share the same 256-bit prefix. The unique-prefix condition means that the game does not need to maintain a PRF table to achieve output consistency. Note that this security game only allows to call the oracle $\textsc {RoR}$ with inputs of length $\left| p\right| \ge 256$; this is sufficient for our purposes, because in $\textsf{MTP}\text {-}\textsf{CH} $ the function family $\textsf{MTP}\text {-}\textsf{MAC}$ is only used with payloads that are longer than 256 bits. The advantage of $\mathcal {D}$ in breaking the $\textrm{UPRKPRF}$-security of $\textsf{MAC}$ with respect to $\phi _{\textsf{MAC}}$ is defined as $\textsf{Adv}^{\textsf{uprkprf}}_{\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^\textsf{uprkprf}_{\textsf{MAC}, \phi _{\textsf{MAC}}, \mathcal {D}}] - 1$.

In Sect. 5.2, we define a novel security notion that requires $\textsf{SHACAL}-\textsf{2}$ to be a leakage-resilient, related-key PRF when evaluated on a fixed input; in Appendix E.3, we show that the $\textrm{UPRKPRF}$-security of $\textsf{MTP}\text {-}\textsf{MAC}$ reduces to this security notion and to the one-time PRF security ($\textrm{OTPRF}$) of the $\textsf{SHA}-\textsf{256}$ compression function $h _{256}$. The new security notion is similar to the notion discussed in Sect. 5.1 and defined in Sect. 5.2, in that it only allows the adversary to evaluate $\textsf{SHACAL}-\textsf{2}$ on the fixed input $\textsf{IV}_{256}$. However, the underlying security game derives the related $\textsf{SHACAL}-\textsf{2}$ keys differently, partially based on the function $\phi _{\textsf{MAC}}$ defined in Fig. 24 (as opposed to $\phi _{\textsf{KDF}}$). The new notion is formalised as the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{MAC}}$.

5.1.5 $\textsf{MTP}\text {-}\textsf{SE}$ is a one-time indistinguishable SE scheme

For any block cipher $\textsf{E}$, Appendix E.4 shows that $\textsf{IGE}[\textsf{E}]$ as used in MTProto is $\mathrm {OTIND\$}$-secure (defined in Fig. 4) if $\textsf{CBC}[\textsf{E}]$ is $\mathrm {OTIND\$}$-secure. This enables us to use standard results [13, 49] on $\textsf{CBC}$ in our analysis of MTProto.

5.2 Novel assumptions about $\textsf{SHACAL}-\textsf{2}$

In this section, we define two novel assumptions about $\textsf{SHACAL}-\textsf{2}$. Both assumptions require $\textsf{SHACAL}-\textsf{2}$ to be a related-key PRF when evaluated on the fixed input $\textsf{IV}_{256}$ (i.e. on the initial state of $\textsf{SHA}-\textsf{256}$), meaning that the adversary can obtain the values of $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(\cdot , \textsf{IV}_{256})$ for a number of different but related keys. We formalise the two assumptions as security notions, called $\textrm{LRKPRF}$ and $\textrm{HRKPRF}$, each defined with respect to different related-key-deriving functions; this reflects the fact that these security notions allow the adversary to choose the keys in substantially different ways. The notion of $\textrm{LRKPRF}$-security derives the $\textsf{SHACAL}-\textsf{2}$ keys partially based on the function $\phi _{\textsf{KDF}}$, whereas the notion of $\textrm{HRKPRF}$-security derives $\textsf{SHACAL}-\textsf{2}$ keys partially based on the function $\phi _{\textsf{MAC}}$ (both functions are defined in Fig. 24). Both security notions also have different flavours of leakage resilience: (1) the security game defining $\textrm{LRKPRF}$ allows the adversary to directly choose 128 bits of the 512-bit long $\textsf{SHACAL}-\textsf{2}$ key, with another 96 bits of this key fixed and known (due to being chosen by the SHA padding function $\textsf{SHA}-\textsf{pad}$), and (2) the security game defining $\textrm{HRKPRF}$ allows the adversary to directly choose 256 bits of the 512-bit long $\textsf{SHACAL}-\textsf{2}$ key.

We use the notion of $\textrm{LRKPRF}$-security to justify the $\textrm{RKPRF}$-security of $\textsf{MTP}\text {-}\textsf{KDF}$ with respect to $\phi _{\textsf{KDF}}$ (as explained in Sect. 5.1, with the security reduction in Appendix E.2), which is needed in both the $\textrm{IND}$-security and the $\textrm{INT}$-security proofs of $\textsf{MTP}\text {-}\textsf{CH} $. We use the notion of $\textrm{HRKPRF}$-security to justify the $\textrm{UPRKPRF}$-security of $\textsf{MTP}\text {-}\textsf{MAC}$ with respect to $\phi _{\textsf{MAC}}$ (as explained in Sect. 5.1, with the security reduction in Appendix E.3), which is needed in the $\textrm{IND}$-security proof of $\textsf{MTP}\text {-}\textsf{CH} $.

We stress that we have to assume properties of $\textsf{SHACAL}-\textsf{2}$ that have not been studied in the literature. Related-key attacks on reduced-round $\textsf{SHACAL}-\textsf{2}$ have been considered [37, 41], but they ordinarily work with a known difference relation between unknown keys. In contrast, our $\textrm{LRKPRF}$-security notion uses keys that differ by random, unknown parts. Both of our security notions consider keys that are partially chosen or known by the adversary. In Appendix F, we show that both the $\textrm{LRKPRF}$-security and the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ hold in the ideal cipher model (i.e. when $\textsf{SHACAL}-\textsf{2}$ is modelled as the ideal cipher); we provide concrete upper bounds for breaking each of them. However, we cannot rule out the possibility of attacks on $\textsf{SHACAL}-\textsf{2}$ due to its internal structure in the setting of related-key attacks combined with key leakage. We leave this as an open question.

5.2.1 $\textsf{SHACAL}-\textsf{2}$ is a PRF with $\phi _{\textsf{KDF}}$-based related keys

Our $\textrm{LRKPRF}$-security notion for $\textsf{SHACAL}-\textsf{2}$ is defined with respect to related-key-deriving functions $\phi _{\textsf{KDF}}$ (from Fig. 24) and $\phi _{\textsf{SHACAL}-\textsf{2}}$ from Fig. 29. The latter mirrors the design of $\textsf{MTP}\text {-}\textsf{KDF}$ that (in Definition 9) is defined to return $\textsf{SHA}-\textsf{256}(\textsf{msg}{\_}\textsf{key} ~\Vert ~\textit{kk}_0) ~\Vert ~$ $\textsf{SHA}-\textsf{256}(\textit{kk}_1 ~\Vert ~\textsf{msg}{\_}\textsf{key})$ for the target key $\textit{kk}_\textit{u}= (\textit{kk}_0, \textit{kk}_1)$, except $\phi _{\textsf{SHACAL}-\textsf{2}}$ only needs to produce the corresponding SHA-padded inputs. We note that $\textrm{LRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ could instead be defined with respect to a single related-key-deriving function that would merge $\phi _{\textsf{KDF}}$ and $\phi _{\textsf{SHACAL}-\textsf{2}}$, which could lead to a cleaner formalisation of $\textrm{LRKPRF}$-security; however, we chose to avoid introducing an additional abstraction level here.

Consider the game $\textrm{G}^\textsf{lrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}, \mathcal {D}}$ in Fig. 30. Adversary $\mathcal {D}$ is given access to the $\textsc {RoR}$ oracle that takes $\textit{u}, i, \textsf{msg}{\_}\textsf{key}$ as input; all inputs to the oracle serve as parameters for the $\textsf{SHACAL}-\textsf{2}$ key derivation, used to determine the challenge key $\textit{sk} _i$. The adversary gets back either the output of $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(\textit{sk} _i$, $\textsf{IV}_{256})$ (if $b=1$), or a uniformly random value (if $b=0$), and is required to guess the challenge bit. The PRF table $\textsf{T}$ is used to ensure consistency, so that a single random value is sampled and remembered for each set of used key derivation parameters $\textit{u}, i, \textsf{msg}{\_}\textsf{key}$. The advantage of $\mathcal {D}$ in breaking the $\textrm{LRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{KDF}}$ and $\phi _{\textsf{SHACAL}-\textsf{2}}$ is defined as $\textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^\textsf{lrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}, \mathcal {D}}] - 1$.

5.2.2 $\textsf{SHACAL}-\textsf{2}$ is a PRF with $\phi _{\textsf{MAC}}$-based related keys

Consider the game $\textrm{G}^\textsf{hrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}, \mathcal {D}}$ in Fig. 31. Adversary $\mathcal {D}$ is given access to $\textsc {RoR}$ oracle and is required to choose the 256-bit suffix $p$ of each challenge key used for evaluating $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(\cdot , \textsf{IV}_{256})$. The value of $\textit{mk}_\textit{u}$ is then used to set the 256-bit prefix of the challenge key, where $\textit{u}$ is also chosen by the adversary, but the $\textit{mk}_\mathcal {I}, \textit{mk}_\mathcal {R}$ values themselves are related secrets that are not known to $\mathcal {D}$. The advantage of $\mathcal {D}$ in breaking the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{MAC}}$ is defined as $\textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^\textsf{hrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}, \mathcal {D}}] - 1$.

5.3 Security requirements on message encoding

In Sect. 3.5, we defined encoding integrity of a message encoding scheme $\textsf{ME}$ with respect to any support function $\textsf{supp}$. We now define the support function $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$ that will be used for our security proofs. We also define three ad hoc notions that must be met by the MTProto-based message encoding scheme $\textsf{MTP}\text {-}\textsf{ME}$ in order to be compatible with our security proofs.

5.3.1 $\textsf{MTP}\text {-}\textsf{ME}$ ensures in-order delivery

We require that $\textsf{MTP}\text {-}\textsf{ME}$ is $\textrm{EINT}$-secure (Fig. 16) with respect to the support function $\textsf{supp}\text {-}\textsf{ord}$ defined in Fig. 32. We define $\textsf{supp}\text {-}\textsf{ord}$ to enforce in-order delivery for each user’s sent messages (i.e. independently in each direction), thus preventing message forgeries, replays, (unidirectional) reordering and drops.

The formalisation of the support function $\textsf{supp}\text {-}\textsf{ord}$ uses a helper function $\textsf{find}(\textsf{op}, \textsf{tr}_{}, \textsf{label})$ that searches a transcript $\textsf{tr}_{}$ for an $\textsf{op}$-type entry (where $\textsf{op}\in \{\textsf{sent}, \textsf{recv}\}$) containing a target label $\textsf{label}$. This code relies on an assumption that all support labels are unique, which is true for payloads of $\textsf{MTP}\text {-}\textsf{ME}$ and for ciphertexts of $\textsf{MTP}\text {-}\textsf{CH} $ as long as at most $2^{96}$ plaintexts are sent.^{Footnote 26} The function $\textsf{find}$ also determines $N_{\textsf{op}} $, the order number of the target entry among all valid entries (i.e. the number of entries in the transcript up to and including the target entry); if the entry was not found, then $N_{\textsf{op}} $ is set to the number of all valid entries in the transcript. The support function $\textsf{supp}\text {-}\textsf{ord}$ on inputs $\textit{u}, \textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}, \textsf{label}$ requires that (i) there is no entry with label $\textsf{label}$ and a non-$\bot $ message in the receiver’s transcript $\textsf{tr}_{\textit{u}}$, (ii) an entry with label $\textsf{label}$ is found in the sender’s transcript $\textsf{tr}_{\overline{\textit{u}}}$, and (iii) the number of valid entries in the receiver’s transcript is one fewer than the order number of the entry found in the sender’s transcript, i.e. $N_{\textsf{sent}} = N_{\textsf{recv}} + 1$. Here the condition (i) prevents message replays, the condition (ii) prevents message forgery, whereas the condition (iii) prevents message reordering and drops. As outlined in Sect. 4.2, the message encoding scheme $\textsf{ME}$ in MTProto we studied (cf. Appendix D) allowed reordering so it was not $\textrm{EINT}$-secure with respect to $\textsf{supp}\text {-}\textsf{ord}$; instead we use the simplified message encoding scheme $\textsf{MTP}\text {-}\textsf{ME}$ (cf. Definition 6) for our formal analysis of MTProto.^{Footnote 27} In Appendix E.5, we show that $\textsf{Adv}^{\textsf{eint}}_{\textsf{MTP}\text {-}\textsf{ME}, \textsf{supp}\text {-}\textsf{ord}}(\mathcal {F}) = 0$ for any $\mathcal {F}$ making at most $2^{96}$ queries to $\textsc {Send}$.

5.3.2 Prefix uniqueness of $\textsf{MTP}\text {-}\textsf{ME}$

We require that payloads produced by $\textsf{MTP}\text {-}\textsf{ME}$ have distinct prefixes of size 256 bits (independently for each user $\textit{u}\in \{\mathcal {I}, \mathcal {R}\}$), as defined by the security game in Fig. 33. The advantage of an adversary $\mathcal {F}$ in breaking the $\textrm{UPREF}$-security of a message encoding scheme $\textsf{ME}$ is defined as $\textsf{Adv}^{\textsf{upref}}_{\textsf{ME}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{upref}}_{\textsf{ME}, \mathcal {F}}]$. Given the fixed prefix size, this notion cannot be satisfied against unbounded adversaries. Our $\textsf{MTP}\text {-}\textsf{ME}$ scheme ensures unique prefixes using the $96$-bit counter $\textsf {seq}\_\textsf {no} $ that contains the number of messages sent by user $\textit{u}$, so we have $\textsf{Adv}^{\textsf{upref}}_{\textsf{MTP}\text {-}\textsf{ME}}(\mathcal {F}) = 0$ for any $\mathcal {F}$ making at most $2^{96}$ queries, and otherwise there exists an adversary $\mathcal {F}$ such that $\textsf{Adv}^{\textsf{upref}}_{\textsf{MTP}\text {-}\textsf{ME}}(\mathcal {F}) = 1$. Note that $\textsf{MTP}\text {-}\textsf{ME}$ always has payloads larger than 256 bits. The MTProto implementation of message encoding we analysed was not $\textrm{UPREF}$-secure as it allowed repeated msg_id (cf. Sect. 4.2).

5.3.3 Encoding robustness of $\textsf{MTP}\text {-}\textsf{ME}$

We require that decoding in $\textsf{MTP}\text {-}\textsf{ME}$ should not affect its state in such a way that would be visible in future encoded payloads, as defined by the security game in Fig. 34. The advantage of an adversary $\mathcal {D}$ in breaking the $\textrm{ENCROB}$-security of a message encoding scheme $\textsf{ME}$ is defined as $\textsf{Adv}^{\textsf{encrob}}_{\textsf{ME}}(\mathcal {D}) = 2\cdot \Pr [\textrm{G}^{\textsf{encrob}}_{\textsf{ME}, \mathcal {D}}]-1$. This advantage is trivially zero for both $\textsf{MTP}\text {-}\textsf{ME}$ and the original MTProto message encoding scheme (cf. Appendix D). Note, however, that this property prevents a message encoding scheme from building payloads that include the number of previously received messages. It is thus incompatible with stronger notions of resistance against reordering attacks such as the global transcript (cf. Sect. 4.2).

5.3.4 Combined security of $\textsf{MTP}\text {-}\textsf{SE}$ and $\textsf{MTP}\text {-}\textsf{ME}$

We require that decryption in $\textsf{MTP}\text {-}\textsf{SE}$ with uniformly random keys has unpredictable outputs with respect to $\textsf{MTP}\text {-}\textsf{ME}$, as defined in Fig. 35. The security game $\textrm{G}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}, \mathcal {F}}$ in Fig. 35 gives adversary $\mathcal {F}$ access to two oracles. For any user $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$ and message key $\textsf{msg}{\_}\textsf{key}$, oracle $\textsc {Ch}$ decrypts a given ciphertext $c_{\textit{se}}$ of deterministic symmetric encryption scheme $\textsf{SE}$ under a uniformly random key $k\in \{0,1\}^{\mathsf {\textsf{SE}.kl}}$ and then decodes it using the given message encoding state $\textit{st}_\textsf{ME}$ of message encoding scheme $\textsf{ME}$, returning no output. The adversary is allowed to choose arbitrary values of $c_{\textit{se}}$ and $\textit{st}_\textsf{ME}$; it is allowed to repeatedly query oracle $\textsc {Ch}$ on inputs that contain the same values for $\textit{u}, \textsf{msg}{\_}\textsf{key}$ in order to reuse a fixed, secret $\textsf{SE}$ key k with different choices of $c_{\textit{se}}$. Oracle $\textsc {Expose}$ lets $\mathcal {F}$ learn the $\textsf{SE}$ key corresponding to the given $\textit{u}$ and $\textsf{msg}{\_}\textsf{key}$; the table $\textsf{S}$ is then used to disallow the adversary from querying $\textsc {Ch}$ with this pair of $\textit{u}$ and $\textsf{msg}{\_}\textsf{key}$ values again. $\mathcal {F}$ wins if it can cause $\mathsf {\textsf{ME}.Decode}$ to output a valid $m \not = \bot $. Note that $\textsf{msg}{\_}\textsf{key}$ in this game merely serves as a label for the tables, so we allow it to be an arbitrary string $\textsf{msg}{\_}\textsf{key}\in \{0,1\}^*$. The advantage of $\mathcal {F}$ in breaking the $\textrm{UNPRED}$-security of $\textsf{SE}$ with respect to $\textsf{ME}$ is defined as $\textsf{Adv}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}, \mathcal {F}}]$. In Appendix E.6, we show that $\textsf{Adv}^{\textsf{unpred}}_{\textsf{MTP}\text {-}\textsf{SE}, \textsf{MTP}\text {-}\textsf{ME}}(\mathcal {F}) \le {q_{\textsc {Ch}}}/{2^{64}}$ for any $\mathcal {F}$ making $q_{\textsc {Ch}}$ queries.

5.4 Correctness of $\textsf{MTP}\text {-}\textsf{CH}$

We claim that our MTProto-based channel satisfies our correctness definition. Consider any adversary $\mathcal {F}$ playing in the correctness game $\textrm{G}^{\textsf{corr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ (Fig. 13) for channel $\textsf{CH}= \textsf{MTP}\text {-}\textsf{CH} $ (Fig. 19) and support function $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$ (Fig. 32). Due to the definition of $\textsf{supp}\text {-}\textsf{ord}$, the $\textsc {Recv}$ oracle in game $\textrm{G}^{\textsf{corr}}_{\textsf{MTP}\text {-}\textsf{CH}, \textsf{supp}\text {-}\textsf{ord}, \mathcal {F}}$ rejects all $\textsf{CH}$ ciphertexts that were not previously returned by the $\textsc {Send}$ oracle. The encryption and decryption algorithms of channel $\textsf{MTP}\text {-}\textsf{CH} $ rely in a modular way on the message encoding scheme $\textsf{MTP}\text {-}\textsf{ME}$, deterministic function families $\textsf{MTP}\text {-}\textsf{KDF}, \textsf{MTP}\text {-}\textsf{MAC}$, and deterministic symmetric encryption scheme $\textsf{MTP}\text {-}\textsf{SE}$; the latter provides decryption correctness, so any valid ciphertext processed by oracle $\textsc {Recv}$ correctly yields the originally encrypted payload $p$. Thus, we need to show that $\textsf{MTP}\text {-}\textsf{ME}$ always recovers the expected plaintext m from payload $p$, meaning m matches the corresponding output of $\textsf{supp}\text {-}\textsf{ord}$. In Sect. 3.5, we formalised this requirement as the encoding correctness of $\textsf{MTP}\text {-}\textsf{ME}$ with respect to $\textsf{supp}\text {-}\textsf{ord}$ and discussed that it is also implied by the encoding integrity of $\textsf{MTP}\text {-}\textsf{ME}$ with respect to $\textsf{supp}\text {-}\textsf{ord}$. We prove the latter in Appendix E.5 for adversaries that make at most $2^{96}$ queries.

5.5 $\textrm{IND}$-security of $\textsf{MTP}\text {-}\textsf{CH}$

We begin our $\textrm{IND}$-security reduction by considering an arbitrary adversary $\mathcal {D}_{\textrm{IND}}$ playing in the $\textrm{IND}$-security game against channel $\textsf{CH}= \textsf{MTP}\text {-}\textsf{CH} $ (i.e. $\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}_{\textrm{IND}}}$ in Fig. 14), and we gradually change this game until we can show that $\mathcal {D}_{\textrm{IND}}$ can no longer win. To this end, we make three key observations:

(1)
Recall that oracle $\textsc {Recv}$ always returns $\bot $, and the only functionality of this oracle is to update the state of receiver’s channel by calling $\mathsf {\textsf{CH}.Recv}$. We assume that calls to $\mathsf {\textsf{CH}.Recv}$ never affect the ciphertexts that are returned by future calls to $\mathsf {\textsf{CH}.Send}$ (more precisely, we use the $\textrm{ENCROB}$ property of $\textsf{ME}$ that reasons about payloads rather than ciphertexts). This allows us to completely disregard the $\textsc {Recv}$ oracle, making it immediately return $\bot $ without calling $\mathsf {\textsf{CH}.Recv}$.
(2)
We use the $\textrm{UPRKPRF}$-security of $\textsf{MAC}$ to show that the ciphertexts returned by oracle $\textsc {Ch}$ contain $\textsf{msg}{\_}\textsf{key}$ values that look uniformly random and are independent of each other. Roughly, this security notion requires that $\textsf{MAC}$ can only be evaluated on a set of inputs with unique prefixes. To ensure this, we assume that the payloads produced by $\textsf{ME}$ meet this requirement (as formalised by the $\textrm{UPREF}$ property of $\textsf{ME}$).
(3)
In order to prove that oracle $\textsc {Ch}$ does not leak the challenge bit, it remains to show that ciphertexts returned by $\textsc {Ch}$ contain $c_{\textit{se}}$ values that look uniformly random and independent of each other. This follows from the $\mathrm {OTIND\$}$-security of $\textsf{SE}$. We invoke the $\textrm{OTWIND}$-security of $\textsf{HASH}$ to show that $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$ does not leak any information about the $\textsf{KDF}$ keys; we then use the $\textrm{RKPRF}$-security of $\textsf{KDF}$ to show that the keys used for $\textsf{SE}$ are uniformly random. Finally, we use the birthday bound to argue that the uniformly random values of $\textsf{msg}{\_}\textsf{key}$ are unlikely to collide, and hence, the keys used for $\textsf{SE}$ are also one-time.

Formally, we prove the following.

Theorem 1

Let $\textsf{ME}$, $\textsf{HASH}$, $\textsf{MAC}$, $\textsf{KDF}$, $\phi _{\textsf{MAC}}$, $\phi _{\textsf{KDF}}$, $\textsf{SE}$ be any primitives that meet the requirements stated in Definition 5 of channel $\textsf{MTP}\text {-}\textsf{CH} $. Let $\textsf{CH}= \textsf{MTP}\text {-}\textsf{CH} [\textsf{ME}, \textsf{HASH}, \textsf{MAC}, \textsf{KDF}, \phi _{\textsf{MAC}}, \phi _{\textsf{KDF}}, \textsf{SE}]$. Let $\mathcal {D}_{\textrm{IND}}$ be any adversary against the $\textrm{IND}$-security of $\textsf{CH}$, making $q_{\textsc {Ch}}$ queries to its $\textsc {Ch}$ oracle. Then, we can build adversaries $\mathcal {D}_{\textrm{OTWIND}}$, $\mathcal {D}_{\textrm{RKPRF}}$, $\mathcal {D}_{\textrm{ENCROB}}$, $\mathcal {F}_{\textrm{UPREF}}$, $\mathcal {D}_{\textrm{UPRKPRF}}$, $\mathcal {D}_{\mathrm {OTIND\$}}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}_{\textrm{IND}}) \le 2&\cdot \Big (\textsf{Adv}^{\textsf{otwind}}_{\textsf{HASH}}(\mathcal {D}_{\textrm{OTWIND}}) + \textsf{Adv}^{\textsf{rkprf}}_{\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}}) \\&+ \textsf{Adv}^{\textsf{encrob}}_{\textsf{ME}}(\mathcal {D}_{\textrm{ENCROB}}) + \textsf{Adv}^{\textsf{upref}}_{\textsf{ME}}(\mathcal {F}_{\textrm{UPREF}}) \\&+ \textsf{Adv}^{\textsf{uprkprf}}_{\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{UPRKPRF}}) + \frac{q_{\textsc {Ch}}\cdot (q_{\textsc {Ch}}- 1)}{2 \cdot 2^{\textsf{MAC}.\textsf{ol}}} \\&+ \textsf{Adv}^{\mathsf {otind\$}}_{\textsf{SE}}(\mathcal {D}_{\mathrm {OTIND\$}})\Big ). \\ \end{aligned}$$

Proof

This proof uses games $\textrm{G}_0$–$\textrm{G}_3$ in Fig. 39 and $\textrm{G}_4$–$\textrm{G}_8$ in Fig. 40, in which the code added for the transitions between games is highlighted in . The adversaries for transitions between games are referenced throughout the proof. Each constructed adversary simulates one or two subsequent games of the security reduction for adversary $\mathcal {D}_{\textrm{IND}}$. The instructions mark the changes in the code of the simulated games.

${\textbf{G}}_{0}$. Game $\textrm{G}_0$ is equivalent to game $\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}_{\textrm{IND}}}$. It expands the code of algorithms $\mathsf {\textsf{CH}.Init}$, $\mathsf {\textsf{CH}.Send}$ and $\mathsf {\textsf{CH}.Recv}$; the expanded instructions are highlighted in . It follows that

$$ \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}_{\textrm{IND}}) = 2 \cdot \Pr [\textrm{G}_0] - 1. $$

${{{\textbf{G}}}_{0}\rightarrow {{\textbf{G}}}_{1}}$. Note that the value of $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$ depends on the raw $\textsf{KDF}$ and $\textsf{MAC}$ keys (i.e. $\textit{kk}$ and $\textit{mk}$), and adversary $\mathcal {D}_{\textrm{IND}}$ can learn it from any ciphertext returned by oracle $\textsc {Ch}$. To invoke PRF-style security notions for either primitive in later steps, we appeal to the $\textrm{OTWIND}$-security of $\textsf{HASH}$ (Fig. 25), which essentially guarantees that $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$ leaks no information about $\textsf{KDF}$ and $\textsf{MAC}$ keys. Game $\textrm{G}_1$ is the same as game $\textrm{G}_0$, except $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}\leftarrow \textsf{HASH}.\textsf{Ev}(\textit{hk}, \cdot )$ is evaluated on a uniformly random string x rather than on $\textit{kk}~\Vert ~\textit{mk}$. We claim that $\mathcal {D}_{\textrm{IND}}$ cannot distinguish between these two games.

More formally, given $\mathcal {D}_{\textrm{IND}}$, in Fig. 36 we define an adversary $\mathcal {D}_{\textrm{OTWIND}}$ attacking the $\textrm{OTWIND}$-security of $\textsf{HASH}$ as follows. According to the definition of game $\textrm{G}^\textsf{otwind}_{\textsf{HASH}, \mathcal {D}_{\textrm{OTWIND}}}$, adversary $\mathcal {D}_{\textrm{OTWIND}}$ takes $(x_0, x_1, \textsf{auth}{\_}\textsf{key}{\_}\textsf{id})$ as input. We define adversary $\mathcal {D}_{\textrm{OTWIND}}$ to sample a challenge bit b, to parse $\textit{kk}~\Vert ~\textit{mk}\leftarrow x_1$, and to subsequently use the obtained values of $b, \textit{kk}, \textit{mk}, \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$ in order to simulate either of the games $\textrm{G}_0$, $\textrm{G}_1$ for adversary $\mathcal {D}_{\textrm{IND}}$ (both games are equivalent from the moment these 4 values are chosen). If $\mathcal {D}_{\textrm{IND}}$ guesses the challenge bit b, then we let adversary $\mathcal {D}_{\textrm{OTWIND}}$ return 1; otherwise we let it return 0. Now let d be the challenge bit in game $\textrm{G}^\textsf{otwind}_{\textsf{HASH}, \mathcal {D}_{\textrm{OTWIND}}}$, and let $d'$ be the value returned by $\mathcal {D}_{\textrm{OTWIND}}$. If $d = 1$ then $\mathcal {D}_{\textrm{OTWIND}}$ simulates game $\textrm{G}_0$ for $\mathcal {D}_{\textrm{IND}}$ (i.e. $\textit{kk}$ and $\textit{mk}$ are derived from the input to $\textsf{HASH}.\textsf{Ev}(\textit{hk}, \cdot )$), and otherwise it simulates game $\textrm{G}_1$ (i.e. $\textit{kk}$ and $\textit{mk}$ are independent from the input to $\textsf{HASH}.\textsf{Ev}(\textit{hk}, \cdot )$). It follows that $\Pr [\textrm{G}_0] = \Pr \left[ \,d' = 1\,|\,d = 1\, \right] $ and $\Pr [\textrm{G}_1] = \Pr \left[ \,d' = 1\,|\,d = 0\, \right] $, and hence

$$ \Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] = \textsf{Adv}^{\textsf{otwind}}_{\textsf{HASH}}(\mathcal {D}_{\textrm{OTWIND}}). $$

${{{\textbf{G}}}_{1}\rightarrow {{\textbf{G}}}_{2}}$. In the transition between games $\textrm{G}_1$ and $\textrm{G}_2$ (Fig. 39), we use the $\textrm{RKPRF}$-security of $\textsf{KDF}$ (Fig. 26) with respect to $\phi _{\textsf{KDF}}$ in order to replace $\textsf{KDF}.\textsf{Ev}(\textit{kk}_{\textit{u}}$, $\textsf{msg}{\_}\textsf{key})$ with a uniformly random value from $\{0,1\}^{\textsf{KDF}.\textsf{ol}}$ (and for consistency store the latter in $\textsf{T}[\textit{u}, \textsf{msg}{\_}\textsf{key}]$). Similarly to the above, in Fig. 37 we build an adversary $\mathcal {D}_{\textrm{RKPRF}}$ attacking the $\textrm{RKPRF}$-security of $\textsf{KDF}$ that simulates $\textrm{G}_1$ or $\textrm{G}_2$ for adversary $\mathcal {D}_{\textrm{IND}}$, depending on the challenge bit in game $\textrm{G}^\textsf{rkprf}_{\textsf{KDF}, \phi _{\textsf{KDF}}, \mathcal {D}_{\textrm{RKPRF}}}$. We have

$$ \Pr [\textrm{G}_1] - \Pr [\textrm{G}_2] = \textsf{Adv}^{\textsf{rkprf}}_{\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}}). $$

${{{\textbf{G}}}_{2}\rightarrow {{\textbf{G}}}_{3}}$. We invoke the $\textrm{ENCROB}$ property of $\textsf{ME}$ (Fig. 34) to transition from $\textrm{G}_2$ to $\textrm{G}_3$ (Fig. 39). This property states that calls to $\mathsf {\textsf{ME}.Decode}$ do not change $\textsf{ME}$’s state in a way that affects the payloads returned by any future calls to $\mathsf {\textsf{ME}.Encode}$, allowing us to remove the $\mathsf {\textsf{ME}.Decode}$ call from inside the oracle $\textsc {Recv}$ in game $\textrm{G}_3$. In Fig. 38 we build an adversary $\mathcal {D}_{\textrm{ENCROB}}$ against $\textrm{ENCROB}$ of $\textsf{ME}$ that simulates either $\textrm{G}_2$ or $\textrm{G}_3$ for $\mathcal {D}_{\textrm{IND}}$, depending on the challenge bit in game $\textrm{G}^{\textsf{encrob}}_{\textsf{ME}, \mathcal {D}_{\textrm{ENCROB}}}$, such that

$$ \Pr [\textrm{G}_2] - \Pr [\textrm{G}_3] = \textsf{Adv}^{\textsf{encrob}}_{\textsf{ME}}(\mathcal {D}_{\textrm{ENCROB}}). $$

${{{\textbf{G}}}_{3}\rightarrow {{\textbf{G}}}_{4}}$. Game $\textrm{G}_4$ (Fig. 40) differs from $\textrm{G}_3$ (Fig. 39) in the following ways:

(1)
The $\textsf{KDF}$ keys $\textit{kk}$, $\textit{kk}_\mathcal {I}$, $\textit{kk}_\mathcal {R}$ are no longer used in our reduction games starting from $\textrm{G}_3$, so they are not included in game $\textrm{G}_4$ and onwards.
(2)
The calls to oracle $\textsc {Recv}$ in game $\textrm{G}_3$ no longer change the receiver’s channel state, so game $\textrm{G}_4$ immediately returns $\bot $ on every call to $\textsc {Recv}$.
(3)
Game $\textrm{G}_4$ rewrites, in a functionally equivalent way, the initialisation and usage of values from the PRF table $\textsf{T}$ inside oracle $\textsc {Ch}$.
(4)
Game $\textrm{G}_4$ adds a set $X_{\textit{u}}$, for each $\textit{u}\in \{\mathcal {I}, \mathcal {R}\}$, that stores 256-bit prefixes of payloads that were produced by calling the specific user’s $\textsc {Ch}$ oracle. Every time a new payload $p$ is generated, the added code inside oracle $\textsc {Ch}$ checks whether its prefix $p[0:256]$ is already contained inside $X_{\textit{u}}$, which would mean that another previously seen payload had the same prefix. Then, regardless of whether this condition passes, the new prefix $p[0:256]$ is added to $X_{\textit{u}}$. We note that the output of oracle $\textsc {Ch}$ in game $\textrm{G}_4$ does not change depending on whether this condition passes or fails.
(5)
Game $\textrm{G}_4$ adds Boolean flags $\textsf{bad}_0$ and $\textsf{bad}_1$ that are set to $\texttt {true}$ when the corresponding conditions inside oracle $\textsc {Ch}$ are satisfied. These flags do not affect the functionality of the games, and will only be used for the formal analysis that we provide below.

Both games are functionally equivalent, so

$$ \Pr [\textrm{G}_4] = \Pr [\textrm{G}_3]. $$

${{{\textbf{G}}}_{4}\rightarrow {{\textbf{G}}}_{5}}$. The transition from game $\textrm{G}_4$ to $\textrm{G}_5$ replaces the value assigned to $\textsf{msg}{\_}\textsf{key}$ when the newly added unique-prefixes condition is satisfied; the value of $\textsf{msg}{\_}\textsf{key}$ changes from $\textsf{MAC}.\textsf{Ev}(\textit{mk}_{\textit{u}}, p)$ to a uniformly random string from $\{0,1\}^{\textsf{MAC}.\textsf{ol}}$. Games $\textrm{G}_4$ and $\textrm{G}_5$ are identical until $\textsf{bad}_0$ is set. We have

$$ \Pr [\textrm{G}_4] - \Pr [\textrm{G}_5] \le \Pr [\textsf{bad}_0^{\textrm{G}_4}]. $$

The $\textrm{UPREF}$ property of $\textsf{ME}$ (Fig. 33) states that it is hard to find two payloads returned by $\mathsf {\textsf{ME}.Encode}$ such that their 256-bit prefixes are the same; we use this property to upper-bound the probability of setting $\textsf{bad}_0$ in game $\textrm{G}_4$. In Fig. 41, we build an adversary $\mathcal {F}_{\textrm{UPREF}}$ attacking the $\textrm{UPREF}$ of $\textsf{ME}$ that simulates game $\textrm{G}_4$ for adversary $\mathcal {D}_{\textrm{IND}}$. Every time $\textsf{bad}_0$ is set in game $\textrm{G}_4$, this corresponds to adversary $\mathcal {F}_{\textrm{UPREF}}$ setting flag $\textsf{win}$ to $\texttt {true}$ in its own game $\textrm{G}^{\textsf{upref}}_{\textsf{ME}, \mathcal {F}_{\textrm{UPREF}}}$. It follows that

$$\begin{aligned} \Pr [\textsf{bad}_0^{\textrm{G}_{4}}] \le \textsf{Adv}^{\textsf{upref}}_{\textsf{ME}}(\mathcal {F}_{\textrm{UPREF}}). \end{aligned}$$

${{{\textbf{G}}}_{5}\rightarrow {{\textbf{G}}}_{6}}$. We use the $\textrm{UPRKPRF}$-security of $\textsf{MAC}$ (Fig. 28) with respect to $\phi _{\textsf{MAC}}$ in order to replace the value of $\textsf{msg}{\_}\textsf{key}$ from $\textsf{MAC}.\textsf{Ev}(\textit{mk}_{\textit{u}}, p)$ to a uniformly random value from $\{0,1\}^{\textsf{MAC}.\textsf{ol}}$ in the transition from $\textrm{G}_5$ to $\textrm{G}_6$ (Fig. 40). Note that the notion of $\textrm{UPRKPRF}$-security only guarantees the indistinguishability from random when $\textsf{MAC}$ is evaluated on inputs with unique prefixes, whereas games $\textrm{G}_5, \textrm{G}_6$ ensure that this prerequisite is satisfied by only evaluating $\textsf{MAC}$ if $p[0:256] \not \in X_{\textit{u}}$. In Fig. 42, we build an adversary $\mathcal {D}_{\textrm{UPRKPRF}}$ attacking the $\textrm{UPRKPRF}$-security of $\textsf{MAC}$ that simulates $\textrm{G}_5$ or $\textrm{G}_6$ for adversary $\mathcal {D}_{\textrm{IND}}$, depending on the challenge bit in game $\textrm{G}^\textsf{uprkprf}_{\textsf{MAC}, \phi _{\textsf{MAC}}, \mathcal {D}_{\textrm{UPRKPRF}}}$. It follows that

$$ \Pr [\textrm{G}_5] - \Pr [\textrm{G}_6] = \textsf{Adv}^{\textsf{uprkprf}}_{\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{UPRKPRF}}). $$

${{{\textbf{G}}}_{6}\rightarrow {{\textbf{G}}}_{7}}$. Games $\textrm{G}_6$ and $\textrm{G}_7$ are identical until $\textsf{bad}_1$ is set; as above, we have

$$ \Pr [\textrm{G}_{6}] - \Pr [\textrm{G}_{7}] \le \Pr [\textsf{bad}_1^{\textrm{G}_6}]. $$

The values of $\textsf{msg}{\_}\textsf{key}\in \{0,1\}^{\textsf{MAC}.\textsf{ol}}$ in game $\textrm{G}_6$ are sampled uniformly at random and independently across the $q_{\textsc {Ch}}$ different calls to oracle $\textsc {Send}$, so we can apply the birthday bound to claim the following:

$$\begin{aligned} \Pr [\textsf{bad}_1^{\textrm{G}_6}] \le \frac{q_{\textsc {Ch}}\cdot (q_{\textsc {Ch}}- 1)}{2 \cdot 2^{\textsf{MAC}.\textsf{ol}}}. \end{aligned}$$

${{{\textbf{G}}}_{7}\rightarrow {{\textbf{G}}}_{8}}$. In the transition from $\textrm{G}_7$ to $\textrm{G}_8$ (Fig. 40), we replace the value of ciphertext $c_{\textit{se}}$ from $\mathsf {\textsf{SE}.Enc}(k, p)$ to a uniformly random value from $\{0,1\}^{\mathsf {\textsf{SE}.cl}(\mathsf {\textsf{ME}.pl}{(\left| m_b\right| , r)})}$ by appealing to the $\mathrm {OTIND\$}$-security of $\textsf{SE}$ (Fig. 4). Recall that $\mathsf {\textsf{ME}.pl}{(\left| m_b\right| , r)}$ is the length of the payload $p$ that is produced by calling $\mathsf {\textsf{ME}.Encode}$ on any message of length $\left| m_b\right| $ and on random coins $r$, whereas $\mathsf {\textsf{SE}.cl}(\cdot )$ maps the payload length to the resulting ciphertext length when encrypted with $\textsf{SE}$. In Fig. 43, we build an adversary $\mathcal {D}_{\mathrm {OTIND\$}}$ attacking the $\mathrm {OTIND\$}$-security of $\textsf{SE}$ that simulates $\textrm{G}_7$ or $\textrm{G}_8$ for adversary $\mathcal {D}_{\textrm{IND}}$, depending on the challenge bit in game $\textrm{G}^{\mathsf {otind\$}}_{\textsf{SE}, \mathcal {D}_{\mathrm {OTIND\$}}}$. It follows that

$$ \Pr [\textrm{G}_7] - \Pr [\textrm{G}_8] = \textsf{Adv}^{\mathsf {otind\$}}_{\textsf{SE}}(\mathcal {D}_{\mathrm {OTIND\$}}). $$

Finally, the output of oracle $\textsc {Ch}$ in game $\textrm{G}_8$ no longer depends on the challenge bit b, so we have

$$ \Pr [\textrm{G}_8] = \frac{1}{2}. $$

The theorem statement follows. $\square $

5.5.1 Proof alternatives

Our security reduction relies on the $\textrm{RKPRF}$-security of $\textsf{KDF}$ with respect to $\phi _{\textsf{KDF}}$. We note that it would suffice to instead define and use a related-key weak-PRF notion here. It could be used in the penultimate step of this security reduction: right before appealing to the $\mathrm {OTIND\$}$-security of $\textsf{SE}$.

Further, in this security reduction we consider a generic function family $\textsf{MAC}$ and rely on it being related-key PRF-secure with respect to unique-prefix inputs. Recall that MTProto uses $\textsf{MAC}= \textsf{MTP}\text {-}\textsf{MAC}$ such that $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}(\textit{mk}_\textit{u}$, $p)$ $=$ $\textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p){[64:192]}$. It discards half of the $\textsf{SHA}-\textsf{256}$ output bits, so we could alternatively model it as an instance of Augmented MAC (AMAC) and prove it to be related-key PRF-secure based on [10]. However, using the results from [10] would have required us to show that the $\textsf{SHA}-\textsf{256}$ compression function is a secure PRF when half of its key is leaked to the adversary. We achieve a simpler and tighter security reduction by relying on the unique-prefix property of $\textsf{ME}$ that is already guaranteed in MTProto.

5.6 $\textrm{INT}$-security of $\textsf{MTP}\text {-}\textsf{CH}$

The first half of our integrity proof shows that it is hard to forge ciphertexts; in order to justify this, we rely on security properties of the cryptographic primitives that are used to build the channel $\textsf{MTP}\text {-}\textsf{CH} $ (i.e. $\textsf{HASH}$, $\textsf{KDF}$, $\textsf{SE}$, and $\textsf{MAC}$). Once ciphertext forgery is ruled out, we are guaranteed that $\textsf{MTP}\text {-}\textsf{CH} $ broadly matches an intuition of an authenticated channel: it prevents an attacker from modifying or creating its own ciphertexts but still allows to intercept and subsequently replay, reorder or drop honestly produced ciphertexts. So in the second part of the proof we show that the message encoding scheme $\textsf{ME}$ appropriately resolves all of the possible adversarial interaction with an authenticated channel; formally, we require that it behaves according to the requirements that are specified by the support function $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$. Our main result is then:

Theorem 2

Let $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$, $\textsf {pb} \in {{\mathbb {N}}}$, and $\textsf{bl}= 128$. Let $\textsf{ME}= \textsf{MTP}\text {-}\textsf{ME}[\textsf {session}\_\textsf {id}, \textsf {pb}, \textsf{bl}]$ be the message encoding scheme as defined in Definition 6. Let $\textsf{SE}= \textsf{MTP}\text {-}\textsf{SE}$ be the deterministic symmetric encryption scheme as defined in Definition 10. Let $\textsf{HASH}$, $\textsf{MAC}$, $\textsf{KDF}$, $\phi _{\textsf{MAC}}$, $\phi _{\textsf{KDF}}$ be any primitives that, together with $\textsf{ME}$ and $\textsf{SE}$, meet the requirements stated in Definition 5 of channel $\textsf{MTP}\text {-}\textsf{CH} $. Let $\textsf{CH}= \textsf{MTP}\text {-}\textsf{CH} [\textsf{ME}, \textsf{HASH}, \textsf{MAC}, \textsf{KDF}, \phi _{\textsf{MAC}}, \phi _{\textsf{KDF}}, \textsf{SE}]$. Let $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$ be the support function as defined in Fig. 32. Let $\mathcal {F}_{\textrm{INT}}$ be any adversary against the $\textrm{INT}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$. Then, we can build adversaries $\mathcal {D}_{\textrm{OTWIND}}$, $\mathcal {D}_{\textrm{RKPRF}}$, $\mathcal {F}_{\textrm{UNPRED}}$, $\mathcal {F}_{\textrm{RKCR}}$, $\mathcal {F}_{\textrm{EINT}}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}_{\textrm{INT}})&\le \textsf{Adv}^{\textsf{otwind}}_{\textsf{HASH}}(\mathcal {D}_{\textrm{OTWIND}}) + \textsf{Adv}^{\textsf{rkprf}}_{\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}})\\&\quad + \textsf{Adv}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}}(\mathcal {F}_{\textrm{UNPRED}}) + \textsf{Adv}^{\textsf{rkcr}}_{\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {F}_{\textrm{RKCR}}) \\&\quad + \textsf{Adv}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}}(\mathcal {F}_{\textrm{EINT}}). \end{aligned}$$

Before providing the detailed proof, we provide some discussion of our approach and a high-level overview of the different parts of the proof.

5.6.1 Invisible terms based on correctness of $\textsf{ME}$, $\textsf{SE}$, $\textsf{supp}$

We state and prove our $\textrm{INT}$-security claim for channel $\textsf{MTP}\text {-}\textsf{CH} $ with respect to fixed choices of MTProto-based constructions $\textsf{ME}= \textsf{MTP}\text {-}\textsf{ME}$ (Definition 6) and $\textsf{SE}= \textsf{MTP}\text {-}\textsf{SE}$ (Definition 10), and with respect to the support function $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$ that is defined in Fig. 32. Our security reduction relies on six correctness-style properties of these primitives: one for $\textsf{ME}$, two for $\textsf{SE}$, three for $\textsf{supp}$. Each of them can be observed to be always true for the corresponding scheme and hence does not contribute an additional term to the advantage statement in Theorem 2. These properties are also simple enough that we chose not to define them in a game-based style (the one we require from $\textsf{ME}$ is distinct from, and simpler than, the encoding correctness notion that we defined in Sect. 3.5). Our security reduction nonetheless introduces and justifies a game hop for each of the these properties. This necessitates the use of 14 security reduction games to prove Theorem 2, including some that are meant to be equivalent by observation (i.e. the corresponding game transitions do not rely on any correctness or security properties). However, some of the reduction steps require a detailed analysis.

Theorem 2 could be stated in a more general way, fully formalising the aforementioned correctness notions and phrasing our claims with respect to any $\textsf{SE}$, $\textsf{ME}$, $\textsf{supp}$. We lose this generality by instantiating these primitives. Our motivation is twofold. On the one hand, we state our claims in a way that highlights the parts of MTProto (as captured by our specification) that are critical for its security analysis, and omit spending too much attention on parts of the reduction that can be “taken for granted”. On the other hand, our work studies MTProto, and the abstractions that we use are meant to simplify and aid this analysis. We discourage the reader from treating $\textsf{MTP}\text {-}\textsf{CH} $ in a prescriptive way, e.g. from trying to instantiate it with different primitives to build a secure channel since standard, well-studied cryptographic protocols such as TLS already exist.

5.6.2 Proof phase I: Forging a ciphertext is hard

Let $\mathcal {F}_{\textrm{INT}}$ be an adversary playing in the $\textrm{INT}$-security game against channel $\textsf{MTP}\text {-}\textsf{CH} $. Consider an arbitrary call made by $\mathcal {F}_{\textrm{INT}}$ to its oracle $\textsc {Recv}$ on inputs $\textit{u}, c, \textit{aux}$ such that $c = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}', \textsf{msg}{\_}\textsf{key}, c_{\textit{se}})$. The oracle evaluates $\mathsf {\textsf{MTP}\text {-}\textsf{CH}.Recv}(\textit{st}_{\textit{u}}, c, \textit{aux})$. Recall that $\mathsf {\textsf{MTP}\text {-}\textsf{CH}.Recv}$ attempts to verify $\textsf{msg}{\_}\textsf{key}$ by checking whether $\textsf{msg}{\_}\textsf{key} = \textsf{MAC}.\textsf{Ev}(\textit{mk}_{\overline{\textit{u}}}, p)$ for an appropriately recovered payload $p$ (i.e. $k \leftarrow \textsf{KDF}.\textsf{Ev}(\textit{kk}_{\overline{\textit{u}}}, \textsf{msg}{\_}\textsf{key})$ and $p\leftarrow \mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}})$). If this $\textsf{msg}{\_}\textsf{key}$ verification passes (and if $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}' = \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$), then $\mathsf {\textsf{MTP}\text {-}\textsf{CH}.Recv}$ attempts to decode the payload by computing $(\textit{st}_{\textsf{ME}, \textit{u}}, m) \leftarrow \mathsf {\textsf{ME}.Decode}(\textit{st}_{\textsf{ME}, \textit{u}}, p, \textit{aux})$.

We consider two cases, and claim the following. (A) If $\textsf{msg}{\_}\textsf{key}$ was not previously returned by oracle $\textsc {Send}$ as a part of any ciphertext sent by user $\overline{\textit{u}}$, then with high probability an evaluation of $\mathsf {\textsf{ME}.Decode}(\textit{st}_{\textsf{ME}, \textit{u}}, p, \textit{aux})$ would return $m = \bot $ regardless of whether the $\textsf{msg}{\_}\textsf{key}$ verification passed or failed; so in this case we are not concerned with assessing the likelihood that the $\textsf{msg}{\_}\textsf{key}$ verification passes. (B) If $\textsf{msg}{\_}\textsf{key}$ was previously returned by oracle $\textsc {Send}$ as a part of some ciphertext $c' = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}, \textsf{msg}{\_}\textsf{key}, c_{\textit{se}}')$ sent by user $\overline{\textit{u}}$, and if $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}= \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}'$, then with high probability $c_{\textit{se}}= c_{\textit{se}}'$ (and hence $c = c'$) whenever the $\textsf{msg}{\_}\textsf{key}$ verification passes. We now justify both claims.

5.6.3 Case A. Assume $\textsf{msg}{\_}\textsf{key}$ is fresh

Our analysis of this case will rely on a property of the symmetric encryption scheme $\textsf{SE}$ and will require that its key k is chosen uniformly at random. Thus, we begin by invoking the $\textrm{OTWIND}$-security of $\textsf{HASH}$ and the $\textrm{RKPRF}$-security of $\textsf{KDF}$ in order to claim that the output of $\textsf{KDF}.\textsf{Ev}(\textit{kk}_{\overline{\textit{u}}}, \textsf{msg}{\_}\textsf{key})$ is indistinguishable from random; this mirrors the first two steps of the $\textrm{IND}$-security reduction of $\textsf{MTP}\text {-}\textsf{CH} $. We formalise this by requiring that $\textsf{KDF}.\textsf{Ev}(\textit{kk}_{\overline{\textit{u}}}, \textsf{msg}{\_}\textsf{key})$ is indistinguishable from a uniformly random value stored in the PRF table’s entry $\textsf{T}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}]$.

Our analysis of Case A now reduces roughly to the following: we need to show that it is hard to find any $\textsf{SE}$ ciphertext $c_{\textit{se}}$ such that its decryption $p$ under a uniformly random key k has a non-negligible chance of being successfully decoded by $\mathsf {\textsf{ME}.Decode}$ (i.e. returning $m\ne \bot $). As a part of this experiment, the adversary is allowed to query many different values of $\textsf{msg}{\_}\textsf{key}$ and $c_{\textit{se}}$ (recall that an $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext contains both). At this point, the $\textsf{msg}{\_}\textsf{key}$ is only used to select a uniformly random $\textsf{SE}$ key k from $\textsf{T}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}]$, but the adversary can reuse the same key k in combination with many different choices of $c_{\textit{se}}$. The Case A assumption that $\textsf{msg}{\_}\textsf{key}$ is “fresh” means that the $\textsf{msg}{\_}\textsf{key}$ was not seen during previous calls to the $\textsc {Send}$ oracle, so the adversary has no additional leakage on key k. All of the above is captured by the notion of $\textsf{SE}$’s unpredictability ($\textrm{UNPRED}$) with respect to $\textsf{ME}$ (Sect. 5.3).

The $\textrm{UNPRED}$-security of $\textsf{SE}, \textsf{ME}$ can be trivially broken if $\mathsf {\textsf{ME}.Decode}$ is defined in a way that it successfully decodes every possible payload $p\in \mathsf {\textsf{ME}.Out}$. It can also be trivially broken for contrived examples of $\textsf{SE}$ like the one defining $\forall k\in \{0,1\}^{\mathsf {\textsf{SE}.kl}}, \forall x\in \mathsf {\textsf{SE}.MS}:(\mathsf {\textsf{SE}.Enc}(k, x) = x) \wedge (\mathsf {\textsf{SE}.Dec}(k, x) = x)$, assuming that $\mathsf {\textsf{ME}.Decode}$ can successfully decode even a single payload $p$ from $\mathsf {\textsf{SE}.MS}$. But the more structure $\mathsf {\textsf{ME}.Decode}$ requires from its input $p$, and the more “unpredictable” is the decryption algorithm $\mathsf {\textsf{SE}.Dec}(k, \cdot )$ for a uniformly random k, the harder it is to break the $\textrm{UNPRED}$-security of $\textsf{SE}, \textsf{ME}$. We note that $\textsf{MTP}\text {-}\textsf{ME}$ requires every $p$ to contain a constant $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$ in the second half of its 128-bit block, whereas $\textsf{MTP}\text {-}\textsf{SE}$ implements the IGE block cipher mode of operation. In Appendix E.6, we show that the output $p$ of $\mathsf {\textsf{MTP}\text {-}\textsf{SE}.Dec}$ is highly unlikely to contain $\textsf {session}\_\textsf {id}$ at the necessary position, i.e. if $\mathcal {F}_{\textrm{INT}}$ makes $q_{\textsc {Send}}$ queries to its $\textsc {Send}$ oracle then it can find such $p$ with probability at most $q_{\textsc {Send}}/2^{64}$. In Appendix E.6, we also discuss the possibility of improving this bound.

5.6.4 Case B. Assume $\textsf{msg}{\_}\textsf{key}$ is reused

In this case, we know that adversary $\mathcal {F}_{\textrm{INT}}$ previously called its $\textsc {Send}$ oracle on inputs $\overline{\textit{u}}, m', \textit{aux}', r'$ for some $m', \textit{aux}', r'$, and received back a ciphertext $c' = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}, \textsf{msg}{\_}\textsf{key}', c_{\textit{se}}')$ such that $\textsf{msg}{\_}\textsf{key}' = \textsf{msg}{\_}\textsf{key}$. Let $p'$ be the payload that was built and used inside this oracle call. Recall that we are currently considering $\mathcal {F}_{\textrm{INT}}$’s ongoing call to its oracle $\textsc {Recv}$ on inputs $\textit{u}, c, \textit{aux}$ such that $c = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}', \textsf{msg}{\_}\textsf{key}, c_{\textit{se}})$; we are only interested in the event that the $\textsf{msg}{\_}\textsf{key}$ verification passed (and that $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}= \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}'$), meaning that $\textsf{msg}{\_}\textsf{key}= \textsf{MAC}.\textsf{Ev}(\textit{mk}_{\overline{\textit{u}}}, p)$ holds for an appropriately recovered $p$.

It follows that $\textsf{MAC}.\textsf{Ev}(\textit{mk}_{\overline{\textit{u}}}, p') = \textsf{MAC}.\textsf{Ev}(\textit{mk}_{\overline{\textit{u}}}, p)$. If $p' \ne p$, then this breaks the $\textrm{RKCR}$-security of $\textsf{MAC}$. Recall that MTProto instantiates $\textsf{MAC}$ with $\textsf{MTP}\text {-}\textsf{MAC}$ where $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}(\textit{mk}_\textit{u}, p) = \textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p){[64:192]}$. So this attack against $\textsf{MAC}$ reduces to breaking some variant of $\textsf{SHA}-\textsf{256}$’s collision resistance that restricts the set of allowed inputs but only requires to find a collision in a 128-bit fragment of the output.

Based on the above, we obtain $(\textsf{msg}{\_}\textsf{key}', p') = (\textsf{msg}{\_}\textsf{key}, p)$. Let $k = \textsf{KDF}.\textsf{Ev}(\textit{kk}_{\overline{\textit{u}}}, \textsf{msg}{\_}\textsf{key})$. Note that $c_{\textit{se}}' \leftarrow \mathsf {\textsf{SE}.Enc}(k, p')$ was computed during the $\textsc {Send}$ call, and $p\leftarrow \mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}})$ was computed during the ongoing $\textsc {Recv}$ call. The equality $p' = p$ implies $c_{\textit{se}}' = c_{\textit{se}}$ if $\textsf{SE}$ guarantees that for any key k, the algorithms of $\textsf{SE}$ match every message $p\in \mathsf {\textsf{SE}.MS}$ with a unique ciphertext $c_{\textit{se}}$. When this condition holds, we say that $\textsf{SE}$ has unique ciphertexts. We note that $\textsf{MTP}\text {-}\textsf{SE}$ satisfies this property; it follows that $c_{\textit{se}}' = c_{\textit{se}}$ and therefore the $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext c that was queried to $\textsc {Recv}$ (for user $\textit{u}$) is equal to the ciphertext $c'$ that was previously returned by $\textsc {Send}$ (by user $\overline{\textit{u}}$). Implicit in this argument is an assumption that $\textsf{SE}$ has the decryption correctness property; $\textsf{MTP}\text {-}\textsf{SE}$ satisfies this property as well.

5.6.5 Proof phase II: $\textsf{MTP}\text {-}\textsf{CH} $ acts as an authenticated channel

We can rewrite the claims we stated and justified in the first phase of the proof as follows. When adversary $\mathcal {F}_{\textrm{INT}}$ queries its oracle $\textsc {Recv}$ on inputs $\textit{u}, c, \textit{aux}$, the channel decrypts c to $m = \bot $ with high probability, unless c was honestly returned in response to $\mathcal {F}_{\textrm{INT}}$’s prior call to $\textsc {Send}(\overline{\textit{u}}, \ldots )$, meaning $\exists m', \textit{aux}' :(\textsf{sent}, m', c, \textit{aux}') \in \textsf{tr}_{\overline{\textit{u}}}$. Furthermore, we claim that the channel’s state $\textit{st}_{\textit{u}}$ of user $\textit{u}$ does not change when $\mathcal {F}_{\textrm{INT}}$ queries its oracle $\textsc {Recv}$ on inputs $\textit{u}, c, \textit{aux}$ that get decrypted to $m = \bot $. This could only happen in Case A above, assuming that the $\textsf{msg}{\_}\textsf{key}$ verification succeeds but then the $\mathsf {\textsf{ME}.Decode}$ call returns $m = \bot $ and changes the message encoding scheme’s state $\textit{st}_{\textsf{ME}, \textit{u}}$ of user $\textit{u}$. We note that $\textsf{MTP}\text {-}\textsf{ME}$ never updates $\textit{st}_{\textsf{ME}, \textit{u}}$ when decoding fails, and hence, it satisfies this requirement.

We now know that oracle $\textsc {Recv}$ accepts only honestly forwarded ciphertexts from the opposite user and that it never changes the channel’s state otherwise. This allows us to rewrite the $\textrm{INT}$-security game to ignore all cryptographic algorithms in the $\textsc {Recv}$ oracle. More specifically, oracle $\textsc {Recv}$ can use the opposite user’s transcript to check which ciphertexts were produced honestly, and simply reject the ones that are not on this transcript. For each ciphertext c that is on the transcript, the game can maintain a table that maps it to the payload $p$ that was used to generate it; oracle $\textsc {Recv}$ can fetch this payload and immediately call $\mathsf {\textsf{ME}.Decode}$ to decode it.

5.6.6 Proof phase III: Interaction between $\textsf{ME}$ and $\textsf{supp}$

By now, we have transformed our $\textrm{INT}$-security game to an extent that it roughly captures the requirement that the behaviour of $\textsf{ME}$ should match that of $\textsf{supp}$ (i.e. adversary $\mathcal {F}_{\textrm{INT}}$ wins the game iff the message m recovered by $\mathsf {\textsf{ME}.Decode}$ inside oracle $\textsc {Recv}$ is not equal to the corresponding output $m^*$ of $\textsf{supp}$). However, the support function $\textsf{supp}$ uses the $\textsf{MTP}\text {-}\textsf{CH} $ encryption c of payload $p$ as its label, and it is not necessarily clear what information about c can or should be used to define the behaviour of $\textsf{supp}$. In order to simplify the security game we have arrived to, we will rely on three correctness-style notions as follows:

(1)
Integrity of a support function requires that the support function returns $m^*= \bot $ when it is called on a ciphertext that cannot be found in the opposite user’s transcript $\textsf{tr}_{\overline{\textit{u}}}$.^{Footnote 28}
(2)
Robustness of a support function requires that adding failed decryption events (i.e. $m = \bot $) to a transcript does not affect the future outputs of $\textsf{supp}$ on any inputs.
(3)
We also rely on a property requiring that a support function uses no information about its labels beyond their equality pattern, separately for either direction of communication (i.e. $\textit{u}\rightarrow \overline{\textit{u}}$ and $\overline{\textit{u}}\rightarrow \textit{u}$).

For the last property, we observe that in our game $p_0 = p_1$ iff the corresponding $\textsf{MTP}\text {-}\textsf{CH} $ ciphertexts are also equal. This allows us to switch from using ciphertexts to using payloads as the labels for the $\textsf{supp}$, and simultaneously change the transcripts to also store payloads instead of ciphertexts. Our theorem is stated with respect to $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$ that satisfies all three of the above properties.

The introduced properties of a support function allow us to further simplify the $\textrm{INT}$-security game. This helps us to remove the corner case that deals with $\textsc {Recv}$ being queried on an invalid ciphertext (i.e. one that was not honestly forwarded). And finally this lets us reduce our latest version of the $\textrm{INT}$-security game for $\textsf{MTP}\text {-}\textsf{CH} $ to the encoding integrity ($\textrm{EINT}$) property of $\textsf{ME}, \textsf{supp}$ (see Fig. 16) that is defined to match $\textsf{ME}$ against $\textsf{supp}$ in the presence of adversarial behaviour on an authenticated channel that exchanges $\textsf{ME}$ payloads between two users. In Appendix E.5, we show that this property holds for $\textsf{MTP}\text {-}\textsf{ME}$ with respect to $\textsf{supp}\text {-}\textsf{ord}$.

Proof of Theorem 2

This proof uses games $\textrm{G}_0$–$\textrm{G}_2$ in Fig. 44, games $\textrm{G}_3$–$\textrm{G}_{8}$ in Fig. 46 and games $\textrm{G}_9$–$\textrm{G}_{13}$ in Fig. 49. The code added for the transitions between games is highlighted in . The adversaries for transitions between games are provided throughout the proof. The instructions that are inside adversaries mark the changes in the code of the simulated security reduction games.

Games $\textrm{G}_0$–$\textrm{G}_2$ and the transitions between them ($\textrm{G}_{0}\rightarrow \textrm{G}_{1}$ based on the $\textrm{OTWIND}$-security of $\textsf{HASH}$, and $\textrm{G}_{1}\rightarrow \textrm{G}_{2}$ based on the $\textrm{RKPRF}$-security of $\textsf{KDF}$) are very similar to the corresponding games and transitions in our $\textrm{IND}$-security reduction. We refer to the proof of Theorem 1 for a detailed explanation of both transitions.

${\textbf{G}}_{0}$. Game $\textrm{G}_0$ is equivalent to game $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}_{\textrm{INT}}}$. It expands the code of algorithms $\mathsf {\textsf{CH}.Init}$, $\mathsf {\textsf{CH}.Send}$ and $\mathsf {\textsf{CH}.Recv}$. The expanded instructions are highlighted in . It follows that

$$ \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}_{\textrm{INT}}) = \Pr [\textrm{G}_0]. $$

${{{\textbf{G}}}_{0}\rightarrow {{\textbf{G}}}_{1}}$. The value of $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$ in game $\textrm{G}_0$ depends on the initial $\textsf{KDF}$ key $\textit{kk}$ and $\textsf{MAC}$ key $\textit{mk}$. In contrast, game $\textrm{G}_1$ computes $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}$ by evaluating $\textsf{HASH}$ on a uniformly random input $x$ that is independent of $\textit{kk}$ and $\textit{mk}$. We invoke the $\textrm{OTWIND}$-security of $\textsf{HASH}$ (Fig. 25) in order to claim that adversary $\mathcal {F}_{\textrm{INT}}$ cannot distinguish between playing in $\textrm{G}_0$ and $\textrm{G}_1$. In Fig. 45a, we build an adversary $\mathcal {D}_{\textrm{OTWIND}}$ against the $\textrm{OTWIND}$-security of $\textsf{HASH}$. When adversary $\mathcal {D}_{\textrm{OTWIND}}$ plays in game $\textrm{G}^\textsf{otwind}_{\textsf{HASH}, \mathcal {D}_{\textrm{OTWIND}}}$ with challenge bit $d\in \{0,1\}$, it simulates game $\textrm{G}_0$ (when $d=1$) or game $\textrm{G}_1$ (when $d=0$) for adversary $\mathcal {F}_{\textrm{INT}}$. Adversary $\mathcal {D}_{\textrm{OTWIND}}$ returns $d'=1$ iff $\mathcal {F}_{\textrm{INT}}$ sets $\textsf{win}$, so we have

$$ \Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] = \textsf{Adv}^{\textsf{otwind}}_{\textsf{HASH}}(\mathcal {D}_{\textrm{OTWIND}}). $$

${{{\textbf{G}}}_{1}\rightarrow {{\textbf{G}}}_{2}}$. Going from $\textrm{G}_1$ to $\textrm{G}_2$, we switch the outputs of $\textsf{KDF}.\textsf{Ev}$ to uniformly random values. Since the adversary can call $k \leftarrow \textsf{KDF}.\textsf{Ev}(\textit{kk}_{\textit{u}}, \textsf{msg}{\_}\textsf{key})$ on the same inputs multiple times, we use a PRF table $\textsf{T}$ to enforce the consistency between calls; the output of $\textsf{KDF}.\textsf{Ev}(\textit{kk}_{\textit{u}}, \textsf{msg}{\_}\textsf{key})$ in $\textrm{G}_1$ corresponds to a uniformly random value that is sampled and stored in the table entry $\textsf{T}[\textit{u}, \textsf{msg}{\_}\textsf{key}]$. In Fig. 45b, we build an adversary $\mathcal {D}_{\textrm{RKPRF}}$ against the $\textrm{RKPRF}$-security of $\textsf{KDF}$ (Fig. 26) with respect to $\phi _{\textsf{KDF}}$. When adversary $\mathcal {D}_{\textrm{RKPRF}}$ plays in game $\textrm{G}^\textsf{rkprf}_{\textsf{KDF}, \phi _{\textsf{KDF}}, \mathcal {D}_{\textrm{RKPRF}}}$ with challenge bit $d\in \{0,1\}$, it simulates game $\textrm{G}_1$ (when $d=1$) or game $\textrm{G}_2$ (when $d=0$) for adversary $\mathcal {F}_{\textrm{INT}}$. Adversary $\mathcal {D}_{\textrm{RKPRF}}$ returns $d'=1$ iff $\mathcal {F}_{\textrm{INT}}$ sets $\textsf{win}$, so we have

$$ \Pr [\textrm{G}_1] - \Pr [\textrm{G}_2] = \textsf{Adv}^{\textsf{rkprf}}_{\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}}). $$

${{{\textbf{G}}}_{2}\rightarrow {{\textbf{G}}}_{3}}$. Game $\textrm{G}_3$ (Fig. 46) differs from $\textrm{G}_2$ (Fig. 44) in the following ways:

(1)
The $\textsf{KDF}$ keys $\textit{kk}$, $\textit{kk}_\mathcal {I}$, $\textit{kk}_\mathcal {R}$ are no longer used in our reduction games starting from $\textrm{G}_2$, so they are not included in game $\textrm{G}_3$ and onwards.
(2)
Game $\textrm{G}_3$ adds a table $\textsf{S}$ that is updated during each call to oracle $\textsc {Send}$. We set $\textsf{S}[\textit{u}, \textsf{msg}{\_}\textsf{key}] \leftarrow (p, c_{\textit{se}})$ to remember that user $\textit{u}$ produced $\textsf{msg}{\_}\textsf{key}$ when sending (to user $\overline{\textit{u}}$) an $\textsf{SE}$ ciphertext $c_{\textit{se}}$, that encrypts payload $p$.
(3)
Oracle $\textsc {Recv}$ in game $\textrm{G}_3$, prior to calling $\mathsf {\textsf{ME}.Decode}$, now saves a backup copy of $\textit{st}_{\textsf{ME}, \textit{u}}$ in variable $\textit{st}_{\textsf{ME}, \textit{u}}^*$. It then adds four new conditional statements that do not serve any purpose in game $\textrm{G}_3$. Four of the future game transitions in our security reduction ($\textrm{G}_{3}\rightarrow \textrm{G}_{4}$, $\textrm{G}_{4}\rightarrow \textrm{G}_{5}$, $\textrm{G}_{5}\rightarrow \textrm{G}_{6}$, $\textrm{G}_{7}\rightarrow \textrm{G}_{8}$) will do the following. Each of them will add an instruction, inside the corresponding conditional statement, that reverts the pair of variables $(\textit{st}_{\textsf{ME}, \textit{u}}, m)$ to their initial values $(\textit{st}_{\textsf{ME}, \textit{u}}^*, \bot )$ that they had at the beginning of the ongoing $\textsc {Recv}$ oracle call. Each of the new conditional statements also contains its own $\textsf{bad}$ flag; these flags are only used for the formal analysis that we provide below.
(4)
Similar to the above, game $\textrm{G}_3$ adds two conditional statements to the $\textsc {Send}$ oracle, and both serve no purpose in game $\textrm{G}_3$. In future games, they will be used to roll back the message encoding scheme’s state $\textit{st}_{\textsf{ME}, \textit{u}}$ to its initial value that it had at the beginning of the ongoing $\textsc {Send}$ oracle call, followed by exiting this oracle call with $\bot $ as output.

Games $\textrm{G}_3$ and $\textrm{G}_2$ are functionally equivalent, so

$$ \Pr [\textrm{G}_3] = \Pr [\textrm{G}_2]. $$

${{{\textbf{G}}}_{3}\rightarrow {{\textbf{G}}}_{4}}$. Games $\textrm{G}_3$ and $\textrm{G}_4$ (Fig. 46) are identical until $\textsf{bad}_0$ is set. We have

$$ \Pr [\textrm{G}_3] - \Pr [\textrm{G}_4] \le \Pr [\textsf{bad}_0^{\textrm{G}_3}]. $$

The $\textsf{bad}_0$ flag can be set in $\textrm{G}_3$ only when the instruction that sets $(\textit{st}_{\textsf{ME}, \textit{u}}, m) \leftarrow \mathsf {\textsf{ME}.Decode}(\textit{st}_{\textsf{ME}, \textit{u}}, p, \textit{aux})$ simultaneously changes the value of $\textit{st}_{\textsf{ME}, \textit{u}}$ and returns $m = \bot $. Recall that the statement of Theorem 2 restricts $\textsf{ME}$ to an instantiation of $\textsf{MTP}\text {-}\textsf{ME}$. But the latter never modifies its state $\textit{st}_{\textsf{ME}, \textit{u}}$ when the decoding fails (i.e. $m = \bot $), so

$$ \Pr [\textsf{bad}_0^{\textrm{G}_3}] = 0. $$

${{{\textbf{G}}}_{4}\rightarrow {{\textbf{G}}}_{5}}$. Games $\textrm{G}_4$ and $\textrm{G}_5$ (Fig. 46) are identical until $\textsf{bad}_1$ is set. We have

$$ \Pr [\textrm{G}_4] - \Pr [\textrm{G}_5] \le \Pr [\textsf{bad}_1^{\textrm{G}_5}]. $$

When the $\textsf{bad}_1$ flag is set in $\textrm{G}_5$, we know that the $\textsf{SE}$ key $k = \textsf{T}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}]$ was sampled uniformly at random and never used inside the $\textsc {Send}$ oracle before (because $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}] = \bot $). Yet the adversary $\mathcal {F}_{\textrm{INT}}$ found an $\textsf{SE}$ ciphertext $c_{\textit{se}}$ such that the payload $p\leftarrow \mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}})$ was successfully decoded by $\mathsf {\textsf{ME}.Decode}$ (i.e. $m\ne \bot $). We note that $\mathcal {F}_{\textrm{INT}}$ is allowed to query its $\textsc {Recv}$ oracle on arbitrarily many ciphertexts $c_{\textit{se}}$ with respect to the same $\textsf{SE}$ key k, by repeatedly using the same pair of values for $(\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key})$. But it might nonetheless be hard for $\mathcal {F}_{\textrm{INT}}$ to obtain a decodable payload p if (1) the outputs of function $\mathsf {\textsf{SE}.Dec}(k, \cdot )$ are sufficiently “unpredictable” for an unknown uniformly random k, and (2) the $\mathsf {\textsf{ME}.Decode}$ algorithm is sufficiently “restrictive” (e.g. designed to run some sanity checks on its payloads, hence rejecting a fraction of them). We use the unpredictability notion of $\textsf{SE}$ with respect to $\textsf{ME}$, which captures this intuition. In Fig. 47, we build an adversary $\mathcal {F}_{\textrm{UNPRED}}$ against the $\textrm{UNPRED}$-security of $\textsf{SE},\textsf{ME}$ (Fig. 35) as follows. When adversary $\mathcal {F}_{\textrm{UNPRED}}$ plays in game $\textrm{G}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}, \mathcal {F}_{\textrm{UNPRED}}}$, it simulates game $\textrm{G}_5$ for adversary $\mathcal {F}_{\textrm{INT}}$. Adversary $\mathcal {F}_{\textrm{UNPRED}}$ wins in its own game whenever $\mathcal {F}_{\textrm{INT}}$ sets $\textsf{bad}_1$, so we have

$$ \Pr [\textsf{bad}_1^{\textrm{G}_5}] \le \textsf{Adv}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}}(\mathcal {F}_{\textrm{UNPRED}}). $$

We now explain the ideas behind the construction of $\mathcal {F}_{\textrm{UNPRED}}$. Adversary $\mathcal {F}_{\textrm{UNPRED}}$ does not maintain its own transcripts $\textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}$ and hence does not evaluate the support function $\textsf{supp}$ at the end of the simulated $\textsc {Recv}$ oracle. This is because $\textsf{supp}$’s outputs do not affect the input–output behaviour of the simulated oracles $\textsc {Send}$ and $\textsc {Recv}$, and because this reduction step does not rely on whether adversary $\mathcal {F}_{\textrm{INT}}$ manages to win in the simulated game (but rather only whether it sets $\textsf{bad}_1$). Some of the adversaries we construct for the next reduction steps will likewise not maintain the transcripts.

Adversary $\mathcal {F}_{\textrm{UNPRED}}$ splits the simulation of game $\textrm{G}_5$’s $\textsc {Recv}$ oracle into two cases:

(1)
If $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}] = \bot $, then $\mathcal {F}_{\textrm{UNPRED}}$ does not modify $\textit{st}_{\textsf{ME}, \textit{u}}$; this is consistent with the behaviour of oracle $\textsc {Recv}$ in game $\textrm{G}_5$. In addition, adversary $\mathcal {F}_{\textrm{UNPRED}}$ also makes a call to its oracle $\textsc {Ch}$. The $\textsc {Ch}$ oracle simulates all instructions that would have been evaluated by $\textsc {Recv}$ when $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}] = \bot $, except it omits the condition checking $(\textsf{msg}{\_}\textsf{key}' = \textsf{msg}{\_}\textsf{key}) \wedge (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}= \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}')$. The omitted condition is a prerequisite to setting flag $\textsf{bad}_1$ in game $\textrm{G}_5$; this change is fine because adversary $\mathcal {F}_{\textrm{UNPRED}}$ will nonetheless set the $\textsf{win}$ flag in its game $\textrm{G}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}, \mathcal {F}_{\textrm{UNPRED}}}$ whenever the simulated adversary $\mathcal {F}_{\textrm{INT}}$ would have set the $\textsf{bad}_1$ flag in $\textrm{G}_5$.
(2)
If $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}] \ne \bot $, then $\mathcal {F}_{\textrm{UNPRED}}$ honestly simulates all instructions that would have been evaluated by $\textsc {Recv}$.

Finally, adversary $\mathcal {F}_{\textrm{UNPRED}}$ uses its $\textsc {Expose}$ oracle to learn the values from the PRF table that is maintained by the $\textrm{UNPRED}$-security game, and synchronises them with its own PRF table $\textsf{T}$ inside the simulated oracle $\textsc {Send}$ (intuitively, this appears unnecessary, but it helps us avoid further analysis to show that $\mathcal {F}_{\textrm{UNPRED}}$ perfectly simulates game $\textrm{G}_5$).

${{{\textbf{G}}}_{5}\rightarrow {{\textbf{G}}}_{6}}$. Games $\textrm{G}_5$ and $\textrm{G}_6$ (Fig. 46) are identical until $\textsf{bad}_2$ is set. We have

$$ \Pr [\textrm{G}_5] - \Pr [\textrm{G}_6] \le \Pr [\textsf{bad}_2^{\textrm{G}_5}]. $$

Game $\textrm{G}_5$ sets the $\textsf{bad}_2$ flag in two different places: one inside oracle $\textsc {Send}$, and one inside oracle $\textsc {Recv}$. In either case, this happens when the table entry $\textsf{S}[w, \textsf{msg}{\_}\textsf{key}] = (p', c_{\textit{se}}')$, for some $w\in \{\mathcal {I},\mathcal {R}\}$, indicates that a prior call to oracle $\textsc {Send}$ obtained $\textsf{msg}{\_}\textsf{key}\leftarrow \textsf{MAC}.\textsf{Ev}(\textit{mk}_{w}, p')$, and now we found $p$ such that $p\ne p'$ and $\textsf{msg}{\_}\textsf{key}= \textsf{MAC}.\textsf{Ev}(\textit{mk}_{w}, p)$. This results in a collision for $\textsf{MAC}$ under related keys and hence breaks its $\textrm{RKCR}$-security (Fig. 27) with respect to $\phi _{\textsf{MAC}}$. In Fig. 48, we build an adversary $\mathcal {F}_{\textrm{RKCR}}$ against the $\textrm{RKCR}$-security of $\textsf{MAC}$ with respect to $\phi _{\textsf{MAC}}$ as follows. When adversary $\mathcal {F}_{\textrm{RKCR}}$ plays in game $\textrm{G}^{\textsf{rkcr}}_{\textsf{MAC}, \phi _{\textsf{MAC}}, \mathcal {F}_{\textrm{RKCR}}}$, it simulates game $\textrm{G}_5$ for adversary $\mathcal {F}_{\textrm{INT}}$. Adversary $\mathcal {F}_{\textrm{RKCR}}$ wins in its own game whenever $\mathcal {F}_{\textrm{INT}}$ sets $\textsf{bad}_2$, so we have

$$ \Pr [\textsf{bad}_2^{\textrm{G}_5}] \le \textsf{Adv}^{\textsf{rkcr}}_{\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {F}_{\textrm{RKCR}}). $$

${{{\textbf{G}}}_{6}\rightarrow {{\textbf{G}}}_{7}}$. Games $\textrm{G}_6$ and $\textrm{G}_7$ (Fig. 46) are identical until $\textsf{bad}_3$ is set. We have

$$ \Pr [\textrm{G}_6] - \Pr [\textrm{G}_7] \le \Pr [\textsf{bad}_3^{\textrm{G}_6}]. $$

If $\textsf{bad}_3$ is set in $\textrm{G}_6$, it means that adversary $\mathcal {F}_{\textrm{INT}}$ found a payload $p$ and an $\textsf{SE}$ key $k \in \{0,1\}^{\mathsf {\textsf{SE}.kl}}$ such that $\mathsf {\textsf{SE}.Dec}(k, \mathsf {\textsf{SE}.Enc}(k, p)) \ne p$. This violates the decryption correctness of $\textsf{SE}$. Recall that the statement of Theorem 2 considers $\textsf{SE}= \textsf{MTP}\text {-}\textsf{SE}$. The $\textsf{MTP}\text {-}\textsf{SE}$ scheme satisfies decryption correctness, so

$$ \Pr [\textsf{bad}_3^{\textrm{G}_6}] = 0. $$

${{{\textbf{G}}}_{7}\rightarrow {{\textbf{G}}}_{8}}$. Games $\textrm{G}_7$ and $\textrm{G}_8$ (Fig. 46) are identical until $\textsf{bad}_4$ is set. We have

$$ \Pr [\textrm{G}_7] - \Pr [\textrm{G}_8] \le \Pr [\textsf{bad}_4^{\textrm{G}_7}]. $$

Whenever $\textsf{bad}_4$ is set in game $\textrm{G}_7$, we know that (1) $p\leftarrow \mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}})$ was computed during the ongoing $\textsc {Recv}$ call, and (2) $c_{\textit{se}}' \leftarrow \mathsf {\textsf{SE}.Enc}(k, p)$ was computed during an earlier call to $\textsc {Send}$, which also verified that $\mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}}') = p$. Importantly, we also know that $c_{\textit{se}}\ne c_{\textit{se}}'$. The statement of Theorem 2 considers $\textsf{SE}= \textsf{MTP}\text {-}\textsf{SE}$. The latter is a deterministic symmetric encryption scheme that is based on the IGE block cipher mode of operation. For each key $k \in \{0,1\}^{\mathsf {\textsf{SE}.kl}}$ and each length $\ell \in {{\mathbb {N}}}$ such that $\{0,1\}^{\ell } \subseteq \mathsf {\textsf{SE}.MS}$, this scheme specifies a permutation between all plaintexts from $\{0,1\}^\ell $ and all ciphertexts from $\{0,1\}^\ell $. In particular, this means that $\textsf{MTP}\text {-}\textsf{SE}$ has unique ciphertexts, meaning it is impossible to find $c_{\textit{se}}\ne c_{\textit{se}}'$ that, under any fixed choice of key k, decrypt to the same payload $p$. It follows that $\textsf{bad}_4$ can never be set when $\textsf{SE}= \textsf{MTP}\text {-}\textsf{SE}$, so we have

$$ \Pr [\textsf{bad}_4^{\textrm{G}_7}] = 0. $$

${{{\textbf{G}}}_{8}\rightarrow {{\textbf{G}}}_{9}}$. While discussing this and subsequent transitions, we say that a ciphertext c belongs to (or appears in) a support transcript $\textsf{tr}_{}$ if and only if $\exists m', \textit{aux}' :(\textsf{sent}, m', c, \textit{aux}') \in \textsf{tr}_{}$.

Consider oracle $\textsc {Recv}$ in game $\textrm{G}_8$ (Fig. 46). Let $\textit{st}_{\textsf{ME}, \textit{u}}^*$ contain the value of variable $\textit{st}_{\textsf{ME}, \textit{u}}$ at the start of the ongoing call to $\textsc {Recv}$ on inputs $(\textit{u}, c, \textit{aux})$. We start by showing that $\textsc {Recv}$ evaluates $(\textit{st}_{\textsf{ME}, \textit{u}}, m)$ $\leftarrow $ $\mathsf {\textsf{ME}.Decode}(\textit{st}_{\textsf{ME}, \textit{u}}$, $p$, $\textit{aux})$ and does not subsequently roll back the values of $(\textit{st}_{\textsf{ME}, \textit{u}}, m)$ to $(\textit{st}_{\textsf{ME}, \textit{u}}^*, \bot )$ iff c belongs to $\textsf{tr}_{\overline{\textit{u}}}$:

(1)
If oracle $\textsc {Recv}$ evaluates $(\textit{st}_{\textsf{ME}, \textit{u}}, m) \leftarrow \mathsf {\textsf{ME}.Decode}(\textit{st}_{\textsf{ME}, \textit{u}}, p, \textit{aux})$ and does not restore the values of $(\textit{st}_{\textsf{ME}, \textit{u}}, m)$, then $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}= \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}'$ and $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}] = (p, c_{\textit{se}})$ (the latter implies $\textsf{msg}{\_}\textsf{key}' = \textsf{msg}{\_}\textsf{key}$). According to the construction of oracle $\textsc {Send}$, this means that the ciphertext $c = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}', \textsf{msg}{\_}\textsf{key}, c_{\textit{se}})$ appears in transcript $\textsf{tr}_{\overline{\textit{u}}}$.
(2)
Let $c = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}', \textsf{msg}{\_}\textsf{key}, c_{\textit{se}})$ be any $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext, and let $\overline{\textit{u}}\in \{\mathcal {I},\mathcal {R}\}$. If c belongs to $\textsf{tr}_{\overline{\textit{u}}}$, then by construction of oracle $\textsc {Send}$ we know that $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}= \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}'$ and $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}] = (p, c_{\textit{se}})$ for the payload $p$ such that $k = \textsf{T}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}]$, and $c_{\textit{se}}= \mathsf {\textsf{SE}.Enc}(k, p)$, and $p= \mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}})$. The latter equality is guaranteed by the decryption correctness of $\textsf{SE}= \textsf{MTP}\text {-}\textsf{SE}$ that we used for transition $\textrm{G}_{6}\rightarrow \textrm{G}_{7}$. The $\textrm{RKCR}$-security of $\textsf{MAC}$ guarantees that once $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}]$ is populated, a future call to oracle $\textsc {Send}$ cannot overwrite $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}]$ with a different pair of values. All of the above implies that if c belongs to $\textsf{tr}_{\overline{\textit{u}}}$ at the beginning of a call to oracle $\textsc {Recv}$, then this oracle will successfully verify that $\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}= \textsf{auth}{\_}\textsf{key}{\_}\textsf{id}'$ and $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}] = (p, c_{\textit{se}})$ for $p\leftarrow \mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}})$ (whereas $\textsf{msg}{\_}\textsf{key}' = \textsf{msg}{\_}\textsf{key}$ follows from $\textsf{S}[\overline{\textit{u}}, \textsf{msg}{\_}\textsf{key}]$ containing the payload $p$). It means that the instruction $(\textit{st}_{\textsf{ME}, \textit{u}}, m) \leftarrow \mathsf {\textsf{ME}.Decode}(\textit{st}_{\textsf{ME}, \textit{u}}, p, \textit{aux})$ will be evaluated, and the variables $(\textit{st}_{\textsf{ME}, \textit{u}}, m)$ will not be subsequently rolled back to $(\textit{st}_{\textsf{ME}, \textit{u}}^*, \bot )$.

Game $\textrm{G}_9$ (Fig. 49) differs from game $\textrm{G}_8$ (Fig. 46) in the following ways:

(1)
Game $\textrm{G}_9$ adds a payload table $\textsf{P}$ that is updated during each call to oracle $\textsc {Send}$. We set $\textsf{P}[\textit{u}, c] \leftarrow p$ to indicate that the $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext c, which was sent from user $\textit{u}$ to user $\overline{\textit{u}}$, encrypts the payload $p$. Observe that any pair $(\textit{u}, c)$ with $c = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}, \textsf{msg}{\_}\textsf{key}, c_{\textit{se}})$ corresponds to a unique payload that can be recovered as $p\leftarrow \mathsf {\textsf{SE}.Dec}(\textsf{T}[\textit{u}, \textsf{msg}{\_}\textsf{key}], c_{\textit{se}})$. This relies on decryption correctness of $\textsf{SE}$, which is guaranteed to hold for ciphertexts inside table $\textsf{P}$ due to the changes that we introduced in the transition between games $\textrm{G}_{6}\rightarrow \textrm{G}_{7}$.
(2)
Game $\textrm{G}_9$ rewrites the code of game $\textrm{G}_8$’s oracle $\textsc {Recv}$ to run $\mathsf {\textsf{ME}.Decode}$ iff the ciphertext c belongs to the transcript $\textsf{tr}_{\overline{\textit{u}}}$; otherwise, the $\textsc {Recv}$ oracle does not change $\textit{st}_{\textsf{ME}, \textit{u}}$ and simply sets $m \leftarrow \bot $. This follows from the analysis of $\textrm{G}_8$ that we provided above. We note that checking whether c belongs to $\textsf{tr}_{\overline{\textit{u}}}$ is equivalent to checking $\textsf{P}[\overline{\textit{u}}, c] \ne \bot $. For simplicity, we do the latter; and if the condition is satisfied, then we set $p\leftarrow \textsf{P}[\overline{\textit{u}}, c]$ and run $\mathsf {\textsf{ME}.Decode}$ with this payload as input. As discussed above, the $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext c that is issued by user $\overline{\textit{u}}$ always encrypts a unique payload $p$, and hence, we can rely on the fact that the table entry $\textsf{P}[\overline{\textit{u}}, c]$ stores this unique payload value.
(3)
Game $\textrm{G}_9$ also rewrites one condition inside oracle $\textsc {Send}$, in a more compact but equivalent way (here we rely on the fact that values $\textit{u}, \textsf{msg}{\_}\textsf{key}, p$ uniquely determine $c_{\textit{se}}$). It also adds one new conditional statement to oracle $\textsc {Recv}$ (checking $m^*\ne \bot $), but it serves no purpose in $\textrm{G}_9$.

Games $\textrm{G}_9$ and $\textrm{G}_8$ are functionally equivalent, so

$$ \Pr [\textrm{G}_9] = \Pr [\textrm{G}_8]. $$

${{{\textbf{G}}}_{9}\rightarrow {{\textbf{G}}}_{10}}$. Game $\textrm{G}_{10}$ (Fig. 49) enforces that $m^*= \bot $ whenever oracle $\textsc {Recv}$ is called on a ciphertext that cannot be found in the appropriate user’s transcript. Games $\textrm{G}_{9}$ and $\textrm{G}_{10}$ are identical until $\textsf{bad}_5$ is set. We have

$$ \Pr [\textrm{G}_{9}] - \Pr [\textrm{G}_{10}] \le \Pr [\textsf{bad}_5^{\textrm{G}_{9}}]. $$

If $\textsf{bad}_5$ is set in game $\textrm{G}_9$, then the support function $\textsf{supp}$ returned $m^*\ne \bot $ in response to an $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext c that does not belong to the opposite user’s transcript $\textsf{tr}_{\overline{\textit{u}}}$. The statement of Theorem 2 considers $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$. The latter is defined to always return $m^*= \bot $ when its input label does not appear in $\textsf{tr}_{\overline{\textit{u}}}$, so

$$ \Pr [\textsf{bad}_5^{\textrm{G}_{9}}] = 0. $$

We refer to this property as the integrity of support function $\textsf{supp}$. We formalise it in Appendix A.

${{{\textbf{G}}}_{10}\rightarrow {{\textbf{G}}}_{11}}$. Game $\textrm{G}_{11}$ (Fig. 49) stops adding entries of the form $(\textsf{recv}, \bot , c, \textit{aux})$ to the transcripts of both users. Once this is done, it becomes pointless for adversary $\mathcal {F}_{\textrm{INT}}$ to call its $\textsc {Recv}$ oracle on any ciphertext that does not appear in the appropriate user’s transcript. This is because such a call will never set the $\textsf{win}$ flag (due to the change introduced in transition $\textrm{G}_{9}\rightarrow \textrm{G}_{10}$) and will never affect the transcript of either user (due to the change introduced in this transition). The statement of Theorem 2 considers $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$. The latter is defined to ignore all transcript entries of the form $(\textsf{recv}, \bot , c, \textit{aux})$, so removing the instruction $\textsf{tr}_{\textit{u}} \leftarrow \textsf{tr}_{\textit{u}} ~\Vert ~(\textsf{recv}, m, c, \textit{aux})$ for $m = \bot $ will not affect the outputs of any future calls to this support function. We have

$$ \Pr [\textrm{G}_{11}] = \Pr [\textrm{G}_{10}]. $$

Earlier in this section we referred to this property as the robustness of support function $\textsf{supp}$.

${{{\textbf{G}}}_{11}\rightarrow {{\textbf{G}}}_{12}}$. When discussing the differences between games $\textrm{G}_8$ and $\textrm{G}_9$, we showed that for each pair of sender $\textit{u}\in \{\mathcal {I}, \mathcal {R}\}$ and $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext c, the encrypted payload $p$ is unique. It is also true that for each pair of $\textit{u}\in \{\mathcal {I}, \mathcal {R}\}$ and payload $p$, there is a unique $\textsf{MTP}\text {-}\textsf{CH} $ ciphertext c that encrypts $p$ in the direction from $\textit{u}$ to $\overline{\textit{u}}$. It follows that in games $\textrm{G}_{11}$ and $\textrm{G}_{12}$ (Fig. 49) for any fixed user $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$ there is a 1-to-1 correspondence between payloads and $\textsf{MTP}\text {-}\textsf{CH} $ ciphertexts that could be successfully sent from $\textit{u}$ to $\overline{\textit{u}}$ (note that this property does not hold if $\textsf{SE}$ does not have decryption correctness, but the code added for the transition $\textrm{G}_{6}\rightarrow \textrm{G}_{7}$ already identifies and discards the corresponding ciphertexts). The statement of Theorem 2 considers $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$. Observe that for any label $z$ sent from $\textit{u}$ to $\overline{\textit{u}}$, the support function $\textsf{supp}\text {-}\textsf{ord}$ checks only its equality with every $z^*$ such that $(\textsf{sent}, m, z^*, \textit{aux}) \in \textsf{tr}_{\textit{u}}$ or $(\textsf{recv}, m, z^*, \textit{aux}) \in \textsf{tr}_{\overline{\textit{u}}}$ across all values of $m, \textit{aux}$. In other words, this support function only looks at the equality pattern of the labels, and it does this independently in each of the two directions between the users. The 1-to-1 correspondence between c and $p$, with respect to any fixed user $\textit{u}$, means we can replace the labels used in support transcripts from c to $p$, and replace the label inputs to the support function $\textsf{supp}\text {-}\textsf{ord}$ in the same way; this does not change the outputs of the support function. We have

$$ \Pr [\textrm{G}_{12}] = \Pr [\textrm{G}_{11}]. $$

${{{\textbf{G}}}_{12}\rightarrow {{\textbf{G}}}_{13}}$. Games $\textrm{G}_{12}$ and $\textrm{G}_{13}$ are identical until $\textsf{bad}_6$ is set. We have

$$ \Pr [\textrm{G}_{12}] - \Pr [\textrm{G}_{13}] \le \Pr [\textsf{bad}_6^{\textrm{G}_{13}}]. $$

Games $\textrm{G}_{12}$ and $\textrm{G}_{13}$ (Fig. 49) can be thought of as simulating a bidirectional authenticated channel that allows the two users to exchange $\textsf{ME}$ payloads. The adversary $\mathcal {F}_{\textrm{INT}}$ is allowed to forward, replay, reorder and drop the payloads; but it is not allowed to forge them. This description roughly corresponds to the definition of $\textrm{EINT}$-security of $\textsf{ME}$ with respect to $\textsf{supp}$ (Fig. 16). In games $\textrm{G}_{12}$–$\textrm{G}_{13}$, the oracle $\textsc {Send}$ still runs cryptographic algorithms in order to generate and return $\textsf{MTP}\text {-}\textsf{CH} $ ciphertexts, but we will build an $\textrm{EINT}$-security adversary that simulates these instructions for $\mathcal {F}_{\textrm{INT}}$. In Fig. 50, we build an adversary $\mathcal {F}_{\textrm{EINT}}$ against the $\textrm{EINT}$-security of $\textsf{ME}, \textsf{supp}$ as follows. When adversary $\mathcal {F}_{\textrm{EINT}}$ plays in game $\textrm{G}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}_{\textrm{EINT}}}$, it simulates game $\textrm{G}_{13}$ for adversary $\mathcal {F}_{\textrm{INT}}$. Adversary $\mathcal {F}_{\textrm{EINT}}$ wins in its own game whenever $\mathcal {F}_{\textrm{INT}}$ sets $\textsf{bad}_6$, so we have

$$ \Pr [\textsf{bad}_6^{\textrm{G}_{13}}] \le \textsf{Adv}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}}(\mathcal {F}_{\textrm{EINT}}). $$

Observe that $\mathcal {F}_{\textrm{EINT}}$ takes $\mathcal {I}$’s and $\mathcal {R}$’s initial $\textsf{ME}$ states as input, and repeatedly calls the $\textsf{ME}$ algorithms to manually update these states (as opposed to relying on its $\textsc {Send}$ and $\textsc {Recv}$ oracles). This allows $\mathcal {F}_{\textrm{EINT}}$ to correctly identify the two conditional statements inside the simulated oracle $\textsc {SendSim}$ that require to roll back the most recent update to $\textit{st}_{\textsf{ME}, \textit{u}}$ and to exit the oracle with $\bot $ as output.

Adversary $\mathcal {F}_{\textrm{INT}}$ can no longer win in game $\textrm{G}_{13}$, because the only instruction that sets the $\textsf{win}$ flag in games $\textrm{G}_{0}$–$\textrm{G}_{12}$ was removed in transition to game $\textrm{G}_{13}$. It follows that

$$ \Pr [\textrm{G}_{13}] = 0. $$

The theorem statement follows. $\square $

5.6.7 Proof alternatives

In the earlier analysis of Case A, we relied on a certain property of the message encoding scheme $\textsf{ME}$. Roughly speaking, we reasoned that the algorithm $\mathsf {\textsf{ME}.Decode}$ should not be able to successfully decode random-looking strings, meaning it should require that decodable payloads are structured in a certain way. We now briefly outline a proof strategy that does not rely on such a property of $\textsf{ME}$.

In Case A adversary $\mathcal {F}_{\textrm{INT}}$ calls its oracle $\textsc {Recv}(\textit{u}, c, \textit{aux})$ on $c = (\textsf{auth}{\_}\textsf{key}{\_}\textsf{id}', \textsf{msg}{\_}\textsf{key}, c_{\textit{se}})$ with a $\textsf{msg}{\_}\textsf{key}$ value that was never previously returned by oracle $\textsc {Send}$ as a part of a ciphertext produced by user $\overline{\textit{u}}$. Let us modify our initial goal for Case A as follows: we want to show that evaluating $k \leftarrow \textsf{KDF}.\textsf{Ev}(\textit{kk}_{\overline{\textit{u}}}, \textsf{msg}{\_}\textsf{key})$, $p\leftarrow \mathsf {\textsf{SE}.Dec}(k, c_{\textit{se}})$ and $\textsf{msg}{\_}\textsf{key}' \leftarrow \textsf{MAC}.\textsf{Ev}(\textit{mk}_{\overline{\textit{u}}}, p)$ is very unlikely to result in $\textsf{msg}{\_}\textsf{key}' = \textsf{msg}{\_}\textsf{key}$. In fact, it is sufficient to focus on the last instruction here: we require that it is hard to forge any input–output pair $(p, \textsf{msg}{\_}\textsf{key})$ such that $\textsf{msg}{\_}\textsf{key}= \textsf{MAC}.\textsf{Ev}(\textit{mk}_{\overline{\textit{u}}}, p)$. This property is guaranteed if $\textsf{MAC}$ is related-key PRF-secure.

Theorem 2 is currently stated for a generic function family $\textsf{MAC}$, but it could be narrowed down to use $\textsf{MAC}= \textsf{MTP}\text {-}\textsf{MAC}$ where $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}(\textit{mk}_\textit{u}, p) = \textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p){[64:192]}$. Crucially, the algorithm $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}$ is defined to drop half of the output bits of $\textsf{SHA}-\textsf{256}$; this prevents length-extension attacks. We could model $\textsf{MTP}\text {-}\textsf{MAC}$ as the Augmented MAC (AMAC) and use the results from [10] to show that it is related-key PRF-secure. Technically, this would require proving three claims as follows:

(1)
Output of the first compression function within $\textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p){[64:192]}$ looks uniformly random when used with related keys; we already formalise and analyse this property in Sect. 5.2, phrased as the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{MAC}}$.
(2)
The $\textsf{SHA}-\textsf{256}$ compression function $h _{256}$ is $\textrm{OTPRF}$-secure.
(3)
The $\textsf{SHA}-\textsf{256}$ compression function is (roughly) PRF-secure even in the presence of some leakage on its key, i.e. an attacker receives k[64 : 192] when trying to break the PRF security of $h _{256}(k, \cdot )$; we do not formalise or analyse this property in our work.

Here (1) and (2) could be chained together to show that $\textsf{MTP}\text {-}\textsf{MAC}$ is a secure PRF even for variable-length inputs; then, (3) would suffice to show that $\textsf{MTP}\text {-}\textsf{MAC}$ is resistant to length-extension attacks.

Adopting the above proof strategy would have allowed us to omit the following two steps from the current security reduction. The $\textrm{UNPRED}$-security of $\textsf{SE}, \textsf{ME}$ would get directly replaced with a new related-key PRF-security assumption for $\textsf{MAC}= \textsf{MTP}\text {-}\textsf{MAC}$, following the results for AMAC from [10]. The $\textrm{RKPRF}$-security of $\textsf{KDF}$ (with respect to $\phi _{\textsf{KDF}}$) would no longer be needed, because currently its only use is to transform the security game prior to appealing to the $\textrm{UNPRED}$-security of $\textsf{SE}, \textsf{ME}$.

5.7 Instantiation and interpretation

We are now ready to combine the theorems from the previous two sections with the notions defined in Sects. 5.1 and 5.3 and the proofs in Appendix E. This is meant to allow interpretation of our main results: qualitatively (what security assumptions are made) and quantitatively (what security level is achieved). Note that in both of the following corollaries, the adversary is limited to making $2^{96}$ queries. This is due to the wrapping of counters in $\textsf{MTP}\text {-}\textsf{ME}$, since beyond this limit the advantage in breaking $\textrm{UPREF}$-security and $\textrm{EINT}$-security of $\textsf{MTP}\text {-}\textsf{ME}$ becomes 1.

Corollary 1

Let $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$, $\textsf {pb} \in {{\mathbb {N}}}$ and $\textsf{bl}= 128$. Let $\textsf{ME}= \textsf{MTP}\text {-}\textsf{ME}[\textsf {session}\_\textsf {id}, \textsf {pb}, \textsf{bl}]$, $\textsf{MTP}\text {-}\textsf{HASH}$, $\textsf{MTP}\text {-}\textsf{MAC}$, $\textsf{MTP}\text {-}\textsf{KDF}$, $\phi _{\textsf{MAC}}$, $\phi _{\textsf{KDF}}$, $\textsf{MTP}\text {-}\textsf{SE}$ be the primitives defined in Sect. 4.4. Let $\textsf{CH}= \textsf{MTP}\text {-}\textsf{CH} [\textsf{ME}, \textsf{MTP}\text {-}\textsf{HASH}, \textsf{MTP}\text {-}\textsf{MAC}, \textsf{MTP}\text {-}\textsf{KDF}, \phi _{\textsf{MAC}}, \phi _{\textsf{KDF}}, \textsf{MTP}\text {-}\textsf{SE}]$. Let $\phi _{\textsf{SHACAL}-\textsf{2}}$ be the related-key-deriving function defined in Fig. 29. Let $h _{256}$ be the $\textsf{SHA}-\textsf{256}$ compression function, and let $\textsf{H} $ be the corresponding function family with $\textsf{H}.\textsf{Ev} = h _{256}$, $\textsf{H}.\textsf{kl} = \textsf{H}.\textsf{ol} = 256$ and $\textsf{H}.\textsf{IN} = \{0,1\}^{512}$. Let $\ell \in {{\mathbb {N}}}$. Let $\mathcal {D}_{\textrm{IND}}$ be any adversary against the $\textrm{IND}$-security of $\textsf{CH}$, making $q_{\textsc {Ch}}\le 2^{96}$ queries to its $\textsc {Ch}$ oracle, each query made for a message of length at most $\ell \le 2^{27}$ bits.^{Footnote 29} Then, we can build adversaries $\mathcal {D}_{\textrm{OTPRF}}^{\textsf{shacal}}$, $\mathcal {D}_{\textrm{LRKPRF}}$, $\mathcal {D}_{\textrm{HRKPRF}}$, $\mathcal {D}_{\textrm{OTPRF}}^{\textsf{compr}}$, $\mathcal {D}_{\mathrm {OTIND\$}}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}_{\textrm{IND}})&\le 4 \cdot \Big (\textsf{Adv}^{\textsf{otprf}}_{\textsf{SHACAL}-\textsf{1}}(\mathcal {D}_{\textrm{OTPRF}}^{\textsf{shacal}}) \\&\quad + \textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}}(\mathcal {D}_{\textrm{LRKPRF}}) \\&\quad + \textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2},\phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{HRKPRF}}) \\&\quad + \left\lfloor \frac{\ell + 256}{512} + \frac{\textsf {pb} + 1}{4}\right\rfloor \cdot \textsf{Adv}^{\textsf{otprf}}_{\textsf{H}}(\mathcal {D}_{\textrm{OTPRF}}^{\textsf{compr}})\Big ) \\&\quad + \; \frac{q_{\textsc {Ch}}\cdot (q_{\textsc {Ch}}- 1)}{2^{128}} \\&\quad + \; 2 \cdot \textsf{Adv}^{\mathsf {otind\$}}_{\textsf{CBC}[\textsf{AES}-\textsf{256}]}(\mathcal {D}_{\mathrm {OTIND\$}}). \end{aligned}$$

Corollary 1 follows from Theorem 1 together with Proposition 5, Proposition 6, Proposition 7 with Lemma 1 and Proposition 8. The two terms in Theorem 1 related to $\textsf{ME}$ are zero for $\textsf{ME}= \textsf{MTP}\text {-}\textsf{ME}$ when an adversary is restricted to making $q_{\textsc {Ch}}\le 2^{96}$ queries. Qualitatively, Corollary 1 shows that the confidentiality of the MTProto-based channel depends on whether $\textsf{SHACAL}-\textsf{1}$ and $\textsf{SHACAL}-\textsf{2}$ can be considered as pseudorandom functions in a variety of modes: with keys used only once, related keys, partially chosen-keys when evaluated on fixed inputs and when the key and input switch positions. Especially the related-key assumptions ($\textrm{LRKPRF}$ and $\textrm{HRKPRF}$ given in Sect. 5.2) are highly unusual; in Appendix F, we show that both assumptions hold in the ideal cipher model, but both of them require further study in the standard model. Quantitatively, a limiting term in the advantage, which implies security only if $q_{\textsc {Ch}}< 2^{64}$, is a result of the birthday bound on the MAC output, though we note that we do not have a corresponding attack in this setting and thus the bound may not be tight.

Corollary 2

Let $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$, $\textsf {pb} \in {{\mathbb {N}}}$ and $\textsf{bl}= 128$. Let $\textsf{ME}= \textsf{MTP}\text {-}\textsf{ME}[\textsf {session}\_\textsf {id}, \textsf {pb}, \textsf{bl}]$, $\textsf{MTP}\text {-}\textsf{HASH}$, $\textsf{MTP}\text {-}\textsf{MAC}$, $\textsf{MTP}\text {-}\textsf{KDF}$, $\phi _{\textsf{MAC}}$, $\phi _{\textsf{KDF}}$, $\textsf{MTP}\text {-}\textsf{SE}$ be the primitives defined in Sect. 4.4. Let $\textsf{CH}= \textsf{MTP}\text {-}\textsf{CH} [\textsf{ME}, \textsf{MTP}\text {-}\textsf{HASH}, \textsf{MTP}\text {-}\textsf{MAC}, \textsf{MTP}\text {-}\textsf{KDF}, \phi _{\textsf{MAC}}, \phi _{\textsf{KDF}}, \textsf{MTP}\text {-}\textsf{SE}]$. Let $\phi _{\textsf{SHACAL}-\textsf{2}}$ be the related-key-deriving function defined in Fig. 29. Let $\textsf{SHA}-\textsf{256}'$ be $\textsf{SHA}-\textsf{256}$ with its output truncated to the middle 128 bits. Let $\textsf{supp}=\textsf{supp}\text {-}\textsf{ord}$ be the support function as defined in Fig. 32. Let $\mathcal {F}_{\textrm{INT}}$ be any adversary against the $\textrm{INT}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$, making $q_{\textsc {Send}}\le 2^{96}$ queries to its $\textsc {Send}$ oracle. Then, we can build adversaries $\mathcal {D}_{\textrm{OTPRF}}$, $\mathcal {D}_{\textrm{LRKPRF}}$, $\mathcal {F}_{\textrm{CR}}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}_{\textrm{INT}})&\le 2 \cdot \Big (\textsf{Adv}^{\textsf{otprf}}_{\textsf{SHACAL}-\textsf{1}}(\mathcal {D}_{\textrm{OTPRF}}) \\&\quad + \textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}}(\mathcal {D}_{\textrm{LRKPRF}})\Big ) \\&\quad + \; \frac{q_{\textsc {Send}}}{2^{64}} + \textsf{Adv}^{\textsf{cr}}_{\textsf{SHA}-\textsf{256}'}(\mathcal {F}_{\textrm{CR}}). \end{aligned}$$

Corollary 2 follows from Theorem 2 together with Proposition 5, Proposition 6 and Proposition 11. The term $\textsf{Adv}^{\textsf{eint}}_{\textsf{MTP}\text {-}\textsf{ME}, \textsf{supp}\text {-}\textsf{ord}}(\mathcal {F}_{\textrm{EINT}})$ from Theorem 2 resolves to 0 for adversaries making $q_{\textsc {Send}}\le 2^{96}$ queries according to Proposition 9. Qualitatively, Corollary 2 shows that also the integrity of the MTProto-based channel depends on $\textsf{SHACAL}-\textsf{1}$ and $\textsf{SHACAL}-\textsf{2}$ behaving as PRFs. Due to the way $\textsf{MTP}\text {-}\textsf{MAC}$ is constructed, the result also depends on the collision resistance of truncated output $\textsf{SHA}-\textsf{256}$ (as discussed in Sect. 5.1). Quantitatively, the advantage is again bounded by ${q_{\textsc {Send}}} < {2^{64}}$. This bound follows from the fact that the first block of payload contains a 64-bit constant $\textsf {session}\_\textsf {id}$, which has to match upon decoding. If the MTProto message encoding scheme consistently checked more fields during decoding (especially in the first block), the bound could be improved.

6 Timing side-channel attack

Formal models and proofs such as the ones in the previous sections cannot by their nature capture all possible security guarantees of a real system, which we illustrate in this section. Going beyond the model, we present a timing side-channel attack against implementations of MTProto. The attack arises from MTProto’s reliance on an Encrypt & MAC construction, the malleability of IGE mode, and specific weaknesses in implementations. The attack proceeds in the spirit of [5]: move a target ciphertext block to a position where the underlying plaintext will be interpreted as a length field and use the resulting behaviour to learn some information. The attack is complicated by Telegram using IGE mode instead of CBC mode analysed in [5]. We begin by describing a generic way to overcome this obstacle in Sect. 6.1. We describe the side channels found in the implementations of several Telegram clients in Sect. 6.2 and experimentally demonstrate the existence of a timing side channel in the desktop client in Sect. 6.3.

6.1 Manipulating IGE

Recall that in IGE mode, we have $c_i = E_K(m_i \oplus c_{i-1}) \oplus m_{i-1}$ for $i = 1, 2, \dots , t$ (see Sect. 2). Suppose we intercept an IGE ciphertext $c$ consisting of t blocks (for any block cipher E): $c_1~|~c_2~|~\dots ~|~c_t$ where $|$ denotes a block boundary. Further, suppose we have a side channel that enables us to learn some bits of the second plaintext block during decryption.^{Footnote 30} Fix a target block number $i$ for which we are interested in learning a portion of $m_i$ that is encrypted in $c_i$. Additionally, assume we know the plaintext blocks $m_1$ and $m_{i-1}$.

We construct a ciphertext $c_{1}~|~c^{\star }$ where $c^{\star } :=c_i \oplus m_{i-1} \oplus m_1$. This is decrypted in IGE mode as follows:

$$\begin{aligned} m_1&= E_{K}^{-1}(c_1 \oplus IV _{m}) \oplus IV _c\\ m^{\star }&= E_{K}^{-1}(c^{\star } \oplus m_1) \oplus c_1 = E_{K}^{-1}(c_i \oplus m_{i-1}) \oplus c_1 \\&= m_i \oplus c_{i-1} \oplus c_1 \end{aligned}$$

Since we know $c_{1}$ and $c_{i-1}$, we can recover some bits of $m_i$ if we can obtain the corresponding bits of the second plaintext block $m^{\star }$ through the side-channel leak.

To motivate our known plaintext assumption, consider a message where $m_{i-1} =$ “Today’s password” and $m_{i} = $ “is SECRET”. Here $m_{i-1}$ is known, while learning bytes of $m_{i}$ is valuable. On another hand, the requirement of knowing $m_{1}$ may not be easy to fulfil in MTProto. The first plaintext block of an MTProto payload always contains $\textsf {server}\_\textsf {salt} ~\Vert ~\textsf {session}\_\textsf {id} $, both of which are random values. It is unclear whether they were intended to be secret, but in effect they are, limiting the applicability of this attack. Section 7 gives an attack to recover these values. Note that these values are the same for all ciphertexts within a single session, so if they were recovered, then we could carry out the attack on each of the ciphertexts in turn. This allows the basic attack above to be iterated when the target $m_{i}$ is fixed across all the ciphertexts, e.g. in order to amplify the total information learned about $m_i$ when a single ciphertext allows to infer only a partial or noisy information about it (cf. [5]).

6.2 Leaky length field

The preceding attack assumes we have a side channel that enables us to learn a part of the second plaintext block during decryption. We now show how such side channels arise in implementations.

The msg_length field occupies the last four bytes of the second block of every MTProto cloud message plaintext (see Sect. 4.1). After decryption, the field is checked for validity in Telegram clients. Crucially, in several implementations this check is performed before the MAC check, i.e. before msg_key is recomputed from the decrypted plaintext. If either of those checks fails, the client closes the connection without outputting a specific error message. However, if an implementation is not constant time, an attacker who submits modified ciphertexts of the form described above may be able to distinguish between an error arising from validity checking of msg_length and a MAC error, and thus learn something about the bits of plaintext in the position of the msg_length field.

Since different Telegram clients implement different checks on the msg_length field, we now proceed to a case-by-case analysis, showing relevant code excerpts in each case.

6.2.1 Android

The field msg_length is referred to as messageLength here. The check is performed in decryptServerResponse of Datacenter.cpp [68], which compares messageLength with another length field (see code below). If the messageLength check fails, the MAC check is still performed. The timing difference thus consists only of two conditional jumps, which would be small in practice. The length field is taken from the first four bytes of the transport protocol format and is not checked against the actual packet size, so an attacker can substitute arbitrary values. Using multiple queries with different length values could thus enable extraction of up to 32 bits of plaintext from the messageLength field.

6.2.2 Desktop

The method handleReceived of session_private.cpp [71] performs the length check, comparing the messageLength field with a fixed value of kMaxMessageLength $=2^{24}$. When this check fails, the connection is closed and no MAC check is performed, providing a potentially large timing difference. Because of the fixed value $2^{24}$, this check would leak the 8 most significant bits of the target block $m_i$ with probability $2^{-8}$, i.e. the eight most significant bits of the 32-bit length field, allowing those bits to be recovered after about $2^8$ attempts on average.^{Footnote 31}

6.2.3 iOS

The field msg_length is referred to as messageDataLength here. The check is performed in _decryptIncomingTransportData of MTProto.m [72], which compares messageDataLength with the length of the decrypted data first in a padding length check and then directly, see code below. If either check fails, it hashes the complete decrypted payload. A timing side channel arises because sometimes this countermeasure hashes fewer bytes than a genuine MAC check (the latter also hashes 32 bytes of auth_key, here effectiveAuthKey.authKey; hence one more 512-bit block will be hashed unless the length of the decrypted payload in bits modulo 512 is 184 or less,^{Footnote 32} this condition being due to padding). If an attacker can change the value of decryptedData.length directly or by attaching additional ciphertext blocks, this could leak up to 32 bits of plaintext as in the Android client.

6.2.4 Discussion

Note that all three of the above implementations were in violation of Telegram’s own security guidelines [64] which state: “If an error is encountered before this check could be performed, the client must perform the msg_key check anyway before returning any result. Note that the response to any error encountered before the msg_key check must be the same as the response to a failed msg_key check.” In contrast, TDLib [66], the cross-platform library for building Telegram clients, avoids timing leaks by running the MAC check first.

Remark 1

Recall that in Sect. 4.4, we define a simplified message encoding scheme which uses a constant in place of session_id and server_salt . This change would make the above attack more practical. However, the attack is enabled by a misplaced msg_key check and the mitigation offered by those values being secret in the implementations is accidental. Put differently, the attacks described in this section do not justify their secrecy; our proofs of security do not rely on them being secret.

6.3 Practical experiments

We ran experiments to verify whether the side channel present in the desktop client code is exploitable. We measured the time difference between processing a message with a wrong msg_length and processing a message with a correct msg_length but a wrong MAC. This was done using the Linux desktop client, modified to process messages generated on the client side without engaging the network. The code can be found in Appendix G.1. We collected data for $10^8$ trials for each case under ideal conditions, i.e. with hyper-threading, Turbo Boost etc. disabled. After removing outliers, the difference in means was about 3 microseconds, see Fig. 51. This should be sufficiently large for a remote attacker to detect, even with network and other noise sources (cf. [6], where sub-microsecond timing differences were successfully resolved over a LAN).

7 Attacking the key exchange

Recall that our attack in Sect. 6 relies on knowledge of $m_{1}$ which in MTProto contains a 64-bit salt and a 64-bit session ID. In Sect. 7.1, we present a strategy for recovering the 64-bit salt. We then use it in a simple guess and confirm approach to recover the session ID in Sect. 7.2.

We stress, however, that the attack in Sect. 7.1 only applies in a short period after a key exchange between a client and a server.^{Footnote 33} Furthermore, the attack critically relies on observing small timing differences which is unrealistic in practice, especially over a wide area network. That is, our attack relies on a timing side channel when Telegram’s servers decrypt RSA ciphertexts and verify their integrity. While—in response to our disclosure—the Telegram developers confirmed the presence of non-constant code in that part of their implementation and hence confirmed our attack, they did not share source code or other details with us. That is, since Telegram does not publish source code for its servers in contrast to its clients the only option to verify the precise server behaviour is to test it. This would entail sending millions if not billions of requests to Telegram’s servers, from a host that is geographically and topologically close to one of Telegram’s data centres, observing the response time. Such an experiment would have been at the edge of our capabilities but is clearly feasible for a dedicated, well-resourced attacker.

In Sect. 7.3, we then discuss how the attack in Sect. 7.1 enables to break server authentication and thus enables an attacker-in-the-middle (MitM) attack on the Diffie-Hellman key exchange.

7.1 Recovering the salt

At a high level, our strategy exploits the fact that during the initial key exchange, Telegram integrity-protects RSA ciphertexts by including a hash of the underlying message contents in the encrypted payload except for the random padding which necessitates parsing the data which in turn establishes the potential for a timing side-channel.^{Footnote 34} In what follows, we assume the presence of such a side channel and show how it enables the recovery of the encrypted message, solving noisy linear equations via lattice reduction. We refer the reader to [2, 45] for an introduction to the application of lattice reduction in side-channel attacks and the state of the art, respectively.

In Fig. 52, we show Telegram’s instantiation of the Diffie-Hellman key exchange [73] at the level of detail required for our attack, omitting TL schema encoding. In Fig. 52, we let $n :=\textsf {nonce} $, $s :=\textsf {server}\_\textsf {nonce} $, $n' :=\textsf {new}\_\textsf {nonce} $ be nonces; $\mathcal {S}$ be the set of public server fingerprints, $F \in \mathcal {S}$ be the fingerprint of the key selected by the client, $t_{s} :=\textsf {server}\_\textsf {time} $ be a timestamp for the server; let $\mathcal {F}(\cdot , \cdot )$ be some function used to derive keys^{Footnote 35}; let $p_{r}, p_{s}, p_{c}$ be random padding of appropriate length; and $ak :=\textsf {auth}\_\textsf {key} $ be the final key. The value $N = p \cdot q$ is a product of two 32-bit primes p, q selected by the server and sent to the client as a rate-limiting challenge; the client can only proceed with the key exchange after factoring N. The initial salt used by Telegram is then computed as $\textsf {server}\_\textsf {salt} :=n'[0:64] \oplus s[0:64]$. Since $s$ is sent in the clear during the key exchange protocol, recovering the salt is equivalent to recovering $n'[0:64]$. We let $N',e$ denote the public RSA key (modulus and exponent) used to perform textbook RSA encryption by the client in the key exchange, and we let $d$ denote the private RSA exponent used by the server to perform RSA decryption.^{Footnote 36} We assume $N'$ has exactly 2048 bits which holds for the values used by Telegram.

Further, we have

$$\begin{aligned} h_{n'} :=\textsf{SHA}-\textsf{1}[{ n' \Vert \texttt{0x0}i \Vert \textsf{SHA}-\textsf{1}[ ak ][0:64]}][32:160] \end{aligned}$$

in Fig. 52 where $i = 1$, 2 or 3 depending on whether the key exchange terminated successfully^{Footnote 37} and $h_{r}, h_{s}, h_{c}$ are $\textsf{SHA}-\textsf{1}$ hashes over the corresponding payloads except for the padding $p_{r}, p_{s}, p_{c}$. In particular, we have

$$ h_{r} :=\textsf{SHA}-\textsf{1}[{N, p, q, n, s, n'}]. $$

The critical observation in this section is that while $n$, $s$ and $n'$ have fixed lengths of 128, 128 and 256 bits, respectively, the same is not true for $N$, $p$ and $q$. This implies that the content to be fed to $\textsf{SHA}-\textsf{1}$ after RSA decryption and during verification must first be parsed by the server. This opens up the possibility of a timing side channel. In particular, at a byte level $\textsf{SHA}-\textsf{1}$ is called on

$$\begin{aligned} hd\ \Vert \ \mathcal {L}(N) \Vert N \Vert \mathcal {P}(N)\ \Vert \ \mathcal {L}(p) \Vert p \Vert \mathcal {P}(p)\ \Vert \ \mathcal {L}(q) \Vert q \Vert \mathcal {P}(q) \ \Vert \ n \Vert s \Vert n' \end{aligned}$$

where $\mathcal {L}(x)$ encodes the length of $x$ in one byte^{Footnote 38}$x$ is stored in big endian byte order and $\mathcal {P}(x)$ is up to three zero bytes so that length of $\mathcal {L}(x)\Vert x\Vert \mathcal {P}(x)$ is divisible by 4; $hd=\texttt{0xec5ac983}$.

We verified the following behaviour of the Telegram server, where “checking” means the key exchange aborts if the payload deviates from the expectation.

The header $hd = \texttt{0xec5ac983}$ is checked;
the server checks that $1 \le \mathcal {L}(N) \le 16$ and $\mathcal {L}(p)$, $\mathcal {L}(q) = 4$ (different valid encodings, e.g. by prefixing zeroes, of valid values are not accepted);
the value of $N$ is not checked, $p,q$ are checked against the value of $N$ stored on the server and the server checks that $p<q$;
the contents of $\mathcal {P}(\cdot )$ are not checked;
both $n,s$ are checked.

While we do not know in what order the Telegram server performs these checks, we recall that the payload must be parsed before being integrity checked and that the number of bytes being fed to $\textsf{SHA}-\textsf{1}$ depends on this parsing. This is because the random padding must be removed from the payload before calling $\textsf{SHA}-\textsf{1}$.

Recall that the Telegram developers acknowledged the attack presented here but did not provide further details on their implementation. Therefore, below we will assume that the Telegram server code follows a similar pattern to Telegram’s flagship $\textsf{TDLib}$ library, which is used e.g. to implement the Telegram Bot API [58]. While $\textsf{TDLib}$ does not implement RSA decryption, it does implement message parsing during the handshake. In particular, the library returns early when the header does not match its expected value. In our case the header is $\texttt{0xec5ac983}$ but we stress that this behaviour does not seem to be problematic in $\textsf{TDLib}$ and we do not know if the Telegram servers follow the same pattern also for RSA decryption. We will discuss other leakage patterns below, but for now we will assume the Telegram servers return early whenever there is a header mismatch, skipping the $\textsf{SHA}-\textsf{1}$ call in this case. This produces a timing side channel.

Thus, we consider a textbook RSA ciphertext $c = m^{e} \bmod N'$ with

$$\begin{aligned} m = h_{r} \Vert hd\Vert \mathcal {L}(N) \Vert N \Vert \mathcal {P}(N)\Vert \mathcal {L}(p) \Vert p \Vert \mathcal {P}(p) \Vert \mathcal {L}(q) \Vert q \Vert \mathcal {P}(q) \Vert n \Vert s \Vert n' \Vert p_{r} \end{aligned}$$

of length $255$ bytes. First, observe that an attacker knows all contents of the payload (including their encodings) except for $h_{r}$, $n'$ and $p_{r}$ and we can write:

$$\begin{aligned} x&= 2^{\ell (p_{r})} \cdot n' + p_{r} < 2^{256 + \ell (p_{r})}\\ m&= (2^{1880} \cdot h_{r} + 2^{256 + \ell (p_{r})} \cdot \gamma + x) \end{aligned}$$

where $\gamma $ is a known constant derived from $n,s,p,q,N$ and where $\ell (p_{r})$ is the known length of $p_{r}$. This relies on knowing that $\left| n'\right| =256$ and $\left| m\right| - \left| h_{r}\right| = 1880$.

Under our assumption on header checking, we can detect whether the bits in positions $1848 = 8\cdot 255-160-32$ to $1879 = 8\cdot 255 -160-1$ (big endian, $\textsf{SHA}-\textsf{1}$ returns 160 bits) of $m' :={(c')}^{d}$ match $\texttt{0xec5ac983}$ for any $c'$ we submit to the Telegram servers. Thus, inspired by [19], we submit $s_{i}^{e} \cdot c$, for several chosen $s_{i}$ to the server and receive back an answer whether the bits ${1848}$ to ${1879}$ of $s_{i} \cdot m$ match the expected header. If the $s_{i}$ are chosen sufficiently randomly, this event will have probability $\approx 2^{-32}$. Writing $\zeta = \texttt{0xec5ac983}$, we consider

$$\begin{aligned} e_{i}&= \left( \left( {s_{i} \cdot m \bmod N'} \right) - \zeta \cdot 2^{1848}\right) \bmod 2^{1880}\\&= \left( \left( s_{i} \cdot \left( {2^{1880} \cdot h_{r} + 2^{256 + \ell (p_{r})} \cdot \gamma + x}\right) \bmod N'\right) - \zeta \cdot 2^{1848}\right) \bmod 2^{1880}\\&= \left( \left( \left( s_{i} \cdot 2^{1880} \cdot h_{r} + s_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma + s_{i} \cdot x\right) \bmod N' \right) - \zeta \cdot 2^{1848}\right) \\&\bmod 2^{1880}. \end{aligned}$$

That is, we pick random $s_{i}$ (we will discuss how to pick those below) and submit $s_{i}^{e} \cdot c$ to the Telegram servers. Using the timing side channel, we then detect when the bits in the header position match $\zeta $. When this happens, we store $s_{i}$. Overall, we find $\mu $ such $s_{i}$ (we discuss below how to pick $\mu $) and suppose the event happens for some set of $s_i$, with $i=0,\ldots ,\mu -1$.

Recovering $h_r$. Note that $e_{i} < 2^{1880-32}$ by construction and $x < 2^{256 + \ell (p_{r})} \ll 2^{1848}$. Thus, picking sufficiently small $s_{i}$ an attacker can make $e'_{i} :=(e_{i} - s_{i} \cdot x) \bmod 2^{1880} < 2^{1848}$, i.e.

$$\begin{aligned} e'_{i}&= \left( \left( \left( s_{i} \cdot 2^{1880} \cdot h_{r} + s_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma \right) \bmod N' \right) - \zeta \cdot 2^{1848}\right) \\&\bmod 2^{1880} < 2^{1848}. \end{aligned}$$

We rewrite $e'_{i}$ as

$$\begin{aligned} e'_{i}&= \left( s_{i} \cdot 2^{1880} \cdot h_{r} + s_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma - \zeta \cdot 2^{1848} - \sigma _{i}\cdot 2^{1880}\right) \bmod N' \end{aligned}$$

for $\sigma _{i} < 2^{160}$ and use lattice reduction to recover $h_{r}$. Writing

$$ t_{i} = \left( s_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma - \zeta \cdot 2^{1848}\right) \bmod N', $$

we consider the lattice spanned by the rows of $L_{1}$ with

$$\begin{aligned} L_{1} :=\begin{pmatrix} 2^{1688} & 0 & 0 & 0 & 2^{1880} \cdot s_{0} & \cdots & 2^{1880} \cdot s_{\mu -1} & 0\\ 0 & 2^{1688}& 0 & 0 & 2^{1880} & \cdots & 0 & 0\\ 0 & 0 & \ddots & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 0 & 2^{1688}& 0 & \cdots & 2^{1880} & 0\\ 0 & 0 & 0 & 0 & N' & \cdots & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & \cdots & N' & 0\\ 0 & 0 & 0 & 0 & t_{0} & \cdots & t_{\mu -1} & 2^{1848}\\ \end{pmatrix}. \end{aligned}$$

Multiplying $L_{1}$ from the left by

$$ (h_{r},\ -\sigma _{0},\ \ldots ,\ -\sigma _{\mu -1},\ *, \ldots , *, 1) $$

where $*$ stands for modular reduction by $N'$, shows that this lattice contains a vector

$$\begin{aligned} (2^{1688}\cdot h_{r},\ -2^{1688}\, \sigma _{0},\ \ldots ,\ -2^{1688}\, \sigma _{\mu -1},\ e'_{0}, \ldots ,\ e'_{\mu -1},\ 2^{1848}) \end{aligned}$$

(1)

where all entries are bounded by $2^{1848} = 2^{1688 + 160}$. Thus that vector has Euclidean norm $\le \sqrt{2\,\mu +2} \cdot 2^{1848}$.^{Footnote 39} On the other hand, the Gaussian heuristic predicts the shortest vector in the lattice to have norm

$$\begin{aligned} \approx \sqrt{\frac{2\,\mu +2}{2\pi \, e}} \cdot {\left( {2^{1688 \cdot (\mu +1)} \cdot {(N')}^{\mu } \cdot 2^{1848}}\right) }^{1/(2\,\mu +2)}. \end{aligned}$$

(2)

Finding a shortest vector in the lattice spanned by the rows of $L_{1}$ is expected to recover our target vector and thus $h_{r}$ when the norm of expression (1) is smaller than the expression (2) which is satisfied for $\mu =6$.

We experimentally verified that LLL on a $(2\cdot 6 + 2)$-dimensional lattice constructed as $L_{1}$ indeed succeeds (cf. Appendix G.2). Thus, under our assumptions, recovering $h_{r}$ requires about $6 \cdot 2^{32}$ queries to Telegram’s servers and a trivial amount of computation.

Recovering $n'$. Once we have recovered $h_{r}$, we can target $n'$. Writing $\gamma ' = 2^{1880-256-\ell (p_{r})} \cdot h_{r} + \gamma $, we obtain

$$\begin{aligned} d_{i} =&\left( \left( {s'_{i} \cdot m \bmod N'} \right) - \zeta \cdot 2^{1848}\right) \bmod 2^{1880}\\ =&\left( \left( s'_{i} \cdot \left( {2^{256 + \ell (p_{r})} \cdot \gamma ' + x}\right) \bmod N'\right) - \zeta \cdot 2^{1848}\right) \bmod 2^{1880}\\ =&\left( \left( \left( s'_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma ' + s'_{i} \cdot x \right) \bmod N'\right) - \zeta \cdot 2^{1848}\right) \bmod 2^{1880}\\ =&\left( \left( \left( s'_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma ' + s'_{i} \cdot (2^{\ell (p_{r})} \cdot n' + p_{r})\right) \bmod N'\right) - \zeta \cdot 2^{1848}\right) \bmod 2^{1880} \end{aligned}$$

where the $s_{i}'$ are again chosen randomly and we collect $s_{i}'$ for $i = 0,\ldots , \mu '-1$ where the bits in the header position match $\zeta $. We discuss how to choose $s_{i}'$ and $\mu '$ below. Thus, we assume that $d_{i} < 2^{1848}$ for $s_{i}'$. Information theoretically, each such inequality leaks 32 bits. Considering that $x = 2^{\ell (p_{r})} n' + p_{r}$ has $256 + \ell (p_{r})$ bits, we thus require at least $(256 + \ell (p_{r}))/32$ such inequalities to recover $x$.^{Footnote 40} Yet, $\ell (p_{r}) \gg 256$ and the content of $p_{r}$ is of no interest to us, i.e. we seek to recover $n'$ without “wasting entropy” on $p_{r}$.^{Footnote 41} In other words, we wish to pick $s'_{i}$ sufficiently large so that all bits of $s'_{i} \cdot 2^{\ell (p_{r})} \cdot n'$ affect the 32 bits starting at $2^{1848}$ but sufficiently small to still allow us to consider “most of” $s'_{i} \cdot p_{r}$ as part of the lower-order bit noise. Thus, we pick random $s'_{i} \approx 2^{1848-\ell (p_{r})}$ and consider $d'_{i} :=d_{i} - s'_{i} \cdot p_{r}$ with

$$\begin{aligned} d'_{i}&= \left( \left( \left( s'_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma ' + s'_{i} \cdot 2^{\ell (p_{r})} \cdot n'\right) \bmod N'\right) - \zeta \cdot 2^{1848}\right) \\&\bmod 2^{1880}\\&= \left( s'_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma ' + s'_{i} \cdot 2^{\ell (p_{r})} \cdot n' - \zeta \cdot 2^{1848} - \sigma '_{i} \cdot 2^{1880}\right) \bmod N'. \end{aligned}$$

Writing

$$ t'_{i} = \left( s'_{i} \cdot 2^{256 + \ell (p_{r})} \cdot \gamma ' - \zeta \cdot 2^{1848}\right) \bmod N', $$

we consider the lattice spanned by the rows of $L_{2}$ with

$$\begin{aligned} L_{2} :=\begin{pmatrix} 2^{1592} & 0 & 0 & 0 & 2^{\ell (p_{r})} \cdot s'_{0} & \cdots & 2^{\ell (p_{r})} \cdot s'_{\mu '-1} & 0\\ 0 & 2^{1688}& 0 & 0 & 2^{1880} & \cdots & 0 & 0\\ 0 & 0 & \ddots & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 0 & 2^{1688}& 0 & \cdots & 2^{1880} & 0\\ 0 & 0 & 0 & 0 & N' & \cdots & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & \cdots & N' & 0\\ 0 & 0 & 0 & 0 & t'_{0} & \cdots & t'_{\mu '-1} & 2^{1848}\\ \end{pmatrix}. \end{aligned}$$

As before, multiplying $L_{2}$ from the left by

$$ (n',\ -\sigma '_{0},\ \ldots ,\ -\sigma '_{\mu '-1},\ *, \ldots , *, 1) $$

shows that this lattice contains a vector

$$\begin{aligned} (2^{1592}\cdot n',\ -2^{1688}\, \sigma '_{0},\ \ldots ,\ -2^{1688}\, \sigma '_{\mu '-1},\ d'_{0}, \ldots ,\ d'_{\mu '-1},\ 2^{1848}) \end{aligned}$$

where all entries are $\approx 2^{1848}$ and thus has Euclidean norm $\approx \sqrt{2\,\mu '+2} \cdot 2^{1848}$. We write “$\approx $” instead of “$\le $” because $s'_{i} \cdot p_{r}$ may overflow $2^{1848}$. Picking $\mu ' = 256/32 + 1 = 9$ gives an instance where the target vector is expected to be shorter than the Gaussian heuristic predicts. However, due to our choice of $s'_{i}$, finding a shortest vector might not recover $n'$ exactly but only the top $256-\varepsilon $ bits for some small $\varepsilon $. We verified this behaviour with our proof of concept implementation which consistently recovers all but $\varepsilon \approx 4$ bits. To recover the remaining bits, we simply perform exhaustive search by computing $\textsf{SHA}-\textsf{1}(N,p,q,n,s,n' + \varDelta n')$ for all candidates for $\varDelta n'$ and comparing against $h_{r}$. Overall, under our assumptions, using $\approx (6+9) \cdot 2^{32}$ noise-free queries and a trivial amount of computation we can recover $n'$ from Telegram’s key exchange. This in turn allows to compute the initial salt. Of course, timing side channels are noisy, suggesting a potentially significantly larger number of queries would be needed to recover sufficiently clean signals for the lattice reduction stage.

Extension to other leakage patterns. Our approach can be adapted to check other leakage patterns, e.g. targeting the values in the $\mathcal {L}(\cdot )$ fields. For example, recall that the Telegram servers require $1 \le \mathcal {L}(N) \le 16$. We do not know what the servers do when this condition is violated, but discuss possible behaviours:

Assume the code terminates early, skipping the $\textsf{SHA}-\textsf{1}$ call. This would result in a timing side channel leaking that the three most significant bits of $\mathcal {L}(N)$ are zero when the $\textsf{SHA}-\textsf{1}$ call is triggered.
Assume the code does not terminate early but the Telegram servers feed between 88 and 104 bytes to $\textsf{SHA}-\textsf{1}$. This would not produce a timing leak. That is, $\textsf{SHA}-\textsf{1}$ hashes data in blocks with its running time depending on the number of blocks processed. It has a block size of 64 bytes, and its padding algorithm (i.e. see algorithm $\textsf{SHA}-\textsf{pad}$ in Sect. 2.2) insists on adding at least 8 bytes of length and 1 byte of padding. Thus up to 55 full bytes are hashed as one block, then 119, 183, and 247, cf. [6, 44] for works exploiting this. Telegram’s format checking restricts accepted length to between 88 and 104 bytes, i.e. all valid payloads lead to calls to the $\textsf{SHA}-\textsf{1}$ compression function on two blocks.
Assume the code performs a dummy $\textsf{SHA}-\textsf{1}$ call on all data received, say, minus the received digest. This would lead to calls to the $\textsf{SHA}-\textsf{1}$ compression function on three blocks and a timing side channel leaking the three most significant bits of $\mathcal {L}(N)$, by distinguishing between $\mathcal {L}(N) > 16$ and $\mathcal {L}(N) \le 16$.

Now, suppose Telegram’s servers do leak whether the three most significant bits of $\mathcal {L}(N)$ are zero without first checking the header. On the one hand, this would reduce the query complexity because the target event is now expected to happen with probability $2^{-3}$. On the other hand, this increases the cost of lattice reduction, as we now need to find shortest vectors in lattices of larger dimension. Information theoretically, we need at least $m = 160/3$ samples to recover $h_{r}$ and thus need to consider finding shortest vectors in a lattice of dimension 110, which is feasible [2]. For $n'$, we can use the same tactic as above for “slicing up” $x$ into $n'$ and $p_{r}$ to slice up $n'$ into sufficiently small chunks. Alternatively, noting that we only need to recover 64 bits of $n'$ we can simply consider a lattice of dimension $\approx 45$, where finding shortest vectors is easy.

7.2 Recovering the session ID

Given the salt, we can recover the session ID using a simple guess and verify approach exploiting the same timing side channel as in Sect. 6. Here, we simply run our attack from Sect. 6 but this time we use a known plaintext block $m_i$ in order to validate our guesses about the value of $m_1$ (which is now partially unknown). That is, for all $2^{64}$ choices of the session ID, and given the recovered salt value, we can construct a candidate for $m_1$. Then for known $m_{i-1}, m_{i}$, we construct $c_{1}~|~c^{\star }$ as before, with $c^{\star } = m_{i-1} \oplus c_i \oplus m_1$. If our guess for the session ID was correct, then decrypting $c_{1}~|~c^{\star }$ results in a plaintext having a second block of the form:

$$ m^{\star } = E_{K}^{-1}(c^{\star } \oplus m_1) \oplus c_1 = E_{K}^{-1}(m_{i-1} \oplus c_i) \oplus c_1 = m_i \oplus c_{i-1} \oplus c_1. $$

We can then check if the observed behaviour on processing the ciphertext is consistent with the known value $m_i \oplus c_{i-1} \oplus c_1$. If our choice of the session ID (and therefore $m_{1}$) is correct, this will always be the case. If our guess is incorrect then $m^{\star }$ can be assumed to be uniformly random.

In more detail, assume our timing side channel leaks 32 bits of plaintext from the length field check. Let $m_{i}^{{(j)}}$ and $c_{i}^{(j)}$ be the $i$-th block in the $j$-th plaintext and ciphertext, respectively. Collect three plaintext-ciphertext pairs such that

$$ m_i^{{(j)}} \oplus c_{i-1}^{(j)} \oplus c_1^{(j)},\ (0 \le j < 3) $$

passes the length check.^{Footnote 42} For each guess of the session ID submit three ciphertexts containing $c^{\star ,(j)} = m_{i-1}^{(j)} \oplus c_i^{(j)} \oplus m_1^{(j)}$ as the second block. If our guess for $m_{1}$ was correct then all three will pass the length check which is leaked to us by the timing side channel. If our guess for $m_1$ was incorrect then $E_{K}^{-1}(c^{\star ,(j)} \oplus m_1) $ will output a random block, i.e. such that $E_{K}^{-1}(c^{\star ,(j)} \oplus m_1) \oplus c_1$ passes the length check with probability $2^{-32}$. Thus, all three length checks will pass with probability $2^{-96}$. In other words, the probability of a false positive is upper-bounded by $2^{64} \cdot 2^{-96} = 2^{-32}$ (i.e. in the worst case we will check and discard $2^{64} - 1$ possible values of session ID before finding the correct one).

7.3 Breaking server authentication

Recall from Fig. 52 that the $\textsf{key}, \textsf{iv} $ pair used to encrypt $g^{a}$ and $g^{b}$ are derived from $s$ (sent in the clear) and $n'$. Since the attack in Sect. 7.1 recovers $n'$, it can be immediately extended into an attacker-in-the-middle (MitM) attack on the Diffie–Hellman key exchange. That is, knowing $n'$ the attacker can compose the appropriate IGE ciphertext containing some $g^{a'}$ of its choice where it knows $a'$ (and similarly replace $g^{b}$ coming from the client with $g^{b'}$ for some $b'$ it knows). Both client and server will thus complete their respective key exchanges with the adversary rather than each other, allowing the adversary to break confidentiality and integrity of their communication. However, even in the presence of the side channel that enabled the attack in Sect. 7.1, the MitM attack is more complicated due to the need to complete it before the session between client and server times out. This may be feasible under some of the alternative leakage patterns discussed earlier but unlikely to be realistic when $> 2^{32}$ requests are required to recover $n'$.

8 Discussion

The central result of this work is a proof that the use of symmetric encryption in Telegram’s MTProto 2.0 can provide the basic security expected from a bidirectional channel if small modifications are made. The Telegram developers have indicated that they implemented most of these changes. Thus, our work can give some assurance to those reliant on Telegram providing confidential and integrity-protected cloud chats—at a comparable level to chat protocols that run over TLS’s record protocol. However, our work comes with a host of caveats.

Attacks. Our work also presents attacks against the symmetric encryption in Telegram. These highlight the gap between the variant of MTProto 2.0 that we specify and Telegram’s implementations. While the reordering attack in Sect. 4.2 and the attack on re-encryption in Sect. 4.2 were possible against implementations that we studied, they can easily be avoided without making changes to the on-the-wire format of MTProto, i.e. by only changing processing in clients and servers. After disclosing our findings, Telegram informed us that they have changed this processing accordingly.

Our attacks in Sect. 6 are attacks on the implementation. As such, they can be considered outside the model: our model only shows that there can be secure instantiations of MTProto but does not cover the actual implementations; in particular, we do not model timing differences. That said, protocol design has a significant impact on the ease with which secure implementations can be achieved. Here, the decision in MTProto to adopt Encrypt & MAC results in the potential for a leak that we can exploit in specific implementations. This “brittleness” of MTProto is of particular relevance due to the surfeit of implementations of the protocol, and the fact that security advice may not be heeded by all authors, as we showed with our $\textsf {msg}\_\textsf {length} $ attack in Sect. 6. Here Telegram’s apparent ambition to provide TDLib as a one-stop solution for clients across platforms will allow security researchers to focus their efforts. We thus recommend that Telegram replaces the low-level cryptographic processing in all official clients with a carefully vetted library.

Note that the security of the Telegram ecosystem does not stop with official clients. As the recent work of [9] shows, many third-party client implementations are also vulnerable to attacks.

Tightness. On the other hand, our proofs are not necessarily tight. That is, our theorem statements contain terms bounding the advantage by $\approx q/2^{64}$ where $q$ is the number of queries sent by the adversary. Yet, we have no attacks matching these bounds (our attacks with complexity $2^{64}$ are outside the model). Thus, it is possible that a refined analysis would yield tighter bounds.

Future work. Our attack in Sect. 7 is against the implementation of Telegram’s key exchange and is thus outside of our model for two reasons: as before, we do not consider timing side channels in our model and, critically, we only specify the symmetric part of MTProto. This highlights a second significant caveat for our results that large parts of Telegram’s design remain unstudied: multi-user security, the key exchange, the higher-level message processing, secret chats, forward secrecy, control messages, bot APIs, CDNs, cloud storage, the Passport feature, to name but a few. These are pressing topics for future work.

Assumptions. In our proofs we are forced to rely on unstudied assumptions about the underlying primitives used in MTProto. In particular, we have to make related-key assumptions about the compression function of $\textsf{SHA}-\textsf{256}$ which could be easily avoided by tweaking the use of these primitives in MTProto. In the meantime, these assumptions represent interesting targets for symmetric cryptography research. Similarly, the complexity of our proofs and assumptions largely derives from MTProto deploying hash functions in place of (domain-separated) PRFs such as HMAC. We recommend that Telegram either adopts well-studied primitives for future versions of MTProto to ease analysis and thus to increase confidence in the design, or adopts TLS.

Telegram. While we prove security of the symmetric part of MTProto at a protocol level, we recall that by default communication via Telegram must trust the Telegram servers, i.e. end-to-end encryption is optional and not available for group chats. We thus, on the one hand, (a) recommend that Telegram open-sources the cryptographic processing on their servers and (b) recommend to avoid referencing Telegram as an “encrypted messenger” which—post-Snowden—has come to mean end-to-end encryption. On the other hand, discussions about end-to-end encryption aside, echoing [1, 25] we note that many higher-risk users do rely on MTProto and Telegram and shun Signal. This emphasises the need to study these technologies and how they serve those who rely on them.

Notes

Clients still differ in their implementation of the protocol and in particular in payload validation, which our specification does not capture.
We note that Telegram’s $\textsf{TDLib}$ [58] library manages to avoid this leak [66].
Verification would require sending a significant number of requests to the Telegram servers from a geographically close host.
Traditionally, $\textsf{MD}[h ]$ is unkeyed, but it is convenient at points in our analysis to think of it as being keyed. When creating a hash function like $\textsf{SHA}-\textsf{1}$ or $\textsf{SHA}-\textsf{256}$ from $\textsf{MD}[h ]$, the key is fixed to a specific IV value.
We note that any analysis of such cryptographic restrictions is orthogonal to whether a reliable transport protocol such as TCP is used; we study the security of MTProto against an active, on-path attacker as is standard in secure channel models.
For a discussion of stronger ordering guarantees, see Sect. 4.2.
One could then modify our construction of the MTProto-based channel from Sect. 4.4 so that it precisely models the current version of MTProto 2.0 and adjust our security analysis accordingly so that it holds with respect to the relaxed set of message delivery rules that is used in practice.
Our formal model of MTProto in Sect. 4 leaves the $\textit{aux}$ field unused. We only use the $\textit{aux}$ field in Appendix D where we specify a message encoding scheme that captures the real-world MTProto protocol more precisely (i.e. allowing messages to be accepted out of order).
We use ciphertexts as support labels when channels are considered. We use message encodings as support labels when properties of message encoding schemes are considered (as defined in Sect. 3.5).
The initial version of this work defined $\textsc {Recv}$ to always return m. This made the definition stronger, by allowing the adversary to detect the moment it won the game. Switching between the two alternative definitions does not affect our proofs, but returning m made it harder to reason about the joint security for channels in Appendix B. So for simplicity we chose to always return $\bot $.
For full generality, the algorithm $\mathsf {\textsf{ME}.Encode}$ could also be allowed to return $p= \bot $, denoting a failure to encode a message. However, the message encoding schemes we define in this work never fail to encode messages from each scheme’s corresponding message space $\mathsf {\textsf{ME}.MS}$. So for simplicity we do not define $p= \bot $ to be a valid output of $\mathsf {\textsf{ME}.Encode}$.
In Sect. 5.4, rather than arguing that a message encoding scheme has encoding correctness, we point out that it is implied by the proof of its encoding integrity.
https://github.com/telegramdesktop/tdesktop/, versions 2.3.2 to 2.7.1.
https://github.com/DrKLO/Telegram/, versions 6.1.1 to 7.6.0.
This feature is meant to prevent ISP blocking. In addition to this, clients can route their connections through a Telegram proxy. The obfuscation key is then derived from a shared secret (e.g. from proxy password) between the client and the proxy.
There are scenarios where message drops can be impactful. Telegram offers its users the ability to delete chat history for the other party (or all members of a group)—if such a request is dropped, severing the connection, the chat history will appear to be cleared in the user’s app even though the request never made it to the Telegram servers (cf. [1] for the significance of history deletion in some settings)
Note that here we are breaking the confidentiality of the ciphertext carrying the acknowledgement message. In addition to these encrypted acknowledgement messages, the underlying transport layer, e.g. TCP, may also issue unencrypted ACK messages or may resend ciphertexts as is. The difference between these two cases is that in the former case the acknowledgement message is encrypted, in the latter it is not.
Since the server code was not available, we inferred its behaviour from observing the communication.
The documentation was updated in response to our paper.
The Android client accepts any value in the place of server_salt and the desktop client [70] compares it with a previously saved value and resends the message if they do not match and if the timestamp within msg_id differs from the acceptable time window.
Secret chats implement more elaborate measures against replay/reordering [65], however this complexity is not required when in-order delivery is required for each direction separately.
While the definition itself could admit many different implementations of the primitives, we are interested in capturing MTProto and thus do not define our channel in a fully general way, e.g. we fix some key sizes.
The comments in Fig. 21 show how the exact 2048-bit value of $\textsf{auth}{\_}\textsf{key}$ can be reconstructed by combining bits of $\textit{hk}$, $\textit{kk}$, $\textit{mk}$. Note that the key $\textit{hk}$ used for $\textsf{HASH}$ is deliberately chosen to contain all bits of $\textsf{auth}{\_}\textsf{key}$ that are not used for $\textsf{KDF}$ and $\textsf{MAC}$ keys $\textit{kk}$, $\textit{mk}$.
The definition of $\mathsf {\textsf{ME}.pl}$ assumes that $\textsf{GenPadding} $ is invoked with the random coins of the corresponding $\mathsf {\textsf{ME}.Encode}$ call. For simplicity, we chose to not surface these coins in Fig. 20 and instead handle this implicitly.
A limitation on number of queries is inherent as long as fixed-length sequence numbers are used. There are other ways to handle counters which could imply correctness for unbounded adversaries. $\textsf{MTP}\text {-}\textsf{ME}$ wraps its counters to stay close to the actual MTProto implementations.
In $\textsf{MTP}\text {-}\textsf{CH} $, the first $2^{96}$ plaintexts are encoded into distinct payloads using $\textsf{MTP}\text {-}\textsf{ME}$, whereas distinct payloads are then encrypted into distinct ciphertexts according to the $\textrm{RKCR}$-security of $\textsf{MAC}$ with respect to $\phi _{\textsf{MAC}}$. The latter is used for transition from $\textrm{G}_5$ to $\textrm{G}_6$ of the integrity proof for $\textsf{MTP}\text {-}\textsf{CH} $ in Sect. 5.6.
Note that $\textit{aux}$ is not used in $\textsf{supp}\text {-}\textsf{ord}$ or in $\textsf{MTP}\text {-}\textsf{ME}$. It would be possible to add time synchronisation using the timestamp captured in the $\textsf {msg}\_\textsf {id} $ field just as the current MTProto $\textsf{ME}$ implementation does.
Integrity of a support function is formalised in Appendix A.
The length of plaintext m in MTProto is $\ell :=\left| m\right| \le 2^{27}$ bits. To build a payload $p$, algorithm $\mathsf {\textsf{ME}.Encode}$ prepends a 256-bit header, and appends at most $\textsf{bl}\cdot (\textsf {pb} + 1)$-bit padding. Further evaluation of $\textsf{MAC}$ on $p$ might append at most 512 additional bits of SHA padding. So this corollary uses Lemma 1 with the maximum number of blocks $T = \left\lfloor \left( 256 + \ell + \textsf{bl}\cdot (\textsf {pb} + 1) + 512\right) /512\right\rfloor $ minus the first 512-bit block that is processed separately in Proposition 7.
The attack is easy to adapt to a different block.
Note that this beats random guessing as the correct value can be recognised.
This condition holds for payloads of length 191 bits or less modulo 512, but interface to hash functions in OpenSSL and derived libraries only accepts inputs in multiples of bytes not bits.
Telegram will perform roughly one key exchange per day, aiming for forward secrecy.
We note that this issue mirrors the one reported in [34].
This consists of $\textsf{SHA}-\textsf{1}$ calls but we omit the details here.
Note that $N'$ is distinct from the proof-of-work value $N$ that is sent by the server during the protocol and whose factors $p$, $q$ are returned by the client.
We could not determine under which condition $i = 3$ would be set.
Longer inputs are supported by $\mathcal {L};(\cdot )$ but would not fit into $\le 255$ bytes of RSA payload.
This estimate is pessimistic for the attacker. Applying the techniques summarised in [2] for constructing such lattices, we can save a factor of roughly two. We forgo these improvements here to keep the presentation simple.
Technically, given the knowledge of $h_{r}$ and that it is a hash of the remaining inputs save $p_{r}$ the information-theoretic limit does not apply and algorithms exist to exploit this additional information [2]. However, for simplicity we forgo a discussion of this variant here.
Indeed, we are only interested in 64 bits of $n'$: $n'[0:64]$.
A different index i can be used within each ciphertext.
We emphasise that it is sufficient to return any non-$\bot $ value when $\mathcal {A}$ violates the integrity of $\textsf{CH}$. For example, one could instead return an arbitrary constant string.
We need both frameworks to study the same object so that in Appendix C.3 we can state and prove relations between their correctness and security notions.
Note that this attack fails for contrived support predicates that would never recognize $(c, \textit{aux})$ as being supported.
For example, the first two instructions of oracle $\textsc {Recv}$ in the predicate-based correctness game (enforcing $0 \le j < i$ and then fetching $(m, c, \textit{aux}) \leftarrow \textsf{T}[j]$) only serve the goal of implicitly mapping $(c, \textit{aux})$ directly to the corresponding plaintext m. Later on, this mapping could still be determined invalid if $d \ne j$. This condition is actually not inherently necessary for the purpose of disallowing the adversary from forging ciphertexts because the support predicate could itself have enforced that.
The analysis of $\textsf{QUIC}$ and $\textsf{DTLS}\;1.3$ in [28] required to use index-recovering support predicates. As we discuss in Appendix C.2, only such predicates can be used to analyse channels with non-unique ciphertexts. So we focus on them here.
We allow the sender and the receiver to use distinct $\textit{aux}$ values with respect to the same m, c, e.g. when $\textit{aux}$ contains a timestamp for the event of creating or delivering a ciphertext. So in general we cannot require $\textit{aux}$ values to be equal. This detail is only relevant because we extended the syntax of [28] to surface $\textit{aux}$.
Note that $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(m, k)$ for chosen $m$ and random secret $k$ is not a PRF since it comes endowed with a decryption function revealing $k$ given $y = \textsf{SHACAL}-\textsf{2}.\textsf{Ev}(m, k)$ and the chosen $m$. This does not rule out the “masked” construction $k \;\hat{+}\;\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(m, k)$ being a PRF.
In Sect. 2.1 we specified the meaning of the instruction ${\textbf{abort}}(\ldots )$ when it is used inside algorithms. In this security proof exclusively, we extend the definition of ${\textbf{abort}}(\ldots )$, allowing to use it inside security games.
Note that the 256 total bits of $\textsf {msg}\_\textsf {key}, \textsf {msg}\_\textsf {key} '\in \{0,1\}^{128}$ should not always be independent. For example, when trying to match $\textsf {msg}\_\textsf {key} ' ~\Vert ~\textit{kk}_{\mathcal {R}, 0} = \textit{kk}_{\mathcal {I}, 1} ~\Vert ~\textsf {msg}\_\textsf {key} $, the 32-bit long bit-string $\textit{kk}[320:352]$ should appear both in the prefix of $\textsf {msg}\_\textsf {key} '$ and in the suffix of $\textsf {msg}\_\textsf {key} $.
https://github.com/telegramdesktop/tdesktop/tree/v2.4.11
https://www.ecrypt.eu.org/ebats/cpucycles.html

References

M.R. Albrecht, J. Blasco, R.B. Jensen, L. Mareková, Collective information security in large-scale urban protests: the case of Hong Kong, in USENIX Security 2021, ed. by M. Bailey, R. Greenstadt (USENIX Association, 2021), pp. 3363–3380. https://www.usenix.org/conference/usenixsecurity21/presentation/albrecht
M.R. Albrecht, N. Heninger, On bounded distance decoding with predicate: breaking the “lattice barrier” for the hidden number problem, in EUROCRYPT 2021, Part I. LNCS, vol. 12696, ed. by A. Canteaut, F.X. Standaert (Springer, Cham, 2021), pp. 528–558. https://doi.org/10.1007/978-3-030-77870-5_19
M.R. Albrecht, L. Mareková, K.G. Paterson, E. Ronen, I. Stepanovs, Analysis of the telegram key exchange, in EUROCRYPT 2025, Part VIII. LNCS, vol. 15608, ed. by S. Fehr, P.A. Fouque, (Springer, Cham, 2025), pp. 212–241. https://doi.org/10.1007/978-3-031-91101-9_8
M.R. Albrecht, L. Mareková, K.G. Paterson, I. Stepanovs, Four attacks and a proof for Telegram, in 2022 IEEE Symposium on Security and Privacy (IEEE Computer Society Press, 2022), pp. 87–106. https://doi.org/10.1109/SP46214.2022.9833666
M.R. Albrecht, K.G. Paterson, G.J. Watson, Plaintext recovery attacks against SSH, in 2009 IEEE Symposium on Security and Privacy (IEEE Computer Society Press, 2009), pp. 16–26. https://doi.org/10.1109/SP.2009.5
N.J. AlFardan, K.G. Paterson, Lucky thirteen: breaking the TLS and DTLS record protocols, in 2013 IEEE Symposium on Security and Privacy (IEEE Computer Society Press, 2013), pp. 526–540. https://doi.org/10.1109/SP.2013.42
J. Alwen, S. Coretti, Y. Dodis, The double ratchet: security notions, proofs, and modularization for the Signal protocol, in Ishai and Rijmen [31], pp. 129–158. https://doi.org/10.1007/978-3-030-17653-2_5
E. Andreeva, A. Bogdanov, A. Luykx, B. Mennink, N. Mouha, K. Yasuda, How to securely release unverified plaintext in authenticated encryption, in ASIACRYPT 2014, Part I. LNCS, vol. 8873, ed. by P. Sarkar, T. Iwata (Springer, Berlin, Heidelberg, 2014), pp. 105–125. https://doi.org/10.1007/978-3-662-45611-8_6
T. von Arx, K.G. Paterson, On the cryptographic fragility of the Telegram ecosystem. Cryptology ePrint Archive, Paper 2022/595 (2022), https://eprint.iacr.org/2022/595, to appear at AsiaCCS 2023
M. Bellare, D.J. Bernstein, S. Tessaro, Hash-function based PRFs: AMAC and its multi-user security, in EUROCRYPT 2016, Part I. LNCS, vol. 9665, ed. by M. Fischlin, J.S. Coron (Springer, Berlin, Heidelberg, 2016), pp. 566–595. https://doi.org/10.1007/978-3-662-49890-3_22
M. Bellare, A. Boldyreva, L.R. Knudsen, C. Namprempre, On-line ciphers and the hash-CBC constructions. J. Cryptol. 25(4), 640–679 (2012). https://doi.org/10.1007/s00145-011-9106-1
Article MathSciNet Google Scholar
M. Bellare, R. Canetti, H. Krawczyk, Pseudorandom functions revisited: the cascade construction and its concrete security, in 37th FOCS (IEEE Computer Society Press, 1996), pp. 514–523. https://doi.org/10.1109/SFCS.1996.548510
M. Bellare, A. Desai, E. Jokipii, P. Rogaway, A concrete security treatment of symmetric encryption, in 38th FOCS (IEEE Computer Society Press, 1997), pp. 394–403. https://doi.org/10.1109/SFCS.1997.646128
M. Bellare, T. Kohno, A theoretical treatment of related-key attacks: RKA-PRPs, RKA-PRFs, and applications, in EUROCRYPT 2003. LNCS, vol. 2656, ed. by E. Biham (Springer, Berlin, Heidelberg, 2003), pp. 491–506. https://doi.org/10.1007/3-540-39200-9_31
M. Bellare, T. Kohno, C. Namprempre, Authenticated encryption in SSH: provably fixing the SSH binary packet protocol, in ACM CCS 2002, ed. by V. Atluri (ACM Press, 2002), pp. 1–11. https://doi.org/10.1145/586110.586112
M. Bellare, T. Kohno, C. Namprempre, Breaking and provably repairing the SSH authenticated encryption scheme: a case study of the encode-then-encrypt-and-mac paradigm. ACM Trans. Inf. Syst. Secur. (TISSEC) 7(2), 206–241 (2004)
Article Google Scholar
M. Bellare, P. Rogaway, The security of triple encryption and a framework for code-based game-playing proofs, in Vaudenay [75], pp. 409–426. https://doi.org/10.1007/11761679_25
M. Bellare, A.C. Singh, J. Jaeger, M. Nyayapati, I. Stepanovs, Ratcheted encryption and key exchange: the security of messaging, in CRYPTO 2017, Part III. LNCS, vol. 10403, ed. by J. Katz, H. Shacham (Springer, Cham, 2017). pp. 619–650. https://doi.org/10.1007/978-3-319-63697-9_21
D. Bleichenbacher, Chosen ciphertext attacks against protocols based on the RSA encryption standard PKCS #1, in CRYPTO’98. LNCS, vol. 1462, ed. by H. Krawczyk (Springer, Berlin, Heidelberg, 1998), pp. 1–12. https://doi.org/10.1007/BFb0055716
C. Boyd, B. Hale, S.F. Mjølsnes, D. Stebila, From stateless to stateful: generic authentication and authenticated encryption constructions with application to TLS, in CT-RSA 2016. LNCS, vol. 9610, ed. by K. Sako (Springer, Cham, 2016), pp. 55–71. https://doi.org/10.1007/978-3-319-29485-8_4
A. Caforio, F.B. Durak, S. Vaudenay, Beyond security and efficiency: on-demand ratcheting with security awareness, in PKC 2021, Part II. LNCS, vol. 12711, ed. by J. Garay (Springer, Cham, 2021), pp. 649–677. https://doi.org/10.1007/978-3-030-75248-4_23
C. Campbell, Design and specification of cryptographic capabilities. IEEE Commun. Soc. Mag. 16(6), 15–19 (1978)
Article Google Scholar
J.S. Coron, Y. Dodis, C. Malinaud, P. Puniya, Merkle-Damgård revisited: how to construct a hash function, in CRYPTO 2005. LNCS, vol. 3621, ed. by V. Shoup (Springer, Berlin, Heidelberg, 2005), pp. 430–448. https://doi.org/10.1007/11535218_26
J.P. Degabriele, M. Fischlin, Simulatable channels: extended security that is universally composable and easier to prove, in ASIACRYPT 2018, Part III. LNCS, vol. 11274, ed. by T. Peyrin, S. Galbraith (Springer, Cham, 2018), pp. 519–550. https://doi.org/10.1007/978-3-030-03332-3_19
K. Ermoshina, H. Halpin, F. Musiani, Can Johnny build a protocol? Co-ordinating developer and user intentions for privacy-enhanced secure messaging protocols, in European Workshop on Usable Security (2017)
P. Eugster, G.A. Marson, B. Poettering, A cryptographic look at multi-party channels, in CSF 2018 Computer Security Foundations Symposium, ed. by S. Chong, S. Delaune (IEEE Computer Society Press, 2018), pp. 31–45. https://doi.org/10.1109/CSF.2018.00010
M. Fischlin, F. Günther, C. Janson, Robust channels: handling unreliable networks in the record layers of QUIC and DTLS 1.3. Cryptology ePrint Archive, Report 2020/718 (2020). https://eprint.iacr.org/2020/718
M. Fischlin, F. Günther, C. Janson, Robust channels: handling unreliable networks in the record layers of QUIC and DTLS 1.3. J. Cryptol. 37(2), 9 (2024). https://doi.org/10.1007/s00145-023-09489-9
Article MathSciNet Google Scholar
Google: BoringSSL AES IGE implementation. https://github.com/DrKLO/Telegram/blob/d073b80063c568f31d81cc88c927b47c01a1dbf4/TMessagesProj/jni/boringssl/crypto/fipsmodule/aes/aes_ige.c (Jul 2018)
H. Handschuh, D. Naccache, SHACAL (-submission to NESSIE-). Proceedings of First Open NESSIE Workshop (2000), http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.3.4066&rep=rep1 &type=pdf
Y. Ishai, V. Rijmen (ed.), EUROCRYPT 2019, Part I, LNCS, vol. 11476 (Springer, Cham, 2019)
J. Iyengar, M. Thomson, QUIC: A UDP-based multiplexed and secure transport. https://datatracker.ietf.org/doc/draft-ietf-quic-transport/ (Mar 2021), draft Version 34
J. Jaeger, I. Stepanovs, Optimal channel security against fine-grained state compromise: the safety of messaging, in CRYPTO 2018, Part I. LNCS, vol. 10991, ed. by H. Shacham, A. Boldyreva (Springer, Cham, 2018), pp. 33–62. https://doi.org/10.1007/978-3-319-96884-1_2
J. Jakobsen, C. Orlandi, On the CCA (in)security of MTProto, in Proceedings of the 6th Workshop on Security and Privacy in Smartphones and Mobile Devices—SPSM’16 (2016). https://doi.org/10.1145/2994459.2994468
D. Jost, U. Maurer, M. Mularczyk, Efficient ratcheting: Almost-optimal guarantees for secure messaging, in Ishai and Rijmen [31], pp. 159–188. https://doi.org/10.1007/978-3-030-17653-2_6
C. Jutla, Attack on free-mac, sci.crypt. https://groups.google.com/forum/#!topic/sci.crypt/4bkzm_n7UGA (2000)
J. Kim, G. Kim, S. Lee, J. Lim, J.H. Song, Related-key attacks on reduced rounds of SHACAL-2, in INDOCRYPT 2004. LNCS, vol. 3348, ed. by A. Canteaut, K. Viswanathan (Springer, Berlin, Heidelberg, 2004), pp. 175–190. https://doi.org/10.1007/978-3-540-30556-9_15
L.R. Knudsen, Block chaining modes of operation. Reports in Informatics, Report 207, Department of Informatics, University of Bergen (2000)
N. Kobeissi, Formal Verification for Real-World Cryptographic Protocols and Implementations. Theses, INRIA Paris; Ecole Normale Supérieure de Paris—ENS Paris (2018). https://hal.inria.fr/tel-01950884
T. Kohno, A. Palacio, J. Black, Building secure cryptographic transforms, or how to encrypt and MAC. Cryptology ePrint Archive, Report 2003/177 (2003), https://eprint.iacr.org/2003/177
J. Lu, J. Kim, N. Keller, O. Dunkelman, Related-key rectangle attack on 42-round SHACAL-2, in ISC 2006. LNCS, vol. 4176, ed. by S.K. Katsikas, J. Lopez, M. Backes, S. Gritzalis, B. Preneel (Springer, Berlin, Heidelberg, 2006), pp. 85–100. https://doi.org/10.1007/11836810_7
K. Ludwig, Trudy—Transparent TCP proxy (2017), https://github.com/praetorian-inc/trudy
G.A. Marson, B. Poettering, Security notions for bidirectional channels. IACR Trans. Symm. Cryptol. 2017(1), 405–426 (2017). https://doi.org/10.13154/tosc.v2017.i1.405-426
Article Google Scholar
R. Merget, M. Brinkmann, N. Aviram, J. Somorovsky, J. Mittmann, J. Schwenk, Raccoon Attack: finding and exploiting most-significant-bit-oracles in TLS-DH(E). https://raccoon-attack.com/RacoonAttack.pdf (2020), accessed 11 September 2020
G.D. Micheli, N. Heninger, Recovering cryptographic keys from partial information, by example. Cryptology ePrint Archive, Report 2020/1506 (2020). https://eprint.iacr.org/2020/1506
M. Miculan, N. Vitacolonna, Automated symbolic verification of Telegram’s MTProto 2.0. in Proceedings of the 18th International Conference on Security and Cryptography, SECRYPT 2021, ed. by S. De Capitani di Vimercati, P. Samarati (SciTePress, 2021), pp. 185–197
NIST: FIPS 180-4: Secure Hash Standard (2015), http://dx.doi.org/10.6028/NIST.FIPS.180-4
E. Rescorla, H. Tschofenig, N. Modadugu, The Datagram Transport Layer Security (DTLS) protocol version 1.3. https://datatracker.ietf.org/doc/draft-ietf-tls-dtls13/ (Feb 2021), draft Version 41
P. Rogaway, Nonce-based symmetric encryption, in FSE 2004. LNCS, vol. 3017, ed. by B.K. Roy, W. Meier (Springer, Berlin, Heidelberg, 2004), pp. 348–359. https://doi.org/10.1007/978-3-540-25937-4_22
P. Rogaway, T. Shrimpton, A provable-security treatment of the key-wrap problem, in Vaudenay [75], pp. 373–390. https://doi.org/10.1007/11761679_23
P. Rogaway, Y. Zhang, Simplifying game-based definitions—indistinguishability up to correctness and its application to stateful AE, in CRYPTO 2018, Part II. LNCS, vol. 10992, ed. by H. Shacham, A. Boldyreva (Springer, Cham, 2018), pp. 3–32. https://doi.org/10.1007/978-3-319-96881-0_1
C.E. Shannon, Communication theory of secrecy systems. Bell Syst. Tech. J. 28(4), 656–715 (1949)
Article MathSciNet Google Scholar
T. Shrimpton, A characterization of authenticated-encryption as a form of chosen-ciphertext security. Cryptology ePrint Archive, Report 2004/272 (2004), https://eprint.iacr.org/2004/272
T. Sušánka, J. Kokeš, Security analysis of the Telegram IM, in Proceedings of the 1st Reversing and Offensive-oriented Trends Symposium, pp. 1–8 (2017)
Telegram: MTProto transports. http://web.archive.org/web/20200527124125/https://core.telegram.org/mtproto/mtproto-transports (2020)
Telegram: Notice of ignored error message. http://web.archive.org/web/20200527121939/https://core.telegram.org/mtproto/service_messages_about_messages#notice-of-ignored-error-message (2020)
Telegram: Schema. https://core.telegram.org/schema (2020)
Telegram: tdlib. https://github.com/tdlib/td (2020)
Telegram: TL language. https://core.telegram.org/mtproto/TL (2020)
Telegram: 500 million users. https://t.me/durov/147 (2021)
Telegram: End-to-end encryption, secret chats—sending a request. http://web.archive.org/web/20210126013030/https://core.telegram.org/api/end-to-end#sending-a-request (2021)
Telegram: Mobile protocol: Detailed description. http://web.archive.org/web/20210126200309/https://core.telegram.org/mtproto/description (2021)
Telegram: Mobile protocol: Detailed description—server salt. http://web.archive.org/web/20210221134408/https://core.telegram.org/mtproto/description#server-salt (2021)
Telegram: Security guidelines for client developers. http://web.archive.org/web/20210203134436/https://core.telegram.org/mtproto/security_guidelines#mtproto-encrypted-messages (2021)
Telegram: Sequence numbers in secret chats. http://web.archive.org/web/20201031115541/https://core.telegram.org/api/end-to-end/seq_no (2021)
Telegram: tdlib—Transport.cpp. https://github.com/tdlib/td/blob/v1.7.0/td/mtproto/Transport.cpp#L272 (2021)
Telegram: Telegram Android—Datacenter.cpp. https://github.com/DrKLO/Telegram/blob/release-7.4.0_2223/TMessagesProj/jni/tgnet/Datacenter.cpp#L1171 (2021)
Telegram: Telegram Android—Datacenter.cpp. https://github.com/DrKLO/Telegram/blob/release-7.6.0_2264/TMessagesProj/jni/tgnet/Datacenter.cpp#L1250 (2021)
Telegram: Telegram Desktop—mtproto_serialized_request.cpp. https://github.com/telegramdesktop/tdesktop/blob/v2.5.8/Telegram/SourceFiles/mtproto/details/mtproto_serialized_request.cpp#L15 (2021)
Telegram: Telegram Desktop—session_private.cpp. https://github.com/telegramdesktop/tdesktop/blob/v2.6.1/Telegram/SourceFiles/mtproto/session_private.cpp#L1338 (2021)
Telegram: Telegram Desktop—session_private.cpp. https://github.com/telegramdesktop/tdesktop/blob/v2.7.1/Telegram/SourceFiles/mtproto/session_private.cpp#L1258 (Apr 2021)
Telegram: Telegram iOS—MTProto.m. https://github.com/TelegramMessenger/Telegram-iOS/blob/release-7.6.2/submodules/MtProtoKit/Sources/MTProto.m#L2144 (Apr 2021)
Telegram: Telegram MTProto—creating an authorization key. http://web.archive.org/web/20210112084225/https://core.telegram.org/mtproto/auth_key (Jan 2021)
N. Unger, S. Dechand, J. Bonneau, S. Fahl, H. Perl, I. Goldberg, M. Smith, SoK: Secure messaging, in 2015 IEEE Symposium on Security and Privacy (IEEE Computer Society Press, 2015), pp. 232–249. https://doi.org/10.1109/SP.2015.22
S. Vaudenay, ed. EUROCRYPT 2006, LNCS, vol 4004 (Springer, Berlin, Heidelberg, 2006)

Download references

Acknowledgements

We thank Mihir Bellare for discussions and insights. The research of Mareková was supported by the EPSRC and the UK Government as part of the Centre for Doctoral Training in Cyber Security at Royal Holloway, University of London (EP/P009301/1). The research of Paterson was supported in part by a gift from VMware. The bulk of this work was done while Albrecht was at Royal Holloway, University of London.

Funding

Open access funding provided by Swiss Federal Institute of Technology Zurich.

Author information

Authors and Affiliations

King’s College London, London, UK
Martin R. Albrecht
Applied Cryptography Group, ETH Zurich, Zürich, Switzerland
Lenka Mareková & Kenneth G. Paterson
Amazon, Madrid, Spain
Igors Stepanovs

Authors

Martin R. Albrecht
View author publications
Search author on:PubMed Google Scholar
Lenka Mareková
View author publications
Search author on:PubMed Google Scholar
Kenneth G. Paterson
View author publications
Search author on:PubMed Google Scholar
Igors Stepanovs
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Lenka Mareková.

Additional information

Communicated by Marc Fischlin.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is the full version of a work that appeared at IEEE S&P 2022. Part of this work was done while Albrecht was at Royal Holloway, University of London. Part of this work was done, while Mareková was at Royal Holloway, University of London. Part of this work was done, while Stepanovs was at ETH Zurich. This work was conducted outside of the author’s employment at Amazon and is unrelated to their position at Amazon

Appendices

Correctness-style properties of a support function

In this section, we formalise two basic correctness-style properties of a support function that we call integrity and order correctness. Both properties were specified and required (but not named) in the robust channel framework of [28]. For our formal analysis of the MTProto protocol in Sect. 5, we use the support function $\textsf{supp}\text {-}\textsf{ord}$ defined in Fig. 32 that mandates in-order message delivery; it happens to satisfy both of the properties formalised in this section. However, we do not mandate that every support function must satisfy these properties so as not to constrain the range of possible channel behaviours. We provide formalisations of these two properties to better clarify the relation to prior work.

The $\textrm{INT}$-security proof of our MTProto-based channel $\textsf{MTP}\text {-}\textsf{CH} $ in Sect. 5.6 relied on the integrity property, which we formalise here, and on two other basic properties of $\textsf{supp}\text {-}\textsf{ord}$. The two latter properties were informally introduced in Sect. 5.6. We do not formalise them in this work in order to avoid introducing additional complexity.

Recall that any support function $\textsf{supp}$ can be interpreted as a specification regarding how a channel should behave. In particular, the correctness definition for channels from Sect. 3.4 matches the implementation of any channel $\textsf{CH}$ against its desired functionality as specified by $\textsf{supp}$. However, there can be many different channel implementations that adhere to a fixed specification defined by $\textsf{supp}$. Therefore, in this work all definitions of support functions and all formalisations of support function properties are agnostic to (i.e. not parametrised with) any specific channel implementation $\textsf{CH}$.

1.1 Integrity of a support function

This property roughly requires that only the messages that were genuinely sent by another user on the channel are delivered. In particular, the support function should return $\bot $ whenever it is evaluated on an input tuple $(\textit{u}, \textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}, \textsf{label}, \textit{aux})$ such that the label $\textsf{label}$ does not appear in the opposite user’s transcript $\textsf{tr}_{\overline{\textit{u}}}$. The game in Fig. 53 captures this requirement by allowing an adversary $\mathcal {F}$ to choose an arbitrary input tuple for the support function $\textsf{supp}$. The advantage of $\mathcal {F}$ in breaking the integrity of $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{sint}}_{\textsf{supp}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{sint}}_{\textsf{supp}, \mathcal {F}}]$.

1.2 Order correctness of a support function

This property roughly requires that in-order delivery is enforced separately in each direction on the channel, assuming that a distinct label is assigned to each network message. In particular, when evaluated on an input tuple $(\textit{u}, \textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}, \textsf{label}, \textit{aux})$, we require that the support function should return m if the opposite user’s transcript $\textsf{tr}_{\overline{\textit{u}}}$ contains a tuple $(\textsf{sent}, m, \textsf{label}, \textit{aux})$ and if all prior messages from $\overline{\textit{u}}$ to $\textit{u}$ were delivered and accepted in-order.

The notion of order correctness is captured by the game in Fig. 54. It provides an adversary $\mathcal {F}$ with an access to oracles $\textsc {Send}$ and $\textsc {Recv}$. The oracles can be used to send and receive messages in either direction between the two users. When querying the $\textsc {Send}$ oracle, the adversary is allowed to associate any plaintext m with an arbitrary label of its choice, as long as all labels are distinct. The adversary controls the network, and can call its $\textsc {Recv}$ oracle to deliver an arbitrary plaintext and label pair to either user of its choice; the support function is used to determine how to resolve the delivered network message. The adversary wins if it manages to create a situation that the support function fails to recover a plaintext that was delivered in-order in either direction. The game in Fig. 54 uses an auxiliary function $\textsf{buildList}$ to build a list of messages sent or received by a particular user, and it uses $\mathcal {L}_0 \preccurlyeq \mathcal {L}_1$ to denote that a list $\mathcal {L}_0$ is a prefix of another list $\mathcal {L}_1$. The advantage of $\mathcal {F}$ in breaking the order correctness of $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{ord}}_{\textsf{supp}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{ord}}_{\textsf{supp}, \mathcal {F}}]$.

Recall that our definition of a support transcript allows entries of the form $(\textsf{sent}, m, \bot , \textit{aux})$ and $(\textsf{recv}, \bot , \textsf{label}, \textit{aux})$, denoting a failure to send and receive a message, respectively. Such entries cannot appear in the user transcripts of our order correctness game because we require that adversaries never pass $\bot $ as input to their oracles (cf. Sect. 2.1). This conveniently provides us with a weak notion of order correctness that does not prescribe the behaviour of the support function in the presence of channel errors. Our definition can be strengthened by giving the adversary a choice to create support transcript entries that contain $\bot $; the updated game could then mandate whether the in-order delivery should still be required after an error is encountered.

The order correctness property is defined implicitly in [28]; it is required to hold as a part of their channel correctness game. In our framework, the channel correctness game (cf. Sect. 3.4) is defined with respect to an arbitrary support function, without mandating order correctness. Thus, expressing order correctness is delegated to the support function itself: if the property is required to hold in some setting, the support function must be required to satisfy the game in Fig. 54 (or some stronger variant of it).

Combined security of bidirectional channels

In this section, we define a security notion for channels that simultaneously captures the integrity and indistinguishability definitions from Sect. 3. We call it authenticated encryption. It follows the all-in-one definitional style of [50, 53]. We will prove that integrity and indistinguishability together are equivalent to authenticated encryption.

1.1 Security definition

Consider the authenticated encryption game $\textrm{G}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}}$ in Fig. 55, defined for a channel $\textsf{CH}$, a support function $\textsf{supp}$ and an adversary $\mathcal {A}$. The advantage of $\mathcal {A}$ in breaking the $\textrm{AE}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}) = 2 \cdot \Pr [\textrm{G}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}}] - 1$. Adversary $\mathcal {A}$ is given access to the challenge encryption oracle $\textsc {Ch}$ and to the receiving oracle $\textsc {Recv}$. The $\textsc {Ch}$ oracle is a copy of the $\textsc {Send}$ oracle from the channel integrity game $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ (Fig. 13), except it is amended for the left-or-right setting. Note that the $\textsc {Ch}$ oracle can be queried with $m_0 = m_1$ in order to recover the functionality of the $\textsc {Send}$ oracle. The $\textsc {Recv}$ oracle is likewise based on the corresponding oracle of $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$, but it is amended as follows. Instead of always returning $\bot $, the $\textsc {Recv}$ oracle can now alternatively return another error code $\lightning $ (here $\lightning \ne \bot $ as per Sect. 2.1). In the spirit of [50, 53], oracle $\textsc {Recv}$ returns $\bot $ whenever the challenge bit b is equal to 0. If $b = 1$ and adversary $\mathcal {A}$ violated the integrity of the channel (i.e. $m \ne m^*$ is true), then the $\textsc {Recv}$ oracle returns $\lightning $. Returning $\lightning $ here signals that $b = 1$, and the adversary can immediately use this to win the game.^{Footnote 43} Finally, if $b = 1$ and $m = m^*$, then $\textsc {Recv}$ returns $\bot $; this ensures that the adversary cannot trivially win by requesting the decryption of a challenge ciphertext. Note that adversary $\mathcal {A}$ can use the support function $\textsf{supp}$ to itself compute each plaintext value m that is obtained in the $\textsc {Recv}$ oracle (separately for either possible challenge bit $b\in \{0,1\}$) for as long as $m = m^*$ has never been false yet.

1.2 $\textrm{AE}$ is equivalent to $\textrm{INT}+ \textrm{IND}$

In the following two propositions, we show that our notions of channel integrity and indistinguishability from Sect. 3 together are equivalent to the above notion of authenticated encryption security.

Proposition 1

Let $\textsf{CH}$ be a channel. Let $\textsf{supp}$ be a support function. Let $\mathcal {A}$ be any adversary against the $\textrm{AE}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$. Then we can build an adversary $\mathcal {F}$ against the $\textrm{INT}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$, and an adversary $\mathcal {D}$ against the $\textrm{IND}$-security of $\textsf{CH}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}) \le 2 \cdot \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}) + \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}). \end{aligned}$$

Proof

This proof uses games $\textrm{G}_0$–$\textrm{G}_1$ in Fig. 56. Game $\textrm{G}_0$ is functionally equivalent to the authenticated encryption game $\textrm{G}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}}$, only adding a single instruction that sets the $\textsf{bad}$ flag, so by definition we have

$$\begin{aligned} \textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}) = 2 \cdot \Pr [\textrm{G}_0] - 1. \end{aligned}$$

We obtain game $\textrm{G}_1$ by removing the only instruction from game $\textrm{G}_0$ that could have returned the non-$\bot $ output; we denote this part of the code by setting $\textsf{bad}\leftarrow \texttt {true}$. In this proof, we will first use the $\textrm{IND}$-security of $\textsf{CH}$ to show that $\mathcal {A}$ cannot win if the $\textsf{bad}$ flag is never set (i.e. as captured by game $\textrm{G}_1$, which does not contain the line commented with $\textrm{G}_0$). And we will then show that the $\textsf{bad}$ flag cannot be set if $\textsf{CH}$ has $\textrm{INT}$-security with respect to $\textsf{supp}$.

We build the adversaries $\mathcal {F}$ for $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ (Fig. 13) and $\mathcal {D}$ for $\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}}$ (Fig. 14) as shown in Fig. 57. First, consider the $\textrm{IND}$-security adversary $\mathcal {D}$. By inspection, it perfectly simulates the oracles of game $\textrm{G}_1$ for the $\textrm{AE}$-security adversary $\mathcal {A}$, so we can write

$$\begin{aligned} \Pr [\textrm{G}_1] = \Pr [\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}}] = \frac{1}{2} \cdot \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}) + \frac{1}{2}. \end{aligned}$$

Second, we have

$$\begin{aligned} \Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] \le \Pr [\textsf{bad}^{\textrm{G}_1}]. \end{aligned}$$

Now consider the $\textrm{INT}$-security adversary $\mathcal {F}$. It perfectly simulates the oracles of game $\textrm{G}_1$ for the $\textrm{AE}$-security adversary $\mathcal {A}$, sampling its own challenge bit $b\in \{0,1\}$ and using it to consistently encrypt the appropriate challenge plaintext when simulating the challenge oracle $\textsc {Ch}$ of game $\textrm{G}_1$. Whenever the $\textsf{bad}$ flag is set in game $\textrm{G}_1$, the $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ game likewise sets $\textsf{win}= \texttt {true}$, so we have

$$\begin{aligned} \Pr [\textsf{bad}] \le \Pr [\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}] = \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}). \end{aligned}$$

Combining all of the above, we can write

$$\begin{aligned} \textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A})&= 2 \cdot (\Pr [\textrm{G}_1] + (\Pr [\textrm{G}_0] - \Pr [\textrm{G}_1])) - 1 \\&\le 2 \cdot \left( \frac{1}{2} \cdot \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}) + \frac{1}{2} + \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F})\right) - 1 = \\&= \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}) + 2 \cdot \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}). \end{aligned}$$

This concludes the proof. $\square $

Proposition 2

Let $\textsf{CH}$ be a channel. Let $\textsf{supp}$ be a support function. Let $\mathcal {F}$ be any adversary against the $\textrm{INT}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$. Let $\mathcal {D}$ be any adversary against the $\textrm{IND}$-security of $\textsf{CH}$. Then we can build adversaries $\mathcal {A}_{\textrm{INT}}$ and $\mathcal {A}_{\textrm{IND}}$ against the $\textrm{AE}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F})&\le \textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}_{\textrm{INT}}) \text { and } \\ \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D})&\le \textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}_{\textrm{IND}}). \end{aligned}$$

Proof

We build adversaries $\mathcal {A}_{\textrm{INT}}$ and $\mathcal {A}_{\textrm{IND}}$ as shown in Fig. 58.

First, let us consider the $\textrm{AE}$-security adversary $\mathcal {A}_{\textrm{INT}}$. It perfectly simulates the integrity game $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ for the $\textrm{INT}$-security adversary $\mathcal {F}$. When adversary $\mathcal {F}$ queries its oracle $\textsc {Send}$ with plaintext m as input, adversary $\mathcal {A}_{\textrm{INT}}$ calls its $\textsc {Ch}$ oracle with challenge plaintexts $m_0 = m_1 = m$ as input and forwards the response back to $\mathcal {F}$. When $\mathcal {F}$ queries its oracle $\textsc {Recv}$, adversary $\mathcal {A}_{\textrm{INT}}$ first calls its own $\textsc {Recv}$ oracle on the same inputs and receives back an error code $\textsf{err}\in \{\bot , \lightning \}$. If $\textsf{err}= \lightning $ then $b = 1$ in the $\textrm{AE}$-security game where $\mathcal {A}_{\textrm{INT}}$ is playing, so it calls ${\textbf{abort}}(1)$ (defined in Sect. 2.1) to immediately halt with the return value 1, causing $\mathcal {A}_{\textrm{INT}}$ to win the game. Alternatively, if $\mathcal {F}$ terminates without triggering this condition, then $\mathcal {A}_{\textrm{INT}}$ returns 0. We now derive the advantage of $\mathcal {A}_{\textrm{INT}}$. Let b be the challenge bit in game $\textrm{G}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}_{\textrm{INT}}}$. If $b = 1$ then $\mathcal {A}_{\textrm{INT}}$ returns $b' = 1$ whenever $\mathcal {F}$ sets $\textsf{win}= \texttt {true}$ in the simulated game $\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$. If $b = 0$, then $\mathcal {A}_{\textrm{INT}}$ never returns $b' = 1$. We can write

$$\begin{aligned} \textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}_{\textrm{INT}})&= \Pr \left[ \,b'=1\,|\,b=1\, \right] - \Pr \left[ \,b'=1\,|\,b=0\, \right] \\&\ge \Pr [\textrm{G}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}] - 0 \\&= \textsf{Adv}^{\textsf{int}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}). \end{aligned}$$

Next, consider the AE-security adversary $\mathcal {A}_{\textrm{IND}}$ as shown in Fig. 58. It perfectly simulates the indistinguishability game $\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}}$ for the $\textrm{IND}$-security adversary $\mathcal {D}$. In particular, $\mathcal {A}_{\textrm{IND}}$’s oracles run the same code that $\mathcal {D}$ would expect from its own oracles, except for the additional processing of transcripts and the support function that happens in the oracles of the $\textrm{AE}$-security game. The latter does not affect the state of the channel and can only cause $\mathcal {A}_{\textrm{IND}}$’s $\textsc {Recv}$ oracle to occasionally return a non-$\bot $ output (i.e. $\textsf{err}=\lightning $). Adversary $\mathcal {A}_{\textrm{IND}}$ checks the error code $\textsf{err}$ obtained from its $\textsc {Recv}$ oracle; it calls ${\textbf{abort}}(1)$ to halt with the return value $b'=1$ whenever $\textsf{err}\ne \bot $, causing it to immediately win in the $\textrm{AE}$-security game. However, our formal statement below does not reflect the potential improvement in the advantage that $\mathcal {A}_{\textrm{IND}}$ might gain by doing this. Overall, if $\mathcal {D}$ returns the correct challenge bit, then so does $\mathcal {A}_{\textrm{IND}}$. Therefore, we can write

$$\begin{aligned} \textsf{Adv}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}_{\textrm{IND}})&= 2 \cdot \Pr [\textrm{G}^{\textsf{ae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}_{\textrm{IND}}}] - 1 \\&\ge 2 \cdot \Pr [\textrm{G}^{\textsf{ind}}_{\textsf{CH}, \mathcal {D}}] - 1 \\&= \textsf{Adv}^{\textsf{ind}}_{\textsf{CH}}(\mathcal {D}). \end{aligned}$$

This concludes the proof. $\square $

Comparison to the robust channel framework of [28]

Our definitional framework in Sect. 3 is designed to analyse the correctness and security of bidirectional channels with respect to relaxed requirements for message delivery. The robust channel framework of [28] aims to capture the same goal, but it is defined only for unidirectional channels. Even though our definitions are more general in this sense, the support functions in our framework can be clearly seen as extending and strengthening the concept of support predicates from [28]. Beyond that, it is not necessarily obvious that the correctness and security notions defined across the two frameworks capture the same—or even a comparable—intuition.

In order to be able to make meaningful claims about how the two frameworks compare to each other, in Appendix C.1 we define unidirectional variants of our notions for correctness and authenticated encryption security. We obtain them by simplifying our bidirectional definitions from Sect. 3.4 and Appendix B in a straightforward way. Then in Appendix C.2 we define the notions of correctness and combined security from [28] in the syntax of this work; we minimally modify the definitions of [28] in order to make them comparable to ours. Finally, in Appendix C.3 we state and prove formal claims showing roughly that correctness and security in our framework implies correctness and security in the framework of [28]. We also provide some informal discussion. The main takeaway is that our frameworks capture the same intuition. But our support transcripts are defined to contain more information than the support transcripts used by [28]; this allows us to capture a broader channel functionality in a somewhat simpler way.

1.1 Our definitions of unidirectional correctness and security

Consider the definitions of channel, support transcript, and support function from Sect. 3. These definitions specify syntax for bidirectional communication, meaning they allow to capture the case where the users $\mathcal {I}$ and $\mathcal {R}$ are both able to send and receive messages. Without loss of generality, in this section we will use the same definitions to formalise the unidirectional notions of correctness and authenticated encryption. Even though syntactically the definitions from Sect. 3 allow to model bidirectional communication, the notions we state in this section only guarantee correctness and security when messages are sent from $\mathcal {I}$ to $\mathcal {R}$ and never in the opposite direction.

We do not provide any high-level intuition for the unidirectional notions that are defined below. We instead refer the reader to Sect. 3 and Appendix B for detailed discussion on the bidirectional variants of these notions.

1.1.1 Unidirectional correctness

Consider the unidirectional correctness game $\textrm{G}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}$ in Fig. 59, defined for a channel $\textsf{CH}$, a support function $\textsf{supp}$ and an adversary $\mathcal {F}$. The advantage of $\mathcal {F}$ in breaking the unidirectional correctness of $\textsf{CH}$ with respect to $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}}]$. The $\textrm{UCORR}$ game closely mirrors the bidirectional correctness game from Fig. 13, except it allows only $\mathcal {I}$ to send messages and only $\mathcal {R}$ to receive them. This means that oracles $\textsc {Send}$ and $\textsc {Recv}$ no longer take a user variable $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$ as input. Instead, $\mathcal {I}$ is hardcoded as the user that sends messages in $\textsc {Send}$ (i.e. this oracle is defined to always use the channel state $\textit{st}_{\mathcal {I}}$ and the support transcript $\textsf{tr}_{\mathcal {I}}$), and $\mathcal {R}$ is similarly hardcoded as the user that receives messages in $\textsc {Recv}$.

1.1.2 Unidirectional authenticated encryption

Consider the unidirectional authenticated encryption game $\textrm{G}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}}$ in Fig. 59, defined for a channel $\textsf{CH}$, a support function $\textsf{supp}$ and an adversary $\mathcal {A}$. The advantage of $\mathcal {A}$ in breaking the $\textrm{UAE}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$ is defined as $\textsf{Adv}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}) = 2 \cdot \Pr [\textrm{G}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}}] - 1$. The $\textrm{UAE}$ game closely mirrors the bidirectional authenticated encryption game from Fig. 55, except $\mathcal {I}$ is now hardcoded as the sender in oracle $\textsc {Ch}$ and $\mathcal {R}$ is now hardcoded as the receiver in oracle $\textsc {Recv}$.

1.2 The robust channel framework of [28]

In this section, we specify the core definitions from [28]. We formalise these definitions using our channel syntax from Sect. 3.^{Footnote 44} This syntax allows to model bidirectional communication, but in Appendix C.1 we explain that we can also use it to study unidirectional notions. The syntax used by [28] treats the auxiliary information differently: it is required to be recoverable from ciphertexts, so their receiving algorithm $\mathsf {\textsf{CH}.Recv}$ does not take $\textit{aux}$ as input. In particular, [28] define a helper algorithm $\textsf{aux}(\cdot )$ that takes a ciphertext c as input and returns the auxiliary information $\textit{aux}$ that is recovered from c. Note that channels with recoverable auxiliary information can also be captured using our syntax, e.g. by considering a receiving algorithm $\mathsf {\textsf{CH}.Recv}(\textit{st}_{\mathcal {R}}, c, \textit{aux})$ that extracts $\textit{aux}' \leftarrow \textsf{aux}(c)$ and then overwrites $\textit{aux}\leftarrow \textit{aux}'$. So we do not lose generality by using our syntax to state the definitions of [28].

1.2.1 Predicate-based support transcripts and functions

The robust channel framework [28] uses deterministic support predicates as the counterpart of our support functions. A support predicate $\textsf{suppPred}$ (when adjusted to our syntax) takes the following inputs: receiver’s identity $\mathcal {R}$, receiver’s support transcript $\textsf{tr}_{\mathcal {R}}^{\textsf{pred}}$, sender’s support transcript $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}$, ciphertext c, and auxiliary information $\textit{aux}$. It returns a support decision $d$. We write $d\leftarrow \textsf{suppPred}(\mathcal {R}, \textsf{tr}_{\mathcal {R}}^{\textsf{pred}}, \textsf{tr}_{\mathcal {I}}^{\textsf{pred}}, c, \textit{aux})$. We will define the exact format of the support transcripts $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}, \textsf{tr}_{\mathcal {R}}^{\textsf{pred}}$ below. A Boolean support predicate returns $d\in \{\texttt {true}, \texttt {false}\}$ to indicate whether $c, \textit{aux}$ should be accepted (i.e. whether it is supported). An index-recovering support predicate returns $d\in \{\texttt {false}\}\cup \{0, 1, \ldots \}$. Here $d= \texttt {false}$ indicates that $c, \textit{aux}$ should be rejected, whereas $d\in \{0, 1, \ldots \}$ indicates that it should be accepted and that $d$ is the index of the entry containing $c, \textit{aux}$ in the sender’s transcript $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}$. In the latter case the correctness of $\textsf{suppPred}$ guarantees that $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}[d]$ indeed contains $c, \textit{aux}$.

The robust channel framework uses support transcripts that are defined as follows (again adjusted to use our syntax). The sender’s support transcript $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}$ is a list of $(\textsf{sent}, c, \textit{aux})$-type entries, each indicating that the sender $\mathcal {I}$ sent a ciphertext c with auxiliary information $\textit{aux}$. The receiver’s support transcript $\textsf{tr}_{\mathcal {R}}^{\textsf{pred}}$ is a list of $(\textsf{recv}, d, c, \textit{aux})$-type entries, each indicating that $c, \textit{aux}$ was delivered to the receiver $\mathcal {R}$, and that $\mathcal {R}$ subsequently accepted them based on obtaining a positive support decision $d$. We emphasise that the receiver’s support transcript in [28] is defined to only contain information about supported ciphertexts, i.e. $\textsf{tr}_{\mathcal {R}}^{\textsf{pred}}$ should never contain an entry with $d= \texttt {false}$.

1.2.2 Summary of the notions defined in [28]

The robust channel framework [28] defines correctness and multiple security notions for a channel, each of them with respect to an arbitrary support predicate $\textsf{suppPred}$. These notions capture roughly the following intuition. Correctness requires that for any sequence of ciphertexts that was sent by the sender $\mathcal {I}$ on an authenticated channel, and for any way – that is supported by $\textsf{suppPred}$ – in which these ciphertexts could have been replayed, reordered or dropped (while in transit), each of them is correctly decrypted upon arriving to the receiver $\mathcal {R}$. Integrity ($\textsf{INT}$) requires that if $\textsf{suppPred}$ requires to reject a ciphertext, then $\mathcal {R}$ rejects it. Robustness ($\textsf{ROB}$) requires that if $\textsf{suppPred}$ requires to accept a ciphertext, then $\mathcal {R}$ correctly decrypts it; this should happen even if $\mathcal {R}$ previously rejected other ciphertexts, meaning $\mathcal {R}$’s channel state should not get corrupted regardless of what ciphertexts $\mathcal {R}$ attempts to decrypt. Robust integrity ($\textsf{ROB}\text {-}\textsf{INT}$) combines $\textsf{ROB}$ and $\textsf{INT}$ essentially requiring that the channel behaves exactly as specified by $\textsf{suppPred}$ (likewise correctly decrypting the supported ciphertexts even after rejecting maliciously formed ciphertexts). Finally, the master notion $\textsf{ROB}\text {-}\textsf{INT}\text {-}\textsf{IND}\text {-}\textsf{CCA}$ combines $\textsf{ROB}\text {-}\textsf{INT}$ with the standard notion of $\textsf{IND}\text {-}\textsf{CCA}$ security; it is recognized by [28] as the “ultimate target” to be achieved by real-world protocols. In this section we will state formal definitions for the correctness and the $\textsf{ROB}\text {-}\textsf{INT}\text {-}\textsf{IND}\text {-}\textsf{CCA}$ security notions of [28] in the syntax of our work, minimally strengthening both of them. For simplicity and consistency with other definitions in this paper, we will refer to $\textsf{ROB}\text {-}\textsf{INT}\text {-}\textsf{IND}\text {-}\textsf{CCA}$ as the unidirectional predicate-based authenticated encryption.

1.2.3 Unidirectional predicate-based correctness

Consider the unidirectional predicate-based correctness game $\textrm{G}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {F}}$ in Fig. 60, defined for a channel $\textsf{CH}$, a support predicate $\textsf{suppPred}$ and an adversary $\mathcal {F}$. The boxed code is used only when $\textsf{suppPred}$ is an index-recovering support predicate. The advantage of $\mathcal {F}$ in breaking the unidirectional predicate-based correctness of $\textsf{CH}$ with respect to $\textsf{suppPred}$ is defined as $\textsf{Adv}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {F}) = \Pr [\textrm{G}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {F}}]$. The game allows adversary $\mathcal {F}$ to relay arbitrary messages from user $\mathcal {I}$ to user $\mathcal {R}$, calling oracles $\textsc {Send}$ and $\textsc {Recv}$ to, respectively, send and receive them. The $\textsc {Recv}$ oracle takes an integer j as input, instructing it to receive the j-th ciphertext that was produced by the oracle $\textsc {Send}$. This effectively models an authenticated channel where adversary $\mathcal {F}$ cannot forge messages but it can replay, reorder and drop them. The game maintains two ordered transcripts: $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}$ is a list that contains ciphertexts (and is passed to the support predicate), and $\textsf{T}$ is a table that contains plaintext-ciphertext pairs. More precisely, the table contains entries of the form $\textsf{T}[j] = (m, c, \textit{aux})$ to indicate that the j-th call to oracle $\textsc {Send}$ encrypted $(m, \textit{aux})$ into c. When adversary queries $\textsc {Recv}(j)$ and the support predicate $\textsf{suppPred}$ determines that $c, \textit{aux}$ from $\textsf{T}[j]$ is a supported input, then the channel should decrypt it as m. Otherwise, the adversary wins the game.

We now discuss the intuition behind the index-recovering support predicates and explain the purpose of the boxed code in the correctness game. We will roughly show that if a channel $\textsf{CH}$ can encrypt two distinct plaintexts into the same ciphertext (e.g. with respect to different sender’s states), then there exists an adversary that can break the correctness of $\textsf{CH}$ with respect to any Boolean support predicate $\textsf{suppPred}$. We note that the output of $\mathsf {\textsf{CH}.Send}$ depends on the continuously evolving state of the sender $\mathcal {I}$, so even some real-world channels display such behaviour (this includes $\textsf{QUIC}$ and $\textsf{DTLS}$ 1.3, which motivated the authors of [28] to define their framework).

More precisely, consider any hypothetical sequence of oracle queries made by adversary $\mathcal {F}$ in game $\textrm{G}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {F}}$ that resulted in creating (intermediate) transcripts $\textsf{tr}_{\mathcal {R}}^{\textsf{pred}}, \textsf{tr}_{\mathcal {I}}^{\textsf{pred}}, \textsf{T}[\cdot ]$ that satisfy the following conditions:

(1)
There exist distinct indices $j_0, j_1$ and distinct plaintexts $m_0, m_1$ such that $\textsf{T}[j_0] = (m_0, c, \textit{aux})$ and $\textsf{T}[j_1] = (m_1, c, \textit{aux})$ for some $c, \textit{aux}$.
(2)
$d\leftarrow \textsf{suppPred}(\mathcal {R}, \textsf{tr}_{\mathcal {R}}^{\textsf{pred}}, \textsf{tr}_{\mathcal {I}}^{\textsf{pred}}, c, \textit{aux})$ returns $d= \texttt {true}$ when queried on $c, \textit{aux}$ from $\textsf{T}[j_0], \textsf{T}[j_1]$ .^{Footnote 45}

At this point, $\mathcal {F}$ can choose to query either $\textsc {Recv}(j_0)$ or $\textsc {Recv}(j_1)$ in order to make the game check that $\mathsf {\textsf{CH}.Recv}$ decrypts $(c, \textit{aux})$ into $m_0$ or $m_1$, respectively. But $\mathsf {\textsf{CH}.Recv}$ would in both cases take the same pair $(c, \textit{aux})$ as input so it would fail to correctly recover at least one of these plaintexts. This illustrates a limitation in the correctness game when it is considered with respect to a Boolean support predicate. The use of an index-recovering support predicate resolves this definitional issue with the help of the boxed code in the correctness game. In particular, an index-recovering support predicate returns j to instruct the correctness game that the supported input pair $(c, \textit{aux})$ corresponds to the j-th ciphertext that was produced by $\textsc {Send}$. So at any point in time there is a unique value j for which $\textsc {Recv}(j)$ would be expected to test the correctness of $\mathsf {\textsf{CH}.Recv}$, because otherwise the condition checking $d \ne j$ would cause the oracle to exit early.

The original correctness game defined in [28] is called $\textsf{Expt}^{\textsf{correct}(\textsf{suppPred})}_{\textsf{CH}, \mathcal {F}}$. Our definition of $\textrm{G}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {F}}$ closely resembles it. The differences we introduced are as follows. The original game does not provide the initial channel states as input to its adversary, and it does not allow the adversary to choose random coins when calling its oracle $\textsc {Send}$. Our game strengthens their definition in both ways; this is necessary for our analysis in Appendix C.3. Beyond that, the original game also verifies that the support predicate $\textsf{suppPred}$ allows to deliver messages only in-order. We suggest to instead formalise it as a stand-alone property of a support predicate (e.g. in the bidirectional setting we formalise this property as the order correctness of a support function in Appendix A).

1.2.4 Unidirectional predicate-based authenticated encryption

We now define the notion of $\textsf{ROB}\text {-}\textsf{INT}\text {-}\textsf{IND}\text {-}\textsf{CCA}$-security from [28], referring to it as unidirectional predicate-based authenticated encryption. Take the unidirectional predicate-based authenticated encryption game $\textrm{G}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {A}}$ in Fig. 60, defined for a channel $\textsf{CH}$, a support predicate $\textsf{suppPred}$ and an adversary $\mathcal {A}$. The advantage of $\mathcal {A}$ in breaking the $\textrm{PREDAE}$-security of $\textsf{CH}$ with respect to $\textsf{suppPred}$ is defined as $\textsf{Adv}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {A}) = 2 \cdot \Pr [\textrm{G}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {A}}] - 1$. The game allows adversary $\mathcal {A}$ to relay arbitrary messages from the sender $\mathcal {I}$ to the receiver $\mathcal {R}$ by calling its oracles $\textsc {Send}$ and $\textsc {Recv}$. The sending oracle $\textsc {Send}$ presents a left-or-right style challenge, encrypting one of the two provided plaintexts. For the receiving oracle $\textsc {Recv}$, the game creates a “real” receiver’s channel state $\textit{st}_{\mathcal {R}}^r$ and a “correct” receiver’s channel state $\textit{st}_{\mathcal {R}}^c$ that are initially equal. When adversary calls $\textsc {Recv}$, the “real” state is always updated by running $\mathsf {\textsf{CH}.Recv}$ on the received input, whereas the “correct” state is only updated if the oracle’s input is determined to be supported according to $\textsf{suppPred}$ (and only if the challenge bit is $b=1$). So the “real” state $\textit{st}_{\mathcal {R}}^r$ could in principle get corrupted by maliciously formed ciphertexts, causing the channel to malfunction; whereas the same cannot happen to the “correct” state $\textit{st}_{\mathcal {R}}^c$. If the challenge bit is $b=0$ then $\textsc {Recv}$ always returns $\bot $. If the challenge bit is $b=1$ then $\textsc {Recv}$ will also return $\bot $ unless the “real” and the “correct” states produce different outcomes (including the case when both of them accept the ciphertext, but the decrypted plaintexts $m^r$ and $m^c$ are distinct). If the latter occurs, then $\textsc {Recv}$ returns $\lightning $ instead of $\bot $, which unambiguously signals to $\mathcal {A}$ that $b=1$ and thus enables it to immediately win the game by halting with output value $b'=1$.

The original game defining the $\textsf{ROB}\text {-}\textsf{INT}\text {-}\textsf{IND}\text {-}\textsf{CCA}$-security of $\textsf{CH}$ within the robust channel framework [28] is called $\textsf{Expt}^{\textsf{ROB}\text {-}\textsf{INT}\text {-}\textsf{IND}\text {-}\textsf{CCA}(\textsf{suppPred})}_{\textsf{CH}, \mathcal {A}}$. Our definition of $\textrm{G}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {A}}$ closely resembles it. The differences we introduced are as follows. Similarly to the correctness notion above, the original game did not let its adversary choose random coins for encrypting challenge plaintexts; our definition allows that. In the original game, if the challenge bit is $b=1$ and the adversary queries its receiving oracle $\textsc {Recv}$ on inputs that produce $m^r \ne m^c$ then the game returns the first of the two values $m^r, m^c$ that is non-$\bot $. We simplified this by instead making $\textsc {Recv}$ immediately return $\lightning $; these behaviours are equivalent, because both of them unambiguously indicate to $\mathcal {A}$ that $b=1$. Finally, in the original game the $\textsc {Recv}$ oracle does not take $\textit{aux}$ as input. In general, our definition gives more power to the adversary by allowing it to independently choose both c and $\textit{aux}$. But this change can be negated by defining both $\mathsf {\textsf{CH}.Recv}$ and $\textsf{suppPred}$ to ignore the $\textit{aux}$ value they receive as input, and instead recover the auxiliary information from c.

1.3 Relations between our framework and the framework of [28]

1.3.1 Richer support transcripts help capturing more functionality

The robust channel framework [28] explicitly defines the receiver’s support transcript to contain only those ciphertexts that were determined to be supported (according to the support predicate $\textsf{suppPred}$ that is used in the corresponding correctness or security game). If a ciphertext is rejected, then it is done silently, without modifying the receiver’s support transcript. In contrast, in our framework every attempt to receive a ciphertext is documented in the receiver’s transcript. This includes rejected ciphertexts, i.e. if $(c, \textit{aux})$ is delivered to the receiver and subsequently rejected by it then the entry $(\textsf{recv}, \bot , c, \textit{aux})$ should be added to the receiver’s support transcript. It is possible to define support functions that use this information in order to determine when the channel should be permanently closed, e.g. upon rejecting a specific number of delivered ciphertexts. Such functionality cannot be captured by the correctness and security games of [28] because they keep no record of rejected ciphertexts. We emphasise that this difference arises only due to the support transcripts of [28] containing less information, and not due to our framework using support functions instead of support predicates (or having stronger correctness or security notions).

Note that our framework also requires the sender’s transcript to contain information about all failed encryption events (i.e. when $\mathsf {\textsf{CH}.Send}$ returns $\bot $), each described by a transcript entry of the type $(\textsf{sent}, m, \bot , \textit{aux})$. In comparison, the channel’s sending algorithm in [28] is likewise allowed to return $\bot $, but their correctness and security definitions do not explicitly consider the possibility of encryption failures (and so it is not obvious whether their definitions are well-defined in this respect). However, it is also not clear whether such entries could serve any purpose in either framework, so the difference here is moot.

1.3.2 Support functions aid in reducing the definitional complexity

We now argue that our correctness definition captures the same intuition as that of [28], except that the use of a support function in our definition allows us to remove some of the complexity that appeared in the definition of [28]. But the syntax of a function on its own would not be sufficient to achieve that. In this regard, it is again crucial that our definitions use support transcripts that are richer than those of [28], i.e. that our support transcripts in addition contain plaintext messages.

Consider the unidirectional (function-based) correctness definition in Fig. 59 and the unidirectional predicate-based correctness definition in Fig. 60. The predicate-based definition essentially maintains two different transcripts: $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}$ and $\textsf{T}[\cdot ]$ (the latter defined as a table with sequential integer indices). Here $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}$ is the sender’s support transcript that is used as an input to the support predicate $\textsf{suppPred}$ in order to determine whether $(c, \textit{aux})$ is supported. If $\textsf{suppPred}$ determines this to be true, then the game implicitly uses $\textsf{T}[\cdot ]$ to map $(c, \textit{aux})$ to the corresponding plaintext m. If such a mapping cannot be unambiguously inferred based just on the pair $(c, \textit{aux})$ (as we discussed in Appendix C.2), then $\textsf{suppPred}$ is required to be index-recovering so that it can point the correctness game to the appropriate index j inside table $\textsf{T}$. The process of mapping $(c, \textit{aux})$ to j and then to m—that is done jointly by $\textsf{suppPred}$ and the correctness game itself—is arguably complicated and counter-intuitive.^{Footnote 46} Having observed this, now consider our function-based correctness definition. One can think of it as having essentially delegated the task of mapping $(c, \textit{aux})$ to m entirely to its support function. In order to be able to do this, the support function now needs to take some information about the encrypted plaintexts as input. This is accomplished by having our support transcript itself contain the plaintexts. Finally, note that even though our function-based correctness definition could potentially allow its adversary to forge ciphertexts, this entirely depends on what support function is considered. Outside of some exotic applications, one would expect any support function to return $m^*= \bot $ whenever c was not previously produced by the sender (in the bidirectional setting we formalise this property as the integrity of a support function in Appendix A).

1.3.3 Is our framework at least as expressive as the framework of [28]?

We now make partial progress towards showing the following: any channel that is correct and secure in the robust channel framework of [28] is also correct and secure in our framework. When trying to prove such claim, ideally for any support predicate $\textsf{suppPred}$ one would like to build a support function $\textsf{supp}$ such that the following is true: any channel $\textsf{CH}$ that is correct and secure with respect to $\textsf{suppPred}$ in the [28] framework is also correct and secure with respect to $\textsf{supp}$ in our framework. The result we provide below is weaker in two ways: (1) we only consider an index-recovering support predicate $\textsf{suppPred}$ as the starting point^{Footnote 47}, and (2) the support function $\textsf{supp}$ that we build is simultaneously based on $\textsf{suppPred}$ and $\textsf{CH}$, and not all combinations of $\textsf{suppPred}, \textsf{CH}$ are permitted. It is not necessarily obvious why it is valuable to prove any implications between the two frameworks. In our opinion, formalising and proving at least some relations between these frameworks serves as a sanity check for both of them.

Building a support function. As outlined above, our goal is to build a support function $\textsf{supp}$ from an arbitrary index-recovering support predicate $\textsf{suppPred}$ such that unidirectional predicate-based correctness and security of a channel $\textsf{CH}$ with respect to $\textsf{suppPred}$ would imply unidirectional (function-based) correctness and security of $\textsf{CH}$ with respect to $\textsf{supp}$. This inevitably requires to build $\textsf{supp}$ from $\textsf{suppPred}$ in a black-box way, and so the support function $\textsf{supp}$ also needs to be able to convert its own supports transcripts into the format that is used by $\textsf{suppPred}$. For this purpose we will assume there exists a transcript conversion algorithm $\textsf{convertTr}$ that takes the support transcripts $\textsf{tr}_{\mathcal {I}}, \textsf{tr}_{\mathcal {R}}$ of our format (i.e. as defined in Definition 2) and converts them into support transcripts $\textsf{tr}_{\mathcal {I}}^{\textsf{pred}}, \textsf{tr}_{\mathcal {R}}^{\textsf{pred}}$ that adhere to the format used by the robust channel framework (i.e. as defined in Appendix C.2). This essentially requires to convert $(\textsf{sent}, m, c, \textit{aux})$-type entries into $(\textsf{sent}, c, \textit{aux})$-type entries, and $(\textsf{recv}, m, c, \textit{aux})$-type entries into $(\textsf{recv}, d, c, \textit{aux})$-type entries. Here $d\in \{0, 1, \ldots \}$ is the support decision that would have been returned by $\textsf{suppPred}$; it should point at the entry of the sender’s transcript that documents the event of sending $c, \textit{aux}$. We emphasise that the transcript conversion algorithm $\textsf{convertTr}$ takes $\textsf{tr}_{\mathcal {I}}, \textsf{tr}_{\mathcal {R}}$ as input. So for any $(\textsf{recv}, m, c, \textit{aux})\in \textsf{tr}_{\mathcal {R}}$ it can trivially determine $d$ by searching $\textsf{tr}_{\mathcal {I}}$ for the unique entry containing $(\textsf{sent}, m, c, \cdot )$^{Footnote 48} – unless $\textsf{tr}_{\mathcal {I}}$ could contain more than one such entry. This problem does not arise if $\textsf{CH}$ is guaranteed to produce unique plaintext-ciphertext pairs; below we will define a correctness condition describing what exactly is required from $\textsf{convertTr}$. If a “correct” transcript conversion algorithm $\textsf{convertTr}$ exists, then we define a support function $\textsf{supp}$ based on $\textsf{suppPred}$ and $\textsf{convertTr}$ as defined in Fig. 61. It calls $\textsf{convertTr}$ to convert the transcripts and uses them to evaluate $\textsf{suppPred}$ on $c, \textit{aux}$. If $d\ne \texttt {false}$ then $\textsf{supp}$ simply gets its output $m^*$ from the $d$-th entry of the sender’s support transcript $\textsf{tr}_{\mathcal {I}}$.

Correctness of a transcript conversion algorithm. Consider the correctness game $\textrm{G}^{\textsf{corr}}_{\textsf{CH}, \textsf{suppPred}, \textsf{convertTr}, \mathcal {D}}$ in Fig. 62, defined for a channel $\textsf{CH}$, an index-recovering support predicate $\textsf{suppPred}$, a transcript conversion algorithm $\textsf{convertTr}$ and an adversary $\mathcal {D}$. The advantage of $\mathcal {D}$ in breaking the correctness of $\textsf{convertTr}$ with respect to $\textsf{CH}$, $\textsf{suppPred}$ is defined as

$$\begin{aligned} \textsf{Adv}^{\textsf{corr}}_{\textsf{CH}, \textsf{suppPred}, \textsf{convertTr}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}^{\textsf{corr}}_{\textsf{CH}, \textsf{suppPred}, \textsf{convertTr}, \mathcal {D}}] - 1. \end{aligned}$$

We say that $\textsf{convertTr}$ is correct with respect to $\textsf{CH}$, $\textsf{suppPred}$ if

$$\begin{aligned} \textsf{Adv}^{\textsf{corr}}_{\textsf{CH}, \textsf{suppPred}, \textsf{convertTr}}(\mathcal {D}) = 0 \end{aligned}$$

for all adversaries $\mathcal {D}$. The correctness game captures the intuition that $\textsf{convertTr}$ must be able to perfectly reconstruct support transcripts for the robust channel framework based on the support transcripts used in our framework. Here the benchmark that we use for “perfect reconstruction” is that $\textsf{suppPred}$ would never return different support decisions based on whether it is run on real support transcripts of [28] or those that are converted from the support transcripts used in our framework. In particular, the game uses real transcripts when its challenge bit is $b=1$, and it uses converted transcripts when $b=0$. The sender’s support transcript $\textsf{tr}_{\mathcal {I}}$ is populated with ciphertexts that can be created by $\mathsf {\textsf{CH}.Send}$, whereas the receiver’s support transcript $\textsf{tr}_{\mathcal {R}}$ is formed based on the support decisions that can be returned by $\textsf{suppPred}$. Here the use of $\mathsf {\textsf{CH}.Send}$ means that a simpler algorithm $\textsf{convertTr}$ would be able to satisfy the correctness requirement when $\textsf{CH}$ has, for example, unique ciphertexts. Note that if $b=1$ then the receiver’s transcript contains only the entries for supported $c, \textit{aux}$; but if $b=0$ then the receiver’s transcript can also contain the entries with $c, \textit{aux}$ that were rejected. This reflects the observation that our support transcripts contain more information than the transcripts in the robust channel framework; this information might help when trying to convert them. Finally, observe that the $b=0$ branch in the $\textsc {Recv}$ oracle could be replaced with the following two instructions (for $\textsf{supp}= \textsf{SUPP}\text {-}\textsf{FUNC}\text {-}\textsf{FROM}\text {-}\textsf{PRED}[\textsf{suppPred}, \textsf{convertTr}]$ as defined in Fig. 61):

$$\begin{aligned}&m^*\leftarrow \textsf{supp}(\mathcal {R}, \textsf{tr}_{\mathcal {R}}, \textsf{tr}_{\mathcal {I}}, c, \textit{aux}), \\&\textsf{tr}_{\mathcal {R}} \leftarrow \textsf{tr}_{\mathcal {R}} ~\Vert ~(\textsf{recv}, m^*, c, \textit{aux}). \end{aligned}$$

So the correctness of $\textsf{convertTr}$ can be interpreted as requiring that $\textsf{suppPred}$ and $\textsf{supp}$ can be used interchangeably (each using support transcripts of a different format).

Proposition 3

Let $\textsf{CH}$ be a channel. Let $\textsf{suppPred}$ be an index-recovering support predicate. Let $\textsf{convertTr}$ be a transcript conversion algorithm that is correct with respect to $\textsf{CH}, \textsf{suppPred}$. Let $\textsf{supp}= \textsf{SUPP}\text {-}\textsf{FUNC}\text {-}\textsf{FROM}\text {-}\textsf{PRED}[\textsf{suppPred}, \textsf{convertTr}]$ be the support function as defined in Fig. 61. Let $\mathcal {F}_{\textrm{PREDCORR}}$ be any adversary against the unidirectional predicate-based correctness of $\textsf{CH}$ with respect to $\textsf{suppPred}$. Then, we can build an adversary $\mathcal {F}_{\textrm{UCORR}}$ against the unidirectional (function-based) correctness of $\textsf{CH}$ with respect to $\textsf{supp}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {F}_{\textrm{PREDCORR}}) \le \textsf{Adv}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}}(\mathcal {F}_{\textrm{UCORR}}). \end{aligned}$$

Proof

This proof uses games $\textrm{G}_0$–$\textrm{G}_2$ in Fig. 63. Game $\textrm{G}_0$ is equivalent to game $\textrm{G}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {F}_{\textrm{PREDCORR}}}$ so we have

$$\begin{aligned} \textsf{Adv}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {F}_{\textrm{PREDCORR}}) = \Pr [\textrm{G}_0]. \end{aligned}$$

Game $\textrm{G}_1$ is obtained from game $\textrm{G}_0$ by replacing the support transcripts of the robust channel framework [28] with the support transcripts of our framework. The transcript conversion algorithm $\textsf{convertTr}$ is used to convert the transcripts, and its assumed correctness with respect to $\textsf{CH}$, $\textsf{suppPred}$ guarantees that the two games are equivalent. It follows that

$$\begin{aligned} \Pr [\textrm{G}_0] = \Pr [\textrm{G}_1]. \end{aligned}$$

We do not provide an adversary against the correctness of $\textsf{convertTr}$, but one could in principle do that as follows. Such an adversary gets both channel states as input, whereas the oracles $\textsc {Send}$ and $\textsc {Recv}$ are fully deterministic in both $\textrm{G}_0$ and $\textrm{G}_1$. So the adversary against the correctness of $\textsf{convertTr}$ could simultaneously simulate both games for $\mathcal {F}_{\textrm{PREDCORR}}$ and only use its own oracles to accordingly update the support transcripts in the game it is playing in. If it ever detects that the support decisions differ between $\textrm{G}_0$ and $\textrm{G}_1$, it can immediately query its own $\textsc {Recv}$ oracle and use the knowledge of the $d$ values in the simulated games in order to trivially determine the challenge bit.

Game $\textrm{G}_2$ is obtained from game $\textrm{G}_1$ by changing the win condition from $m^c \ne m$ to $m^c \ne m^*$. Here the plaintext m it taken from $\textsf{T}[j]$, whereas $m^*$ is taken from $\textsf{tr}_{\mathcal {I}}[d]$. But $j = d$ is guaranteed to be true, so $m = m^*$ and the two win conditions are equivalent. We have

$$\begin{aligned} \Pr [\textrm{G}_1] = \Pr [\textrm{G}_2]. \end{aligned}$$

In Fig. 64, we build adversary $\mathcal {F}_{\textrm{UCORR}}$ against the unidirectional (function-based) correctness of $\textsf{CH}$ with respect to $\textsf{supp}$. This adversary perfectly simulates game $\textrm{G}_2$ for adversary $\mathcal {F}_{\textrm{PREDCORR}}$. The simulation is trivial, because both oracles in game $\textrm{G}_2$ are deterministic and because $\mathcal {F}_{\textrm{UCORR}}$ gets the initial channel states as input. Adversary $\mathcal {F}_{\textrm{UCORR}}$ also calls its own oracles $\textsc {Send}$ and $\textsc {Recv}$ during the simulation of the corresponding oracles of $\textrm{G}_2$, but the responses from these calls are ignored (i.e. these calls do not affect the simulation of $\textrm{G}_2$). We claim that at the end of each oracle call that is simulated by $\mathcal {F}_{\textrm{UCORR}}$, the values of $\textsf{tr}_{\mathcal {I}}, \textsf{tr}_{\mathcal {R}}, \textit{st}_{\mathcal {I}}, \textit{st}_{\mathcal {R}}$ in game $\textrm{G}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}_{\textrm{UCORR}}}$ are the same as the corresponding values in the simulated game $\textrm{G}_2$. More precisely, this is true at least until the $\textsf{win}$ flag is set in game $\textrm{G}_2$; we will justify this below. Based on this claim, it follows that setting the $\textsf{win}$ flag in the simulated game $\textrm{G}_2$ also results in setting it in game $\textrm{G}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}_{\textrm{UCORR}}}$. We have

$$\begin{aligned} \Pr [\textrm{G}_2] \le \Pr [\textrm{G}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}_{\textrm{UCORR}}}]. \end{aligned}$$

The above claim is trivially true with respect to $\textsf{tr}_{\mathcal {I}}, \textit{st}_{\mathcal {I}}$. For the analysis of $\textsf{tr}_{\mathcal {R}}, \textit{st}_{\mathcal {R}}$ consider what happens when $\mathcal {F}_{\textrm{UCORR}}$ calls its own oracle $\textsc {Recv}$ from the simulated oracle $\textsc {RecvSim}$. Observe that the value of $m^*$ derived in $\textsc {Recv}$ is equal to the corresponding value in $\textsc {RecvSim}$. This is true because $\textsc {RecvSim}$ can be thought of as expanding and evaluating the code of the construction $\textsf{supp}= \textsf{SUPP}\text {-}\textsf{FUNC}\text {-}\textsf{FROM}\text {-}\textsf{PRED}[\textsf{suppPred}, \textsf{convertTr}]$ from Fig. 61. However, note that $\textsc {RecvSim}$ adds $m^*$ to $\textsf{tr}_{\mathcal {R}}$ whereas $\textsc {Recv}$ instead adds the real output of $\mathsf {\textsf{CH}.Recv}(\textit{st}_{\mathcal {R}}, \cdot )$ to $\textsf{tr}_{\mathcal {R}}$. It follows that the receiver’s transcript and state will remain consistent across $\textrm{G}^{\textsf{ucorr}}_{\textsf{CH}, \textsf{supp}, \mathcal {F}_{\textrm{UCORR}}}$ and $\textrm{G}_2$ for as long as $\textsf{win}$ is not set.

The inequality stated in the proposition follows from the above steps. This concludes the proof. $\square $

Proposition 4

Let $\textsf{CH}$ be a channel. Let $\textsf{suppPred}$ be an index-recovering support predicate. Let $\textsf{convertTr}$ be a transcript conversion algorithm that is correct with respect to $\textsf{CH}, \textsf{suppPred}$. Let $\textsf{supp}= \textsf{SUPP}\text {-}\textsf{FUNC}\text {-}\textsf{FROM}\text {-}\textsf{PRED}[\textsf{suppPred}, \textsf{convertTr}]$ be the support function as defined in Fig. 61. Let $\mathcal {A}_{\textrm{PREDAE}}$ be any adversary against the $\textrm{PREDAE}$-security of $\textsf{CH}$ with respect to $\textsf{suppPred}$. Then we can build an adversary $\mathcal {F}_{\textrm{PREDCORR}}$ against the unidirectional predicate-based correctness of $\textsf{CH}$ with respect to $\textsf{suppPred}$, and an adversary $\mathcal {A}_{\textrm{UAE}}$ against the $\textrm{UAE}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {A}_{\textrm{PREDAE}})&\le 2 \cdot \textsf{Adv}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {F}_{\textrm{PREDCORR}}) \\&\quad + \textsf{Adv}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}_{\textrm{UAE}}). \end{aligned}$$

Proof

This proof uses games $\textrm{G}_0$–$\textrm{G}_2$ in Fig. 65. Game $\textrm{G}_0$ is designed to be equivalent to game $\textrm{G}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {A}_{\textrm{PREDAE}}}$. It rewrites the code of oracle $\textsc {Recv}$ as follows. Consider the original definition of oracle $\textsc {Recv}$ in game $\textrm{G}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {A}_{\textrm{PREDAE}}}$ from Fig. 60. First, observe that the original oracle always returns $\bot $ or $\lightning $, and the latter is returned iff $(m^r \ne m^c) \wedge (b = 1)$. The last two lines of the rewritten oracle reflect this observation. Next, observe in the original oracle that the block of code computing $\textsf{suppPred}$ and checking $d\ne \texttt {false}$ could in principle be evaluated regardless of the challenge bit’s value. So in the rewritten oracle we move these instructions up, to the very beginning of the oracle’s code. Game $\textrm{G}_0$ also contains code marked in green that builds a table $\textsf{T}[\cdot ]$ in the same way as in the unidirectional (predicate-based) correctness game, but this code does not affect the functionality of $\textrm{G}_0$. We can conclude that the games are equivalent and hence

$$\begin{aligned} \textsf{Adv}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {A}_{\textrm{PREDAE}}) = 2 \cdot \Pr [\textrm{G}_0] - 1. \end{aligned}$$

Game $\textrm{G}_1$ is obtained by adding a single instruction to oracle $\textsc {Recv}$ in game $\textrm{G}_0$ that sets $m^c \leftarrow m$ whenever $m^c \ne m$. This basically ensures that $m^c$ is always equal to the plaintext value from $\textsf{T}[d]$. We argue that adversary $\mathcal {A}$ is unlikely to cause $m^c \ne m$, and hence it would not be able to distinguish between $\textrm{G}_0$ and $\textrm{G}_1$. In particular, observe that the “correct” instance of the channel (represented by state $\textit{st}_{\mathcal {R}}^c$) is evaluated only on supported ciphertexts, so the correctness of $\textsf{CH}$ with respect to $\textsf{suppPred}$ would require its output $m^c$ to be equal to the plaintext from $\textsf{T}[d]$. Formally, games $\textrm{G}_0$ and $\textrm{G}_1$ are identical until $\textsf{bad}$ is set. We have

$$\begin{aligned} \Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] \le \Pr [\textsf{bad}^{\textrm{G}_0}]. \end{aligned}$$

We bound the probability of $\Pr [\textsf{bad}^{\textrm{G}_0}]$ by building adversary $\mathcal {F}_{\textrm{PREDCORR}}$ in Fig. 66 against the unidirectional predicate-based correctness of $\textsf{CH}$ with respect to $\textsf{suppPred}$ such that

$$\begin{aligned} \Pr [\textsf{bad}^{\textrm{G}_0}] \le \textsf{Adv}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {F}_{\textrm{PREDCORR}}). \end{aligned}$$

This adversary perfectly simulates game $\textrm{G}_0$ for $\mathcal {A}_{\textrm{PREDAE}}$ and wins in game $\textrm{G}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}, \mathcal {F}_{\textrm{PREDCORR}}}$ whenever $\mathcal {A}_{\textrm{PREDAE}}$ sets the $\textsf{bad}$ flag in game $\textrm{G}_0$.

Game $\textrm{G}_2$ is obtained from game $\textrm{G}_1$ by replacing the support transcripts of the robust channel framework [28] with the support transcripts of our framework. The transcript conversion algorithm $\textsf{convertTr}$ is used to convert the transcripts, and its assumed correctness with respect to $\textsf{CH}$, $\textsf{suppPred}$ guarantees that the two games are equivalent. It follows that

$$\begin{aligned} \Pr [\textrm{G}_1] = \Pr [\textrm{G}_2]. \end{aligned}$$

We do not provide an adversary against the correctness of $\textsf{convertTr}$; its construction is straightforward.

Finally, we argue that game $\textrm{G}_2$ is equivalent to game $\textrm{G}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}_{\textrm{PREDAE}}}$ with only one minor difference, namely until $(m^r \ne m^c) \wedge (b = 1)$ becomes true in $\textrm{G}_2$. First, observe that in game $\textrm{G}_2$ the table entry $\textsf{T}[d]$ and the transcript entry $\textsf{tr}_{\mathcal {I}}[d]$ contain the same plaintext, i.e. $m^c = m^*$. Next, the $\textsc {Recv}$ oracle in $\textrm{G}_2$ can be thought of having been obtained by expanding the code of $\textsf{supp}= \textsf{SUPP}\text {-}\textsf{FUNC}\text {-}\textsf{FROM}\text {-}\textsf{PRED}[\textsf{suppPred}, \textsf{convertTr}]$ from Fig. 61 inside the $\textsc {Recv}$ oracle of game $\textrm{G}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}_{\textrm{PREDAE}}}$. The only distinction is that $\textrm{G}_2$ populates the receiver’s transcript $\textsf{tr}_{\mathcal {R}}^{\textsf{pred}}$ with entries containing $m^*$ whereas $\textrm{G}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}_{\textrm{PREDAE}}}$ instead uses the plaintexts returned by algorithm $\mathsf {\textsf{CH}.Recv}$. This does not affect the input–output distribution of either oracle when the challenge bit is $b=0$. In contrast, if $b=1$ then the input–output distribution of $\textsc {Recv}$ in games $\textrm{G}_2$ and $\textrm{G}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}_{\textrm{PREDAE}}}$ is only identical until $(m^r \ne m^c) \wedge (b = 1)$ becomes true for the first time. We define adversary $\mathcal {A}_{\textrm{UAE}}$ in Fig. 66 against the $\textrm{UAE}$-security of $\textsf{CH}$ with respect to $\textsf{supp}$ that simulates game $\textrm{G}_2$ for adversary $\mathcal {A}_{\textrm{PREDAE}}$ by calling its own corresponding oracles and returning the received responses back to $\mathcal {A}_{\textrm{PREDAE}}$ without any additional processing, except it uses ${\textbf{abort}}(1)$ to halt with output 1 as soon as it detects that $\mathcal {A}_{\textrm{PREDAE}}$ triggered the aforementioned condition in $\textrm{G}_2$. We have

$$\begin{aligned} \Pr [\textrm{G}_2] \le \Pr [\textrm{G}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}, \mathcal {A}_{\textrm{UAE}}}]. \end{aligned}$$

Combining all of the above steps, we can write

$$\begin{aligned} \textsf{Adv}^{\textsf{predae}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {A}_{\textrm{PREDAE}})&= 2 \cdot \left( \sum _{i = 0}^{1} (\Pr [\textrm{G}_i] - \Pr [\textrm{G}_{i+1}]) + \Pr [\textrm{G}_2]\right) - 1 \\&\le \! 2 \!\cdot \!\textsf{Adv}^{\textsf{predcorr}}_{\textsf{CH}, \textsf{suppPred}}(\mathcal {F}_{\textrm{PREDCORR}}) \!+\! \textsf{Adv}^{\textsf{uae}}_{\textsf{CH}, \textsf{supp}}(\mathcal {A}_{\textrm{UAE}}). \end{aligned}$$

This concludes the proof. $\square $

Message encoding scheme of MTProto

Figure 67 defines an approximation of the current $\textsf{ME}$ construction in MTProto, where header fields have encodings of fixed size as in Sect. 4.1. Salt generation is specified as an abstract call within $\mathsf {\textsf{ME}.Init}$. Table $\textsf {S} $ contains 64-bit $\textsf {server}\_\textsf {salt} $ values, each associated to some time period; algorithm $\textsf{GenerateSalts} $ generates this table; algorithms $\textsf{GetSalt} $ and $\textsf{ValidSalts} $ are used to choose and validate salt values depending on the current timestamp. $\textsf {M} $ is a fixed-size set that stores $(\textsf {msg}\_\textsf {id}, \textsf {msg}\_\textsf {seq}\_\textsf {no})$ for each of recently received messages; when $\textsf {M} $ reaches its maximum size, the entries with the smallest msg_id are removed first. $\textsf {M}.\textsf {IDs} $ is the set of msg_ids in $\textsf {M} $. Time constants $t_p$ and $t_f$ determine the range of timestamps (from the past or future) that should be accepted; these constants are in the same encoding as $\textit{aux}$. We assume all strings are byte-aligned.

We omit specifying containers or acknowledgement messages, though they are not properly separated from the main protocol logic in implementations. We stress that because implementations of MTProto differ even in protocol details, it would be impossible to define a single $\textsf{ME}$ scheme, so Fig. 67 shows an approximation. For instance, the $\textsf{GenPadding} $ function in Android has randomised padding length which is at most 240 bytes, whereas the same function on desktop does not randomise the padding length. Different client/server behaviour is captured by $\textit{u}= \mathcal {I}$ representing the client and $\textit{u}= \mathcal {R}$ representing the server, and we assume that $\mathcal {I}$ always sends the first message.

Proofs for the underlying MTProto primitives

In this appendix we provide the reductions referred to in Sects. 5.1 and 5.3. The code added for the transition between games is highlighted in . In the adversaries, the instructions mark the changes in the code of the simulated games.

1.1 $\textrm{OTWIND}$ of $\textsf{MTP}\text {-}\textsf{HASH}$

Proposition 5 shows that $\textsf{MTP}\text {-}\textsf{HASH}$ is a one-time weak indistinguishable function (Fig. 25) if $\textsf{SHACAL}-\textsf{1}$ is a one-time pseudorandom function (Fig. 3). At a high level, our proof uses the fact that the construction of $\textsf{MTP}\text {-}\textsf{HASH}$ evaluates $\textsf{SHACAL}-\textsf{1}$ using uniformly random independent keys, and hence produces random-looking outputs if $\textsf{SHACAL}-\textsf{1}$ is a PRF. The final $\textsf{SHACAL}-\textsf{1}$ call on a known constant (the padding) cannot improve the distinguishing advantage; this is a special case of the data processing inequality.

Proposition 5

Let $\mathcal {D}_{\textrm{OTWIND}}$ be an adversary against the $\textrm{OTWIND}$-security of the function family $\textsf{MTP}\text {-}\textsf{HASH}$ from Definition 7. Then we can build an adversary $\mathcal {D}_{\textrm{OTPRF}}$ against the $\textrm{OTPRF}$-security of the block cipher $\textsf{SHACAL}-\textsf{1}$ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{otwind}}_{\textsf{MTP}\text {-}\textsf{HASH}}(\mathcal {D}_{\textrm{OTWIND}}) \le 2 \cdot \textsf{Adv}^{\textsf{otprf}}_{\textsf{SHACAL}-\textsf{1}}(\mathcal {D}_{\textrm{OTPRF}}). \end{aligned}$$

Proof

Recall that $\textsf{SHA}-\textsf{1}$ operates on 512-bit input blocks. Padding is appended at the end of the last input block. If the message size is already a multiple of the block size (as it is in $\textsf{MTP}\text {-}\textsf{HASH}$), a new input block is added. For a message of length 2048, we denote the added block of padding by $x_p$. Define P as the public function $P(H_i) :=h _{160}(H_i, x_p)$, i.e. the last iteration of $\textsf{SHA}-\textsf{1}$ over the padding block.

Consider games $\textrm{G}_0$–$\textrm{G}_1$ in Fig. 68. Game $\textrm{G}_0$ expands the code of algorithm $\textsf{MTP}\text {-}\textsf{HASH}.\textsf{Ev}$ in game $\textrm{G}^\textsf{otwind}_{\textsf{MTP}\text {-}\textsf{HASH}, \mathcal {D}_{\textrm{OTWIND}}}$. The evaluation of function family $\textsf{MTP}\text {-}\textsf{HASH}$ (on 2048-bit long inputs) can be expanded into five calls to the compression function $h _{160}$ of $\textsf{SHA}-\textsf{1}$. The third and fourth calls to the compression function $h _{160}$ would take as input two blocks that are formed from the function key of $\textsf{MTP}\text {-}\textsf{HASH}$, i.e. $\textit{hk}[32:1056]$. Game $\textrm{G}_0$ rewrites these calls to use two invocations of $\textsf{SHACAL}-\textsf{1}.\textsf{Ev}$ accordingly, using uniformly random and independent keys $\textit{hk}[32:544]$ and $\textit{hk}[544:1056]$. Game $\textrm{G}_0$ is functionally equivalent to game $\textrm{G}^\textsf{otwind}_{\textsf{MTP}\text {-}\textsf{HASH}, \mathcal {D}_{\textrm{OTWIND}}}$, so $\Pr [\textrm{G}_0] = \Pr [\textrm{G}^\textsf{otwind}_{\textsf{MTP}\text {-}\textsf{HASH}, \mathcal {D}_{\textrm{OTWIND}}}]$. We then construct game $\textrm{G}_1$ in which the outputs of the aforementioned $\textsf{SHACAL}-\textsf{1}.\textsf{Ev}$ calls are replaced with random values. In this game, the adversary $\mathcal {D}_{\textrm{OTWIND}}$ is given $\textsf {auth}\_\textsf {key}\_\textsf {id} = P(H_3 \;\hat{+}\;r_1)[96:160]$ for a uniformly random value $r_1$ that does not depend on the challenge bit b, so the probability of $\mathcal {D}_{\textrm{OTWIND}}$ winning in this game is $\Pr [\textrm{G}_1] = \frac{1}{2}$.

We construct an adversary $\mathcal {D}_{\textrm{OTPRF}}$ against the $\textrm{OTPRF}$-security of $\textsf{SHACAL}-\textsf{1}$ as shown in Fig. 69 such that $\Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] = \textsf{Adv}^{\textsf{otprf}}_{\textsf{SHACAL}-\textsf{1}}(\mathcal {D}_{\textrm{OTPRF}})$. Let $d$ be the challenge bit in game $\textrm{G}^\textsf{otprf}_{\textsf{SHACAL}-\textsf{1}, \mathcal {D}_{\textrm{OTPRF}}}$ and $d'$ be the output of the adversary in that game. If $d=1$ then queries to $\textsc {RoR}$ made by $\mathcal {D}_{\textrm{OTPRF}}$ return the output of evaluating function $\textsf{SHACAL}-\textsf{1}$ with random keys. If $d=0$ then each call to $\textsc {RoR}$ returns a uniformly random value from $\{0,1\}^{\textsf{SHACAL}-\textsf{1}.\textsf{ol}}$.

We can write:

$$\begin{aligned} \textsf{Adv}^{\textsf{otprf}}_{\textsf{SHACAL}-\textsf{1}}(\mathcal {D}_{\textrm{OTPRF}})&= \Pr \left[ \,d'=1\,|\,d=1\, \right] - \Pr \left[ \,d'=1\,|\,d=0\, \right] \\&= \Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] \\&= \frac{1}{2} \cdot \left( \textsf{Adv}^{\textsf{otwind}}_{\textsf{MTP}\text {-}\textsf{HASH}}(\mathcal {D}_{\textrm{OTWIND}}) + 1\right) - \frac{1}{2} \\&= \frac{1}{2} \cdot \textsf{Adv}^{\textsf{otwind}}_{\textsf{MTP}\text {-}\textsf{HASH}}(\mathcal {D}_{\textrm{OTWIND}}). \end{aligned}$$

The inequality follows. $\square $

1.2 $\textrm{RKPRF}$ of $\textsf{MTP}\text {-}\textsf{KDF}$

We now reduce the related-key PRF security of $\textsf{MTP}\text {-}\textsf{KDF}$ to the leakage-resilient, related-key PRF security of $\textsf{SHACAL}-\textsf{2}$. Recall that $\textsf{MTP}\text {-}\textsf{KDF}$ is defined in Definition 9 to return concatenated outputs of two $\textsf{SHA}-\textsf{256}$ calls, when evaluated on inputs $\textsf{msg}{\_}\textsf{key}~\Vert ~\textit{kk}_0$ and $\textit{kk}_0~\Vert ~\textsf{msg}{\_}\textsf{key}$, respectively. The key observation here is that these two strings are both only 416 bits long, so the resulting SHA-padded payloads $\textit{sk} _0 = \textsf{SHA}-\textsf{pad}(\textsf{msg}{\_}\textsf{key}~\Vert ~\textit{kk}_0)$ and $\textit{sk} _1 = \textsf{SHA}-\textsf{pad}(\textit{kk}_0~\Vert ~\textsf{msg}{\_}\textsf{key})$ each consist of a single 512-bit block. So for $\textsf{SHA}-\textsf{256}$ compression function $h _{256}$ and initial state $\textsf{IV}_{256}$ we need to show that $h _{256}(\textsf{IV}_{256}, \textit{sk} _0) ~\Vert ~h _{256}(\textsf{IV}_{256}, \textit{sk} _1)$ is indistinguishable from a uniformly random string. When this is expressed through the underlying block cipher $\textsf{SHACAL}-\textsf{2}$, it is sufficient that $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(\textit{sk} _0, \textsf{IV}_{256})$ and $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(\textit{sk} _1, \textsf{IV}_{256})$ both look independent and uniformly random (even while an adversary can choose the values of $\textsf{msg}{\_}\textsf{key}$ that are used to build $\textit{sk} _0, \textit{sk} _1$). This requirement is exactly satisfied if $\textsf{SHACAL}-\textsf{2}$ is assumed to be a related-key PRF for appropriate related-key-deriving functions, which in Sect. 5.2 was formalised as the notion of $\textrm{LRKPRF}$-security with respect to $\phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}$. We capture this claim in Proposition 6.

Proposition 6

Let $\mathcal {D}_{\textrm{RKPRF}}$ be an adversary against the $\textrm{RKPRF}$-security of the function family $\textsf{MTP}\text {-}\textsf{KDF}$ from Definition 9 with respect to the related-key-deriving function $\phi _{\textsf{KDF}}$ from Fig. 24. Let $\phi _{\textsf{SHACAL}-\textsf{2}}$ be the related-key-deriving function as defined in Fig. 29. Then, we can build an adversary $\mathcal {D}_{\textrm{LRKPRF}}$ against the $\textrm{LRKPRF}$-security of the block cipher $\textsf{SHACAL}-\textsf{2}$ with respect to related-key-deriving functions $\phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}$ (abbrev. with $\phi _{}$) such that

$$\begin{aligned} \textsf{Adv}^{\textsf{rkprf}}_{\textsf{MTP}\text {-}\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}}) \le 2 \cdot \textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{}}(\mathcal {D}_{\textrm{LRKPRF}}). \end{aligned}$$

Proof

Consider game $\textrm{G}^\textsf{rkprf}_{\textsf{MTP}\text {-}\textsf{KDF}, \phi _{\textsf{KDF}}, \mathcal {D}_{\textrm{RKPRF}}}$ (Fig. 26) defining the $\textrm{RKPRF}$-security experiment in which the adversary $\mathcal {D}_{\textrm{RKPRF}}$ plays against the function family $\textsf{MTP}\text {-}\textsf{KDF}$ with respect to the related-key-deriving function $\phi _{\textsf{KDF}}$. We first rewrite the game in a functionally equivalent way as $\textrm{G}_0$ in Fig. 70 using the definition of algorithm $\textsf{MTP}\text {-}\textsf{KDF}.\textsf{Ev}$, expanded to $\textsf{SHA}-\textsf{256}$ and then expressed through the underlying block cipher $\textsf{SHACAL}-\textsf{2}$, which is called twice on related keys, each built by appending SHA padding to $\textsf{msg}{\_}\textsf{key}~\Vert ~\textit{kk}_0$ or $\textit{kk}_1~\Vert ~\textsf{msg}{\_}\textsf{key}$. We have $\textsf{Adv}^{\textsf{rkprf}}_{\textsf{MTP}\text {-}\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}}) = 2\cdot \Pr [\textrm{G}_0] - 1$. Then $\textrm{G}_1$ rewrites the derivation of $\textit{sk} _0,\textit{sk} _1$ in $\textrm{G}_0$ in terms of the related-key-deriving function $\phi _{\textsf{SHACAL}-\textsf{2}}$ (Fig. 29). Game $\textrm{G}_1$ is functionally equivalent to game $\textrm{G}_0$, so $\Pr [\textrm{G}_1] = \Pr [\textrm{G}_0]$. Finally, game $\textrm{G}_2$ replaces both $\textsf{SHACAL}-\textsf{2}$ outputs with uniformly random values that are independent of the challenge bit. In this game $\mathcal {D}_{\textrm{RKPRF}}$ can have no advantage better than simply guessing the challenge bit, so $\Pr [\textrm{G}_2] = \frac{1}{2}$.

Here we construct an adversary $\mathcal {D}_{\textrm{LRKPRF}}$ against the $\textrm{LRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}$ as shown in Fig. 71 such that $\Pr [\textrm{G}_1] - \Pr [\textrm{G}_2] = \textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{}}(\mathcal {D}_{\textrm{LRKPRF}})$. Let $d$ be the challenge bit in $\textrm{G}^\textsf{lrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{},\mathcal {D}_{\textrm{LRKPRF}}}$ and $d'$ be the output of the adversary in that game. If $d=1$ then calls to oracle $\textsc {RoR}$ made by $\mathcal {D}_{\textrm{LRKPRF}}$ are $\textsf{SHACAL}-\textsf{2}$ invocations with related and partially-chosen keys; we have $\Pr \left[ \,d'=1\,|\,d=1\, \right] = \Pr [\textrm{G}_1]$. If $d=0$ then each call to oracle $\textsc {RoR}$ draws a uniformly random value $r_i$ and so $k_1 = (\textsf{IV}_{256} \;\hat{+}\;r_0) ~\Vert ~(\textsf{IV}_{256} \;\hat{+}\;r_1)$ is a uniformly random string; we have $\Pr \left[ \,d'=1\,|\,d=0\, \right] = \Pr [\textrm{G}_2]$.

We can use the above to write:

$$\begin{aligned} \textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{}}(\mathcal {D}_{\textrm{LRKPRF}})&= \Pr \left[ \,d'=1\,|\,d=1\, \right] - \Pr \left[ \,d'=1\,|\,d=0\, \right] \\&= \Pr [\textrm{G}_1] - \Pr [\textrm{G}_2] \\&= \Pr [\textrm{G}_0] - \Pr [\textrm{G}_2] \\&= \frac{1}{2} \left( \textsf{Adv}^{\textsf{rkprf}}_{\textsf{MTP}\text {-}\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}}) + 1\right) - \frac{1}{2} \\&= \frac{1}{2} \textsf{Adv}^{\textsf{rkprf}}_{\textsf{MTP}\text {-}\textsf{KDF}, \phi _{\textsf{KDF}}}(\mathcal {D}_{\textrm{RKPRF}}). \end{aligned}$$

The inequality follows. $\square $

1.3 $\textrm{UPRKPRF}$ of $\textsf{MTP}\text {-}\textsf{MAC}$

The $\textrm{UPRKPRF}$-security (Fig. 28) of $\textsf{MTP}\text {-}\textsf{MAC}$ roughly requires that the function $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}(\textit{mk}_\textit{u}, \cdot )$ is a related-key PRF (simultaneously for both $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$) if the adversary is only allowed to evaluate this function on inputs that have distinct 256-bit prefixes.

Recall that $\textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}(\textit{mk}_\textit{u}, p)$ is defined to return a truncated output of $\textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p)$ where the key $\textit{mk}_\textit{u}$ is 256-bit long for any $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$, and the payload $p$ is guaranteed (according to the definition of $\textsf{MTP}\text {-}\textsf{ME}$) to be longer than 256 bits. Furthermore, the construction of $\textsf{MTP}\text {-}\textsf{ME}$ ensures that the 256-bit prefix of $p$ will be unique (as long as the number of total produced payloads is upper-bounded by some large constant) because this prefix of $p$ encodes various counters. This enables us to consider the output of the first $\textsf{SHA}-\textsf{256}$ compression function $h _{256}$ while evaluating $\textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p)$; we can assume that this output is uniformly random by assuming the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ (as defined in Sect. 5.2). Now it remains to show that every next $h _{256}$ call that is made to evaluate $\textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p)$ will return a uniformly random output as well, which is true when $h _{256}$ is assumed to be a PRF.

We start with the latter step, showing that the Merkle–Damgård construction is a secure PRF as long as the underlying compression function is a secure PRF. This claim about the Merkle–Damgård transform is analogous to the basic cascade PRF security proved in [12], except that we only prove one-time security and hence we do not require prefix-free inputs.

Lemma 1

Consider the compression function $h _{256}$ of $\textsf{SHA}-\textsf{256}$. Let $\textsf{H} $ be the corresponding function family with $\textsf{H}.\textsf{Ev} = h _{256}$, $\textsf{H}.\textsf{kl} = \textsf{H}.\textsf{ol} = 256$, $\textsf{H}.\textsf{IN} = \{0,1\}^{512}$. Let $\mathcal {D}_\textsf{md}$ be an adversary against the $\textrm{OTPRF}$-security of the function family $\textsf{MD}[h _{256}]$ from Sect. 2.2 that makes queries of length at most $T$ blocks (i.e. at most $T\cdot 512$ bits). Then we can build an adversary $\mathcal {D}_\textsf{compr}$ against the $\textrm{OTPRF}$-security of $\textsf{H} $ such that

$$\begin{aligned} \textsf{Adv}^{\textsf{otprf}}_{\textsf{MD}[h _{256}]}(\mathcal {D}_\textsf{md}) \le T\cdot \textsf{Adv}^{\textsf{otprf}}_{\textsf{H}}(\mathcal {D}_\textsf{compr}). \end{aligned}$$

Proof

This proof uses games $\textrm{G}_0$–$\textrm{G}_{T}$ in Fig. 72. The oracle $\textsc {RoR}$ on input x returns $\textsf{MD}[h _{256}].\textsf{Ev}(H_0, x)$ for a uniformly random key $H_0 \in \textsf{H}.\textsf{KS}$ in game $\textrm{G}_0$, and it returns a uniformly random value from $\{0,1\}^{\textsf{H}.\textsf{ol}}$ in game $\textrm{G}_T$.

Let b be the challenge bit in game $\textrm{G}^\textsf{otprf}_{\textsf{MD}[h _{256}], \mathcal {D}_\textsf{md}}$, and let $b'$ be the output of $\mathcal {D}_\textsf{md}$ in that game. Then we have

$$\begin{aligned} \textsf{Adv}^{\textsf{otprf}}_{\textsf{MD}[h _{256}]}(\mathcal {D}_\textsf{md})&= \Pr \left[ \,b' = 1\,|\,b = 1\, \right] - \Pr \left[ \,b' = 1\,|\,b = 0\, \right] \nonumber \\&= \Pr [\textrm{G}_0] - \Pr [\textrm{G}_T] \nonumber \\&= \sum _{q=1}^{T} \left( \Pr [\textrm{G}_{q-1}] - \Pr [\textrm{G}_q]\right) . \end{aligned}$$

(3)

Consider adversary $\mathcal {D}_\textsf{compr}$ in Fig. 73. Let h be the value sampled in the first step of $\mathcal {D}_\textsf{compr}$. For any choice of $h \in \{1, \ldots , T\}$, adversary $\mathcal {D}_\textsf{compr}$ (playing in game $\textrm{G}^\textsf{otprf}_{\textsf{H}}$) perfectly simulates the view of $\mathcal {D}_\textsf{md}$ in either $\textrm{G}_{h-1}$ or $\textrm{G}_h$, depending on whether $\mathcal {D}_\textsf{compr}$’s oracle $\textsc {RoR}$ is returning real evaluations of $\textsf{H}.\textsf{Ev}$ or uniformly random values from $\{0,1\}^{\textsf{H}.\textsf{ol}}$.

Let d be the challenge bit in game $\textrm{G}^\textsf{otprf}_{\textsf{H}, \mathcal {D}_\textsf{compr}}$, and let $d'$ be the output of $\mathcal {D}_\textsf{compr}$ in that game. It follows that for any $j\in \{1, \ldots , T\}$ we have

$$\begin{aligned} \Pr [\textrm{G}_{j-1}]&= \Pr \left[ \,d' = 1\,|\,d = 1, h = j\, \right] , \\ \Pr [\textrm{G}_{j}]&= \Pr \left[ \,d' = 1\,|\,d = 0, h = j\, \right] . \end{aligned}$$

Let us express $\Pr \left[ \,d' = 1\,|\,d = 1\, \right] $ and $\Pr \left[ \,d' = 1\,|\,d = 0\, \right] $ using the above:

$$\begin{aligned} \Pr \left[ \,d' = 1\,|\,d = 1\, \right]&= \sum _{q=1}^{T} \Pr [h = j] \cdot \Pr \left[ \,d' = 1\,|\,d = 1, h = j\, \right] \\&= \frac{1}{T} \sum _{q=1}^{T} \Pr \left[ \,d' = 1\,|\,d = 1, h = j\, \right] . \\ \Pr \left[ \,d' = 1\,|\,d = 0\, \right]&= \sum _{q=1}^{T} \Pr [h = j] \cdot \Pr \left[ \,d' = 1\,|\,d = 0, h = j\, \right] \\&= \frac{1}{T} \sum _{q=1}^{T} \Pr \left[ \,d' = 1\,|\,d = 0, h = j\, \right] . \end{aligned}$$

We can now rewrite Eq. (3) as follows:

$$\begin{aligned}&\textsf{Adv}^{\textsf{otprf}}_{\textsf{MD}[h _{256}]}(\mathcal {D}_\textsf{md}) \\&\quad = \sum _{q=1}^{T} \left( \Pr \left[ \,d' = 1\,|\,d = 1, h = j\, \right] - \Pr \left[ \,d' = 1\,|\,d = 0, h = j\, \right] \right) \\&\quad = T\cdot (\Pr \left[ \,d' = 1\,|\,d = 1\, \right] - \Pr \left[ \,d' = 1\,|\,d = 0\, \right] ) \\&\quad = T\cdot \textsf{Adv}^{\textsf{otprf}}_{\textsf{H}}(\mathcal {D}_\textsf{compr}). \end{aligned}$$

This concludes the proof. $\square $

1.3.1 Security reduction for $\textsf{MTP}\text {-}\textsf{MAC}$

We are ready to state the main result about the security of $\textsf{MTP}\text {-}\textsf{MAC}$, which we reduce to two assumptions in Proposition 7: (a) that $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(k, m)$ is a PRF under known fixed $m$, partially known $k$ and related-key-deriving function $\phi _{\textsf{MAC}}$ and (b) that $h _{256}(k, \cdot )$ is a one-time PRF. Concretely, $h _{256}(a,b) :=a \;\hat{+}\;\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(b, a)$ and thus we require both assumptions to hold for $\textsf{SHACAL}-\textsf{2}$.^{Footnote 49} The former assumption is captured by the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$, whereas the latter was used in Lemma 1 in order to show that the MD transform inherits the PRF security of its underlying compression function (given that the initial state of the MD transform is already uniformly random).

Proposition 7

Let $\mathcal {D}_{\textrm{UPRKPRF}}$ be an adversary against the $\textrm{UPRKPRF}$-security of $\textsf{MTP}\text {-}\textsf{MAC}$ from Definition 8 under the related-key-deriving function $\phi _{\textsf{MAC}}$ from Fig. 24, for inputs whose 256-bit prefixes are distinct from each other. Then we can build an adversary $\mathcal {D}_{\textrm{HRKPRF}}$ against the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{MAC}}$, and an adversary $\mathcal {D}_{\textrm{OTPRF}}$ against the $\textrm{OTPRF}$-security of the Merkle–Damgård transform of $\textsf{SHA}-\textsf{256}$, captured as function family $\textsf{MD}[h _{256}]$ in Sect. 2.2, such that

$$\begin{aligned} \textsf{Adv}^{\textsf{uprkprf}}_{\textsf{MTP}\text {-}\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{UPRKPRF}}) \le&~2 \cdot \textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2},\phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{HRKPRF}}) \\&\quad + 2 \cdot \textsf{Adv}^{\textsf{otprf}}_{\textsf{MD}[h _{256}]}(\mathcal {D}_{\textrm{OTPRF}}). \end{aligned}$$

Proof

Consider game $\textrm{G}^\textsf{uprkprf}_{\textsf{MTP}\text {-}\textsf{MAC}, \phi _{\textsf{MAC}}, \mathcal {D}_{\textrm{UPRKPRF}}}$ (Fig. 28).

Recall that

$$\begin{aligned} \textsf{MTP}\text {-}\textsf{MAC}.\textsf{Ev}(\textit{mk}_\textit{u}, p)&= \textsf{SHA}-\textsf{256}(\textit{mk}_\textit{u}~\Vert ~p){[64:192]} \\&= \textsf{MD}[h _{256}].\textsf{Ev}(\textsf{IV}_{256}, \textsf{SHA}-\textsf{pad}(\textit{mk}_\textit{u}~\Vert ~p)){[64:192]}. \end{aligned}$$

We first rewrite the game in a functionally equivalent way as $\textrm{G}_0$ in Fig. 74, splitting the $\textsf{MD}[h _{256}].\textsf{Ev}$ call based on what happens to the first block of input. Since the first block contains a secret $\textit{mk}_\textit{u}$, it can be interpreted as providing security guarantees for a $\textsf{SHACAL}-\textsf{2}$ call keyed with the first block. $\textrm{G}_1$ thus captures that the output of the first $\textsf{SHACAL}-\textsf{2}$ call should be indistinguishable from random if $\textsf{SHACAL}-\textsf{2}$ is a leakage-resilient PRF under related keys, and $\textrm{G}_2$ extends it to the output of the first compression function call $h _{256}$; games $\textrm{G}_1$ and $\textrm{G}_2$ are functionally equivalent so we have $\Pr [\textrm{G}_1] = \Pr [\textrm{G}_2]$. Then, $\textrm{G}_3$ replaces the $\textsf{MD}$-transform call on the remaining input (if there is any) with a uniformly random value. This is the final reduction game, and it returns a random value regardless of the challenge bit, so $\mathcal {D}_{\textrm{UPRKPRF}}$ cannot have a better than guessing advantage to win, i.e. $\Pr [\textrm{G}_3] = \frac{1}{2}$.

We first build an adversary $\mathcal {D}_{\textrm{HRKPRF}}$ against the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{MAC}}$ as shown in Fig. 75, such that we obtain $\Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] = \textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{HRKPRF}})$. Next, we build an adversary $\mathcal {D}_{\textrm{OTPRF}}$ against the $\textrm{OTPRF}$-security of $\textsf{MD}[h _{256}]$ as shown in Fig. 76, such that $\Pr [\textrm{G}_1] - \Pr [\textrm{G}_2] = \textsf{Adv}^{\textsf{otprf}}_{\textsf{MD}[h _{256}]}(\mathcal {D}_{\textrm{OTPRF}})$. Note that $\mathcal {D}_{\textrm{OTPRF}}$ calls its oracle $\textsc {RoR}$ only if $\mathcal {D}_{\textrm{UPRKPRF}}$ calls $\textsc {RoRSim}$ on large enough inputs. However, adversary $\mathcal {D}_{\textrm{UPRKPRF}}$ does not benefit from calling its own $\textsc {RoR}$ oracle on smaller inputs because at this point in the security reduction we already swapped out the output of the first call to compression function $h _{256}$ with a uniformly random value.

We have the following:

$$\begin{aligned}&\textsf{Adv}^{\textsf{uprkprf}}_{\textsf{MTP}\text {-}\textsf{MAC}, \phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{UPRKPRF}}) \\&= 2 \cdot \Pr [\textrm{G}_0] - 1 \\&= 2 \cdot \left( \sum _{i=1}^{3} (\Pr [\textrm{G}_{i-1}] - \Pr [\textrm{G}_i]) + \Pr [\textrm{G}_3]\right) - 1 \\&= 2 \cdot (\textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}}(\mathcal {D}_{\textrm{HRKPRF}}) + \textsf{Adv}^{\textsf{otprf}}_{\textsf{MD}[h _{256}]}(\mathcal {D}_{\textrm{OTPRF}})). \end{aligned}$$

The inequality follows. $\square $

1.4 $\mathrm {OTIND\$}$ of $\textsf{IGE}$

Recall that the deterministic symmetric encryption scheme $\textsf{MTP}\text {-}\textsf{SE}$ is defined in Definition 10 as the IGE block cipher mode of operation, parametrised with the block cipher $\textsf{E}= \textsf{AES}-\textsf{256}$. We now show that IGE mode with any block cipher $\textsf{E}$ is one-time indistinguishable (i.e. $\mathrm {OTIND\$}$-secure; defined in Fig. 4) if the CBC mode, based on the same $\textsf{E}$, is one-time indistinguishable. This follows from an observation that the IGE encryption algorithm $\textsf{IGE}[\textsf{E}].\textsf{Enc}$, for any block cipher $\textsf{E}$, can be expressed in terms of the CBC encryption algorithm $\textsf{CBC}[\textsf{E}].\textsf{Enc}$ as shown in Fig. 77.

Proposition 8

Let $\textsf{E}$ be a block cipher. Consider the deterministic symmetric encryption schemes $\textsf{SE}_\textsf{IGE}= \textsf{IGE}[\textsf{E}]$ and $\textsf{SE}_\textsf{CBC}= \textsf{CBC}[\textsf{E}]$ as defined in Fig. 5. Let $\mathcal {D}_\textsf{IGE}$ be an adversary against the $\mathrm {OTIND\$}$-security of $\textsf{SE}_\textsf{IGE}$. Then we can build an adversary $\mathcal {D}_\textsf{CBC}$ against the $\mathrm {OTIND\$}$-security of $\textsf{SE}_\textsf{CBC}$ such that

$$\begin{aligned} \textsf{Adv}^{\mathsf {otind\$}}_{\textsf{SE}_\textsf{IGE}}(\mathcal {D}_\textsf{IGE}) \le \textsf{Adv}^{\mathsf {otind\$}}_{\textsf{SE}_\textsf{CBC}}(\mathcal {D}_\textsf{CBC}). \end{aligned}$$

Proof

Consider adversary $\mathcal {D}_\textsf{CBC}$ in Fig. 78. We now show that when this adversary plays in game $\textrm{G}^{\mathsf {otind\$}}_{\textsf{SE}_\textsf{CBC}, \mathcal {D}_\textsf{CBC}}$ for any challenge bit $b\in \{0,1\}$, it simulates game $\textrm{G}^{\mathsf {otind\$}}_{\textsf{SE}_\textsf{IGE}, \mathcal {D}_\textsf{IGE}}$ for adversary $\mathcal {D}_\textsf{IGE}$ with respect to the same challenge bit.

If $b=0$ in $\textrm{G}^{\mathsf {otind\$}}_{\textsf{SE}_\textsf{CBC}, \mathcal {D}_\textsf{CBC}}$, then $\textsc {RoR}(m')$ returns a uniformly random value as $c'$, which is preserved under XOR. If $b=1$, we get $c' = \textsf{SE}_\textsf{CBC}.\textsf{Enc}(k, m')$ for a uniformly random $\textsf{SE}_\textsf{CBC}$ challenge key $k = K ~\Vert ~c'_0$. Here $c'_i = \textsf{E}.\textsf{Ev}(K, m_i \oplus m_{i-2} \oplus c'_{i-1})$. Since $c_i = c'_i \oplus m_{i-1}$, we get $c_i = \textsf{E}.\textsf{Ev}(K, m_i \oplus c_{i-1}) \oplus m_{i-1}$ and so $c = \textsf{SE}_\textsf{IGE}.\textsf{Enc}(k ~\Vert ~m_0, m)$. In both cases adversary $\mathcal {D}_\textsf{CBC}$ perfectly simulates the $\textsc {RoR}$ oracle for adversary $\mathcal {D}_\textsf{IGE}$, so $\textsf{Adv}^{\mathsf {otind\$}}_{\textsf{SE}_\textsf{CBC}}(\mathcal {D}_\textsf{CBC}) = \textsf{Adv}^{\mathsf {otind\$}}_{\textsf{SE}_\textsf{IGE}}(\mathcal {D}_\textsf{IGE}$). $\square $

1.5 $\textrm{EINT}$ of $\textsf{MTP}\text {-}\textsf{ME}$ with respect to $\textsf{supp}\text {-}\textsf{ord}$

In this section, we prove that the message encoding scheme $\textsf{MTP}\text {-}\textsf{ME}$ provides encoding integrity with respect to the support function $\textsf{supp}\text {-}\textsf{ord}$ for adversaries that request at most $2^{96}$ encoded payloads. As discussed in Sect. 3.5, this means that $\textsf{MTP}\text {-}\textsf{ME}$ manages to prevent an attacker from silently replaying, reordering or dropping payloads in a channel that otherwise provides integrity (i.e. ensures that each received payload was at some point honestly produced by the opposite user). We note that if an adversary requests a single user to encode more than $2^{96}$ payloads, then this user’s $\textsf{MTP}\text {-}\textsf{ME}$ counter $N_\textsf {sent} $ wraps modulo $2^{96}$, allowing a trivial attack; below we will define an $\textrm{EINT}$-security adversary that in such case wins with advantage 1.

Proposition 9

Let $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$ and $\textsf {pb}, \textsf{bl}\in {{\mathbb {N}}}$. Denote by $\textsf{ME}$ $=$ $\textsf{MTP}\text {-}\textsf{ME}[\textsf {session}\_\textsf {id}$, $\textsf {pb} $, $\textsf{bl}]$ the message encoding scheme defined in Definition 6. Let $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$ be the support function defined in Fig. 32. Let $\mathcal {F}$ be any adversary against the $\textrm{EINT}$-security of $\textsf{ME}$ with respect to $\textsf{supp}$ making $q \le 2^{96}$ queries to its oracle $\textsc {Send}$. Then

$$\begin{aligned} \textsf{Adv}^{\textsf{eint}}_{\textsf{ME},\textsf{supp}}(\mathcal {F}) = 0. \end{aligned}$$

Proof

Consider game $\textrm{G}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}}$ (Fig. 16). For any receiver $\textit{u}\in \{\mathcal {I},\mathcal {R}\}$, the security game only allows oracle $\textsc {Recv}$ queries on inputs $\textit{u}, p, \textit{aux}$ such that the payload $p$ was previously honestly produced by the opposite user $\overline{\textit{u}}$ (i.e. $p$ was produced in response to a prior oracle call $\textsc {Send}(\overline{\textit{u}}, m', \textit{aux}', r')$ for some values $m', \textit{aux}', r'$). Thus it is sufficient to consider the following two cases, and show that the $\textsf{win}$ flag cannot be set $\texttt {true}$ in either of them: a) the payload $p$ was successfully decoded by a prior call to $\textsc {Recv}(\textit{u}, p, \cdot )$ (i.e. for an arbitrary auxiliary information value), and b) the payload $p$ was not successfully decoded by a prior call to $\textsc {Recv}(\textit{u}, p, \cdot )$.

In both cases, we will rely on the fact that the first $q=2^{96}$ calls to oracle $\textsc {Send}(\overline{\textit{u}}, \cdot , \cdot , \cdot )$ produce distinct payloads $p$. This is true because the algorithm $\mathsf {\textsf{ME}.Encode}$ ensures that every payload $p$ returned by $\textsc {Send}(\overline{\textit{u}}, \cdot , \cdot , \cdot )$ includes a $96$-bit counter $\textsf {seq}\_\textsf {no} $ (in a fixed position of $p$) that starts at $0^{96}$ and is incremented modulo $2^{96}$ after each time a message is encoded.

We now consider the two cases listed above. Let

$$\begin{aligned} p= \textsf {salt} ~\Vert ~\textsf {session}\_\textsf {id}~\Vert ~\textsf {seq}\_\textsf {no} ~\Vert ~\textsf {length} ~\Vert ~m' ~\Vert ~\textsf {padding}. \end{aligned}$$

Let $\textit{st}_{\textsf{ME}, \textit{u}}= (\textsf {session}\_\textsf {id}, \cdot , N_{\textsf {recv},{\textit{u}}})$ be the $\textsf{ME}$ state of the user $\textit{u}$ at the beginning of the current call to $\textsc {Recv}(\textit{u}, p, \textit{aux})$, where $\textit{aux}$ is an arbitrary auxiliary information string.

Payload $p$ is reused. There was a prior call to oracle $\textsc {Recv}(\textit{u}, p, \textit{aux}'')$ that successfully decoded $p$, meaning the transcript $\textsf{tr}_{\textit{u}}$ now contains $(\textsf{recv}, m', p, \textit{aux}'')$ for $m' \ne \bot $. We know that the condition $\textsf {seq}\_\textsf {no} = N_\textsf {recv} $ evaluated to $\texttt {true}$ inside algorithm $\mathsf {\textsf{ME}.Decode}$ during the prior call (where $\textsf {seq}\_\textsf {no} $ was parsed from $p[128 :224]$, and $N_\textsf {recv} < N_{\textsf {recv},{\textit{u}}} $ is a prerequisite to the prior decoding having succeeded). This means that the condition $\textsf {seq}\_\textsf {no} = N_{\textsf {recv},{\textit{u}}} $ will evaluate to $\texttt {false}$ during the current call, and the decoding will fail (i.e. return $\bot $). But the support function $\textsf{supp}(\textit{u}, \textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}, p, \textit{aux})$ likewise returns $m^*= \bot $, because $\textsf{find}(\textsf{recv}, {\textsf{tr}_{\textit{u}}, p})$ iterates over all $\textsf{recv}$-type entries in $\textsf{tr}_{\textit{u}}$ and finds a match for $p$ that corresponds to the decoded message $m' \not = \bot $. We are guaranteed that $m = m^*$, and hence $\mathcal {F}$ cannot set the $\textsf{win}$ flag in this case.

Payload $p$ is fresh. Either there was no $\textsc {Recv}(\textit{u}, p, \textit{aux}'')$ call in the past for any $\textit{aux}''$, or each entry $(\textsf{recv}, m, p, \cdot )$ in the transcript $\textsf{tr}_{\textit{u}}$ has $m = \bot $. The support function $\textsf{supp}(\textit{u}, \textsf{tr}_{\textit{u}}, \textsf{tr}_{\overline{\textit{u}}}, p, \textit{aux})$ first makes a call to $\textsf{find}(\textsf{recv}, {\textsf{tr}_{\textit{u}}, p})$ which returns $(n_{\textit{u}}, \bot )$ where $n_{\textit{u}}$ is the number of entries of $\textsf{tr}_{\textit{u}}$ of the form $(\textsf{recv}, m, p', \cdot )$ for $m \not = \bot $ and $p' \not = p$. Next, it calls $\textsf{find}(\textsf{sent}, {\textsf{tr}_{\overline{\textit{u}}}, p})$ which returns $(n_{\overline{\textit{u}}}, m')$ because $\textsf{tr}_{\overline{\textit{u}}}$ contains the entry $(\textsf{sent}, m', p, \textit{aux}')$, where $n_{\overline{\textit{u}}}$ is the number of entries of $\textsf{tr}_{\overline{\textit{u}}}$ that were sent before and including the target entry. Then the support function checks whether $n_{\overline{\textit{u}}} = n_{\textit{u}} + 1$.

Let us consider both $n_{\overline{\textit{u}}}$ and $n_{\textit{u}}$. Whenever an entry $(\textsf{recv}, m, p', \cdot )$ for $m \not = \bot $ is added to $\textsf{tr}_{\textit{u}}$, it means that the output of $\mathsf {\textsf{ME}.Decode}$ included a changed state that incremented the number of received messages by one. Hence $n_{\textit{u}} = N_{\textsf {recv},{\textit{u}}} $. Similarly, an entry $(\textsf{sent}, m, \cdot , \cdot )$ is only added to $\textsf{tr}_{\overline{\textit{u}}}$ when $\mathsf {\textsf{ME}.Encode}$ was called, saving the prior number of sent messages $N_{\textsf {sent},{\overline{\textit{u}}}} $ in the sequence number field $\textsf {seq}\_\textsf {no} $, then incrementing it by one and including it in the updated state of $\textsf{ME}$. It follows that $n_{\overline{\textit{u}}} = \textsf {seq}\_\textsf {no} + 1$ as long as $n_{\overline{\textit{u}}} \le 2^{96}$, which we assumed at the beginning. Then the support function and the algorithm $\mathsf {\textsf{ME}.Decode}(\textit{st}_{\textsf{ME}, \textit{u}}, p, \textit{aux})$ both evaluate the same condition, checking whether $\textsf {seq}\_\textsf {no} + 1 = N_{\textsf {recv},{\textit{u}}} + 1$. Hence the support function returns $m'$ if and only if $\mathsf {\textsf{ME}.Decode}$ does, and $\mathcal {F}$ cannot win in this case either. This concludes the proof. $\square $

1.5.1 Counter overflows

For completeness, let us now deal with the case of an overflow (modulo $2^{96}$) happening in the $N_\textsf {sent} $ and $N_\textsf {recv} $ counters of $\textsf{MTP}\text {-}\textsf{ME}$. In this case, we show that there exists an adversary that can trivially win with advantage 1.

Proposition 10

Let $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$ and $\textsf {pb}, \textsf{bl}\in {{\mathbb {N}}}$. Denote by $\textsf{ME}$ $=$ $\textsf{MTP}\text {-}\textsf{ME}[\textsf {session}\_\textsf {id}$, $\textsf {pb} $, $\textsf{bl}]$ the message encoding scheme defined in Definition 6. Let $\textsf{supp}= \textsf{supp}\text {-}\textsf{ord}$ be the support function defined in Fig. 32. Let $\mathcal {F}$ be an adversary against the $\textrm{EINT}$-security of $\textsf{ME}$ with respect to $\textsf{supp}$ as defined in Fig. 79, making $q = 2^{96} + 1$ queries to its oracle $\textsc {Send}$. Then

$$\begin{aligned} \textsf{Adv}^{\textsf{eint}}_{\textsf{ME},\textsf{supp}}(\mathcal {F}) = 1. \end{aligned}$$

Proof

Adversary $\mathcal {F}$ repeatedly queries its oracles $\textsc {Send}$ and $\textsc {Recv}$ in order to exhaust all possible values of the $96$-bit field $\textsf {seq}\_\textsf {no} $. When user $\mathcal {I}$ sends the $2^{96}$-th payload, its counter overflows (modulo $2^{96}$) to become $N_\textsf {sent} = 0$; after user $\mathcal {R}$ accepts this payload, its counter likewise overflows to become $N_\textsf {recv} = 0$. It follows that the next payload will be equal to the first payload, i.e. $p_0 = p_{2^{96} + 1}$.

This causes a mismatch: in $\mathsf {\textsf{ME}.Decode}$ the $\textsf {seq}\_\textsf {no} $ check passes because the counter wrapped around, and so it returns m. But the corresponding evaluation of $\textsf{supp}$ in game $\textrm{G}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}}$ determines that the label $p_{2^{96} + 1} = p_0$ was already received before (i.e. $\textsf{find}(\textsf{recv}, {\textsf{tr}_{\mathcal {R}}, p_{0}}) \rightarrow m \not = \bot $) so the support function returns $\bot $. This triggers the $\textsf{win}$ flag in game $\textrm{G}^{\textsf{eint}}_{\textsf{ME}, \textsf{supp}, \mathcal {F}}$. $\square $

1.6 $\textrm{UNPRED}$ of $\textsf{MTP}\text {-}\textsf{SE}$ and $\textsf{MTP}\text {-}\textsf{ME}$

We now prove unpredictability of the deterministic symmetric encryption scheme $\textsf{SE}= \textsf{MTP}\text {-}\textsf{SE}$ (Definition 10) with respect to the message encoding scheme $\textsf{ME}= \textsf{MTP}\text {-}\textsf{ME}$ (Definition 6). In our proof, we show that it is hard for any adversary $\mathcal {F}$ to find an $\textsf{SE}$ ciphertext $c_{\textit{se}}$ such that its decryption under a uniformly random key $k\in \{0,1\}^{\mathsf {\textsf{SE}.kl}}$ begins with $p_1 = \textsf {salt} ~\Vert ~\textsf {session}\_\textsf {id}$, where $\textsf {session}\_\textsf {id}$ is a value chosen by the adversary via $\textit{st}_\textsf{ME}$ and $\textsf {salt} $ is arbitrary.

Recall that Definition 10 specifies $\textsf{MTP}\text {-}\textsf{SE}= \textsf{IGE}[\textsf{AES}-\textsf{256}]$. We state and prove our result for a more general case of $\textsf{SE}= \textsf{IGE}[\textsf{E}]$, where $\textsf{E}$ is an arbitrary block cipher with block length $\textsf{E}.\textsf{ol} = 128$ (that matches the output block length $\textsf{bl}$ of $\textsf{ME}$).

Note that our proof is not tight, i.e. the advantage could potentially be lower if we also considered the $\textsf {seq}\_\textsf {no} $ and $\textsf {length} $ fields in the second block. However, this would complicate analysis and possibly overstate the security of MTProto as implemented, given that we made the modelling choice to check more fields in $\textsf{MTP}\text {-}\textsf{ME}$ upon decoding. The bound could also be improved if $\textsf{MTP}\text {-}\textsf{ME}$ checked the $\textsf {salt} $ in the first block, however this would deviate even further from the current MTProto implementation and so we did not include this in our definition.

Proposition 11

Let $\textsf {session}\_\textsf {id}\in \{0,1\}^{64}$, $\textsf {pb} \in {{\mathbb {N}}}$ and $\textsf{bl}= 128$. Denote by $\textsf{ME}$ $=$ $\textsf{MTP}\text {-}\textsf{ME}[\textsf {session}\_\textsf {id}$, $\textsf {pb} $, $\textsf{bl}]$ the message encoding scheme defined in Definition 6. Let $\textsf{E}$ be a block cipher with block length $\textsf{E}.\textsf{ol} = 128$. Let $\textsf{SE}= \textsf{IGE}[\textsf{E}]$ be the deterministic symmetric encryption scheme defined in Sect. 2.2. Let $\mathcal {F}$ be any adversary against the $\textrm{UNPRED}$-security of $\textsf{SE}, \textsf{ME}$ making $q_{\textsc {Ch}}$ queries to its oracle $\textsc {Ch}$. Then

$$\begin{aligned} \textsf{Adv}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}}(\mathcal {F}) \le \frac{q_{\textsc {Ch}}}{2^{64}}. \end{aligned}$$

Proof

We rewrite game $\textrm{G}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}, \mathcal {F}}$ (Fig. 35) as game $\textrm{G}$ in Fig. 80 by expanding algorithms $\mathsf {\textsf{SE}.Dec}$ and $\mathsf {\textsf{ME}.Decode}$ with the following relaxations. Algorithm $\mathsf {\textsf{SE}.Dec}$ is partially expanded to only decrypt the first block of ciphertext $c_{\textit{se}}$ into a 128-bit long payload block $p_1$. Algorithm $\mathsf {\textsf{ME}.Decode}$ is partially expanded to only surface the sanity check of $p_1$, which (as per Fig. 20) should consist of two concatenated 64-bit values $\textsf {salt} ~\Vert ~\textsf {session}\_\textsf {id}$, where $\textsf {session}\_\textsf {id}$ should match the fixed constant that is stored inside the $\textsf{ME}$’s state $\textit{st}_\textsf{ME}$. Since game $\textrm{G}$ does not implement all of the checks from algorithm $\mathsf {\textsf{ME}.Decode}$, adversary $\mathcal {F}$ is more likely to win in $\textrm{G}$ than in the original game $\textrm{G}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}, \mathcal {F}}$, but $\mathcal {F}$ is not able to detect these changes because $\textsc {Ch}$ always returns $\bot $. We have

$$ \textsf{Adv}^{\textsf{unpred}}_{\textsf{SE}, \textsf{ME}}(\mathcal {F}) \le \Pr [\textrm{G}]. $$

The adversary $\mathcal {F}$ can only win in game $\textrm{G}$ if $p_1[64:128] = \textsf {session}\_\textsf {id}$ for some $p_1$ that is defined by the equation $p_1 = \textsf{E}.\textsf{Inv}(K, c_1 \oplus p_0) \oplus c_0$. We can rewrite this winning condition as

$$ \textsf{E}.\textsf{Inv}(K, c_1 \oplus p_0)[64:128] \oplus \textsf {session}\_\textsf {id}= c_0[64:128]. $$

Here $c_0[64:128]$ is a bit string that is sampled uniformly at random for each pair $(\textit{u}, \textsf{msg}{\_}\textsf{key})$ and that is unknown to the adversary.

Consider for a moment a particular pair $(\textit{u}, \textsf{msg}{\_}\textsf{key})$; suppose that $\mathcal {F}$ makes $q_{\textit{u}, \textsf{msg}{\_}\textsf{key}}$ queries to $\textsc {Ch}$ relating to this pair. These queries result in some specific set of values $X_{\textit{u}, \textsf{msg}{\_}\textsf{key}}$ for $\textsf{E}.\textsf{Inv}(K, c_1 \oplus p_0)[64:128] \oplus \textsf {session}\_\textsf {id}$ arising in the game. Moreover, $\mathcal {F}$ wins for one of these queries if and only if some element of the set $X_{\textit{u}, \textsf{msg}{\_}\textsf{key}}$ matches $c_0[64:128]$. Note also that $\mathcal {F}$ learns nothing about $c_0[64:128]$ from each such query (since the $\textsc {Ch}$ oracle always returns $\bot $). Combining these facts, we see that $\mathcal {F}$’s winning probability for this set of $q_{\textit{u}, \textsf{msg}{\_}\textsf{key}}$ queries is no larger than $q_{\textit{u}, \textsf{msg}{\_}\textsf{key}}/2^{64}$ (in essence, $\mathcal {F}$ can do no better than random guessing of distinct values for the unknown 64 bits). Moreover, while the adversary can learn $c_0$ for any $(\textit{u},\textsf{msg}{\_}\textsf{key})$ pair after-the-fact using $\textsc {Expose}$, it cannot continue querying $\textsc {Ch}$ for this value once the query is made, which makes the output of that oracle useless in winning the game.

Considering all pairs $(\textit{u}, \textsf{msg}{\_}\textsf{key})$ involved in $\mathcal {F}$’s queries and using the union bound, we obtain that

$$ \Pr [\textrm{G}] \le q_{\textsc {Ch}}\cdot 2^{-64}. $$

The inequality follows. $\square $

Concrete security of the novel $\textsf{SHACAL}-\textsf{2}$ assumptions in the ICM

In Sect. 5.2, we defined the $\textrm{LRKPRF}$-security and the $\textrm{HRKPRF}$-security of the block cipher $\textsf{SHACAL}-\textsf{2}$ (with respect to some related-key-deriving functions). Both assumptions roughly require $\textsf{SHACAL}-\textsf{2}$ to be related-key PRF-secure when evaluated on the fixed input $\textsf{IV}_{256}$. As discussed in Sect. 5.2, these assumptions construct the $\textsf{SHACAL}-\textsf{2}$ challenge keys in significantly different ways, but both of them allow the attacker to directly choose certain bits of the challenge keys.

The notions of $\textrm{LRKPRF}$ and $\textrm{HRKPRF}$ security are novel, and hence, further analysis is needed to determine whether they hold for $\textsf{SHACAL}-\textsf{2}$ in the standard model. We leave this task as an open problem. Here we justify both assumptions in the ideal cipher model [52], where a block cipher is modelled as a random (and independently chosen) permutation for every key in its key space. More formally, our analysis will assume that $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(\textit{sk}, \cdot )$ is a random permutation for every choice of $\textit{sk} \in \{0,1\}^{\textsf{SHACAL}-\textsf{2}.\textsf{kl}}$. This will allow us to derive an upper bound for any adversary attacking either of the two assumptions.

In this section, for any $\ell \in {{\mathbb {N}}}$ we use $\mathcal {P}(\ell )$ to denote the set of all bit-string permutations with domain and range $\{0,1\}^{\ell }$. For any permutation $\pi \in \mathcal {P}(\ell )$ and any $x, y\in \{0,1\}^{\ell }$ we write $\pi (x)$ to denote the result of evaluating $\pi $ on x, and we write $\pi ^{-1}(y)$ to denote the result of evaluating the inverse of $\pi $ on y. A basic correctness condition stipulates that $\pi ^{-1}(\pi (x)) = x$ for all $x\in \{0,1\}^{\ell }$.

Proposition 12

Let $\phi _{\textsf{KDF}}$ be the related-key-deriving function as defined in Fig. 24. Let $\phi _{\textsf{SHACAL}-\textsf{2}}$ be the related-key-deriving function as defined in Fig. 29. Let the block cipher $\textsf{SHACAL}-\textsf{2}$ from Sect. 2.2 be modelled as the ideal cipher with key length $\textsf{SHACAL}-\textsf{2}.\textsf{kl}$ and block length $\textsf{SHACAL}-\textsf{2}.\textsf{ol}$. Let $\mathcal {D}$ be any adversary against the $\textrm{LRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}$. Assume that $\mathcal {D}$ makes a total of q queries to its ideal cipher oracles. Then the advantage of $\mathcal {D}$ is upper-bounded as follows:

$$\begin{aligned} \textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}}(\mathcal {D}) < 2^{-156} + q \cdot 2^{-285}. \end{aligned}$$

Proof

This proof uses games $\textrm{G}_0$–$\textrm{G}_3$ in Fig. 81, and games $\textrm{G}_4$–$\textrm{G}_5$ in Fig. 82. Game $\textrm{G}_0$ is designed to be equivalent to game $\textrm{G}^\textsf{lrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}, \mathcal {D}}$ in the ideal cipher model. In particular, game $\textrm{G}_0$ gives its adversary $\mathcal {D}$ access to oracles $\textsf{IC}$ and $\textsf{IC}^{-1}$ that evaluate the direct and inverse calls to the ideal cipher, respectively. The evaluation of $\textsf{SHACAL}-\textsf{2}.\textsf{Ev}(\textit{sk} _i, \textsf{IV}_{256})$ inside oracle $\textsc {RoR}$ of game $\textrm{G}^\textsf{lrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}, \mathcal {D}}$ is replaced with a call to $\textsf{IC}(\textit{sk} _i, \textsf{IV}_{256})$ in game $\textrm{G}_0$. Oracles $\textsf{IC}$ and $\textsf{IC}^{-1}$ assign a random permutation $\pi :\{0,1\}^{\textsf{SHACAL}-\textsf{2}.\textsf{ol}}\rightarrow \{0,1\}^{\textsf{SHACAL}-\textsf{2}.\textsf{ol}}$ to any block cipher key $\textit{sk} \in \{0,1\}^{\textsf{SHACAL}-\textsf{2}.\textsf{kl}}$ that is seen for the first time, and store it in the table entry $\textsf{P}[\textit{sk} ]$. On input $(\textit{sk}, x)$ oracle $\textsf{IC}$ evaluates the permutation $\pi \leftarrow \textsf{P}[\textit{sk} ]$ on input x and returns the result $\pi (x)$; on input $(\textit{sk}, y)$ oracle $\textsf{IC}$ evaluates the inverse of the permutation $\pi \leftarrow \textsf{P}[\textit{sk} ]$ on input y and returns the result $\pi ^{-1}(y)$. Game $\textrm{G}_0$ also expands the code of the related-key-deriving functions $\phi _{\textsf{KDF}}$ and $\phi _{\textsf{SHACAL}-\textsf{2}}$. We have

$$\begin{aligned} \textsf{Adv}^{\textsf{lrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{KDF}}, \phi _{\textsf{SHACAL}-\textsf{2}}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}_0] - 1. \end{aligned}$$

Game $\textrm{G}_0$ adds some bookkeeping code highlighted in to its oracle $\textsc {RoR}$. This code does not affect the input–output behaviour of $\textsc {RoR}$.

Throughout transitions from $\textrm{G}_0$ to $\textrm{G}_3$, the code highlighted in is used to gradually eliminate the possibility that the adversary $\mathcal {D}$ calls its oracle $\textsc {RoR}$ on two distinct input tuples $(\textit{u}', i ', \textsf{msg}{\_}\textsf{key}')$ and $(\textit{u}, i, \textsf{msg}{\_}\textsf{key})$ that both lead to the same block cipher key $\textit{sk} _i $. If this was not true, then $\mathcal {D}$ could trivially win the game by comparing the equality of outputs returned by $\textsc {RoR}(\textit{u}', i ', \textsf{msg}{\_}\textsf{key}')$ and $\textsc {RoR}(\textit{u}, i, \textsf{msg}{\_}\textsf{key})$. Depending on the values of $i ', i \in \{0,1\}$ used in $(\textit{u}', i ', \textsf{msg}{\_}\textsf{key}') \ne (\textit{u}, i, \textsf{msg}{\_}\textsf{key})$, the block cipher keys produced across the two corresponding calls to $\textsc {RoR}$ can be the equal if the intersection of sets $\{\textsf {msg}\_\textsf {key} ' ~\Vert ~\textit{kk}_{\textit{u}', 0}, \textit{kk}_{\textit{u}', 1} ~\Vert ~\textsf {msg}\_\textsf {key} '\}$ and $\{\textsf {msg}\_\textsf {key} ~\Vert ~\textit{kk}_{\textit{u}, 0}, \textit{kk}_{\textit{u}, 1} ~\Vert ~ \textsf {msg}\_\textsf {key} \}$ is not empty. We now show that it will be empty with high probability.

Let $i\in \{0, 1, 2\}$. Games $\textrm{G}_i$ and $\textrm{G}_{i+1}$ are identical until $\textsf{bad}_i$ is set. We have

$$ \Pr [\textrm{G}_i] - \Pr [\textrm{G}_{i+1}] \le \Pr [\textsf{bad}_i^{\textrm{G}_i}]. $$

Note that whenever the $\textsf{bad}_i$ flag is set in game $\textrm{G}_{i+1}$, we use the ${\textbf{abort}}(\texttt {false})$ instruction to immediately halt the game with output $\texttt {false}$, meaning $\mathcal {D}$ loses the game right after setting the flag.^{Footnote 50} In order for it to be possible to set each of the flags, certain bit segments in $\textit{kk}$ need to be equal; this is a necessary, but not a sufficient condition. We use that to upper bound the corresponding probabilities as follows, when measured over the randomness of sampling (here $\textit{kk}$ is implicitly parsed into $\textit{kk}_{\mathcal {I},0}$, $\textit{kk}_{\mathcal {I},1}$, $\textit{kk}_{\mathcal {R},0}$, $\textit{kk}_{\mathcal {R},1}$ as specified by the related-key-deriving function $\phi _{\textsf{KDF}}$):

$$\begin{aligned} \Pr [\textsf{bad}_0^{\textrm{G}_0}]&\le \Pr [\textit{kk}_{\mathcal {I}, 0} = \textit{kk}_{\mathcal {R}, 0}] = 2^{-288}. \\ \Pr [\textsf{bad}_1^{\textrm{G}_1}]&\le \Pr [\textit{kk}_{\mathcal {I}, 1} = \textit{kk}_{\mathcal {R}, 1}] = 2^{-288}. \\ \Pr [\textsf{bad}_2^{\textrm{G}_2}]&\le \Pr [\exists a, b\in \{\mathcal {I},\mathcal {R}\} :\textit{kk}_{a, 0}[0:160] = \textit{kk}_{b, 1}[128:288]] \\&\le \sum _{a,b\in \{\mathcal {I},\mathcal {R}\}} \Pr [\textit{kk}_{a, 0}[0:160] = \textit{kk}_{b, 1}[128:288]] \\&= \Pr [\textit{kk}[0:160] = \textit{kk}[448:608]] \\&\;\;\;+ \Pr [\textit{kk}[0:160] = \textit{kk}[512:672]] \\&\;\;\;+ \Pr [\textit{kk}[64:224] = \textit{kk}[448:608]] \\&\;\;\;+ \Pr [\textit{kk}[64:224] = \textit{kk}[512:672]] \\&= 4 \cdot 2^{-160} \\&= 2^{-158}. \end{aligned}$$

We used the union bound in order to upper-bound $\Pr [\textsf{bad}_2^{\textrm{G}_2}]$. The upper bound on $\Pr [\textsf{bad}_2^{\textrm{G}_2}]$ could be significantly lowered by capturing the idea that the adversary $\mathcal {D}$ also needs to match both $\textsf {msg}\_\textsf {key} $ and $\textsf {msg}\_\textsf {key} '$ to the corresponding bits of $\textit{kk}$. Adversary $\mathcal {D}$ cannot efficiently learn $\textit{kk}$ based on the responses from its oracles; the best it could do in an attempt to set $\textsf{bad}_2$ is to repeatedly try guessing the bits of $\textit{kk}$ by supplying different values of $\textsf {msg}\_\textsf {key}, \textsf {msg}\_\textsf {key} ' \in \{0,1\}^{128}$.^{Footnote 51} The upper bound would then depend on the number of calls that $\mathcal {D}$ makes to its oracle $\textsc {RoR}$. We omit this analysis, and settle for a less precise lower-bound.

Game $\textrm{G}_4$ in Fig. 82 rewrites game $\textrm{G}_3$ in an equivalent way. The calls to $\textsf{SHA}-\textsf{pad}$ are expanded according to its definition in Fig. 7. The three conditional statements that inevitably lead to ${\textbf{abort}}(\texttt {false})$ are replaced with an immediate call to ${\textbf{abort}}(\texttt {false})$. The code of the $\textsf{IC}$ oracle is expanded in place of the single call to $\textsf{IC}$ from within oracle $\textsc {RoR}$. These changes are marked in . We have

$$\begin{aligned} \Pr [\textrm{G}_4] = \Pr [\textrm{G}_3]. \end{aligned}$$

Game $\textrm{G}_4$ also adds some code highlighted in to its ideal cipher oracles $\textsf{IC}$ and $\textsf{IC}^{-1}$; this does not affect the input–output behaviour of the oracles.

By now, we have determined that in game $\textrm{G}_4$ each query to the $\textsc {RoR}$ oracle uses a distinct block cipher key $\textit{sk} $ (except in the trivial case when $\textsc {RoR}$ is queried twice with the same input tuple). In the ideal cipher model, this key is mapped to a random permutation, which is then stored in $\textsf{P}[\textit{sk} ]$. We want to show that the adversary $\mathcal {D}$ cannot distinguish between this permutation evaluated on input $\textsf{IV}_{256}$ and a uniformly random value from $\{0,1\}^{\textsf{SHACAL}-\textsf{2}.\textsf{ol}}$. The only way it could distinguish between the two cases is if $\mathcal {D}$ managed to guess $\textit{sk} $ and query one of its ideal cipher oracles with $\textit{sk} $ as input. This requires $\mathcal {D}$ to guess the corresponding 288-bit segment of $\textit{kk}$ that is used to build $\textit{sk} $ inside oracle $\textsc {RoR}$: either $\textit{sk} [128:416]$ should be equal to one of $\{\textit{kk}_{\mathcal {I}, 0}, \textit{kk}_{\mathcal {R}, 0}\}$, or $\textit{sk} [0:288]$ should be equal to one of $\{\textit{kk}_{\mathcal {I}, 1}, \textit{kk}_{\mathcal {R}, 1}\}$. We show that this is hard to achieve.

Formally, games $\textrm{G}_4$ and $\textrm{G}_5$ are identical until $\textsf{bad}_3$ is set. We have

$$\begin{aligned} \Pr [\textrm{G}_4] - \Pr [\textrm{G}_5] \le \Pr [\textsf{bad}_3^{\textrm{G}_5}]. \end{aligned}$$

Note that $\textit{kk}$ can take a total of $2^{672}$ different values. Each query to an ideal cipher oracle $\textsf{IC}$ or $\textsf{IC}^{-1}$ either sets $\textsf{bad}_3$, or silently rejects at most $4 \cdot 2^{384}$ candidate $\textit{kk}$ values. In particular, if $\textsf{bad}_3$ was not set, then $\textit{kk}$ cannot contain $\textit{sk} [128:416]$ in one of the positions that correspond to $\textit{kk}_{\mathcal {I}, 0}$ or $\textit{kk}_{\mathcal {R}, 0}$, and it cannot contain $\textit{sk} [0:288]$ in one of the positions that correspond to $\textit{kk}_{\mathcal {I}, 1}$ or $\textit{kk}_{\mathcal {R}, 1}$. Here we use the fact that for any fixed 288-bit string, there are $2^{672-288}=2^{384}$ different ways to choose the remaining bits of $\textit{kk}$. Beyond eliminating some candidate keys as per above, the ideal cipher oracles do not return any useful information about the contents of $\textit{kk}$. So we can upper-bound the probability of setting $\textsf{bad}_3$ in game $\textrm{G}_5$ after making q queries to oracles $\textsf{IC}$ and $\textsf{IC}^{-1}$ as follows:

$$\begin{aligned} \Pr [\textsf{bad}_3^{\textrm{G}_5}] \le q \cdot \frac{4 \cdot 2^{384}}{2^{672}} = q \cdot 2^{-286}. \end{aligned}$$

Finally, in game $\textrm{G}_5$ the ideal cipher oracles can no longer help $\mathcal {D}$ learn any information about the bits of $\textit{kk}$, or about the corresponding random permutations. So we have

$$\begin{aligned} \Pr [\textrm{G}_5] = \frac{1}{2}. \end{aligned}$$

Combining all of the above, we get

$$\begin{aligned} \Pr [\textrm{G}_0]&= \sum _{0 \le i \le 4} (\Pr [\textrm{G}_i] - \Pr [\textrm{G}_{i+1}]) + \Pr [\textrm{G}_5] \\&= (2^{-288} + 2^{-288} + 2^{-158} + 0 + q \cdot 2^{-286}) + \frac{1}{2} \\&< 2^{-157} + q \cdot 2^{-286} + \frac{1}{2}. \end{aligned}$$

The inequality follows. $\square $

Proposition 13

Let $\phi _{\textsf{MAC}}$ be the related-key-deriving function as defined in Fig. 24. Let the block cipher $\textsf{SHACAL}-\textsf{2}$ from Sect. 2.2 be modelled as the ideal cipher with key length $\textsf{SHACAL}-\textsf{2}.\textsf{kl}$ and block length $\textsf{SHACAL}-\textsf{2}.\textsf{ol}$. Let $\mathcal {D}$ be any adversary against the $\textrm{HRKPRF}$-security of $\textsf{SHACAL}-\textsf{2}$ with respect to $\phi _{\textsf{MAC}}$. Assume that $\mathcal {D}$ makes a total of q queries to its ideal cipher oracles. Then the advantage of $\mathcal {D}$ is upper-bounded as follows:

$$\begin{aligned} \textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}}(\mathcal {D}) \le 2^{-255} + q \cdot 2^{-254}. \end{aligned}$$

Proof

This proof presents a very similar (but simpler) argument compared to the one used for the proof of Proposition 12. So we provide the games and only the core analysis here, with a minimal amount of justification for each of the steps.

This proof uses games $\textrm{G}_0$–$\textrm{G}_3$ in Fig. 83. Game $\textrm{G}_0$ is equivalent to game $\textrm{G}^\textsf{hrkprf}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}, \mathcal {D}}$ in the ideal cipher model, so

$$\begin{aligned} \textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}_0] - 1. \end{aligned}$$

For the transition from $\textrm{G}_0$ to $\textrm{G}_1$ we upper-bound the probability of $\textit{mk}_{\mathcal {I}} = \textit{mk}_{\mathcal {R}}$ as follows:

$$\begin{aligned} \Pr [\textrm{G}_0] - \Pr [\textrm{G}_1] \le \Pr [\textsf{bad}_0^{\textrm{G}_0}] \le \Pr [\textit{mk}_{\mathcal {I}} = \textit{mk}_{\mathcal {R}}] = 2^{-256}. \end{aligned}$$

Game $\textrm{G}_2$ differs from game $\textrm{G}_1$ by expanding the code of the oracle $\textsf{IC}$ in place of the corresponding call to $\textsf{IC}(\textit{sk}, \textsf{IV}_{256})$ inside the $\textsc {RoR}$ oracle, so both games are equivalent:

$$\begin{aligned} \Pr [\textrm{G}_1] - \Pr [\textrm{G}_2] = 0. \end{aligned}$$

For the transition from $\textrm{G}_2$ to $\textrm{G}_3$ we upper-bound the probability that adversary $\mathcal {D}$ calls one of its ideal cipher oracles $\textsf{IC}$ or $\textsf{IC}^{-1}$ with a block cipher key $\textit{sk} $ that contains $\textit{mk}_\mathcal {I}$ or $\textit{mk}_\mathcal {R}$ as its prefix:

$$\begin{aligned} \Pr [\textrm{G}_2] - \Pr [\textrm{G}_3] \le \Pr [\textsf{bad}_1^{\textrm{G}_2}] \le q \cdot \frac{2 \cdot 2^{256}}{2^{512}} = q \cdot 2^{-255}. \end{aligned}$$

In game $\textrm{G}_3$ the ideal cipher oracles $\textsf{IC}$ and $\textsf{IC}^{-1}$ no longer work with any keys that might be used inside oracle $\textsc {RoR}$. So the adversary $\mathcal {D}$ cannot distinguish between an evaluation of a random permutation on input $\textsf{IV}_{256}$ and a uniformly random output value from the range of such permutation. We have

$$\begin{aligned} \Pr [\textrm{G}_3] = \frac{1}{2}. \end{aligned}$$

We now combine all of the above steps:

$$\begin{aligned}&\textsf{Adv}^{\textsf{hrkprf}}_{\textsf{SHACAL}-\textsf{2}, \phi _{\textsf{MAC}}}(\mathcal {D}) = 2 \cdot \Pr [\textrm{G}_0] - 1 \\&\hspace{10pt}= 2\cdot \left( \sum _{0 \le i \le 2} (\Pr [\textrm{G}_i] - \Pr [\textrm{G}_{i+1}]) + \Pr [\textrm{G}_3]\right) - 1 \\&\hspace{10pt}\le 2 \cdot (2^{-256} + 0 + q \cdot 2^{-255} + \frac{1}{2}) - 1. \end{aligned}$$

The inequality follows. $\square $

Implementation

1.1 Code for the attack in Section 6

Assume Telegram desktop version 2.4.11.^{Footnote 52} The experiment code (experiment.h and experiment.cpp, also attached to the electronic version of the document) was added to Telegram/SourceFiles/core/ and called from Application::run() inside application.cpp. We use cpucycles^{Footnote 53} to measure the running time.

1.2 Code for the attack in Section 7

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Albrecht, M.R., Mareková, L., Paterson, K.G. et al. Four Attacks and a Proof for Telegram. J Cryptol 39, 10 (2026). https://doi.org/10.1007/s00145-025-09566-1

Download citation

Received: 31 March 2023
Revised: 12 September 2025
Accepted: 21 October 2025
Published: 17 December 2025
Version of record: 17 December 2025
DOI: https://doi.org/10.1007/s00145-025-09566-1

Keywords

Profiles

Lenka Mareková View author profile

Four Attacks and a Proof for Telegram

Abstract

Similar content being viewed by others

Analysis of the Telegram Key Exchange

More Practical Single-Trace Attacks on the Number Theoretic Transform

Establishing Secure Communication Channels Using Remote Attestation with TPM 2.0

Explore related subjects

1 Introduction

1.1 Contributions

1.1.1 Security model

1.1.2 Formal specification of MTProto

1.1.3 Proof of security

1.1.4 Attacks

1.2 Publication history

1.3 Disclosure

2 Preliminaries

2.1 Notational conventions

2.1.1 Basic notation

2.1.2 Algorithms and adversaries

2.1.3 Security games and reductions

2.2 Standard definitions

2.2.1 Fundamental Lemma of Game Playing

2.2.2 Collision-resistant functions

2.2.3 Function families

2.2.4 Block ciphers

2.2.5 One-time PRF security of function family for multiple keys

2.2.6 Symmetric encryption schemes

2.2.7 One-time indistinguishability of SE

2.2.8 CBC block cipher mode of operation

2.2.9 IGE block cipher mode of operation

2.2.10 MD transform

2.2.11 \(\textsf{SHA}-\textsf{1}\) and \(\textsf{SHA}-\textsf{256}\)

2.2.12 \(\textsf{SHACAL}-\textsf{1}\) and \(\textsf{SHACAL}-\textsf{2}\)

3 Bidirectional channels

3.1 Our formal model in the context of prior work

3.1.1 The choice of a cryptographic primitive

3.1.2 The choice of a security model

3.1.3 Extending the robust channel framework

3.1.4 Relation to secure messaging models

3.2 Syntax of channels

Definition 1

3.3 Support transcripts and functions

3.3.1 Support transcripts

Definition 2

3.3.2 Support functions

Definition 3

3.4 Correctness and security of channels

3.4.1 Correctness

3.4.2 Integrity

3.4.3 Confidentiality

3.4.4 Authenticated encryption

3.5 Message encoding schemes

Definition 4

4 MTProto 2.0 specification

4.1 Telegram description

4.1.1 Key exchange

4.1.2 “Record protocol”

4.2 Attacks against MTProto metadata validation

4.2.1 Message reordering

4.2.2 Message drops

4.2.3 Re-encryption

4.3 Modelling differences

4.3.1 Under-specification and inconsistencies

4.3.2 Application layer

4.3.3 Client/server roles

4.3.4 Key exchange

4.3.5 Bit mixing

4.3.6 Order

4.3.7 Re-encryption

4.3.8 Message encoding

4.4 MTProto-based channel

Definition 5

Definition 6

Definition 7

Definition 8

Definition 9

Definition 10

5 Formal security analysis

5.1 Security requirements on standard primitives

5.1.1 \(\textsf{MTP}\text {-}\textsf{HASH}\) is a one-time indistinguishable function family