multivariate -quantiles: a spatial approach

Multivariate ρ-Quantiles: a Spatial Approach∗

DIMITRI KONEN† and DAVY PAINDAVEINE‡

†,‡Universite libre de Bruxelles

ECARES and Departement de Mathematique

Avenue F.D. Roosevelt, 50

ECARES, CP114/04

B-1050, Brussels

Belgium

‡Universite Toulouse Capitole

Toulouse School of Economics

1, Esplanade de l’Universite

31080 Toulouse Cedex 06

France

By substituting an Lp loss function for the L1 loss function in the optimization problem

defining quantiles, one obtains Lp-quantiles that, as shown recently, dominate their classical

L1-counterparts in financial risk assessment. In this work, we introduce a concept of multivari-

ate Lp-quantiles generalizing the spatial (L1-)quantiles from Chaudhuri (1996). Rather than

restricting to power loss functions, we actually allow for a large class of convex loss functions ρ.

We carefully study existence and uniqueness of the resulting ρ-quantiles, both for a general prob-

ability measure over Rd and for a spherically symmetric one. Interestingly, the results crucially

depend on ρ and on the nature of the underlying probability measure. Building on an investiga-

tion of the differentiability properties of the objective function defining ρ-quantiles, we introduce

a companion concept of spatial ρ-depth, that generalizes the spatial depth from Vardi and Zhang

(2000). We study extreme ρ-quantiles and show in particular that extreme Lp-quantiles behave

in fundamentally different ways for p ≤ 2 and p > 2. Finally, we establish Bahadur represen-

tation results for sample ρ-quantiles and derive their asymptotic distributions. Throughout, we

impose only very mild assumptions on the underlying probability measure, and in particular we

never assume absolute continuity with respect to the Lebesgue measure.

MSC 2010 subject classifications: Primary 62G99, 62H99; secondary 62E20.

Keywords: Bahadur representation results, Convex objective functions, M-estimation, Multivari-

ate quantiles, Spatial depth, Spatial quantiles.

∗Research is supported by the Program of Concerted Research Actions (ARC) of the Universite

libre de Bruxelles and by an Aspirant fellowship from the FNRS (Fonds National pour la Recherche

Scientifique), Communaute Francaise de Belgique.

1

2 D. Konen and D. Paindaveine

1. Introduction

The concept of quantile, which is of paramount importance in statistics, has long been

limited to probability measures over R. Defining a suitable quantile concept in Rd is a

problem that is intrinsically difficult due to the lack of a canonical ordering in Rd, d > 1.

This has been an active research topic in the last decades; see, among many others, Hallin

et al. (2021), Hallin, Paindaveine and Siman (2010), Serfling (2002), and the references

therein. One of the most successful multivariate quantile concepts is the concept of spatial

(or geometric) quantiles from Chaudhuri (1996); for a probability measure P over Rd,the spatial quantile of order α in direction u is defined as the minimizer over Rd of the

map

µ 7→Mα,u(µ) =

∫Rd

{‖z − µ‖ − ‖z‖ − αu′µ

}dP (z), (1)

with α ∈ [0, 1) and u ∈ Sd−1 := {z ∈ Rd : ‖z‖2 := z′z = 1}. The success of spa-

tial quantiles is explained by several key distinctive properties, among which: spatial

quantiles are easy to compute, even for large d (Mukhopadhyaya and Chatterjee (2011),

Vardi and Zhang (2000)). The asymptotic behavior of their sample version is rather

standard (Chaudhuri (1996), Zhou and Serfling (2008)). They can easily be extended

into regression quantiles (Chakraborty (2003), Cheng and De Gooijer (2007)) or turned

into quantiles for functional data (Chakraborty and Chaudhuri (2014), Chowdhury and

Chaudhuri (2019), Serfling and Wijesuriya (2017)).

For d = 1, spatial quantiles, that minimize the L1-objective function in (1), reduce

to the usual univariate quantiles. In particular, the collection of intervals whose end-

points are the spatial quantiles of order α in direction u = −1 and u = 1 is a nested

family of interquantile intervals, that all contain the univariate median (which is ob-

tained with α = 0, irrespective of the direction u). Similarly, expectiles, an L2-analog of

quantiles introduced in Newey and Powell (1987), provide a nested family of centrality

intervals that all contain the mean of the distribution. Expectiles have met a big success,

particularly so in financial risk assessment, where they provide coherent risk measures;

see, e.g., Daouia, Girard and Stupfler (2018), Kuan, Yeh and Hsu (2009), Taylor (2008).

Quantiles and expectiles belong to the class of Lp-quantiles (associated with Lp loss func-

tions, with p ≥ 1), or, more generally, of M-quantiles (associated with general convex

loss functions); see Breckling and Chambers (1988), Chen (1996). Recently, there has

been a growing interest in such generalized quantiles, still with risk assessment as one of

the main applications; see, e.g., Daouia, Girard and Stupfler (2019), Gardes, Girard and

Stupfler (2020), or Usseglio-Carleve (2018).

The success of spatial quantiles in Rd and the growing interest in Lp-quantiles in the

univariate case d = 1 suggest to define a spatial concept of Lp-quantiles. For p = 2,

this has actually recently been done in Herrmann, Hofert and Mailhot (2018), but the

resulting spatial expectiles remain less well understood than their spatial L1-counterparts

from Chaudhuri (1996). To the best of our knowledge, a spatial Lp-quantile concept, or,

Multivariate ρ-Quantiles 3

more generally, a spatial M-quantile concept has not been investigated in the literature.

In this work, we define such a general concept and we thoroughly study its properties.

We adopt the following definition.

Definition 1. Let ρ : R+ → R+ be a convex function and P be a probability measure

over Rd. Fix α ∈ [0, 1) and u ∈ Sd−1. We say that µρα,u = µρα,u(P ) is a spatial ρ-quantile

of order α in direction u for P if and only if it minimizes the objective function

µ 7→Mρα,u(µ) :=

∫Rd

{Hρα,u(z − µ)−Hρ

α,u(z)}dP (z) (2)

over Rd, where we let

Hρα,u(z) := ρ(‖z‖)

(1 + α

u′z

‖z‖

)ξz,0, (3)

with ξz1,z2 := I[z1 6= z2] (throughout, I[A] is the indicator function of A).

It might have been natural to write µρv, with v = αu rather than µρα,u, to emphasize

the indexation of spatial ρ-quantiles on the unit ball, but we favour the notation µρα,uthat stresses the heterogenous roles α and u will play in the sequel. The multivariate Lp-

quantiles we consider in this work are obtained with ρ(t) = tp, p ≥ 1. Clearly, these

reduce for p = 1 to the minimizers of (1), that is, to the spatial quantiles from Chaudhuri

(1996). If P has finite second-order moments, then our Lp-quantiles reduce for p = 2 to

the expectiles introduced in Herrmann, Hofert and Mailhot (2018); as we will show,

however, our formulation above only requires that P has finite first-order moments. Note

that for α = 0, the ρ-quantile µρα,u, irrespective of u (the direction u does not play any

role for α = 0), is an M-functional of location, that, for p = 1 and p = 2, provides

the celebrated spatial median (Brown (1983)) and the mean vector of P , respectively.

The ρ-quantiles from Definition 1 extend this M-functional of location in the same way

the spatial quantiles from Chaudhuri (1996) extend the spatial median: they are thus

M-quantiles, in the sense of Breckling and Chambers (1988) or Koltchinski (1997) (it

should be noted that the only intersection between the M-quantiles in Definition 1 and

those from Koltchinski (1997) are the spatial quantiles from Chaudhuri (1996)).

Above, the motivation to consider Lp-quantiles was linked to their relevance for risk

assessment, and there is no doubt that spatial Lp-quantiles are natural tools to define

suitable risk measures in situations where multidimensional portfolios are considered.

The main focus in this work, however, is on a careful study of the probabilistic prop-

erties of these Lp-quantiles and, more generally, of the corresponding ρ-quantiles. Quite

remarkably, many of these properties crucially depend on the loss function ρ. We pro-

vide two examples. (i) Convexity of the objective function in (2) for any order α and

direction u is a key property for the study of ρ-quantiles and for their evaluation at

empirical distributions, and it may be expected that this convexity is inherited from the

convexity of ρ. Our results, however, will show that, in the class of Lp-quantiles, this is


the case if and only if p ≤ 2. For p > 2, we will show that convexity holds for α ≤ αponly, where, quite remarkably, αp is very close to one for any p but does not depend

on p monotonically. (ii) The spatial quantiles from Chaudhuri (1996) have recently been

criticised because they exit any compact set as α→ 1 even for a compactly supported

probability measure P ; see Girard and Stupfler (2017). As our results will show, this

behavior of extreme spatial quantiles is shared by Lp-quantiles with p ≤ 2, but not by

those with p > 2. While we discussed these results here for Lp-quantiles only, we will

throughout study properties of ρ-quantiles for a virtually arbitrary convex loss function ρ,

which will allow us to consider, e.g., exponential loss functions or the celebrated Huber

loss functions.

The outline of the paper is as follows. In Section 2, we provide the assumptions under

which the objective function Mρα,u(µ) in (2) is well-defined for any µ, and we discuss

existence of ρ-quantiles. In Section 3, we obtain a necessary and sufficient condition for

convexity of Mρα,u(µ), we characterize the orders α for which convexity fails when this

condition is not satisfied, and we exploit this to derive uniqueness results for ρ-quantiles.

In Section 4, we refine these convexity and uniqueness results in the particular case

for which the underlying probability measure is spherically symmetric. In Section 5, we

study first- and second-order differentiability of the objective function Mρα,u(µ), which

will play a key role in the subsequent sections. In Section 6, we exploit Robert Serfling’s

DOQR paradigm to define ρ-depth functions, ρ-outlyingness functions and ρ-rank func-

tions associated with our ρ-quantile functions. We also identify conditions under which

ρ-quantile functions are homeomorphisms from the open unit ball (quantiles are indexed

by (α, u) ∈ [0, 1)×Sd−1 or, equivalently, by αu in the open unit ball of Rd) to the whole

Euclidean space Rd. This will play a major role when studying in Section 7 the behavior

of extreme ρ-quantiles. In Section 8, we derive Bahadur representation results for sample

ρ-quantiles and deduce their asymptotic distribution. Finally, we briefly discuss some

perspectives for future research in Section 9. Proofs are collected in the Supplementary

Material (Konen and Paindaveine (2021)).

2. Existence

Throughout, we assume that the loss function ρ belongs to the class C of functions

from R+ to R+ that are convex, piecewise twice continuously differentiable on (0,∞),

and satisfy ρ(t) = 0 only for t = 0. Here, ρ is piecewise twice continuously differentiable

on (0,∞) means that either (i) there exist a nonnegative integer K and (0 =: t0 <)t1 <

t2 < . . . < tK < tK+1 := ∞ such that ρ is twice continuously differentiable on each

open interval (tk, tk+1), k = 0, . . . ,K, or (ii) there exists a monotone strictly increasing

sequence (t0 := 0, t1, t2, . . .) in R+ diverging to infinity such that ρ is twice continuously

differentiable on each open interval (tk, tk+1), k ∈ N. We let Dρ = (0,∞)\{t1, t2, . . . , tK}in case (i) and Dρ = (0,∞)\{t1, t2, . . .} in case (ii). Examples of loss functions in C are the


power loss functions ρ(t) = tp, with p ≥ 1, the exponential functions ρ(t) = exp(ct)− 1,

with c > 0, and the Huber loss functions ρ(t) = (t2/2)I[0 < t < c] + c(t− (c/2))I[t ≥ c],

with c > 0. For the Huber loss functions, Dρ = (0,∞) \ {c}, whereas Dρ = (0,∞) for

power and exponential loss functions.

For any ρ ∈ C, we denote as Pρd the class of probability measures P over Rd such that

for any µ ∈ Rd, there exists δ > 0 for which∫Rdψ−(‖z − µ‖+ δ) dP (z) <∞; (4)

throughout, ψ− and ψ+ will denote the left- and right-derivative of ρ, respectively (con-

vexity of ρ ensures existence of these one-sided derivatives). For the power loss func-

tion ρ(t) = tp, with p ≥ 1, P ∈ Pρd if and only if P has finite moments of order p−1 (that

is, E[‖Z‖p−1] < ∞, where Z is a random d-vector with distribution P ). For ρ(t) = t,

Pρd thus collects all probability measures on Rd, which is also the case for Huber loss

functions.

We then have the following existence result (see Section S.2 in Konen and Paindaveine

(2021) for a proof).

Theorem 2.1. Let ρ ∈ C and P ∈ Pρd . Fix α ∈ [0, 1] and u ∈ Sd−1. Then,

(i) Mρα,u(µ) is well-defined for any µ ∈ Rd; (ii) if α < 1, then P admits at least one

ρ-quantile of order α in direction u.

The existence result in Theorem 2.1(ii) is obtained by establishing that, for any α ∈[0, 1) and u ∈ Sd−1, the map µ 7→ Mρ

α,u(µ) is both coercive (in the sense that Mρα,u(µ)

diverges to infinity as ‖µ‖ does) and continuous on Rd. We require that ρ ∈ C in Theo-

rem 2.1 to avoid introducing many different collections of loss functions in the sequel, but

inspection of the proof reveals that the result actually holds without any differentiability

assumption on ρ.

Theorem 2.1(ii) shows that, for ρ(t) = tp, with p ≥ 1, ρ-quantiles exist for any α ∈ [0, 1)

and u ∈ Sd−1 as soon as P has finite moments of order p−1 (while subtracting Hρα,u(z) in

the integrand of (2) in principle has no impact on the corresponding quantile minimizers,

not doing so would guarantee existence of a minimizer only under the stronger condition

of finite moments of order p). In particular, taking p = 1 and p = 2, this shows that the

quantiles from Chaudhuri (1996) always exist, whereas their expectile analogs from Her-

rmann, Hofert and Mailhot (2018) only require that P has finite first-order moments (as

already mentionned, finite second-order moments are imposed in Herrmann, Hofert and

Mailhot (2018)). The quantiles associated with Huber loss functions also always exist for

any α ∈ [0, 1) and u ∈ Sd−1.

In Section 7 below, we will study extreme ρ-quantiles, that is, the ρ-quantiles indexed

by an order α that is arbitrarily close to one. As we will see, the behavior of such

quantiles crucially depends on the existence of the boundary ρ-quantiles indexed by an


order α = 1 (the term “boundary” results from the fact that Definition 1 imposes that α ∈[0, 1)). Our interest in such boundary quantiles explains why we will be investigating the

properties of the map Mρα,u(µ) also for α = 1, as we already did in Theorem 2.1(ii).

At this stage, we stress that Theorem 2.1(ii) remains silent about the existence of such

boundary ρ-quantiles, the reason being that coercivity of µ 7→Mρα,u(µ) may fail for α =

1. For ρ(t) = t and d ≥ 2, for instance, no such boundary quantiles exist when P

is non-atomic and is not supported on a line of Rd; see Girard and Stupfler (2017),

Proposition 2.1.

We conclude this section with the following orthogonal- and translation-equivariance

result, that will be particularly relevant in Section 4 when considering the particular case

for which P is spherically symmetric (the proof readily follows from Definition 1).

Proposition 2.1. Let ρ ∈ C and P ∈ Pdρ . Fix α ∈ [0, 1) and u ∈ Sd−1. Let O be a d×dorthogonal matrix, b be a d-vector, and denote as PO,b the distribution of OZ+ b when Z

has distribution P . Then, if µ is a ρ-quantile of P of order α in direction u, then Oµ+ b

is a ρ-quantile of PO,b of order α in direction Ou.

Spatial ρ-quantiles are not affine-equivariant, that is, they fail to be equivariant under

general affine transformations. However, they can be made affine-equivariant through a

transformation-retransformation approach; see, e.g., Serfling (2010) in the case of the

Chaudhuri (1996) spatial quantiles.

3. Convexity and uniqueness

The quantiles studied in this work are defined as minimizers of the map µ 7→ Mρα,u(µ)

in (2). Convexity of this map is a most desirable property, that is expected to play a

key role when investigating uniqueness of these quantiles and when evaluating them for

empirical probability measures. In this section, we therefore study under which conditions

on the loss function ρ ∈ C the map µ 7→Mρα,u(µ) is convex for any P ∈ Pρd .

First note that convexity of Hρα,u trivially implies convexity of Mρ

α,u, and that, if Hρα,u

is not convex, then there exists P ∈ Pρd for which Mρα,u fails to be convex (simply consider

a Dirac probability measure). Therefore, we may focus on studying convexity of Hρα,u.

Since any ρ ∈ C clearly makes Hρα,u convex for d = 1, we tacitly restrict throughout this

section to the case d ≥ 2. We start with the following preliminary result showing that

the larger α, the fewer the functions ρ making Hρα,u convex for any u ∈ Sd−1.

Lemma 3.1. For any α ∈ [0, 1], denote as Cα the collection of functions ρ ∈ C such

that Hρα,u is convex for any u ∈ Sd−1. Then, we have the following:

(i) C0 = C; (ii) if α1, α2 ∈ [0, 1] satisfy α1 < α2, then Cα2⊆ Cα1

.


This result suggests considering αρ := max{α ∈ [0, 1] : ρ ∈ Cα}, the largest value of α

for which ρ makes Hρα,u convex for any u ∈ Sd−1 (it is trivial to prove that the maximum

exists for any ρ ∈ C). Ideally, we would like to have that αρ = 1, as this would ensure

that Hρα,u is convex for any α ∈ [0, 1] and u ∈ Sd−1. The following result provides a

necessary and sufficient condition for αρ = 1.

Theorem 3.1. Let ρ ∈ C. Then, irrespective of d ≥ 2, αρ = 1 if and only if the map

t 7→ t2/ρ(t) is concave on (0,∞).

As a corollary, the power loss function ρ(t) = tp makes Hρα,u convex for any α ∈ [0, 1)

and any u ∈ Sd−1 if and only if p ∈ [1, 2]. For p > 2, it is then of interest to determine

the corresponding value of αρ(< 1). More generally, the following result allows one to

determine αρ for any loss function ρ that does not satisfy the necessary and sufficient

condition in Theorem 3.1.

Theorem 3.2. Let ρ ∈ C be such that the map t 7→ t2/ρ(t) is not concave on (0,∞).

Then, irrespective of d ≥ 2,

αρ = inft∈Dcv

ρ

√qt(4p2t − 4pt − qt)4(pt − 1)2(qt + 1)

< 1,

with

pt :=tψ−(t)

ρ(t)and qt :=

t2ψ′−(t)

tψ−(t)− ρ(t),

where we let Dcvρ := {t ∈ Dρ : (t2/ρ(t))′′ > 0} and where ψ′− is the left-derivative of ψ−

(in this result, ψ−(t) and ψ′−(t) are used only for t ∈ Dρ, so that we could write ψ−(t) =

ρ′(t) and ψ′−(t) = ρ′′(t) above).

For ρ(t) = tp with p > 2, one readily checks that Dcvρ = (0,∞) and

αρ :=

√p2(4p− 5)

4(p− 1)2(p+ 1)· (5)

Remarkably, αρ in (5) exhibits a non-monotonic pattern in p: for p ∈ [2, 5], it decreases

monotonically from one to its minimal value√

125/128 (slightly above .9882), then

increases monotonically to one again for p ∈ [5,∞); see the left panel of Figure 1.

For p > 2, it is thus only for most extreme quantile orders α that convexity fails.

This is even more the case for the exponential loss function ρ(t) = exp(ct) − 1, for

which Dcvρ = (3.0861/c,∞) and αρ = .9939. It can be shown that, if the loss function ρ

is such that t 7→ ρ(t)/t is convex, then αρ ≥√

2/3 ≈ .8165 (for the sake of completeness,

we prove this in Konen and Paindaveine (2021); see Corollary S.3.1). Like the power loss


2 4 6 8 10 12 14

0.985

0.990

0.995

1.000

p

α5=.9882

α2.429sph =.9987

αpsph

αp

5 10 15 20 25

0.993

0.994

0.995

0.996

0.997

0.998

0.999

1.000

t

αρsph=1

αρ=.9939

αρsph

αρ

αρ(t)

Figure 1. (Left:) For power loss functions ρ(t) = tp, plot of αρ (see (5)) and αsphρ (see (6)) (note that (5)

shows that αρ → 1 as p→∞). (Right:) For the loss function ρ(t) = exp(t)−1, plot of the quantity, αρ(t)say, of which the infimum is taken in Theorem 3.2; the blue line marks the resulting infimum αρ = .9939,

whereas the orange line stresses that αsphρ = 1 (see Section 4).

functions ρ(t) = tp, p ∈ (1, 2), the Huber loss functions provide a compromise between

the L1 and L2 loss functions, but since Theorem 3.2 entails that αρ = 0 for the Huber

loss functions, power loss functions clearly should be favoured in terms of convexity.

An important corollary of convexity is the following uniqueness result.

Theorem 3.3. Let ρ ∈ C and P ∈ Pdρ . Assume that there is no open interval in (0,∞)

on which ψ− is constant or that P is not concentrated on a line. Then, for any α ∈[0, αρ)∪{0} and u ∈ Sd−1 (union with {0} is needed when αρ = 0), the map µ 7→Mρ

α,u(µ)

is strictly convex on Rd, so that the ρ-quantile µρα,u is unique.

This covers the well-known result stating that, for ρ(t) = t, all ρ-quantiles are unique

provided that P is not supported on a line. Remarkably, Theorem 3.3 shows that this

structural constraint on P is not needed for the ρ-quantiles associated with other power

loss functions: under the corresponding moment assumptions, all ρ-quantiles are unique

for ρ(t) = tp, with p ∈ (1, 2], whereas, for p > 2, all ρ-quantiles with an order α that is

below (5) are unique. Similarly, for exponential loss functions, ρ-quantiles are unique for

any order α < .9939.


4. The spherical case

In this section, we consider the special case for which P (∈ Pρd ) is spherically symmetric

about some location µ0(∈ Rd), in the sense that, for any d-Borel set B and any d× d or-

thogonal matrix O, the P -probability of µ0+OB does not depend on O. Since ρ-quantiles

are translation-equivariant, we will actually restrict, without any loss of generality, to the

case µ0 = 0 (translation-equivariance here means that if µ is a ρ-quantile of P of order α

in direction u, then, for any h ∈ Rd, µ+ h is a ρ-quantile of Ph of order α in direction u,

where Ph is the distribution of Z + h when Z has distribution P ).

Note that it follows from Proposition 2.1 that, if P is spherically symmetric about

the origin of Rd and satisfies P [{0}] < 1, with d ≥ 2 say, then any quantile contour

{µα,u : u ∈ Sd−1}, with α ∈ [0, αρ)∪{0}, is a hypersphere (uniqueness of these quantiles

follows from Theorem 3.3 since P is then not supported on a line). Proposition 2.1 then

also implies that, for an arbitrary order α ∈ [0, 1) and any direction u ∈ Sd−1, the ρ-

quantiles of P of order α in direction u form a set that is invariant under all rotations

fixing u. In particular, if µρα,u is unique, then it belongs to the line spanned by u, which

is most natural. For α ≥ αρ, however, uniqueness is not guaranteed, so that it is unclear

whether or not quantiles meet this natural property in the spherical case. This motivates

the following result.

Theorem 4.1. Let ρ ∈ C and P ∈ Pρd . Assume that P is spherically symmetric about

the origin of Rd. Then, (i) for α = 0 and any u ∈ Sd−1, the unique ρ-quantile µρα,u is

the origin of Rd; (ii) for α ∈ (0, 1) and u ∈ Sd−1, any ρ-quantile µρα,u belongs to the

halfline {λu : λ ≥ 0}.

In case (ii), the origin of Rd may be a ρ-quantile of order α > 0 in direction u. Actually,

it can be shown that (a) for ρ(t) = t, the origin is a ρ-quantile of order α > 0 in direction u

if and only α ≤ P [{0}]. Moreover, (b) provided that ψ+(0)P [{0}]+P [‖Z‖ ∈ (0,∞)\Dρ] =

0 where Z has distribution P (a condition that always holds for ρ(t) = tp with p > 1),

the origin cannot be a ρ-quantile of order α > 0 in direction u, so that all these quantiles

then belong to {λu : λ > 0} (for the sake of completeness, we prove (a)–(b) in Konen

and Paindaveine (2021); see Proposition S.4.1).

If P is spherically symmetric about the origin and satisfies P [{0}] < 1, Theorem 3.3

shows that ρ-quantiles are unique for any α < αρ (as mentioned above) but it remains

silent on the case α ≥ αρ. Interestingly, we will be able to say more under sphericity,

thanks to the fact that Theorem 4.1 entails that uniqueness will hold if t 7→Mρα,u(tu) is

strictly convex over [0,∞) for all u ∈ Sd−1, which in turn will hold if t 7→ Hρα,u(z − tu)

is convex for any z ∈ Rd and any u ∈ Sd−1. Accordingly, for any α ∈ [0, 1], let Csphα be

the collection of functions ρ ∈ C such that t 7→ Hρα,u(z − tu) is convex for any z ∈ Rd

and u ∈ Sd−1. Since Csph0 = C and Csphα2⊆ Csphα1

for any α1 < α2 (see the proof of


Theorem 4.2 below), we let αsphρ := max{α ∈ [0, 1] : ρ ∈ Csphα }, parallel to what we did

for αρ in Section 3. We have the following result.

Theorem 4.2. Let ρ ∈ C and P ∈ Pρd . Assume that P is spherically symmetric about

the origin of Rd. Then, for any α ∈ [0, αsphρ ) ∪ {0} and u ∈ Sd−1 (again, union with {0}

is needed when αsphρ = 0), the map t 7→ Mρ

α,u(tu) is strictly convex on [0,∞), and the

ρ-quantile µρα,u is unique.

Note that, in the present spherical setup, this uniqueness result may only strengthen

the one in Theorem 3.3, since the fact that Cα ⊆ Csphα for any α ∈ [0, 1] implies that αsphρ ≥

αρ. Of course, it is natural to wonder under which conditions on ρ all ρ-quantiles are

unique (αsphρ = 1) and, when these conditions are not met, what are the orders α for

which uniqueness is guaranteed (that is, what is then the value of αsphρ < 1). The following

result provides a complete answer to these questions.

Theorem 4.3. Let ρ ∈ C. Then, (i) αsphρ = 1 if and only if 4pt + qt − ptqt ≤ 6 for

any t ∈ Dρ, where pt and qt are as in Theorem 3.2; (ii) if αsphρ < 1, then, letting Dsph

ρ :=

{t ∈ Dρ : 4pt + qt − ptqt > 6}(⊆ Dcvρ ),

αsphρ = inf

t∈Dsphρ

√βpt,qt ,

where

βp,q :=2(pq − p− q)3(

√cp,q − q(2p− 3)/3)2

3(p− 1)2(3− p)(√cp,q − (2p− q))(√cp,q − q(2p− 3))2

involves cp,q := q3 (3−2p)(2pq−8p+q) (if q makes βp,q undefined in the expression above,

then we let βp,q := limr→q βp,r).

Parallel to αρ in Theorems 3.1–3.2, αsphρ does not depend on d(≥ 2). For the power

loss functions ρ(t) = tp with p ≥ 1, it follows from Theorem 4.3 that

αsphρ =

√

p2(p−2)3(bp−(p− 32 )

2)

(p−1)2(3−p)(bp− 32 )(bp−3(p−

32 )

2)(<1) if p ∈ (2, 3)

1 otherwise,

(6)

with bp :=(3(p− 3

2 )( 72 − p)

)1/2. For p ∈ [1, 2], the result is just a corollary of Theorem 3.1

since we then have αsphρ ≥ αρ = 1. For p > 3, (6) implies that all ρ-quantiles are uniquely

defined under sphericity, while there is no guarantee that this is the case in general (since

αρ < 1 for such values of p). As shown in Figure 1, the values of αsphp for p ∈ (2, 3) are

remarkably close to one (the minimal value, achieved at p ≈ 2.429, is about .9987),

which implies that, also for p ∈ (2, 3), essentially all ρ-quantiles are uniquely defined

under sphericity. For the exponential loss functions ρ(t) = exp(ct)−1, all ρ-quantiles are

also unique under sphericity (αsphρ = 1), while “only” quantiles of order α < αρ = .9939

are guaranteed to be unique in general.


5. Differentiability of the objective function

For any α < αρ, the ρ-quantiles µρα,u are minimizers of the convex objective function µ 7→Mρα,u(µ). If this objective function is smooth, then ρ-quantiles are characterized by the

first-order condition ∇Mρα,u(µρα,u) = 0. Such a gradient condition will actually play a key

role when deriving further properties of ρ-quantiles in the next sections. This provides

a strong motivation to study smoothness of the map µ 7→ Mρα,u(µ). We start with the

following result.

Proposition 5.1. Let ρ ∈ C and P ∈ Pρd . Fix α ∈ [0, 1] and u ∈ Sd−1. Let Z be a

random d-vector with distribution P and write Zµ := Z − µ, for any µ ∈ Rd. Then, for

any µ ∈ Rd and v ∈ Rd \ {0}, the directional derivative

∂Mρα,u

∂v(µ) = lim

t>→0

Mρα,u(µ+ tv)−Mρ

α,u(µ)

t

exists and is given by

∂Mρα,u

∂v(µ) = ψ+(0)(‖v‖ − αu′v)P [{µ}]− αv′E

[ρ(‖Zµ‖)‖Zµ‖

(Id −

ZµZ′µ

‖Zµ‖2

)ξZ,µ

]u

−v′E[{ψ−(‖Zµ‖)I[v′Zµ > 0] + ψ+(‖Zµ‖)I[v′Zµ < 0]}

(1 + α

u′Zµ‖Zµ‖

)Zµ‖Zµ‖

],

where Id is the d× d identity matrix and ξz1,z2 is as in Definition 1.

The objective function thus admits directional derivatives in all directions (hence, is

continuous over Rd), but it is not necessarily differentiable. For instance, the classical

spatial quantiles obtained with ρ(t) = t provide

∂Mρα,u

∂v(µ) = (‖v‖ − αu′v)P [{µ}] + v′E

[(µ− Z‖µ− Z‖

− αu)ξZ,µ

],

so that Mρα,u fails to be differentiable at atoms of P . Clearly, it follows from Theo-

rem 5.1 that a necessary condition for this objective function to be differentiable at µ

is ψ+(0)P [{µ}] = 0. The next result provides a necessary and sufficient condition and

gives an expression for the corresponding gradient.

Theorem 5.1. Let ρ ∈ C and P ∈ Pρd . Fix α ∈ [0, 1] and u ∈ Sd−1. Then,

(i) µ 7→Mρα,u(µ) is differentiable at µ0(∈ Rd) if and only if ψ+(0)P [{µ0}]+P [‖Z−µ0‖ ∈

(0,∞) \ Dρ] = 0, in which case the corresponding gradient is

∇Mρα,u(µ0) = v(µ0)− αT (µ0)u, with v(µ) := −E

[ψ−(‖Zµ‖)

Zµ‖Zµ‖

ξZ,µ

]


and

T (µ) := E

[{ρ(‖Zµ‖)‖Zµ‖

(Id −

ZµZ′µ

‖Zµ‖2

)+ ψ−(‖Zµ‖)

ZµZ′µ

‖Zµ‖2

}ξZ,µ

],

where Zµ := Z − µ is based on a random d-vector Z with distribution P .

(ii) If ψ+(0)P [{µ}] + P [‖Z − µ‖ ∈ (0,∞) \ Dρ] = 0 for any µ in an open set N ,

then µ 7→Mρα,u(µ) is continuously differentiable on N .

It follows from this result that, in contrast with ρ(t) = t, the power loss functions ρ(t) =

tp with p > 1 make the objective function Mρα,u (continuously) differentiable even in the

atomic case. The corresponding quantiles µρα,u are thus the solutions of the first-order

equations ∇Mρα,u(µ) = 0, which rewrite

−pE[‖Zµ‖p−1

Zµ‖Zµ‖

ξZ,µ

]= αE

[‖Zµ‖p−1

(Id + (p− 1)

ZµZ′µ

‖Zµ‖2

)ξZ,µ

]u.

In particular, spatial expectiles (p = 2) of order α in direction u are the unique (Theo-

rem 3.3) solutions of

2(µ− E[Z]) = αE

[‖Z − µ‖

(Id +

(Z − µ)(Z − µ)′

‖Z − µ‖2ξZ,µ

)]u.

This is compatible with the fact that the corresponding “median” (that is, the quantile

of order α = 0, in an arbitrary direction u) is the mean vector E[Z].

We turn to second-order differentiability, which will be relevant when studying the

asymptotic behavior of sample ρ-quantiles in Section 8.

Theorem 5.2. Let ρ ∈ C and P ∈ Pρd . Fix α ∈ [0, 1], u ∈ Sd−1, and µ0 ∈ Rd. Assume

that P [‖Z − µ‖ ∈ [0,∞) \ Dρ] = 0 for any µ in an open neighbourhood of µ0 (hence, in

particular, that P is non-atomic in this neighbourhood). Let further one of the following

assumptions hold:

(A) ψ− is concave on (0,∞) and∫Rd\{µ0}

ψ−(‖z − µ0‖)‖z − µ0‖

dP (z) <∞;

(A′) ψ− is convex on (0,∞), ψ+(0) = 0, and there exists r > 0 such that∫Rdψ′−(‖z − µ0‖+ r) dP (z) <∞

(recall that ψ′− is the left-derivative of ψ−).


Then, for any v ∈ Rd \ {0},

limt>→0

∇Mρα,u(µ0 + tv)−∇Mρ

α,u(µ0)

t= ∇2Mρ

α,u(µ0)v,

where the Hessian matrix ∇2Mρα,u(µ) is given by

∇2Mρα,u(µ) =

(∂i∂jM

ρα,u(µ)

)i,j=1,...,d

= E

[(ψ′−(‖Zµ‖)−

2ψ−(‖Zµ‖)‖Zµ‖

+2ρ(‖Zµ‖)‖Zµ‖2

)(1 + α

u′Zµ‖Zµ‖

)ZµZ

′µ

‖Zµ‖2ξZ,µ

+ρ(‖Zµ‖)‖Zµ‖2

(Id −

ZµZ′µ

‖Zµ‖2

)ξZ,µ +

‖Zµ‖ψ−(‖Zµ‖)− ρ(‖Zµ‖)‖Zµ‖2

×{(

1 + αu′Zµ‖Zµ‖

)(Id −

ZµZ′µ

‖Zµ‖2

)+ 2

ZµZ′µ

‖Zµ‖2+ α

Zµu′ + Z ′µu

‖Zµ‖

}ξZ,µ

](as in the previous results, Zµ := Z − µ, where Z is a random d-vector with distribu-

tion P ).

While they may seem complex at first, the assumptions of Theorem 5.2 turn out

to be simple (and very weak) when considering specific loss functions ρ. For instance,

for ρ(t) = tp with p ≥ 1, they only require that P ∈ Pρd is non-atomic in a neighborhood

of µ0 and is such that E[‖Z − µ0‖p−2] < ∞ when Z has distribution P . Note that this

last assumption, that cannot be avoided since this expectation is involved in the Hessian

matrix ∇2Mρα,u(µ), is superfluous for p ≥ 2. Under the assumptions of Theorems 3.3

and 5.2, this Hessian matrix is positive definite for any α ∈ [0, αρ)∪{0} and any u ∈ Sd−1;

since this will be needed in the sequel, we prove it in Konen and Paindaveine (2021) (see

Lemma S.8.2).

6. A ρ-version of Robert Serfling’s DOQR paradigm

In a series of papers, Robert Serfling introduced the DOQR paradigm, that presents

Depth, Outlyingness, Quantile and Rank functions as interrelated, yet distinct, objects

of interest for multivariate nonparametric statistics; see, e.g., Serfling (2010), Serfling

(2019), Serfling and Zuo (2010) and the references therein. While this paradigm in prin-

ciple applies to any multivariate quantile concept, the primary focus when considering

this paradigm in the aforementioned papers was on spatial quantiles. This makes it nat-

ural to study the paradigm for the generalized spatial quantiles considered in this work,

which leads to introducing ρ-depth, ρ-outlyingness, ρ-quantile and ρ-rank functions. As

we will see later, some of these functions play a key role to understand the nature of

extreme ρ-quantiles.


We start by formally defining ρ-quantile functions. Restricting to the interesting case

for which αρ > 0, Theorems 2.1 and 3.3 imply that ρ-quantiles exist and are unique for

any α ∈ [0, αρ) and u ∈ Sd−1, which allows us to adopt the following definition.

Definition 2. Let ρ ∈ C and P ∈ Pdρ . Assume that there is no open interval in (0,∞)

on which ψ− is constant or that P is not concentrated on a line. Write Bdr = {z ∈ Rd :

‖z‖ < r}. Then, the ρ-quantile function of P is the map Q = QρP : Bdαρ → Rd that is

defined through Q(αu) = µρα,u.

In dimension d = 1 and ρ(t) = t, this provides the (centered-outward version of

the) usual quantile function. This standard quantile function, that is defined on B11 =

(−1, 1), may of course fail to be continuous (it is discontinuous for empirical probability

measures). The multivariate case d ≥ 2 is different.

Proposition 6.1. Let ρ ∈ C and P ∈ Pdρ , with d ≥ 2. Assume that there is no open

interval in (0,∞) on which ψ− is constant or that P is not concentrated on a line. Then,

the quantile function Q = QρP : Bdαρ → Rd is continuous.

Following Serfling (2010), we associate with the ρ-quantile function Q corresponding

concepts of rank function R, depth function D and outlyingness function O. We start

with the rank function.

Definition 3. Let ρ ∈ C and assume that P ∈ Pdρ is not a Dirac probability measure.

Then, the rank function of P is the map R = RρP : Rd → Rd defined through R(µ) =

(T (µ))−1v(µ), where the d × d matrix T (µ) and the d-vector v(µ) were introduced in

Theorem 5.1.

In the setup of this definition, T (µ) is positive definite, hence invertible, for any µ ∈Rd (for the sake of completeness, we prove this in Konen and Paindaveine (2021); see

Lemma S.6.1). The natural assumptions under which to study the rank function are those

of Theorem 5.1 complemented by conditions ensuring uniqueness of ρ-quantiles (which

provides the assumptions in Theorem 6.1 below). Under these assumptions, µ 7→Mρα,u(µ)

is continuously differentiable on Rd, with gradient

∇Mρα,u(µ) = T (µ)(R(µ)− αu),

so that µ = µα,u = Q(αu) (for α < αρ) if and only if R(µ) = αu (recall that, under the

assumptions considered, quantiles of order α ∈ [0, αρ) in direction u ∈ Sd−1 are indeed

uniquely determined by the gradient condition ∇Mρα,u(µ) = 0). This provides a clear

interpretation of the rank function as the inverse map of the quantile function. We have

the following result.


Theorem 6.1. Let ρ ∈ C and P ∈ Pρd . Assume that there is no open interval in (0,∞)

on which ψ− is constant or that P is not concentrated on a line. Assume further that

ψ+(0)P [{µ}] + P [‖Z − µ‖ ∈ (0,∞) \ Dρ] = 0 for any µ ∈ Rd. Write Zρ = QρP (Bdαρ).Then, Q = QρP : Bdαρ → Zρ is a homeomorphism, with inverse RρP |Zρ : Zρ → Bdαρ (therestriction of RρP to Zρ).

If t 7→ t2/ρ(t) is concave on (0,∞), then we are in the important particular case αρ = 1

(Theorem 3.1), for which the quantile function Q is defined on the open unit ball Bd = Bd1 .

We then have the following result.

Theorem 6.2. Let ρ ∈ C be such that t 7→ t2/ρ(t) is concave on (0,∞). Assume that P

is not concentrated on a line and that ψ+(0)P [{µ}] + P [‖Z − µ‖ ∈ (0,∞) \ Dρ] = 0 for

any µ ∈ Rd. Then, Zρ = QρP (Bd) = Rd, so that Q = QρP : Bd → Rd is a homeomorphism,

with inverse R = RρP : Rd → Bd.

This result shows in particular that for any power loss function ρ(t) = tp with p ∈[1, 2], any non-atomic probability measure that is not concentrated on a line provides ρ-

quantiles that span the whole Euclidean space Rd (the non-atomicity condition is actually

needed for p = 1 only), whereas the result remains silent for the case p > 2. This will

have important implications when studying extreme quantiles in Section 7.

Let us turn to depth and outlyingness functions. Clearly, central or “deep” quantiles

are indexed by a small order α ∈ [0, 1), whereas exterior or “outlying” ones are rather

indexed by a large order α. A natural outlyingness measure for µ ∈ Rd is then the order α

of the quantile µα,u for which µ = µα,u, that is, the oultlyingness of µ is ‖R(µ)‖. Any

decreasing function of this outlyingness measure is then a natural depth measure. We

adopt the following definition.

Definition 4. Let ρ ∈ C and P ∈ Pdρ . Then, (i) the outlyingness function of P is the

map O = OρP from Rd to [0, 1] defined through O(µ) = min(‖R(µ)‖, 1), where R = RρPis the rank function of P . (ii) The depth function of P is the map D = Dρ

P from Rdto [0, 1] defined through D(µ) = 1−O(µ).

The deepest location, the only one that receives the maximal depth value one, is the

ρ-median µρ0,u of P (the direction u plays no role for α = 0). For any direction u ∈ Sd−1,

depth decreases along the quantile curves {µρα,u : α ∈ [0, αρ)} originating from the

ρ-median. For ρ(t) = t, this depth reduces to the celebrated spatial depth; see, e.g.,

Vardi and Zhang (2000). The depth that are associated with our ρ-quantiles extend this

classical depth; in particular, an “expectile spatial depth”, whose deepest point is the

mean vector of P , is obtained for ρ(t) = t2. For any depth function, the depth regions

collecting locations with depth exceeding a given threshold are of interest. The depth

regions

Rρα = RρP,α := {µ ∈ Rd : DρP (µ) ≥ α}


are nested “centrality regions”; see, e.g., Mosler (2012) and the references therein. The

corresponding depth contours, i.e. the boundaries ∂Rρα of these depth regions, collect the

ρ-quantiles associated with a fixed order α.

For each combination of α ∈ {.25, .50, .75} and p ∈ {1, 1.5, 2, 4}, we plot in Figure 2 the

depth contours of order α, based on ρ(t) = tp, for the empirical probability measure Pnof six random samples of size n = 200 (these were obtained from a uniform grid of 50

directions on the unit circle Sd−1, and each quantile was evaluated through the descent

method involving the backtracking line search in Section 9.2 of Boyd and Vandenberghe,

2004; R code is available on request). These samples were generated from (i) the bi-

variate standard normal distribution, (ii)–(iii) the standard t-distributions with ν = 4

and ν = 1 degrees of freedom, (iv) the centered bivariate normal distribution with covari-

ance matrix Σ =(2 11 1

), (v) the bivariate distribution whose marginals are independent

exponential distributions with mean one, (vi) the standard skew-t distribution with 4

degrees of freedom and slant vector α = (10, 10); see Azzalini and Capitanio (2014). Fig-

ure 2 shows that larger values of p provide contours that are more concentrated about

the corresponding median; the only exception is the Cauchy distribution, for which these

large-p contours are the most spread ones due to their lack of robustness with respect

to extreme observations. As expected, the various medians differ when the underlying

distribution is skewed, as it is the case in (v)–(vi).

7. Extreme quantiles

Recently, Girard and Stupfler (2015, 2017) studied the spatial quantiles from Chaudhuri

(1996) with a focus on extreme quantiles, that is, those associated with an order α

that is close to one. In particular, Girard and Stupfler (2017) derived striking results on

extreme quantiles showing that (i) spatial quantiles exit any compact set as α → 1 and

that (ii) they do so in a direction that eventually coincides with the direction u in which

quantiles are computed. Surprisingly, this typically also happens when the underlying

distribution P is compactly supported. As shown in Paindaveine and Virta (2021), the

result even holds under atomic probability measures P , so that this unexpected behavior

also shows in the sample case (provided that not all observations lie on a line of Rd).Of course, it is natural to ask whether or not this behavior of extreme quantiles shows

for other ρ-quantiles. We tackle this question in the present section. Our first result is

the following.

Theorem 7.1. Let ρ ∈ C be such that t 7→ t2/ρ(t) is concave on (0,∞). Assume that P

is not concentrated on a line and that ψ+(0)P [{µ}] + P [‖Z − µ‖ ∈ (0,∞) \ Dρ] = 0 for

any µ ∈ Rd. Let (αn) be a sequence in [0, 1) that converges to one and (un) be a sequence

in Sd−1. Then, (i) ‖µραn,un‖ → ∞; (ii) if un → u, then µραn,un/‖µραn,un‖ → u.


-3 -2 -1 0 1 2 3

-3-2

-10

12

(i) Spherical Gaussian

-4 -2 0 2 4-4

-20

24

(iv) Elliptical Gaussian

p=1p=1.5p=2p=4

-6 -4 -2 0 2 4 6

-6-4

-20

24

6

(ii) Spherical t4

0 1 2 3 4 5 6

01

23

45

(v) Independent exponentials

-600 -500 -400 -300 -200 -100 0

-400

-300

-200

-100

0100

(iii) Spherical Cauchy

-2 -1 0 1 2 3 4

-10

12

34

(vi) Skew-t4

Figure 2. For ρ(t) = tp with p = 1, 1.5, 2, 4, ρ-depth contours of order α = .25, .50, .75 for randomsamples of size n = 200 drawn from six bivariate distributions; see Section 6 for details.


This result shows that all ρ-quantiles for which αρ = 1, hence in particular those

associated with ρ(t) = tp for p ∈ (1, 2], will show the behavior of the extreme quantiles

from Chaudhuri (1996) described above. Note that for p ∈ (1, 2], we have ψ+(0) = 0

and Dρ = (0,∞), so that Theorem 7.1 does not require that P is non-atomic, hence

also allows for empirical distributions. We illustrate this in Figure 3 for P = Pn, the

empirical distribution of a random sample of size n = 10 drawn from the bivariate

standard normal distribution. For ρ(t) = tp, with p ∈ {1, 1.5, 2, 2.25, 3, 4}, the figure

shows the ρ-quantiles µρα,u, for α ∈ [0, 1) and u = (cos(π`/6), sin(π`/6)), with ` =

0, 1, 2, 3. Clearly, for the values of p that are covered by Theorem 7.1, namely p = 1, 1.5, 2,

quantiles exit any compact set and do so eventually in the corresponding direction u. In

contrast, the figure suggests that, for p > 2, the Euclidean norm of extreme ρ-quantiles

remains bounded. This is indeed the case, as the following result shows.

Theorem 7.2. Let ρ ∈ C such that ρ(t)/t2 → ∞ as t → ∞. Assume that P ∈ Pρd (a)

is not concentrated on a line of Rd and (b) satisfies∫Rd ρ(‖z‖) dP (z) < ∞ (if ρ(t)/t3 is

bounded away from 0 as t→∞, then Condition (b) is superfluous). Then, there exists a

bounded set S ⊂ Rd such that, for any α ∈ [0, 1) and u ∈ Sd−1, all ρ-quantiles of order α

in direction u belong to S (moreover, D(µ) = 0 for any µ ∈ Rd \ S).

Under the condition of this result, all ρ-quantiles of order α in direction u may fail

to be unique for α ∈ (αρ, 1), which is the reason why Theorem 7.2 states that all ρ-

quantiles of order α in direction u belong to S. The result implies that for ρ(t) = tp

with p ≥ 3, extreme ρ-quantiles are bounded as soon as P ∈ Pρd is not concentrated on

a line and that, for ρ(t) = tp with p ∈ (2, 3), the same holds provided that P further has

finite moments of order p rather than finite moments of order p− 1 only (we conjecture

that this stronger moment assumption for p ∈ (2, 3) is actually superfluous, but we were

not able to avoid this assumption when proving Theorem 7.2). Note that Theorem 7.2

confirms in particular that, in Figure 3, the ρ-quantiles associated with p > 2 form a

bounded set.

As mentioned in Section 2, ρ-quantiles in principle are not defined for α = 1, but of

course Definition 1 may still be adopted to define possible quantiles for order α = 1. We

have the following existence result.

Proposition 7.1. Let the assumptions of Theorem 7.2 hold. Then, for any u ∈ Sd−1,

there exists a quantile µρ1,u.

In contrast, it directly follows from Theorem 6.2 that, under the assumptions of Theo-

rem 7.1, there is no u ∈ Sd−1 for which a quantile µρ1,u exist (this result is already known

for ρ(t) = t; see Proposition 2.1 in Girard and Stupfler, 2017).

The following corollary of Theorem 7.2 extends in some sense the continuity of the

quantile function (Proposition 6.1) to the framework where quantiles of order α = 1

exist.


0 2 4 6 8

02

46

8p=1

0 2 4 6 80

24

68

p=1.5

0 2 4 6 8

02

46

8

p=2

0 2 4 6 8

02

46

8

p=2.25

0 2 4 6 8

02

46

8

p=3

0 2 4 6 8

02

46

8

p=4

Figure 3. For the loss functions ρ(t) = tp with p = 1, 1.5, 2, 2.25, 3, 4, the plots show the ρ-quantiles µρα,ufor α ∈ [0, 1) and u = (cos(π`/6), sin(π`/6)), with ` = 0, 1, 2, 3; the underlying probability measure Pis the empirical distribution Pn associated with a random sample of size n = 10 from the bivariatestandard normal distribution. Dashed lines are showing the halflines with the corresponding directions uoriginating from the median µρ0,u.


Corollary 7.1. Let the assumptions of Theorem 7.2 hold. Let (αn) be a sequence

in [0, 1) that converges to α ∈ [0, 1] and (un) be a sequence in Sd−1 that converges

to u(∈ Sd−1). Fix an arbitrary sequence (µραn,un) of ρ-quantiles. Then, (i) any converg-

ing subsequence of (µραn,un) converges to a ρ-quantile µρα,u; (ii) if µρα,u is unique, then

µραn,un → µρα,u.

This result further confirms that the quantile functions—hence also the rank, depth

and outlyingness functions—associated with the loss functions ρ covered by Theorem 7.1

and Theorem 7.2 are very different in nature. In particular, in the framework of The-

orem 7.2, the depth of µ will be exactly zero if ‖µ‖ is large enough. Some recent re-

search efforts in the statistical depth literature aimed at defining depth functions—or

at modifying existing depth functions—that do not show this “vanishing property”; see,

e.g., Francisci, Nieto-Reyes and Agostinelli (2019) and the many references therein. This

vanishing property is indeed undesirable in some inferential applications, such as, e.g.,

supervised classification based on the max-depth approach; see Francisci, Nieto-Reyes

and Agostinelli (2019), Ghosh and Chaudhuri (2005), and Li, Cuesta-Albertos and Liu

(2012). Quite nicely, the ρ-depths associated with loss functions ρ compatible with The-

orem 7.1 will not exhibit this vanishing property. Yet, as in Girard and Stupfler (2017),

some might find it shocking that the corresponding ρ-quantiles span the whole Euclidean

space even when P is compactly supported. This can be avoided by adopting a loss func-

tion ρ meeting the conditions of Theorem 7.2. As a conclusion, while Theorems 7.1–7.2

discriminate between two fundamentally different classes of DOQR functions, none of

these two worlds is “the good one” and the choice of ρ, hence the choice among both

worlds, should be performed based on the inferential problem at hand.

8. Asymptotics for point estimation

We now consider estimation of the ρ-quantiles µρα,u = µρα,u(P ) based on a random sam-

ple Z1, . . . , Zn from P . As usual, the natural estimator is obtained by replacing P with

the corresponding empirical probability measure. In this section, we study the asymptotic

properties of the resulting sample ρ-quantiles. We start with the following consistency

result.

Theorem 8.1. Fix ρ ∈ C and assume that there is no open interval in (0,∞) on

which ψ− is constant or that P ∈ Pdρ is not concentrated on a line. Denote as Pn the

empirical probability measure associated with a random sample of size n from P . Fix α ∈[0, αρ) ∪ {0}, u ∈ Sd−1, and write µρα,u = µρα,u(Pn). Then,

µρα,u → µρα,u

almost surely as n→∞.


The sample spatial median, that is, the median obtained with the loss function ρ(t) = t,

satisfies a classical asymptotic normality result (see, e.g., Mottonen et al. (2010)), which,

as usual, allows one to perform hypothesis testing or to build confidence zones for the

population spatial median. This is an important advantage over competing multivariate

medians, that exhibit so complicated asymptotic distributions that it is not possible to

base inference on them (this is in particular the case for the celebrated Tukey median; see

Masse (2002)). Quite nicely, all sample ρ-quantiles enjoy a standard asymptotic normality

result, relying on a neat Bahadur representation result (that typically may itself have

further applications, such as the derivation of LIL results). We have the following result.

Theorem 8.2. Let ρ ∈ C and P ∈ Pρd . Assume that there is no open interval in (0,∞)

on which ψ− is constant or that P is not concentrated on a line. Fix α ∈ [0, αρ) ∪ {0}and u ∈ Sd−1. Assume that ∫

Rdψ2−(‖z − µρα,u‖) dP (z) <∞

and that P [‖Z−µ‖ ∈ [0,∞) \Dρ] = 0 for any µ in an open neighborhood of µρα,u (hence,

in particular, that P is non-atomic in this neighborhood). Let further one of the following

assumptions hold:

(A) ψ− is concave on (0,∞) and∫Rd\{µρα,u}

ψ−(‖z − µρα,u‖)‖z − µρα,u‖

dP (z) <∞;

(A′) ψ− is convex on (0,∞), ψ+(0) = 0, and there exists r > 0 such that∫Rdψ′−(‖z − µρα,u‖+ r) dP (z) <∞

(recall that ψ′− is the left-derivative of ψ−).

Let µρα,u = µρα,u(Pn), where Pn is the empirical probability measure associated with a

random sample Z1, . . . , Zn of size n from P . Then,

√n(µρα,u − µρα,u)

=1√nA−1

n∑i=1

∇Hρα,u(Zi − µρα,u)I[‖Zi − µρα,u‖ ∈ Dρ] + oP(1)

D→ Nd(0, V )

as n→∞, where V := A−1BA−1 involves A := ∇2Mρα,u(µρα,u) and B := E[(∇Hρ

α,u(Z1−µρα,u))(∇Hρ

α,u(Z1 − µρα,u))′I[‖Z1 − µρα,u‖ ∈ Dρ]].


We stress that this result requires very mild assumptions only. In particular, for the

power loss functions ρ(t) = tp with p ≥ 2, it only requires that P is non-atomic in

a neighborhood of µρα,u and admits finite moments of order 2(p − 1) (for the median

obtained with p = 2, namely the mean, this is the usual finite second-order moment

assumption, and the result only restate the usual multivariate central limit theorem, but

for the mild local non-atomicity assumption). For p ∈ [1, 2), Theorem 8.2 further requires

that E[‖Z − µρ0,u‖p−2] exists and is finite (note that, for the spatial median (p = 1),

Mottonen et al. (2010) derives the result under assumptions that are more stringent,

since it is imposed there that E[‖Z − µρ0,u‖−r] exists and is finite for any r ∈ [0, 2).

Invertibility of A is always guaranteed; see Lemma S.8.2 in Konen and Paindaveine

(2021).

To illustrate the result, we focus on ρ-medians (α = 0) under sphericity. If P is

spherically symmetric about the origin of Rd, then all ρ-medians µρ0,u are equal to each

other (they coincide with the origin of Rd; see Theorem 4.1), which makes it valid to

compare the asymptotic variances of sample ρ-medians. We consider the power loss func-

tions ρ(t) = tp with p ≥ 1, for which

∇Hρα,u(x)(∇Hρ

α,u(x))′I[x ∈ Dρ] = p2‖x‖2(p−1) xx′

‖x‖2ξx,0

and

∇2Hρα,u(x)I[x ∈ Dρ] = ‖x‖p−2

{pId + p(p− 2)

xx′

‖x‖2

}ξx,0;

see Lemma S.5.1 in Konen and Paindaveine (2021). If P is spherically symmetric about

the origin of Rd, then ‖Z‖ and Z/‖Z‖ are mutually independent, with Z/‖Z‖ uniformly

distributed over Sd−1, which yields

B =p2

dE[‖Z‖2(p−1)]Id and A =

p(d+ p− 2)

dE[‖Z‖p−2ξZ,0]Id.

Thus, the asymptotic covariance matrix V is given by

V = A−1BA−1 =dE[‖Z‖2(p−1)]

(d+ p− 2)2(E[‖Z‖p−2])2Id =: vp(P )Id. (7)

For p = 1, this reduces to the asymptotic covariance matrix of the spatial median

(see Mottonen et al. (2010)), whereas, for p = 2, this provides the asymptotic covariance

matrix V = E[ZZ ′] of the sample mean. Let us consider various spherical distributions.

If P = P tν is the d-variate t-distribution with ν degrees of freedom, then ‖Z‖2/d is

Fisher–Snedecor with d and ν degrees of freedom, which yields

vp(Ptν) =

Γ(d+22 )Γ(d+2p−2

2 )Γ(ν+22 )Γ(ν−2p+2

2 )

Γ2(d+p2 )Γ2(ν−p+22 )

(8)


for ν > 2(p− 1), whereas if P = P eη is the d-variate power-exponential distribution with

tail parameter η(> 0), then

vp(Peη ) =

2(1−η)/ηΓ(d+2η2η )Γ(d+2p−2

2η )

ηΓ2(d+p+2η−22η )

; (9)

the power-exponential distribution with tail parameter η refers to the distribution admit-

ting the density z 7→ feη (z) := cd,η exp(−‖z‖2η/2) with respect to the Lebesgue measure

over Rd (cd,η is a normalizing constant). The asymptotic variance at the standard d-

variate normal distribution is obtained by taking ν → ∞ in (8) or, alternatively, by

taking η = 1 in (9).

The factors vp(Ptν) and vp(P

eη ), that completely characterize the asymptotic covari-

ance matrix of the sample ρ-median associated with ρ(t) = tp under the corresponding

distributions, are plotted in Figure 4. For heavy tails, the medians associated with a

small value of p dominate their competitors, whereas the opposite happens for light tails

(lighter-than-normal tails are obtained for η > 1 in the power-exponential case). Note

that the sample ρ-median associated with ρ(t) = tp is the maximum likelihood esti-

mator of the symmetry center in the location family generated by power-exponential

distributions with parameter η = p/2, which explains that large values of p (p > 2) will

behave well under lighter-than-normal tails. All in all, the median associated with p = 1.5

seems to provide a nice balance between the spatial median and sample mean associated

with p = 1 and p = 2, respectively. While these considerations are specific to the spherical

case, the efficiency of ρ-medians in the elliptical case could be studied following the anal-

ysis in Magyar and Tyler (2011), where the focus was exclusively on the spatial median

(p = 1).

To check correctness of Theorem 8.2, we performed a Monte-Carlo study involving the

bivariate (d = 2) t-distributions with ν degrees of freedom with ν ∈ {3, 5, 7, . . . , 21}, and

the bivariate power-exponential distributions with parameter η ∈ {.8, 1.2, 1.6, . . . , 4}. For

each of these distributions, we generated M = 10,000 random samples of size n = 200 and

evaluated the ρ-medians µρ0,u = µρ0,u(m) associated with ρ(t) = tp for p ∈ {1, 1.5, 2, 4} in

each sample m = 1, . . . ,M . In Figure 4, we report the quantities

vp := n

(1

M

M∑m=1

(µρ0,u(m)− µρ0,u

)(µρ0,u(m)− µρ0,u

)′)11

,

with

µρ0,u :=1

M

M∑m=1

µρ0,u(m).

These quantities estimate the upper-left entry in the corresponding asymptotic covariance

matrix V , namely the corresponding factor vp(P ) in (7). Clearly, the results are in perfect

agreement with Theorem 8.2. It is only for p = 4 and t-distributions that some deviation


5 10 15 20

1.0

1.5

2.0

2.5

3.0

ν

v p(P

νt )

1.0 1.5 2.0 2.5 3.0 3.5 4.0

01

23

ηv p(P

ηe )

p=4p=2p=1.5p=1

Figure 4. (Left:) For p ∈ {1, 1.5, 2, 4}, plots of ν 7→ vp(P tν) for ν = 3, 5, 7, . . . , 21, where vp(P tν) (see (8))is the factor characterizing the asymptotic covariance matrix, at the bivariate t-distribution with νdegrees of freedom, of the sample ρ-median based on ρ(t) = tp. Dotted lines are estimates of vp(P tν)computed from M = 10,000 random samples of size n = 200. For p = 4, the dashed line provides thesame result for random samples of size n = 1,000. (Right:) Still for p ∈ {1, 1.5, 2, 4}, plots of η 7→ vp(P eη )for η = .8, 1.2, 1.6, . . . , 4, where vp(P eη ) (see (9)) is the factor characterizing the asymptotic covariancematrix, at the bivariate power-exponential distribution with tail parameter η, of the sample ρ-medianbased on ρ(t) = tp. Dotted lines are estimates of vp(P eη ) computed from M = 10,000 random samples ofsize n = 200; see Section 8 for details.

from the asymptotic theory is seen, but this deviation vanishes for larger sample sizes

(for p = 4 and t-distributions, Figure 4 also provides the results for sample size n =

1,000).

Obviously, using Theorem 8.2 to conduct inference based on ρ-quantiles (i.e., per-

forming hypothesis testing or building confidence zones) requires estimating consistently

the corresponding asymptotic covariance matrix V . A natural estimator is of course

Vn = A−1n BnA−1n , with

An :=1

n

n∑i=1

∇2Hρα,u(Zi − µρα,u)I[‖Zi − µρα,u‖ ∈ Dρ]

and

Bn :=1

n

n∑i=1

(∇Hρα,u(Zi − µρα,u))(∇Hρ

α,u(Zi − µρα,u))′I[‖Zi − µρα,u‖ ∈ Dρ].

One may proceed as in Haberman (1989) to establish that Vn converges in probability

to V as the sample size n diverges to infinity.


9. Perspectives for future research

In this paper, we investigated the properties of the spatial ρ-quantiles in Definition 1.

While this arguably settles the probabilistic study of these quantiles in the setup consid-

ered, our work naturally calls for an extension to more general setups and for applications

of these quantiles. As mentioned in the introduction, the spatial quantiles from Chaud-

huri (1996) are flexible objects that can cope with more exotic types of data, such as

functional data. This is associated with the fact that these quantiles are defined as min-

imizers of an objective function (see (1)) that involves norms and inner products only,

hence that also makes sense in Hilbert spaces. This, however, is also the case for the

objective function defining ρ-quantiles in (2), so that it would be natural to investigate

the properties of ρ-quantiles for random variables taking values in Hilbert spaces and to

compare their properties with those of the classical spatial quantiles; we refer to Cardot,

Cenac and Godichon-Baggioni (2017), Cardot, Cenac and Zitt (2013) and Chakraborty

and Chaudhuri (2014) for results on the spatial median and spatial quantiles in infinite-

dimensional spaces.

Another direction for future research is related to inferential applications. As al-

ready mentioned in the introduction, the spatial quantiles from Chaudhuri (1996) and

the companion spatial depth have been much used in a quantile regression framework

(Chakraborty (2003), Cheng and De Gooijer (2007), Chowdhury and Chaudhuri (2019)),

and it would be of interest to consider ρ-quantiles in this setup. In particular, this would

provide a spatial concept of multiple-output expectile regression, which would be quite

natural since expectiles were originally introduced, in Newey and Powell (1987), as an

L2-alternative to the traditional L1-concept of quantile regression (Koenker and Bassett

(1978)). Another natural venue for application of ρ-quantiles and ρ-depth is supervised

classification. In the last decade, supervised classification based on depth, where a new

observation is classified into the population with respect to which it is deepest, has met

much success in the literature; see, e.g., Li, Cuesta-Albertos and Liu (2012), Pokotylo,

Mozharovskyi and Dyckerhoff (2019), and the references therein. In this framework,

Lp-depths provide natural tools to implement this max-depth approach where p might

be chosen through cross-validation. Such applications, or the application of ρ-quantiles

in risk assessment, deserve a full-fledged paper, hence are left for future work.

Acknowledgement

The authors would like to thank the Editor-In-Chief, Mark Podolskij, the Associate

Editor, and two anonymous referees for their insightful comments and suggestions. This

research work was supported by a research fellowship from the Francqui Foundation

and by the Program of Concerted Research Actions (ARC) of the Universite libre de

Bruxelles.


Supplementary Material

Technical proofs

(). This online supplement contains the proofs of all results stated in this paper.

References

Azzalini, A. and Capitanio, A. (2014). The Skew-Normal and Related Families. IMS

Monograph series. Cambridge University Press.

Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University

Press, Cambridge.

Breckling, J. and Chambers, R. (1988). M-Quantiles. Biometrika 75 761–771.

Brown, B. M. (1983). Statistical uses of the spatial median. J. R. Statist. Soc. Ser. B

45 25–30.

Cardot, H., Cenac, P. and Zitt, P.-A. (2013). Efficient and fast estimation of the

geometric median in Hilbert spaces with an averaged stochastic gradient algorithm.

Bernoulli 19 18–43.

Cardot, H., Cenac, P. and Godichon-Baggioni, A. (2017). Online estimation of

the geometric median in Hilbert spaces: Nonasymptotic confidence balls. Ann. Statist.

45 591–614.

Chakraborty, B. (2003). On multivariate quantile regression. J. Statist. Plann. Infer-

ence 110 109–132.

Chakraborty, A. and Chaudhuri, P. (2014). The spatial distribution in infinite

dimensional spaces and related quantiles and depths. Ann. Statist. 42 1203–1231.

Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data. J.

Amer. Statist. Assoc. 91 862–872.

Chen, Z. (1996). Conditional Lp-quantiles and their application to the testing of sym-

metry in non-parametric regression. Statist. Probab. Lett. 29 107–115.

Cheng, Y. and De Gooijer, J. G. (2007). On the uth geometric conditional quantile.

J. Statist. Plann. Inference 137 1914–1930.

Chowdhury, J. and Chaudhuri, P. (2019). Nonparametric depth and quantile regres-

sion for functional data. Bernoulli 25 395–423.

Daouia, A., Girard, S. and Stupfler, G. (2018). Estimation of tail risk based on

extreme expectiles. J. Roy. Statist. Soc. Ser. B 80 263–292.

Daouia, A., Girard, S. and Stupfler, G. (2019). Extreme M-quantiles as risk mea-

sures: from L1 to Lp optimization. Bernoulli 25 264–309.

Francisci, G., Nieto-Reyes, A. and Agostinelli, C. (2019). Generalization of the

simplicial depth: no vanishment outside the convex hull of the distribution support.

arXiv preprint arXiv:1909.02739.


Gardes, L., Girard, S. and Stupfler, G. (2020). Beyond tail median and conditional

tail expectation: extreme risk estimation using tail Lp-optimization. Scand. J. Statist.

47 922–949.

Ghosh, A. K. and Chaudhuri, P. (2005). On maximum depth and related classifiers.

Scand. J. Stat. 32 327–350.

Girard, S. and Stupfler, G. (2015). Extreme geometric quantiles in a multivariate

regular variation framework. Extremes 18 629–663.

Girard, S. and Stupfler, G. (2017). Intriguing properties of extreme geometric quan-

tiles. REVSTAT 15 107–139.

Haberman, S. J. (1989). Concavity and estimation. Ann. Statist. 17 1631–1661.

Hallin, M., Paindaveine, D. and Siman, M. (2010). Multivariate quantiles and

multiple-output regression quantiles: From L1 optimization to halfspace depth (with

discussion). Ann. Statist. 38 635–703.

Hallin, M., del Barrio, E., Albertos, J. C. and Matran, C. (2021). Center-

outward distribution/quantile functions, ranks, and signs in Rd: a measure transporta-

tion approach. Ann. Statist. 49(2) 1139–1165.

Herrmann, K., Hofert, M. and Mailhot, M. (2018). Multivariate geometric expec-

tiles. Scand. Actuar. J. 2018 629–659.

Koenker, R. and Bassett, G. Jr. (1978). Regression quantiles. Econometrica 46

33–50.

Koltchinski, V. I. (1997). M-estimation, convexity and quantiles. Ann. Statist. 25

435–477.

Konen, D. and Paindaveine, D. (2021). Supplement to “Multivariate ρ-quantiles: a

spatial Approach”.

Kuan, C. M., Yeh, J. H. and Hsu, Y. C. (2009). Assessing value at risk with CARE,

the Conditional Autoregressive Expectile models. J. Econometrics 150 261–270.

Li, J., Cuesta-Albertos, J. A. and Liu, R. Y. (2012). DD-classifier: Nonparametric

classification procedure based on DD-plot. J. Amer. Statist. Assoc 107 737–753.

Magyar, A. and Tyler, D. E. (2011). The asymptotic efficiency of the spatial median

for elliptically symmetric distributions. Sankhya 73 165–192.

Masse, J.-C. (2002). Asymptotics for the Tukey median. J. Multivariate Anal. 81 286–

300.

Mosler, K. (2012). Multivariate Dispersion, Central Regions, and Depth: The Lift

Zonoid Approach 165. Springer Science & Business Media.

Mottonen, J., Nordhausen, K., Oja, H. et al. (2010). Asymptotic theory of the

spatial median. In Nonparametrics and Robustness in Modern Statistical Inference

and Time Series Analysis: A Festschrift in honor of Professor Jana Jureckova 182–

193. Institute of Mathematical Statistics.

Mukhopadhyaya, N. D. and Chatterjee, S. (2011). High dimensional data analysis

using multivariate generalized spatial quantiles. J. Multivariate Anal. 102 768–780.


Newey, W. K. and Powell, J. L. (1987). Asymmetric least squares estimation and

testing. Econometrica 55 819–847.

Paindaveine, D. and Virta, J. (2021). On the behavior of extreme spatial quantiles

under minimal assumptions. In Advances in Contemporary Statistics and Econometrics

(A. Daouia and A. Ruiz-Gazen, eds.) 243–259. Springer, Cham.

Pokotylo, O., Mozharovskyi, P. and Dyckerhoff, R. (2019). Depth and depth-

based classification with R-package ddalpha. J. Statist. Softw. 91 1–46.

Serfling, R. (2002). Quantile functions for multivariate analysis: approaches and ap-

plications. Stat. Neerl. 56 214–232.

Serfling, R. (2010). Equivariance and invariance properties of multivariate quantile and

related functions, and the role of standardisation. J. Nonparametr. Stat. 22 915–936.

Serfling, R. (2019). Depth functions on general data spaces, I. Perspectives, with

consideration of “density” and “local” depths. Submitted.

Serfling, R. and Wijesuriya, U. (2017). Depth-based nonparametric description of

functional data, with emphasis on use of spatial depth. Comput. Statist. Data Anal.

105 24–45.

Serfling, R. and Zuo, Y. (2010). Discussion of “Multivariate quantiles and multiple-

output regression quantiles: From L1 optimization to halfspace depth”, by M. Hallin,

D. Paindaveine, and M. Siman. Ann. Statist. 38 676–684.

Taylor, J. (2008). Estimating value at risk and expected shortfall using expectiles. J.

Financ. Econometrics 6 231–252.

Usseglio-Carleve, A. (2018). Estimation of conditional extreme risk measures from

heavy-tailed elliptical random vectors. Electron. J. Stat. 12 4057–4093.

Vardi, Y. and Zhang, C.-H. (2000). The multivariate L1-median and associated data

depth. Proc. Natl. Acad. Sci. USA 97 1423–1426.

Zhou, W. and Serfling, R. (2008). Multivariate spatial U-quantiles: a Bahadur-Kiefer

representation, a Theil–Sen estimator for multiple regression, and a robust dispersion

estimator. J. Statist. Plann. Inference 138 1660–1678.

multivariate -quantiles: a spatial approach

Documents