Optimal decomposition for infimal convolution on Banach Couples

(1)

' $

Department of Mathematics

Optimal decomposition for infimal

convolution on Banach Couples

Japhet Niyobuhungiro

LiTH-MAT-R–2013/07–SE

(2)

Department of Mathematics

Linkping University

S-581 83 Linkping, Sweden.

(3)

Optimal decomposition for infimal

convolution on Banach Couples

Japhet Niyobuhungiro

∗†

Abstract

Infimal convolution of functions defined on a regular Banach couple is considered. By using a theorem due to Attouch and Brezis, we establish sufficient conditions for this infimal convolution to be subdifferentiable. We also provide a result which gives a dual characterization of optimal decomposition for an infimal convolution in general. We plan tu use these results as tools to study the mathematical properties of exact minimiz-ers for the K–, L–, and E– functionals of the theory of real interpolation. This will be done in a separate study. In this study, we apply our ap-proach to two well–known optimization problems, namely convex and linear programming and provide proofs for duality theorems based on infimal convolution.

1 Introduction

The K–, L– and E– functionals of the theory of interpolation can be considered as particular cases of infimal convolution, well–known in convex analysis. Let (X0, X1) denote a regular Banach couple. Then there exist two specific

convex, lower semicontinuous and proper functions ϕ0 : X0 −→ R∪ {+∞}

and ϕ1: X1−→R∪ {+∞}for each of the K–, L– and E– functionals such that

they can be written as a function F : X0+X1−→R∪ {+∞}defined by

F(x) = (ϕ0⊕ϕ1) (x) = inf x=x0+x1

(ϕ0(x0) +ϕ1(x1)), (1)

where the infimum extends over all representations x = x0+x1of x with x0

and x1 in X0+X1 and where ϕ0 : X0+X1 −→ R∪ {+∞} and ϕ1 : X0+

X1 −→ R∪ {+∞}are respective extensions of ϕ0 and ϕ1 on X0+X1 in the

following way

ϕ0(u) =

ϕ0(u) if u∈X0;

+∞ if u∈ (X0+X1) \X0. (2)

∗_{Department of Mathematics, Linköping University, SE–581 83 Linköping, Sweden. E-mail:}

japhet.niyobuhungiro@liu.se

†_{Department of Applied Mathematics, National University of Rwanda, P.O. Box: 56 Butare}

(4)

and

ϕ1(u) =

ϕ1(u) if u∈X1;

+∞ if u∈ (X0+X1) \X1. (3)

For example, the L– functional Lp0,p1(t, x; X0, X1) =_x=xinf 0+x1 1 p0 kx0kp_X0₀+ t p1 kx1k p₁ X1 , (4)

where 1≤p0, p1<∞, can be written as the infimal convolution

Lp0,p1(t, x; X0, X1) = (ϕ0⊕ϕ1) (x), (5)

where functions ϕ0and ϕ1are both defined on the sum X0+X1as follows ϕ0(u) = ( 1 p0 kuk p0 X0 if u∈X0; +∞ if u∈ (X0+X1) \X0. (6) and ϕ1(u) = ( _t p₁ kuk p1 X1 if u∈X1; +∞ if u∈ (X0+X1) \X1. (7) In this case the functions ϕ0 : X0−→ R∪ {+∞}and ϕ1 : X1 −→R∪ {+∞}

are defined by ϕ0(u) = 1 p0 kukp0 X0 and ϕ1(u) = t p1 kukp1 X1. (8)

However, it is important to notice that the extended functions ϕ0and ϕ1could

stop to be lower semicontinuous even if ϕ0 and ϕ1 are. For example let the

function ϕ0be defined by ϕ0(u) = 0 if kuk_X 0 ≤1; +∞ if u∈ (X0+X1) \BX0(0; 1).

Then if the unit ball B_X₀(0; 1) = nu∈X0+X1:kukX0 ≤1

o

of X0 is not

closed in X0+X1then the function ϕ0is not lower semicontinuous.

It also turns out that the strongest condition for a decomposition correspond-ing to x to be optimal is subdifferentiability of the function F= ϕ0⊕ϕ1at x.

However, since two different Banach spaces are involved, some technical diffi-culties appear when you would like to apply known results in convex analysis. In this regard, we reconsider the infimal convolution F(x) = (ϕ0⊕ϕ1) (x)

as a minimization problem on the intersection X0∩X1 as follows: Given

x∈ X0+X1such that (ϕ0⊕ϕ1) (x) < +∞, we fix a0∈ X0 and a1∈ X1such

that a0+a1=x. It follows that any other arbitrary decomposition x0+x1=x

of x with x0and x1in X0+X1will be such that x0=a0−y and x1=a1+y for

y ∈ (X0∩X1). Then(ϕ0⊕ϕ1) (x)can be written as a minimization of a sum

of two convex functions both defined on the intersection X0∩X1as follows

(ϕ0⊕ϕ1) (x) = inf y∈X0∩X1

(5)

for functions S and R defined on X0∩X1with values inR∪ {+∞}by

S(y) =ϕ0(a0−y) and R(y) =ϕ1(a1+y). (10)

In Section 2 we recall some classical definitions, notions and notations. In Section 3 we formulate and prove a theorem establishing sufficient condi-tions for the infimal convolution(ϕ0⊕ϕ1)to be subdifferentiable. In section

4 we prove a lemma which establishes a useful characterization of optimal decomposition for infimal convolution. In the Sections 5 and 6, we apply our approach to two well–known optimization problems, namely convex and linear programming, and finally the last section outlines some remarks and discussions.

2 Preliminaries

Given a Banach space(E,k·k_E), its dual space E∗ is the space of all bounded linear functionals y : E −→ R. Given y ∈ E∗, for y(x)we will write hy, xi

withh· , ·idenoting the dual product between E∗and E. We note that E∗is a Banach space equipped with the norm

kyk_E∗ =sup{|hy, xi|: x∈E; kxk_E≤1}. (11)

Definition 2.1. The effective domain or simply domain of the function F from E into R∪ {+∞}is denoted dom F and defined by

dom F={x ∈E : F(x) < +∞}. We will call F proper if dom F6=∅.

The effective domain of a convex function is convex (see the proof in (Ekeland and Témam, 1999)).

Definition 2.2. Let F be a function from E intoR∪ {+∞}. We say that F is lower semicontinuous (l.s.c.) at x∈ E if for every ε>0 there exists a neighborhoodO of x such that F(y) ≥F(x) −ε for all y∈ O. Equivalently, this can be expressed as

F(x) ≤lim inf

y→x F(y),

The function F is called lower semicontinuous if it is lower semicontinuous at every point of its domain.

Definition 2.3. Let F be the function from E intoR∪ {+∞}. The conjugate function of F is the function denoted F∗and defined from E∗intoR∪ {+∞}by

F∗(y) = sup

x∈dom E

{hy, xi −F(x)}, for y∈E∗.

Definition 2.4. The subdifferential of a convex function F: E−→R∪ {+∞}at x is the set denoted by ∂F(x)and defined by

∂F(x) ={y∈E∗: F(z) ≥F(x) +hy, z−xi, ∀z∈E}.

(6)

Proposition 2.1. Any points y ∈ E∗ and x ∈ E satisfy the inequality, well known as Young-Fenchel inequality

F(x) +F∗(y) ≥hy, xi.

Equality holds if and only if y∈∂F(x):

y∈∂F(x) ⇐⇒F(x) +F∗(y) =hy, xi.

Definition 2.5. If S is a subset of E, the indicator function δSof S is defined as:

δS(x) =

0 if x∈S;

+∞ if x∈E\A.

Clearly S is a convex subset of E if and only if δSis a convex function. Definition 2.6. Let F0and F1be two functions from E intoR∪ {+∞}. The infimal

convolution(F0⊕F1) (x)of F0and F1at a point x∈ E is defined as

(F0⊕F1) (x) =inf{F0(x0) +F1(x−x0): x0∈E}, x∈E. (12)

We say that the infimal convolution is exact if the infimum is attained when(F0⊕F1) (x)

is finite. i.e., there exists ˜x0∈E such that

(F0⊕F1) (x) =F0(˜x0) +F1(x− ˜x0).

I this case, we call x= ˜x0+ (x− ˜x0), optimal decomposition of(F0⊕F1)

corespond-ing to x.

It is known in convex analysis that the operations of addition and infimal con-volution of convex functions are dual to each other (see for example (Rockafel-lar, 1972)). The conjugate of infimal convolution is the sum of the conjugates and this holds without any requirement on the convex functions.

Theorem 2.1(Conjugate of infimal convolution). Let F0and F1be convex

func-tions from E intoR∪ {+∞}. Then

(F0⊕F1)∗=F0∗+F1∗.

However, the conjugate of the sum is the exact infimal convolution of the conjugates under some mild qualification conditions. More precisely, it is very important in convex analysis (see for example (Azé, 1994)) to find sufficient conditions ensuring that

(F0+F1)∗(y) = min y=y0+y1

(F0∗(y0) +F1∗(y1)). (13)

Different qualification conditions exist. We can mention for example (Moreau, Paris, 1996), (Rockafellar, 1966) and (Ernst and Théra, 2009). In this paper, we will use a condition provided by (Attouch and Brezis, 1986). The theorem will be stated in the next section.

(7)

3 Subdifferentiability of infimal convolution on a

Banach couple

The following theorem establishes sufficient conditions for the infimal convo-lution(ϕ0⊕ϕ1)to be subdifferentiable.

Theorem 3.1 (Subdifferentiability of infimal convolution). Let the functions S and R be defined as in (10) and be convex, lower semicontinuous and proper. Let

ϕ∗₀and ϕ∗₁be the respective conjugate functions of ϕ0and ϕ1. Suppose that

(1) the sets dom S and dom R satisfy

[

λ≥0

λ(dom S−dom R) =X0∩X1 (14)

(2) The conjugate function S∗of S is given by

S∗(z) =

ϕ∗₀(−z) +hz, a0i if z∈X0∗;

+∞ if z∈ X₀∗+X₁∗

\X∗₀. (15) (3) The conjugate function R∗of R is given by

R∗(z) =

ϕ∗₁(z) +h−z, a1i if z∈X1∗;

+∞ if z∈ X₀∗+X₁∗

\X∗₁. (16) Then the function(ϕ0⊕ϕ1)is subdifferentiable on its domain in X0+X1.

Remark 1. In the case X0 = X1, the situation is much more simplified. Indeed the

case of+∞ disappears in the formulas (15) and (16) above.

The main tool we use here to prove Theorem 3.1 is the result by (Attouch and Brezis, 1986) who provided a sufficient condition for the conjugate of the sum of two convex lower semicontinuous and proper functions to be equal to the exact infimal convolution of their conjugates. Below we give its statement. Let E be a Banach space.

Theorem 3.2(Attouch–Brezis). Let ϕ, ψ : E −→ R∪ {+∞} be convex, lower semicontinuous, and proper functions such that

[

λ≥0

λ(dom ϕ−dom ψ) is a closed vector space. (17)

Then

(ϕ+ψ)∗=ϕ∗⊕ψ∗on E∗, (18)

(8)

The following corollary follows immediately and it provides a connection be-tween minimization problem (primal problem) on E and maximization prob-lem (dual probprob-lem) on E∗. Moreover, the maximum is always attained.

Corollary 1. Let ϕ, ψ : E−→ R∪ {+∞}be functions satisfying Attouch–Brezis conditions. Then

inf

x∈E(ϕ(x) +ψ(x)) =maxf ∈E∗(−ϕ

∗₍₋_f_{) −} ψ∗(f))

= − (ϕ∗⊕ψ∗) (0),

with exact infimal convolution.

Proof. From Theorem 3.2 we have that

(ϕ+ψ)∗ ˜f= (ϕ∗⊕ψ∗) ˜f , ∀˜f∈E∗, (19)

where (ϕ∗⊕ψ∗) ˜f is exact. By definitions of convex conjugate and infimal

convolution, we can write (19) as follows sup x∈E _{˜f, x} −ϕ(x) −ψ(x)= inf ˜f= ˜f1+ ˜f2∈E∗ ϕ∗ ˜f1 +ψ∗ ˜f2 .

Since infimal convolution(ϕ∗⊕ψ∗) ˜f is exact, then we can write

sup x∈E _{˜f, x} −ϕ(x) −ψ(x)= min ˜f2∈E∗ ϕ∗ ˜f− ˜f2+ψ∗ ˜f2 .

Let ˜f=0, then we obtain that sup x∈E (−ϕ(x) −ψ(x)) =min f ∈E∗(ϕ ∗ (−f) +ψ∗(f)). (20) Note that sup u∈E [−F(u)] = −inf u∈EF(u). (21)

Then the expression (20) can be written as inf

x∈E(ϕ(x) +ψ(x)) = −minf ∈E∗(ϕ

∗₍₋_f_{) +}

ψ∗(f)). (22)

Equivalently inf

x∈E(ϕ(x) +ψ(x)) =maxf ∈E∗(−ϕ

∗

(−f) −ψ∗(f)).

By definition of infimal convolution, we can rewrite (22) as inf

x∈E(ϕ(x) +ψ(x)) = − (ϕ ∗_⊕

ψ∗) (0),

(9)

Proof of Theorem 3.1. For any given x ∈ X0+X1 such that x = a0+a1, we

have

(ϕ0⊕ϕ1) (x) = inf y∈X0∩X1

(S(y) +R(y))

By Corollary 1, we have that

(ϕ0⊕ϕ1) (x) = − (S∗⊕R∗) (0),

with exact infimal convolution. This means that there exists an element y∗ ∈

dom S∗∩dom R∗such that

(ϕ0⊕ϕ1) (x) = −S∗(−y∗) −R∗(y∗).

By formulas of S∗and R∗, this is equivalent to

(ϕ0⊕ϕ1) (x) = −ϕ∗₀(y∗) +hy∗, a0i −ϕ∗₁(y∗) +hy∗, a1i

=hy∗, xi −ϕ∗₀(y∗) −ϕ∗₁(y∗).

By Theorem 2.1, we have that

(ϕ0⊕ϕ1) (x) =hy∗, xi − (ϕ0⊕ϕ1)∗(y∗).

By Proposition 2.1, we conclude that y∗∈∂(ϕ0⊕ϕ1) (x).

Therefore the function(ϕ0⊕ϕ1)is subdifferentiable on X0+X1.

4 Characterization of optimal decomposition for

in-fimal convolution

In this subsection we give a proof of a lemma which establishes a useful char-acterization of optimal decomposition for infimal convolution.

Lemma 1 (Key lemma). Let F0 and F1 be convex and proper functions from a

Banach space E with values in R∪ {+∞}such that their infimal convolution F= (F0⊕F1)defined by

F(x) = (F0⊕F1) (x) = inf x=x0+x1

{F0(x0) +F1(x1)} (23)

is subdifferentiable at the point x ∈ dom(F0⊕F1). Then the decomposition x =

x0,opt+x1,optis optimal for (23) if and only if there exists y∗∈E∗that is dual to both

x0,optand x1,optwith respect to F0and F1respectively. i.e.,

F0 x0,opt= hy∗, x0,opti −F₀∗(y∗);

F1 x1,opt= hy∗, x1,opti −F1∗(y∗).

(10)

Proof. Let us assume that the decomposition x=x0,opt+x1,opt is optimal for (23). It means that (F0⊕F1) (x) = inf x=x0+x1 {F0(x0) +F1(x1)} =F0 x0,opt+F1 x1,opt . (25) Since(F0⊕F1)is subdifferentiable at x then there exists y∗ ∈E∗such that

y∗∈∂(F0⊕F1) (x).

Then it follows from Proposition 2.1 that

(F0⊕F1) (x) =hy∗, xi − (F0⊕F1)∗(y∗). (26)

From (25) and by the formula for the conjugate of infimal convolution (see Theorem 2.1), we have that (26) is equivalent to

F0 x0,opt+F1 x1,opt=hy∗, xi −F0∗(y∗) −F₁∗(y∗).

Or equivalently,

F0 x0,opt+F0∗(y∗) −y∗, x0,opt+F1 x1,opt+F1∗(y∗) −y∗, x1,opt=0 (27)

From the Young-Fenchel inequality (see Proposition 2.1), we have that F0 x0,opt +F₀∗(y∗) −y∗, x0,opt ≥0, (28) and F1 x1,opt+F1∗(y∗) −y∗, x1,opt≥0. (29)

Considering (27), (28) and (29), we conclude that (28) and (29) are indeed equalities and we get

F0 x0,opt= hy∗, x0,opti −F₀∗(y∗);

F1 x1,opt= hy∗, x1,opti −F1∗(y∗).

So we obtain (24).

Conversely, let us assume the existence of y∗ ∈ E∗ and a decomposition x=

˜x0+ ˜x1such that F0(˜x0) = hy∗, ˜x0i −F0∗(y∗); F1(˜x1) = hy∗, ˜x1i −F1∗(y∗). It follows that F0(˜x0) +F1(˜x1) = hy∗, ˜x0+ ˜x1i − (F0∗+F1∗) (y∗).

Since the conjugate of infimal convolution is the sum of conjugates (see Theo-rem 2.1), then

(11)

By definition of infimal convolution, it follows that in particular

(F0⊕F1) (x) ≤F0(˜x0) +F1(˜x1). (31)

Then it follows from (30) that

(F0⊕F1) (x) ≤ hy∗, xi − (F0⊕F1)∗(y∗) (32)

From the Young-Fenchel inequality we have that

(F0⊕F1) (x) ≥ hy∗, xi − (F0⊕F1)∗(y∗). (33)

Combining (32) and (33) we obtain that

(F0⊕F1) (x) = hy∗, xi − (F0⊕F1)∗(y∗). (34)

From (30) and (34) we conclude that

(F0⊕F1) (x) =F0(˜x0) +F1(˜x1).

Therefore the decomposition x= ˜x0+ ˜x1is optimal for (23).

In the next two sections, we apply our approach to two well–known optimiza-tion problems, namely convex and linear programming. The approach is as follows: First we reformulate the optimization problem at hand as an infimal convolution of two well–defined functions. Secondly, we check the subdiffer-entiability of the infimal convolution by Theorem 3.1 and finally use Lemma 1. The aim is to provide proofs for duality theorems which are central for these problems based on our infimal convolution approach.

5 Application to convex programming

5.1 Formulation of the problem

We consider the general nonlinear optimization problem with inequality con-straints.

(P )

inf f0(x)

subject to fi(x) ≤0, i=1, . . . , m (35)

where f0and fi, i=1, . . . , m are real valued functions onRn. We assume that

the functions f0 and f1, f2, . . . , fm are convex, continuously differentiable and

that

∃x0∈Rn such that fi(x0) <0, i=1, . . . , m. (36)

We want to derive necessary and sufficient optimality conditions, known as KKT conditions for convex problems. Let us formulate this in the following theorem

(12)

Theorem 5.1. Let f0and fi, i=1, . . . , m be convex and continuously differentiable

real valued functions defined onRnsuch that

∃x0∈Rn such that fi(x0) <0, i=1, . . . , m. (37)

Then the vector x∗is a global optimal solution for the problem(P )if and only if there

exists a vector λ∗∈Rmsuch that

(1) ∇f0(x∗) = −∑_i=1m λ∗i∇fi(x∗);

(2) fi(x∗) ≤0, i=1, 2, . . . , m;

(3) λ∗i ≥0, i=1, 2, . . . , m;

(4) λ∗ifi(x∗) =0, i=1, 2, . . . , m.

We will denote by S the feasible set. i.e., x ∈ S if x satisfies all the constraints for the problem(P ).

S={x∈Rn: fi(x) ≤0, i=1, . . . , m}.

The problem(P )can then be rewritten as follows

(P )

inf

x∈Sf0(x)

Definition 5.1(Active set). For x ∈S we letIa(x)denote the index set for active

constraints in the point x, i.e.,

Ia(x) ={i∈ {1, 2, . . . , m}: fi(x) =0}.

Let us define two functions F0and F1as follows

F0(x) = f0(x) and F1(x) =

0 if x∈ −S;

+∞ if x ∈Rn_{\ (−}_S₎_. (38) Lemma 2. Let functions F0and F1be defined by (38). Then

(1) infx∈Sf0(x) = (F0⊕F1) (0)

(2) The vector x∗is optimal solution for the problem(P )if and only if the

decom-position 0=x∗+ (−x∗)is optimal for(F0⊕F1) (0).

(3) F0and F1 are convex, proper and lower semicontinuous. Moreover F0⊕F1 is

(13)

Proof. (1) We have, by definition that

(F0⊕F1) (0) = inf x0+x1=0

(F0(x0) +F1(x1)) = inf

x∈Rn(F0(x) +F1(−x))

From definition of the function F1, it follows that

(F0⊕F1) (0) = inf

−x∈−S(F0(x)).

We conclude that inf

x∈Sf0(x) = (F0⊕F1) (0).

(2) It is clear that 0=x∗+ (−x∗)is optimal decomposition for(F0⊕F1) (0),

i.e., (F0⊕F1) (0) =F0(x∗) +F1(−x∗) if and only if inf x∈Sf0(x) =F0(x∗) +F1(−x∗) =F0(x∗), with−x∗∈ −S Equivalently inf x∈Sf0(x) = f0(x∗), with x∗∈S,

Therefore the vector x∗is optimal solution for the problem(P ).

(3) Recall that the set S is convex, closed and bounded and nonempty by assumption. Hence the functions F0and F1are convex, proper and lower

semi-continuos. Since dom F0=Rnand dom F1= −S then we have that

∪_λ≥0λ(dom F0−dom F1) =Rn.

So all conditions in Theorem 3.1 are satisfied. Therefore we conclude that(F0⊕F1)is subdifferentiable onRn and at zero in particular.

So it is possible to apply Key Lemma. Let us define the sets Si, i=1, . . . , m as

follows Si ={x∈Rn: fi(x) ≤0}, i=1, . . . , m, and functions hi(x) = 0 if x∈ −Si; +∞ if x∈Rn_{\ (−}_S i). , i=1, . . . , m.

Then the feasible set S and the function F1are expressed by

S= m \ i=1 Si and F1(x) = m

∑

i=1 hi(x).

(14)

Lemma 3. Let z 6= 0 and h∗_i (z) = h−z, x∗i, for i ∈ Ia(x∗), where x∗ is optimal

solution for the problemP. Then there exists λi>0 such that

z= −λi∇fi(x∗).

Proof. By definition we have that

h∗i (z) = sup x∈Rn (hz, xi −hi(x)) = sup x∈−Si hz, xi = − inf x∈Si hz, xi. Therefore h∗_i (z) = − inf x∈Si hz, xi = h−z, x∗i.

This means that the vector projection of x∗ onto z has minimal length.

There-fore there exists λi >0 such that the vector z is expressed by

z= −λi∇fi(xz).

5.2 Proof of Theorem 5.1

Proof. Let x∗ be a global optimal solution for the problem (P ). Then from Lemma 2 we have that x∗+ (−x∗) =0 is optimal decomposition for(F0⊕F1) (0).

From Key Lemma, there exists y∗∈∂(F0⊕F1) (0)such that

F0(x∗) =hy∗, x∗i −F₀∗(y∗) (39)

and

F1(−x∗) =hy∗,−x∗i −F₁∗(y∗) (40)

From Proposition 2.1, the expression (39) is equivalent to y∗= ∇f0(x∗).

Therefore (40) becomes

F1(−x∗) =h∇f0(x∗),−x∗i −F1∗(∇f0(x∗)). (41)

By definition of F1, this is equivalent to

F1∗(∇f0(x∗)) =h−∇f0(x∗), x∗i, with x∗∈S.

Applying the Attouch–Brezis Theorem, (41) can be rewritten as follows h1(−x∗) +h2(−x∗) +. . .+hm(−x∗)

(15)

where the infimal convolution is exact. i.e., there exist z1, z2, . . . , zmwith ∇f0(x∗) = m

∑

i=1 zi, (42)

and such that

h1(−x∗) +h2(−x∗) +. . .+hm(−x∗)

=h∇f0(x∗),−x∗i −h∗1(z1) −h2∗(z2) −. . .−h∗m(zm)

=hz1,−x∗i −h∗₁(z1) +hz2,−x∗i −h∗₂(z2) +. . .+hzm,−x∗i −h∗m(zm).

From Fenchel’s inequality,

hi(−x∗) ≥hzi,−x∗i −h∗_i (zi), i=1, . . . , m.

Therefore we conclude that

hi(−x∗) =hzi,−x∗i −h∗i (zi), ∀i=1, 2, . . . , m.

By definition of hi, this is equivalent to

h∗_i (zi) =h−zi, x∗i, with x∗∈Si ∀i=1, 2, . . . , m.

Therefore, from Lemma 3, there exists λ∗i≥0 such that

zi= −λ∗i∇fi(x∗). (43)

Putting (43) into (42), we obtain requirement(1). Requiremets(2)and(3)are obvious. Taking λ∗i =0 for i /∈ Ia(x∗)and λ∗i ≥ 0 for i∈ Ia(x∗)we obtain

requirement(4).

Conversely let (x∗, λ∗) ∈ Rn×Rm satisfy requirements (1) − (4). Assume

that x∈Rn _{is feasible for}_{(P )}_{. Then from condition}₍₁₎_{, we have that}

f0(x∗) + m

∑

i=1 λ∗ifi(x∗) ≤ f0(x) + m

∑

i=1 λ∗ifi(x).

From condition(4)and feasibility of x, this implies that f0(x∗) ≤ f0(x).

This together with feasibility of x∗ we conclude that x∗ is a global optimal

(16)

6 Application to linear programming

The notion of duality is one of the most important concepts in linear program-ming. Basically, associated with each linear programming problem called pri-mal problem that we will define by the constraint matrix A, the right-hand-side vector b and the cost vector c, is a linear programming problem called dual problem which is constructed by the same set of data A, b and c. We will show how both problems can be formulated in the context of infimal convo-lution. We discuss connection of Key Lemma with duality theorem of linear programming. We will show that the vector y∗ in Key Lemma, can be writen

as y∗ = ATλ∗, where λ∗ is optimal solution for the dual problem of linear

programming.

6.1 Linear programming and infimal convolution

Linear programming concerns the problem of maximization or minimization of a linear objective function in terms of decision variables subject to a finite number of linear inequality and/or equality and sign constraints imposed on the decision variables. The minimization case can be writen in the following generic form: (P )    infhc, xi subject to Ax≥b x≥0, (44) where c∈Rn_{, b}_∈_Rm_{, A}_∈_Rm×n_.

Inequalities/equalities are to be understood componentwise. Let us denote by S the following set

S={x∈Rn: x≥0 and Ax≥b}.

S is called the feasible set for the problem (P ): each x ∈ S satisfies all the constraints for the problem(P )and is hence called a feasible point.

We are interested in optimal solution, i.e., a feasible point denoted by x∗ at

which the infimum is achieved.

For simplicity, we will assume that S contains the origin, is bounded, and contains some interior point x0. In this case it is clear that the optimal solution

x∗ to the problem(P )exists.

In order to reformulate the problem(P )in terms of infimal convolution, we define two functions F0and F1in the following way

F0(x) = hc, xi if x≥0 +∞ otherwise and F1(x) = 0 if Ax≤ −b +∞ otherwise (45)

(17)

Lemma 4. Let functions F0and F1be defined as in (45). Then

(1)

inf

x∈Shc, xi = (F0⊕F1) (0).

(2) The vector x∗is optimal solution for(P )if and only if the decomposition 0=

x∗+ (−x∗)is optimal for(F0⊕F1) (0).

(3) F0and F1 are convex, proper and lower semicontinuous. Moreover F0⊕F1 is

subdifferentiable at 0.

Proof. For(1), we have by definition that

(F0⊕F1) (x) = inf x=x0+x1 (F0(x0) +F1(x1)). Therefore (F0⊕F1) (0) = inf 0=x0+x1 (F0(x0) +F1(x1)) =inf x0 (F0(x0) +F1(−x0)).

From definition of F0and F1(equation (45)), it follows that

(F0⊕F1) (0) = inf x0≥0, A(−x0)≤−b (hc, x0i +0). This is equivalent to (F0⊕F1) (0) =    infhc, x0i subject to Ax0≥b; x0≥0. = inf x0∈S hc, x0i.

To prove (2), suppose that 0 = x∗+ (−x∗) is optimal decomposition for

(F0⊕F1) (0). That is equivalent to (F0⊕F1) (0) =    infhc, x0i subject to Ax0≥b x0≥0 =F0(x∗) +F1(−x∗) =hc, x∗i, with x∗≥0, Ax∗≥b. We conclude that    infhc, x0i subject to Ax0≥b x0≥0 =hc, x∗i, with x∗≥0, Ax∗≥b.

(18)

Lemma 5. Let functions F0 and F1 be defined as in (45) and let b < 0. Then the

function F0⊕F1is subdifferentiable at 0.

Proof. It is clear that the function F0is convex and lower semicontinuous. It is also proper because the set dom F0contains the point x=0 and F0(0) =0.

The function F1is convex and lower semicontinuous as the indicator function

of the convex and closed set SF1 ={x : Ax≤ −b}. It is proper because dom F1

is nonempty. Since b <0 then there exists an ε > 0 such that the set dom F1

contains the ballB (0, ε)of radius ε centered at zero. Therefore

[

λ≥0

λ(dom F0−dom F1) =Rn.

We conclude from Theorem 3.1 that(F0⊕F1)is subdifferentiable onRn and

at zero in particular.

In order to apply Key Lemma, we need to calculate conjugate functions F₀∗and F₁∗to F0and F1respectively.

Lemma 6. The conjugate functions F₀∗and F₁∗to F0and F1are given by

F₀∗(z) = 0 if z≤c +∞ otherwise , and F₁∗(z) = ₋ sup_S_zhλ, bi if Sz6=∅ +∞ otherwise ,

where Sz is the set

Sz =

n

λ∈Rm, λ≥0 : ATλ=z

o

. (46)

Proof. By definition of conjugate function, we have that

F₀∗(z) = sup x∈Rn {hz, xi −F0(x)} =sup x≥0 {hz−c, xi}. Therefore F0∗(z) = 0 if z≤c +∞ else

To calculate F₁∗, we write F1(x)as a sum of more simpler functions as follows

F1(x) = m

∑

i=1 ˜ Fi(x) where F˜i(x) = 0 if ∑n_j=1aijxj≤ −bi +∞ else.

By Attouch-Brezis Theorem, the conjugate function of F1is obtained as a

infi-mal convolution of conjugate functions: F₁∗(z) = m M i=1 ˜ F_i∗ ! (z). (47)

(19)

Let Ai• = [ai1, ai2, . . . , ain]T. Then it is clear that the conjugate of ˜Fi, i = 1, 2, . . . , m is equal to ˜ F_i∗(z) = sup x∈Rn hz, xi −F˜i(x) (48) = sup ∑n j=1aijxj≤−bi hz, xi = −λibi if ∃λi ≥0 : z=λiAi•; +∞ else.

That is ˜F_i∗(z) = −λibiif z is proportional with positive coefficient to the vector

Ai•, normal to the hyperplane ai1x1+ai2x2+. . .+ai,nxn = −bi, defining the

ithconstraint. Otherwise ˜F_i∗(z) = +∞. Putting (48) in (47), yields

F₁∗(z) = inf z=~z1+~z2+...+~zm ˜ F₁∗(~z1) +F˜2∗(~z2) +. . .+F˜m∗(~zm) = inf z=~z1+~z2+...+~zm − hλ, bi if∃λi≥0, i=1, . . . , m : ~zi=λiAi•; +∞ else.

Therefore if the set Szis defined as in (46) then

F1∗(z) =

₋

sup_λ∈S_zhλ, bi if Sz6=∅;

+∞ else.

6.2 Key lemma and duality theorem for linear programming

In this section we will prove that the duality theorem for linear programming can be derived from Key Lemma. The dual problem can be formulated as follows (D)    suphb, λi subject to A T_λ_≤_c λ≥0, (49)

where λ∈Rm_{. The feasible set in this case is the following set}

S∗ =nλ∈Rm: λ≥0, ATλ≤c

o .

Each point λ∈S∗satisfies all the constraints for the problem(D)and is hence called dual feasible point or simply feasible point. The goal is to find a point λ∗

which is feasible and for which the supremum is achieved. This point is called dual optimal solution.

The following lemma will not be used in the proof of duality theorem. It is only a note that the problem (D) is reformulated in terms of F₀∗ and F₁∗ as follows

(20)

Lemma 7. Let functions F₀∗and F₁∗be defined as in Lemma 6. Then sup λ∈S∗ hb, λi = (F0∗+F1∗) ∗ (0). Proof. By definition (F0∗+F1∗) ∗ (0) =sup z (−F0∗(z) −F1∗(z)).

From formulas of F₀∗and F₁∗in Lemma 6, we have that

(F₀∗+F₁∗)∗(0) =supnhb, λi: λ∈Rm_{, λ}_≥_{0, A}T λ=z, z≤c o =supnhb, λi: ATλ≤c, λ≥0 o . Therefore sup λ∈S∗ hb, λi = (F₀∗+F₁∗)∗(0).

The following theorem, so-called weak duality theorem is well known. Roughly speaking, it says that the primal optimal value is always lower-bounded by the dual optimal value.

Theorem 6.1(The Weak Duality Theorem). If x ∈ Rn _{is feasible for} _{(P )}_and λ∈Rmis feasible for(D), then

hb, λi ≤Dx, ATλ

E

≤ hc, xi. (50) Moreover, if

b, ˜λ=hc, ˜xi (51)

with ˜λ feasible for(D) and ˜x feasible for (P ), then ˜λ must solve(D) and ˜x must solve(P ).

Proof. Let λ∈Rm_{be feasible for}_(D)_{, and x}_∈_Rn _{be feasible for}_{(P )}_{, i.e.,}

ATλ≤c, λ≥0 and Ax≥b, x≥0. Then hb, λi ≤ hAx, λi =Dx, ATλ E ≤ hc, xi. Therefore hb, λi ≤Dx, ATλ E ≤ hc, xi. (52)

(21)

To prove that b, ˜λ=hc, ˜xi

together withD − P feasibility implies optimality, we note that for any other

D − P feasible pair(y, x)we have from (52) that

hb, yi ≤ hc, ˜xi =b, ˜λ≤ hc, xi. Therefore

b, ˜λ≥ hb, yi,

which means that ˜λ is optimal solution forD, and

hc, ˜xi ≤ hc, xi,

which means that ˜x is optimal solution forP.

Remark 2. It follows from the weak duality theorem that if either problem (P )or

(D)is unbounded, then the other problem is infeasible.

The following theorem, so-called strong duality theorem tells us that optimality is equivalent to equality in the weak duality theorem. That is, λ solves(D)

and x solves(P )if and only if(λ, x)is aD − Pfeasible pair and

hb, λi =Dx, ATλ

E

=hc, xi.

Theorem 6.2 (The Strong Duality Theorem). Let problems (P ) and (D) be defined as in (44) and (49) respecively and let the vector b < 0. Assume that the primal problem(P )has an optimal solution x∗. Then the dual problem(D)has an

optimal solution λ∗and

hb, λ∗i =

D

x∗, ATλ∗

E

=hc, x∗i. (53) Proof. Let problems(P )and(D) be defined as in (44) and (49) respectively. Let functions F0, F1be defined by (45). From the assumptions, it follows that

there exists x∗∈S such that

inf

x∈Shc, xi = hc, x∗i.

Then from Lemma 4 we have that 0 = x∗+ (−x∗)is optimal decomposition

for(F0⊕F1) (0). Since by Lemma 5 F0⊕F1is subdifferentiable, the it follows

from Key Lemma that there exists y∗∈Rnsuch that

F0(x∗) =hy∗, x∗i −F₀∗(y∗).

Equivalently

(22)

and F1(−x∗) =hy∗,−x∗i −F1∗(y∗) Or equivalently supnhb, λi: λ∈Rm_{, λ}_≥_{0, A}T λ=y∗ o =hy∗, x∗i, with Ax∗≥b. (54)

From our assumptions it follows that the set λ∈Rm: λ≥0, ATλ=y∗ is

nonempty and bounded. Therefore there exists λ∗ ∈ Rm, λ∗ ≥ 0 such that

ATλ∗ =y∗ at which supremun in (54) is attained. Cleary such λ∗is optimal

solution for (D). Moreover the optimal values for (P )and (D)coincide. In fact hb, λ∗i = hy∗, x∗i = hc, x∗i. We conclude that hb, λ∗i = D x∗, ATλ∗ E =hc, x∗i.

We conclude this section with a proof of the complementary slackness the-orem and a statement of its corollary that can be used to develop a test of optimality for a solution to(D)or(P ).

Theorem 6.3 (The Complementary Slackness Theorem). The vector λ ∈ Rm

solves(D)and the vector x∈Rn _solves_{(P )}_{if and only if λ is feasible for}_(D)_and

x is feasible for(P )and

(i) Either λi=0 or ∑nj=1aijxj−bi=0 or both for i=1, . . . , m.

(ii) Either xj=0 or ∑mi=1aijλi−cj=0 or both for j=1, . . . , n.

Proof. Suppose that the vector λ ∈ Rm _solves _(D) _{and the vector x} _∈ _Rn

solves (P ). Then from the strong duality theorem we have equality in the weak duality theorem:

hb, λi =Dx, ATλ E =hc, xi. The equation hb, λi =Dx, ATλ E implies that hλ, Ax−bi = m

∑

i=1 λi n

∑

j=1 aijxj−bi ! =0. (55)

(23)

Feasibility of λ∈Rm_for_(D)_{and of x}_∈_Rn _for_{(P )}_{implies that} λi ≥0 and n

∑

j=1 aijxj−bi≥0 for i=1, . . . , m. Therefore λi n

∑

j=1 aijxj−bi ! ≥0 for i=1, . . . , m. (56) From (55) and (56), we conclude that the only way (55) can hold is if

λi n

∑

j=1 aijxj−bi ! =0 for i=1, . . . , m. Or equivalently, λi =0 or n

∑

j=1 aijxj−bi=0 or both for i=1, . . . , m.

Hence we obtain(i). Expression(ii)can be shown in exactly similar way. Suppose now that(i)and (ii)are satisfied. Then we must get equality in the weak duality theorem, and conclude by strong duality theorem that the vector

λ∈Rmsolves(D)and the vector x∈Rnsolves(P ).

Corollary 2. The vector λ∈Rm_solves_(D)_{if and only if λ is feasible for}_(D)_and

there exists a vector x∈Rn _{feasible for}_{(P )}_{and such that}

(i) If λi >0 then∑nj=1aijxj−bi=0 for each i∈ {1, . . . , m}.

(ii) If∑m

i=1aijλi−cj<0 then xj =0 for each j∈ {1, . . . , n}.

7 Final remarks and discussions

1. For convex functions, the following criterion for subdifferentiability is known (see for example (Ekeland and Témam, 1999), Proposition 5.2 on page 22):

Proposition 7.1. Let F be a convex function of E intoR∪ {+∞}, finite and continuous at the point x0 ∈ int(dom F). Then ∂F(x) 6= ∅ for all x ∈

int(dom F), and in particular ∂F(x0) 6=∅.

Continuity is a requirement because the set ∂F(x0)may be empty even if

the function F is lower semicontinuous at x0. For example, the function

F :R−→R∪ {+∞}defined by F(x) =

−√1−x2 _if _|_x_{| ≤}_1;

+∞ otherwise.

is lower semicontinuous and finite at 1, but ∂F(1) =∅. Therefore

The-orem 3.1 could be useful in the case one couldn’t apply Proposition 5.2 in (Ekeland and Témam, 1999).

(24)

2. Lemma 1 is known in a different form in (Strömberg, 1994), Theorem 3.6 on page 24. Our version provides necessary and sufficient conditions for optimality of an optimal decomposition and emphasizes the importance of the condition ∂(F0⊕F1) (x) 6=∅. Indeed as we show in the following

example, there could be situations where optimal decomposition exists but ∂(F0⊕F1) (x) =∅. We illustrate this by the following example. Example 1. Let E be a Banach space and S be a closed convex subset of E and

assume that S6=E and S6=∅. Let ξ ∈bd(S)be such that (a) There is no supporting hyperplane at ξ

(b) ξ is not in core S. i.e., ξ+td does not lie in S for any direction d in E and for all small real t, say 0<t<ε_ξ, where ε_ξ =ε_ξ(ξ, d).

An example of such a set S is given in (Figueiredo, 1989) on page 62. It is defined as follows S= ( ξ∈ `2: ξj ≥0, kξk2= ∞

∑

j=1 ξ2_j ≤1 ) ,

and it is shown that S= bd(S) and that the points ˆξ ∈ S, with ˆξj > 0 and

ˆξ

<1 are not supporting points. i.e., there is no bounded linear functional f ∈ E∗such that

sup

ξ∈S

f(ξ) = f ξˆ .

Consider two functions ϕ0, ϕ1: E−→R∪ {+∞}defined by ϕ0(x) = 0 if x∈S; +∞ if x /∈S. and ϕ1(x) = t if x=td, t≤εξ; +∞ otherwise.

Then the decomposition ξ= ξ+0 is optimal for(ϕ0⊕ϕ1) (ξ). However the

set ∂(ϕ0⊕ϕ1) (ξ) =∅. In fact

∂(ϕ0⊕ϕ1) (ξ) =∂ϕ0(ξ) ∩∂ϕ1(0).

But ∂ϕ0(ξ) = ∅ because y ∈ ∂ϕ0(ξ)would define a supporting hyperplane

to S at the point ξ.

Therefore subdifferentiability is not a necessary condition for existence of optimal decomposition.

3. We have used a duality approach to obtain new proofs for duality results concerning convex and linear programming. Theorem 3.1 on subdiffer-entiability of infimal convolution plays an important role in proofs for both convex and linear programming problems. The existence of the el-ement y∗dual to x0,optand x1,optwith respect to the functions F0and F1

was very crucial in the Key Lemma. In the convex programming problem, F0(x) = f0(x) and F1(x) =

0 if x∈ −S;

(25)

The proof of Theorem 5.1, shows that the element y∗is equal to∇F0(x∗)

and that y∗ is equal to−∑m_i=1λ_∗i∇fi(x∗). These put together give

re-quirement(1) in Theorem 5.1, which is basically the meanning of Key Lemma in this special context.

In the linear programming problem, we find that y∗ in Key Lemma, is

equal to y∗ = ATλ∗, where λ∗ is optimal solution for the dual problem

of linear programming.

4. In our next study, we will use Theorem 3.1 and Key Lemma to obtain mathematical characterizations of optimal decomposition for the K–, L– and E– functionals of the theory of interpolation.

References

H. Attouch and H. Brezis. Duality for the sum of convex functions in general banach spaces. In: Aspects of Mathematics and its Applications. J.A. Barroso ed., North-Holland, Amsterdam, pages 125–133, 1986.

D. Azé. Duality for the sum of convex functions in general normed spaces. Arch. Math., 62:554–561, 1994.

I. Ekeland and R. Témam. Convex Analysis and Variational Problems. Siam, 1999. E. Ernst and M. Théra. On the necessity of the moreau-rockafellar-robinson qualification condition in banach spaces. Math. Program., Ser. B, pages 149– 161, 2009.

D. G. D. Figueiredo. Lectures on the Ekeland variational principle with applications and detours. TATA Institute of Fundamental Research, Bombay, 1989. J. J. Moreau, Paris, 1996. Fonctionnelles convexes. Lecture notes College de

France.

R. T. Rockafellar. Extension of the fenchel’s duality theorem for convex func-tions. Duke Math., J. 33:81–90, 1966.

R. T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, New Jersey, 1972.

T. Strömberg. A Study of the Operation of Infimal Convolution. Luleå University of Technology, Sweden, 1994.