1 Introduction

Agency theory has been the object of a growing literature in the last decades. One of its major virtues has been to make possible that a large diversity of problems can be handled within a unique framework. The trouble lies however in that, to deal with the monotonicity constraints that typically intervene in the analysis, it has become customary the adoption of strong differentiability assumptions with regards to variables that are endogenous to the problem. Among other advantages, if the function in question turns out to be continuous and piecewise differentiable one can express that it is monotonous by imposing the constraint that its derivative is single signed. This allows characterizing the solution by relying on standard optimal control methods.

But piecewise differentiability comes to be also important in ensuring that the candidate monotonous functions satisfy the provisos needed for the solution to be well-behaved, in the sense of sufficiently regular to involve a clear-cut economic interpretation on every point of its relevant domain. The reason for this is that, under piecewise differentiability, the necessary conditions define unambiguously the shape of the searched policy-function on its whole domain, including the corner points, which can be then associated with the prescription that different agents choose the same policy.

Nevertheless, the cost of piecewise differentiability rests, as is well-known, in the eventual loss of generality for the neglect of those monotonous functions that represent potential candidates for the solution, despite not being piecewise differentiable. For this reason, our goal in the paper will be to relax the differentiability assumptions in principal-agent problems, though imposing on the endogenous variables of the model conditions that allow ensuring a well-behaved solution. To do this, however, instead of undertaking the tedious task of removing the differentiability restrictions in each of the problems concerned, the strategy will consist in relaxing an archetypical problem whose basic model is shared with a large group of other principal-agent problems. Such a procedure will allow us identifying the common properties of necessary conditions from which most interesting conclusions have been directly derived in the literature.

Accordingly, the next section is devoted to recollect the main features and results in the now classical contribution of Mussa and Rosen (1978) on monopoly with product quality. Section 3 begins with an example where, unlike in the standard approach, the solution for the monopolist fails to be piecewise differentiable. This is used to justify our generalization of Mussa and Rosen’s problem, in the sense of replacing its differentiability restrictions by two mild provisos that ensure all qualitative results to be unaffected after the change in the space of candidate solutions. In Sect. 4, the same type of generalization is extended to other results in the context of similar principal-agent and other screening problems. Our final conclusions together with a brief summary are found in Sect. 5. All analytical details have been relegated to an Appendix at the end of the paper.

2 The problem of monopoly with product quality

Mussa and Rosen (1978) examine the provision of a generic commodity that can be produced in a number of different varieties. The particular level of quality for each variety is represented by q, and the “breadth” of the product line comes to be \( [\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{q} ,\bar{q}] \). Despite the similarity of the goods, they are not perfect substitutes. In this economy, there is an indefinite number of consumers with taste parameters distributed in accord with a continuous density function f(θ) ≡ F′(θ) > 0 defined on the range \( [\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } ,\bar{\theta }] \).

The sellers know the form of f(θ), but cannot distinguish among buyers prior to an actual sale. Therefore, the monopolist does not proceed in the usual way of price discrimination, having to apply a pricing policy, through a price-quality schedule, in order to allocate costumers along the quality spectrum by a process of self-selection. Apart from this, each person has an identical utility function represented by:

$$ U(x,q;\theta ) = x + \theta q $$
(1)

Here x denotes a composite commodity that is differentiated from the generic type in question. Consumers’ valuations of quality vary in proportion to θ, and each person maximizes utility subject to the budget constraint:

$$ P(q) + x \le y $$
(2)

The function P(q) represents the price-quality schedule and y is income. P(q) indicates the unique price that, for all buyers of a given quality q, is required in an impersonal market where all quality varieties are sold. Both P(q) and y are measured in terms of x. Moreover, at points where P(q) is differentiable, utility maximization yields:

$$ P'[q(\theta )] = \theta $$
(3)

If q(θ) expresses the quality purchased by a person of type θ, consumer surplus accruing to any hypothetical purchaser θ will then be given by:

$$ z(\theta ) = \theta q(\theta ) - P[q(\theta )] $$
(4)

For any particular quality level, q, it is supposed that the unit cost, C(q), is independent of the number of units of the particular variety considered. In addition, C′(q> 0 and C″(q> 0. The monopolist seeks to maximize profit subject to the constraint of individual behavior, by choice of an assignment q(θ) and a non-decreasing P(q), namely:

Problem 1

Find the functions P(q) and q(θ) thatFootnote 1:

$$ {\text{Maximize:}}\quad \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\{ P[q(\theta )] - C[q(\theta )]\} f(\theta )d\theta } $$
(5)
$$ {\text{subject}}\,{\text{to}}:\quad q(\theta )\,\text{m} {\text{aximizes}}\,x + \theta \,q(\theta )\,{\text{under}}\,P(q) + x \le y $$
(6)
$$ P(q)\,{\text{is}}\,{\text{non-decreasing}} $$
(7)

Let {P*(q), q*(θ)} depict the solution. As it stands, Problem 1 seems rather difficult to handle, unless it can be transformed into another setting with P(·) not appearing explicitly. Mussa and Rosen (1978) considered, in fact, this equivalent program:

Problem 2

Find the functions z(θ) and q(θ) that:

$$ {\text{Maximize}}:\,\int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\{ \theta \,q(\theta ) - z(\theta ) - C[q(\theta )]\} f(\theta )d\theta } $$
(8)

subject to:

$$ z(\theta ) = \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{\theta } {q(\zeta )d\zeta } $$
(9)
$$ q(\theta )\,{\text{is}}\,{\text{non-decreasing}} $$
(10)

The equivalence between (5) and (8) follows from the definition of z(θ) in (4). The envelope condition (9) was derived by differentiating (4), substituting (3) and integrating between θ and \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } \) for \( z ( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } } ) = 0 \). Of course, for these calculations to be valid, apart from the piecewise differentiability of q(θ) and P(q), we must suppose that Mussa and Rosen (1978) adopted the implicit assumption that z(θ) is an everywhere continuous function. Concerning condition \( z ( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } } ) = 0 \) in (9), it stems from (10) and the fact that, at an “extensive” margin \( \tilde{\theta } ( { \ge \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } } ), \) it is always optimal for the monopolist to make the marginal consumer indifferent between buying and not buying. Finally, (10) was justified by using geometrical methods in combination with the assumptions mentioned.

Again restricting the set of candidates q(θ) to functions that are piecewise differentiable, Mussa and Rosen (1978) obtained these necessary conditions for Problem 2:

$$ {\text{MR}}(\theta ) - {\text{MC}}^{*}(\theta ) = 0\quad {\text{if}}\,\theta \,{\text{is}}\,{\text{a}}\,{\text{point}}\,{\text{of}}\,{\text{increse}}\,{\text{for}}\,q^{*} (\theta ) $$
(11)
$$ \int\limits_{{\theta_{1} }}^{{\theta_{2} }} {[{\text{MR}}(\theta ) - {\text{MC}}^{*}(\theta )]f (\theta )d\theta = 0\quad {\text{if}}\, [\theta_{ 1} ,\theta_{2} ]\,{\text{is}}\,{\text{a}}\,{\text{maximal}}\,{\text{interval}}\,{\text{of}}\,{\text{constancy}}\,{\text{for}}\,q^{*} (\theta )} $$
(12)

Here \( {\text{MC}}^{*} \left( \theta \right) = C^{\prime}\left[ {q^{*}\left( \theta \right)} \right] \) represents the marginal cost of producing an increment of quality for consumers of type θ at the optimum. Likewise, \( {\text{MR}}(\theta ) \equiv \theta - {\frac{1}{f(\theta )}}[1 - F(\theta )] \) stands for the marginal revenue accruing to the monopolist, and measures the gain in revenue associated with quality increments sold to consumers of type θ.

It must be emphasized in passing that Mussa and Rosen (1978) refer in (11) to points where q*(θ> 0, instead of to the more general concept of points of increase of q*(θ). However we shall see below that (11) applies as well in connection with points of increase where q*(θ) does not present a positive derivative.

Using C″(q> 0 Mussa and Rosen (1978) managed in addition to prove that:

$$ q^{*} (\theta )\,{\text{is}}\,{\text{continuous}}\,{\text{everywhere}}. $$
(13)

As a whole, conditions (11)–(13) mean that the optimal policy involves bunching consumers of different tastes onto the same quality level by imparting corners in P*(q) at some points, while equating marginal cost to marginal revenue elsewhere.

3 More general results

As has already been said, the trouble with the analysis of Mussa and Rosen (1978) rests in its differentiability assumptions that limit the relevance of the results achieved. The following example illustrates this point, motivating most of the subsequent developments.

Example

Apart from the assumptions adopted in the last section, suppose:

  1. (a)
    $$ C\left( q \right) = {\frac{{q^{2} }}{2}} $$
  2. (b)

    f(θ) is on \( [\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } ,\theta_{1} ) \) and \( (\theta_{2} ,\bar{\theta }] \) the Cantor singular function. Besides, f(θ) satisfies on \( \left[ {\theta_{1} ,\theta_{2} } \right] \):

$$ {\text{MR}}(\theta_{1} ) = {\text{MR}}(\theta_{2} ) $$
(14)
$$ \int\limits_{{\theta_{1} }}^{{\theta_{2} }} {[{\text{MR}}(\theta ) - {\text{MR}}(\theta_{1} )]} f(\theta )d\theta = 0 $$
(15)
$$ \int\limits_{\theta }^{{\theta_{2} }} {[{\text{MR}}(\zeta ) - {\text{MR}}(\theta_{1} )]} f(\zeta )d\zeta > 0 $$
(16)

Recall that \( {\text{MR}}(\theta ) \equiv \theta - {\frac{1}{f(\theta )}}[1 - F(\theta )]. \) Now, let us consider the solution for q(θ):

$$ \bar{q}(\theta ) = {\text{MR}}(\theta_{1} )\quad {\text{on}}\, [\theta_{1} ,\theta_{2} ] $$
(17)
$$ \bar{q}(\theta ) = {\text{MR}}(\theta )\quad {\text{on[}}\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } ,\theta_{1} ){\text{ and }}(\theta_{2} ,\bar{\theta }] $$
(18)

Equations (17)–(18) together with assumption b imply that \( \bar{q}(\theta ) \) is a non-decreasing continuous function that has an uncountable set of non-differentiable points. It therefore fulfills constraint (10) without being piecewise differentiable. In fact \( \bar{q}\left( \theta \right) \) is strictly increasing, except for the interval \( \left[ {\theta_{1} ,\theta_{2} } \right] \) where it appears to be a constant function.

Now we shall check that \( \bar{q}\left( \theta \right) \) entails a true solution to Problem 2 above.Footnote 2 Let \( \bar{z}\left( \theta \right) \) be the value in (9) attained by z(θ) for \( q(\theta ) = \bar{q}\left( \theta \right). \) If \( \left\{ {\hat{q}(\theta ),\hat{z}(\theta )} \right\} \) represents any pair satisfying (9)–(10), it will be shown to provide at most the same level of profit as \( \left\{ {\bar{q}(\theta ),\bar{z}(\theta )} \right\}. \) Thus, by the concavity of the integrand in (8) with respect to q and z (since the Hessian matrix is negative semi-definite for all q and z):

$$ \begin{aligned} & \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\{ \theta \hat{q}(\theta ) - \hat{z}(\theta ) - C[\hat{q}(\theta )]\} f(\theta )d\theta \le \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\{ \theta \bar{q}(\theta ) - \bar{z}(\theta ) - C[\bar{q}(\theta )]\} } f(\theta )} d\theta \\ & \quad + \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\{ \left[ {\theta - C'[\bar{q}(\theta )]} \right]\left[ {\hat{q}(\theta ) - \bar{q}(\theta )} \right] - \left[ {\hat{z}(\theta ) - \bar{z}(\theta )} \right]\} f(\theta )d\theta } \\ \end{aligned} $$
(19)

At this point, it will suffice to prove that the last integral in (19) becomes non-positive. Using assumptions ab:

$$ \begin{aligned} & \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\left\{ {\left[ {\theta - \bar{q}(\theta ) - {\frac{1}{f(\theta )}}[1 - F(\theta )] + {\frac{1}{f(\theta )}}[1 - F(\theta )]} \right].\left[ {\hat{q}(\theta ) - \bar{q}(\theta )} \right] - \left[ {\hat{z}(\theta ) - \bar{z}(\theta )} \right]} \right\}f(\theta )d\theta } \\ & = \quad \int\limits_{{\theta_{1} }}^{{\theta_{2} }} {\left[ {{\text{MR}}(\theta ) - {\text{MR}}(\theta_{1} )} \right]\left[ {\hat{q}(\theta ) - \bar{q}(\theta_{1} )} \right]f(\theta )d\theta } \\ & = \quad \int\limits_{{\theta_{1} }}^{{\theta_{2} }} {\left[ {{\text{MR}}(\theta ) - {\text{MR}}(\theta_{1} )} \right]\left[ {\hat{q}(\theta ) - \hat{q}(\theta_{1} )} \right]f(\theta )d\theta } \\ \end{aligned} $$
(20)

The second integral stems from (17)–(18) and:

$$ \begin{aligned} & \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\left\{ {\left[ {{\frac{1}{f(\theta )}}[1 - F(\theta )]} \right] \left[ {\hat{q}(\theta ) - \bar{q}(\theta )} \right] - \left[ {\hat{z}(\theta ) - \bar{z}(\theta )} \right]} \right\}f(\theta )d\theta } \\ & = \quad \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\int\limits_{\theta }^{{\bar{\theta }}} {f(\zeta )d\zeta [\hat{q}(\theta ) - \bar{q}(\theta )]d\theta - \int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{{\bar{\theta }}} {\int\limits_{{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\theta } }}^{\theta } {\left[ {\hat{q}(\zeta ) - \bar{q}(\zeta )} \right]d\zeta f(\theta )d\theta = 0} } } } \\ \end{aligned} $$
(21)

Here, the zero result can be checked by reversing the order of integration (Fubini’s Theorem) in any of the two double integrals, while leaving the other as it stands. As for the third integral in (20), it derives from a direct application of (15).

Now, by Lebesgue’s theorem on bounded convergence, for any ε > 0:

$$ \begin{aligned} & \int\limits_{{\theta_{1} }}^{{\theta_{2} }} {{\frac{1}{\varepsilon }}\int\limits_{\theta }^{\theta + \varepsilon } {\left[ {{\text{MR}}(\zeta ) - {\text{MR}}(\theta_{1} )} \right]} f(\zeta )d} \zeta \left[ {\hat{q}(\theta ) - \hat{q}(\theta_{1} )} \right]d\theta \to \\ & \quad \int\limits_{{\theta_{1} }}^{{\theta_{2} }} {\left[ {{\text{MR}}(\theta ) - {\text{MR}}(\theta_{1} )} \right]\left[ {\hat{q}(\theta ) - \hat{q}(\theta_{1} )} \right]} f(\theta )d\theta ,\quad {\text{as }}\,\varepsilon \to 0\\ \end{aligned} $$
(22)

The limit becomes legitimate since \( \left[ {{\text{MR}}(\zeta ) - {\text{MR}}\left( {\theta_{1} } \right)} \right] \)f(ζ) is a continuous function and:

$$ \begin{aligned} & {\frac{1}{\varepsilon }}\int\limits_{\theta }^{\theta + \varepsilon } {\left[ {{\text{MR}}(\zeta ) - {\text{MR}}(\theta_{1} ))} \right]f(\zeta )d\zeta \left[ {\hat{q}(\theta ) - \hat{q}(\theta_{1} )} \right]d\theta \to } \\ & \left[ {{\text{MR}}(\theta ) - {\text{MR}}(\theta_{1} )} \right]\left[ {\hat{q}(\theta ) - \hat{q}(\theta_{1} )} \right]f(\theta )\quad \quad {\text{as}}\,\,\varepsilon \to 0\\ \end{aligned} $$
(23)

Finally, the left-hand side of (22) is equal toFootnote 3:

$$ \begin{aligned} & {\frac{1}{\varepsilon }}\int\limits_{{\theta_{1} - \varepsilon }}^{{\theta_{2} - \varepsilon }} {\int\limits_{\theta }^{{\theta_{2} }} {\left[ {{\text{MR}}(\zeta ) - {\text{MR}}(\theta_{1} )} \right]f(\zeta )d\zeta \left[ {\hat{q}(\theta - \varepsilon ) - \hat{q}(\theta_{1} )} \right]} } \,d\theta \\ & - {\frac{1}{\varepsilon }}\int\limits_{{\theta_{1} }}^{{\theta_{2} }} {\int\limits_{\theta }^{{\theta_{2} }} {\left[ {{\text{MR}}(\zeta ) - {\text{MR}}(\theta_{1} )} \right]f(\zeta )d\zeta \left[ {\hat{q}(\theta ) - \hat{q}(\theta_{1} )} \right]} d} \theta \\ & = {\frac{1}{\varepsilon }}\int\limits_{{\theta_{1 - \varepsilon } }}^{{\theta_{1} }} {\int\limits_{\theta }^{{\theta_{2} }} {\left[ {{\text{MR}}(\zeta ) - {\text{MR}}(\theta_{1} )} \right]f(\zeta )d\zeta \left[ {\hat{q}(\theta - \varepsilon ) - \hat{q}(\theta_{1} )} \right]} d} \theta \\ & + {\frac{1}{\varepsilon }}\int\limits_{{\theta_{1} }}^{{\theta_{2} - \varepsilon }} {\int_{\theta }^{{\theta_{2} }} {\left[ {{\text{MR}}(\zeta ) - {\text{MR}}(\theta_{1} )} \right]f(\zeta )d\zeta \left[ {\hat{q}(\theta - \varepsilon ) - \hat{q}(\theta )} \right]} d} \theta \\ \end{aligned} $$
(24)

Due to (15), the first term on the right of (24) tends to zero as \( \varepsilon \to 0. \) The second also tends to zero, when \( \varepsilon \to 0. \) The third term is clearly non-positive, in the light of (16) and the fact that, since \( \hat{q}(\theta ) \) is non-decreasing, \( \hat{q}(\theta - \varepsilon ) \le \hat{q}(\theta ). \) It follows that \( \{ \bar{q}(\theta ),\bar{z}(\theta )\} \) turns out to be a true solution to Problem 2 in the simple case here considered.

One can observe from (15) and (17)–(18) that \( \{ \bar{q}(\theta ),\bar{z}(\theta )\} \) fulfills conditions (11)–(12). Moreover, the continuity of \( \bar{q}(\theta ) \) implies that proviso (13) is also satisfied. The example suggests in this sense that piecewise differentiability may be an excessively strong requirement that rules out many optimal policies stemming from the necessary conditions for an optimum. It seems natural therefore to explore the possibility of extending the relevance of conditions (11)–(13) to a more general framework where piecewise differentiability is not inevitably present. This will be done with the help of:

Restrictioni: Jumps in q(θ) can only occur on intervals all of whose points are points of increase.

Here θ is said to be a point of increase of q(θ) if, for all θ″ and θ′ such that θ″ > θ > θ′, always: q(θ′) > q(θ) > q(θ′). Of course, when every point of increase belongs to an open interval all of whose points are points of increase, on that interval the function concerned must be strictly increasing. But points of increase may define sets of a more complex structure, such as nowhere-dense sets and other possible types of Cantor sets.

Restrictionii: q(θ) cannot contain points of increase on intervals of singularity.

As is well-known, a function is singular on an interval if its derivative there vanishes almost everywhere. Conspicuous examples of non-constant singular functions are the Lebesgue’s Singular Function, also called the Cantor Function, and the (paradoxical) strictly increasing singular function [cfr. Takács (1978) for many examples]. Proviso ii means that if q(θ) is continuous and singular on an interval, it becomes constant there.

Definition

A function q(θ) is said to be well-behaved if it fulfils Restrictions iii.

A notable subset of well-behaved functions turns out to be that of absolutely continuous functions, i.e. those expressible as an integral of their own derivative. Consequently, all continuous and piecewise differentiable functions q(θ) always are well-behaved, but the converse is far from true. In fact, any non-decreasing q(θ) may contain a countably infinite set of jumps, together with an uncountable null set of non-differentiable points, and still be well-behaved. Note that the property is clearly less demanding than the regularity condition imposed in Theorem 4.2 of Milgrom (2004, p. 113), since it allows for intervals on which q(θ) becomes continuous without being absolutely continuous.

Recall that the piecewise differentiability of q(θ) and P(q), together with the continuity of z(θ), was used by Mussa and Rosen (1978) when showing the correspondence between (6)–(7) and (9)–(10). Notwithstanding this, the next proposition allows substituting Problem 1 for Problem 2 without resort to such assumptions:

Proposition 1

Constraints (9)–(10) are equivalent to constraints(6)–(7) within the set of all candidate trajectoriesq(θ) that come to be well-behaved.

Proof

See Ruiz del Portal (2007b). \( \square \)

The proof of Proposition 1 is a direct adaptation of the proof of Theorem 1 in Mirrlees (1971). Our main result can now be enunciated as:

Theorem

Let {z*(θ), q*(θ)} be a well-behaved solution to Problem 2. Then, it satisfies conditions (11)–(13) above.

Proof

See the Appendix. \( \square \)

The Theorem confirms the qualitative properties alleged by Mussa and Rosen (1978), without imposing the piecewise differentiability of q(θ). Other requirements present in Mussa and Rosen (1978), such as the differentiability of C′(q), or the continuity of f(θ), are not needed either for the proof in the Appendix, as it can be easily checked.

In addition, our Theorem helps to explain why the solution of the example above satisfies the standard conditions (11)–(13) despite not being piecewise differentiable. Thus, now we can see that, since \( \bar{q}(\theta ) \) is a well-behaved solution, sufficient conditions (17)–(18) are also necessary for Problem 2. In reality, the fact that constraint (10) is binding on \( \left[ {\theta_{1} ,\theta_{2} } \right] \) in the optimum implies that the only way to show that (17)–(18) are the true necessary conditions is by invoking our Theorem.

It should be emphasized on the other hand that no advanced optimization technique is required for the derivation of the Theorem, provided the procedure in the Appendix just consists in transforming Problem 2 into another program where we do not have to deal directly with constraint (10) above. The reason for the transformation, which is the key point of the proof, lies in that optimal control problems do not allow in general for constraints restricting functions to be monotonous (and well-behaved). In the new program, the constraints replacing (10), i.e. (35)–(37) below, can easily be handled with the help of any standard theorem in optimal control theory that provides for equality-phase constraints. As to the concrete choice of Theorem 12.1 in Makowski and Neustadt (1974), it is justified on the basis of its wide enough scope of application for extending our Theorem to a large variety of problems, such as those discussed in the next section.

Concerning the role plaid by Restrictions iii in the derivation of (11)–(13), it must be stressed that, without them, instead of these necessary conditions our analysis in the Appendix would lead us to the statement of necessity:

$$ \begin{aligned} \int\limits_{{\theta^{\prime}}}^{{\theta^{\prime\prime}}} {\left[ {{\text{MR}}(\theta ) - {\text{MC}}^{*}(\theta )} \right]} f(\theta )d\theta = 0 & \quad {\text{if}}\,\theta^{\prime\prime} ,\theta^{\prime}\,{\text{are}}\,{\text{points}}\,{\text{of}}\,{\text{increase}}\,{\text{of}}\,q^{*} (\theta )\,{\text{not}} \\ & \quad {\text{contained}}\,{\text{in an open}}\,{\text{interval}}\,{\text{of}}\,{\text{singularity}} .\\ \end{aligned} $$
(25)

Restriction ii permits deleting the underlined part of (25), hence excluding from the set of candidates q(θ) those functions not having a stair-case shape on intervals of singularity. In turn, Restriction i enables through Lemma 3 below the equivalence between (25), once deleted its underlined part, and condition (11). Therefore, Restriction i guarantees that the equality \( {\text{MC}}\left( \theta \right) = {\text{MC}}^{*} \left( \theta \right) \) can apply at any point of increase of q*(θ), even when this point does not belong to an interval of strict increase. Together, Restrictions iii also entail the continuity of q*(θ) at every end-point of any maximal interval of constancy. This involves that all intervals where q*(θ) presents a staircase shape are intervals of constancy, thus allowing to derive conditions (12)–(13) from (25), by basing the argument upon continuity reasons.

In passing, it is interesting to check that “well-behavingness” is a binding requirement in the sense that, without it, we could not find a solution to Problem 2 that is not well-behaved but satisfies conditions (11)–(12) in the Theorem. First, the failure of Restrictions iii implies that q*(θ) cannot be constant along any maximal interval of singularity, therefore preventing that condition (12) can ever hold. Second, the need for (11) of i comes from that, due to the continuity of MR(θ) and C′(q), jumps in q*(θ) become impossible on domains where (11) holds. Finally, a continuous q*(θ) cannot violate ii and at the same time fulfill (11) since, as can be seen by totally differentiating (11), this would mean that q*(θ) is also continuously differentiable almost everywhere.Footnote 4

4 Other principal-agent problems

As is well known, the problem of monopoly with product quality belongs to a broader family of problems that share with it a common structure. This family includes, among other topics, optimum nonlinear taxation [e.g., Seade (1977), Brunner (1993)]; non-linear prizing [e.g., Spence (1977), Armstrong (1996)]; the theory of the monopolist [e.g., Goldman et al. (1984) and Rochet and Choné (1998)]; the problem of optimal insurance [e.g., Stiglitz (1977)]; the so-called “principal agent problem” [e.g., Guesnerie and Laffont (1984)]; the optimal design of auctions [e.g., Milgrom (2004)]; the theory of incentives [e.g., Laffont (2000)].

A permanent feature in this family of problems is the existence of a vector parameter θ denoting a number of characteristics (i.e. ability, wage, taste, age, etc.) that are distributed in the population obeying a continuous density function f(θ). The principal ignores the characteristics of each agent, since his knowledge of θ is confined to a statistical level. The aim is to characterize a vector P(q) of policy-functions and a vector q(θ) of decision profiles, which optimize an objective-function (e.g. social welfare, consumer surplus, profit, pay-off, etc.) subject to several constraints. One of these constraints reflects individual behavior and contains the searched policy-functions.

As a second aspect in common, the solution is characterized by deriving the necessary conditions of an “instrumental program” where, as in Problem 2 above, P(q) is absent and the restriction of individual behavior is only implied. In its place we find, provided an appropriate single-crossing proviso is satisfied, one or several differential (or integral) equations supplemented by one or several monotonicity constraints denoting, respectively, first and second-order conditions for utility maximization. Customarily, the justification of the instrumental problem is based on the existence of a so-called “Constraint Simplification Lemma”, or of a “Constraint Reduction Theorem”.

Another affinity in these problems is that, thanks to their piecewise differentiability assumptions on q(θ) guaranteeing that a well-behaved solution be achieved in the terms of our definition, the necessary conditions invariably exhibit two types of statements.Footnote 5 First, there appears one or several point-wise conditions, similar to (11), holding at points where the solution exhibits a non-vanishing derivative. And second, there arises one or several integral conditions, applying like (12) at intervals of constancy, which depict the existence of corners in the optimal policy-functions P(q).

All of these considerations suggest the possibility of extending the necessary conditions in principal-agent problems to a new framework where differentiability restrictions are not present, same as we have done with the problem of monopoly with product quality. A strong argument for this is that, after inspecting the proof of our Theorem above one can check that, for most of the problems mentioned, the procedure in the Appendix suggests a pretty straightforward derivation of necessary conditions at the optimum.

Anyway, two possibilities must be distinguished with regards to the models concerned:

4.1 P(q), q(θ) and θ are just scalars

When this is the case, the method of proof of Theorem 1 in Mirrlees (1969, 1971) applies integrally, same as in our Proposition 1, and a single differential (or integral) equation, supplemented by a single monotonicity constraint, will suffice to depict the restriction expressing individual behavior. Therefore, a distinction between the original problem and an analogous instrumental program can be established, perhaps after some few adaptations, without invoking any kind of differentiability assumption on P(q) and q(θ). Now the monotonicity constraint will only require q(θ) to be well-behaved, with restrictions iii expressed in terms, not just of points of increase, but of points of strict monotonicity so as to allow for constraints of non-increase.

It can be shown that, following the steps in the Appendix with respect to the resulting instrumental program, we shall reach both a pointwise condition and two interval conditions, similar to (11)–(12), as the necessary provisos that characterize the solution. This will be so provided there exist two circumstances, in connection with problems considered here, which ensure achieving the provisos mentioned. On the one hand, the transformation of the monotonicity constraint on q(θ) into identical constraints to (34)–(37) below may always be done as seems perfectly obvious; of course, the inequality in (37) must be reversed in case we face a monotonicity constraint of non-increase. On the other, the theorem considered in the Appendix, i.e. Makowski and Neustadt (1974, Theorem 12.1), presents a wide enough scope of applicability to allow for a complete characterization of the solution to the instrumental program in question.

In fact, once transformed the monotonicity constraint all we must do is to convert the resulting program into a minimization problem of Mayer, same as Problem 3 in the Appendix. The application to it of Theorem 12.1 in Makowski and Neustadt (1974) will lead us invariably to the above described structure of necessary conditions. Aspects such as the adoption of a more general utility function will not involve any significant departure from the basic model, as long as a suitable single-crossing condition is assumed. Identical arguments can be said of the inclusion of additional constraints denoting production possibilities, or any other restriction on the principal, since this will introduce a new addend in the resulting necessary conditions, but nothing else. All this has been confirmed in Ruiz del Portal (2007a) for the problem of optimum income taxation, thus supporting the idea that derivation of similar results for other principal-agent problems is fairly routine when P(q), q(θ) and θ are just scalars.

Incidentally, an additional advantage to be remarked from our approach in the Appendix is that the continuity assumption, present in all problems here with regards to the density function f(θ), may be relaxed to some extent. Thus, one can assume f(θ) to be discontinuous up to an at most countable set of points without modifying the line of argument and type of results in the Appendix.Footnote 6 In contrast, even when f(θ) is assumed to be continuous Mussa and Rosen’s conclusion on the global continuity of q*(θ) will not naturally arise for q*(θ), as a parallel condition to (13) above, unless some additional assumptions can be adopted for the particular problem in question.

4.2 P(q), q(θ) or θ may be vectors

Here, the greater degree of complexity involved explains that on occasions the characterization of the solution be made under the so-called “first-order approach”, i.e. a heuristic method for dealing with the instrumental program, implying the neglect of monotonicity constraints under the assumption that they will end up satisfied in the optimal solution.Footnote 7 Only few contributions, such as Rochet and Choné (1998), allow for the monotonicity constraints when θ, q(θ) and P(q) can be vectors.

However, either adopting the first-order approach, or taking into account the monotonicity constraints expressive of second-order conditions for utility maximization, it seems hard to extrapolate our results in the preceding section to some of these problems successfully. To begin with, we have to confine the analysis to models in which θ is one-dimensional, provided there is no theorem on optimal control theory, to the best knowledge of the author, allowing for multidimensional θ and containing the possibilities of application in Theorem 12.1 of Makowski and Neustadt (1978).

On the other hand, although one can always conjecture that, at least when θ is a scalar but P(q) and q(θ) are still vectors, an equation like (38) below may always be found by proceeding as in the Appendix, the resulting pointwise condition equivalent to (11) will hold in that case only almost everywhere. The reason for this rests in the fact that the sort of proof, employed in Lemma 3 below, fails to work without assuming that q*(θ) is continuous up to an at most countable set of points, something that is implied in the monotonicity constraint when q(θ) turns out to be a scalar.

5 Conclusions

We have reinvestigated the question of characterizing solutions to principal-agent problems, from the necessary conditions for an optimum, in economies where individual characteristics θ become unknown. The standard approach adopts strong assumptions on the candidates for the solution, which typically consist in restricting the analysis to decision profiles q(θ) that are piecewise differentiable. However, even in environments where continuity of the solution q*(θ) is formally established, piecewise differentiability implies a non-negligible loss of generality. This limitation has been illustrated with an example showing, not only the existence of a non-piecewise differentiable q*(θ), but also that if such q*(θ) is well-behaved, in the sense of not too uneven, it will fulfil identical necessary conditions as under piecewise differentiability.

Taking the example as a starting point, the present paper has relaxed the standard assumptions by considering well-behaved decision profiles, those such that: (i) jumps in q(θ) just occur on intervals all of whose points are points of strict monotonicity, and: (ii) intervals of singularity for q(θ) cannot contain points of strict monotonicity. In doing so, we have proceeded by first relaxing piecewise differentiability in the quality provision model of Mussa and Rosen (1978), to demonstrate afterwards that the same sort of generalization applies to a large class of agency problems. As a consequence of it, existing results prove to be more general than initially noted, holding as well under weaker conditions than those adopted, not only for endogenous objects like q(θ), or the policy-function P(q), but also for an exogenous object like the density function of types f(θ). More precisely, the conclusions that appear are:

  1. (1)

    Qualitative results in Mussa and Rosen (1978) still apply after replacing the piecewise differentiability of P(q) and q(θ) by Restrictions iii.

  2. (2)

    Conclusion 1 holds as well in connection with programs of the principal-agent type such that P(q), q(θ) and θ are one-dimensional. The only exception here is the result on the global continuity of q*(θ), which seems to depend heavily on the particular characteristics of Mussa and Rosen (1978)’s problem.

  3. (3)

    Conclusion 2 also applies even if f(θ) is allowed to be discontinuous up to an at most countable set of points.

These conclusions extend considerably the scope of principal-agent theory, thus implying that main results in the literature are robust when P(q), q(θ) and θ become scalars. It would be good therefore to explore the possibility of achieving similar conclusions in the context of models where P(q), q(θ) or θ may be multidimensional.