Friday 13 July 2018

Liberal leadership in ideas: the welfare state


I have never liked the term "welfare state" with is paternalist tone, preferring "social state" derived from the German Sozialstaat  and fitting with the concept of the social market as a regulated, fair and free economy. It was interesting to read today in The Economist (Repairing the safety net - The welfare state needs updating) that the founding spirit of the UK welfare state, the Liberal William Beveridge, didn't like the term either. However, and more importantly, in that article  and the introductory leader -   - the point is made that it was the liberal philosophy of Beveridge that informed the identification of the need, scope and form of the UK welfare reforms of the  late 1940's. The main thrust of these articles in The Economist is that we must return to liberal values and creative thought on the mutually supporting nature of effective welfare with effective wealth creation to redesign the welfare state.

As chance would have it, this is a topic I addressed in a recent post, Hayek and the welfare state, in which the argument was made that a principled and human approach to welfare was not only compatible with but should be an essential complement to the market economy. This is  very much the tone of the pieces in The Economist too.

Years of tinkering and short term politically motivated meddling from left and right have led to the UK social protections becoming a bureaucratic entanglement that is costly and provide at best a second rate service. In addition the weakening of the liberal tradition, Liberal Party and the consequent decrease in influence have contributed to this state of affairs.This is not the time to roll back to 1945 in terms of detail and implementation but just as the market has evolved to become more diverse and international, social protection must also be thought through again. Neither is it a mere matter of picking up The Orange Book, but it does provide a valuable starting point. What should be returned to is the core liberal philosophy with its humane side personified by Beveridge dealing with the “Five Giants”: disease, idleness, ignorance, squalor and want, and it competitive market foundation for the creation of wealth. We can look to the rounded enlightened figures such as Adam Smith and John Stuart Mill as well as the more austere Friedrich Hayek. Nor should the still very pertinent thinking of the German social market economists such as Walter Eucken be neglected, as well as others across the world. One of most recent champions of liberal enlightenment values, Steven Pinker in Enlightenment  Now, was a pains to stress that it is the combination of social care and wealth creation that has given us the progress in global well-being that he documents in such detail.

Reform is needed to deal with affordability and fairness of a social provision system that is designed to provide the safeguards that foster wealth creation rather than undermine it. The social safety measures must provide quality services that people are proud to use and are proud of the society that provides them. Services such as the NHS do not provide this quality and are currently condemned not to be  capable of delivering it, as outlined in a recent opinion piece by Matthew Parris.

To repeat the essential points from my earlier post; four principles can be proposed to help design social insurance that can enhance market dynamism and economic freedom in a Free-Market Welfare State:
  • Risk and Entrepreneurship.  As the term “safety net” suggests, social insurance can enhance risk-taking and entrepreneurship by ensuring failure is not catastrophic.
  • Search and Adjustment Costs. Workers who are laid off in periods of market restructuring should be ensured a smooth transition through appropriate wage replacements and active labour-market policies. While your job may not be secure, your employment is. 
  • Benefit Portability. Markets work best when social benefits follow the individual and are detached from any particular firm or market structure. (In the UK many people are often trapped in a firm due to penalties imposed to their pension entitlement. In contrast the German system decouples this, as recommended.)
  • Migration Robustness. Welfare benefits should be payments or services resulting from insurance funds to which people have contributed while working in the host country, migrants who claim such benefits should not therefore be perceived to be a great problem. There needs to be further humanitarian safeguards for refugees and others in dire need; as opposed to economic migrants.


Conditional probability: Renyi axioms


In earlier posts the relationship of the material conditional to conditional probability and the role of Leibniz in the early philosophy of probability where discussed. In both posts the case for taking conditional probability as fundamental was made or implied. How far this will resolve the difficulties in combining aspects of propositional logic with probability theory remains to be seen but it is worth taking time to explain a full axiomisation with conditional probability as fundamental. A further consideration is the clarification of distinct role of conditional probability in the epistemic and the objective (ontological) interpretations.

In his Foundations of Probability  Renyi provided an alternative axiomisation to that of Kolmogorov that takes conditional probability as the fundamental notion, otherwise he stays as close as possible to Kolmogorov. Renyi has provided a direct axiomatisation of quantitative conditional probability. In brief, Renyi's conditional probability space $(\Omega, \mathcal{F} (, \mathcal{G}, P(F | G))$ is defined as follows. The set $\Omega$ is the sample space of elementary events and $\mathcal{F}$ is a $\sigma$-field of subsets of $\Omega$ (so far as with Kolmogorov) and $\mathcal{G}$, a subset of $\mathcal{F}$ (called the set of admissible conditions) having the properties:
(a) $ G_1, G_2 \in \mathcal{G} \Rightarrow G_1 \cup G_2 \in \mathcal{G}$,
(b) $\exists \{G_n\}, \cup_{n=1}^{\infty} G_n = \Omega,$
(c) $\emptyset \notin \mathcal{G}$,
$P$ is the conditional probability function satisfying the following four axioms.
R0. $ P : \mathcal{F} \times \mathcal{G} \rightarrow [0, 1]$,
R1. $ (\forall G \in \mathcal{G} ) ,  P(G | G) = 1.$
R2. $(\forall G \in \mathcal{G}) , P(\centerdot | G)$ , is a countably additive measure on $\mathcal{F}$.
R3. $(\forall G_1, G_2 \in \mathcal{G}) G_2 \subseteq G_1 \Rightarrow P(G_2 | G_1) > 0$, $$(\forall F \in \mathcal{F}) P(F|G_2 ) = { \frac{P(F \cap G_2 | G_1)}{P(G_2 | G_1)}}.$$
What has this won over the more well known Kolmogorov formulation?

A number of examples have been highlighted by Stephen Mumford, Rani Lill Anjum and Johan Arnt Myrstad in What Tends to Be, Chapter 6. These have been analysed by them using absolute probabilities as fundamental, so a Kolmogorov type framework, and these examples will be revisited here using Renyi's formulation. The critique of Mumford et all is based on a development of the development of an ontological point of view that has the potential to clarify physical propensities as a degree of causal disposition. The explicit clarification of the example within Renyi's axiomisation shows that by adopting it the path is open to mathematically modelling physical propensities as causal dispositions.

Here is the first example that is thought to indicate a problem with absolute probability (absolute probability will be denoted by $\mu$ below to avoid confusion with Renyi's function $P$).
P1. Let $\mu(A) = 1$ then $\mu(A | B) =1$, $\mu$ is Kolmogorov's absolute probability
We can calculate this result from Kolmogorov's conditional probability postulate as follows: since $\mu(A \cap B) = \mu(B)$, $\mu(A|B) = \mu(A \cap B)/\mu(B) = \mu(B)/\mu(B)=1$. Why is this problematic? Not at all if you stay inside the formal mathematics but is if $\mu(A|B)$ is to be understood as a degree of implication. Is it not reasonable that there must exist a a condition under which the probability of $A$ decreases? A consequence of Renyi's theory is that these Kolmogorov absolute probabilities can be obtained by conditioning on the entire sample space
$$ \mu(A) \equiv P(A|\Omega).$$
Then $\mu(A)=1$ means $P(A|\Omega)=1$ and again (by R3.)
$$P(A|B) = { \frac{P(F \cap B | \Omega)}{P(B |\Omega)}}=1 $$
independently of choice of $B$ which must be a subset of $\Omega$. Thus, giving the same result. However if we are not working within a global conditioning on the entire $\Omega$ but on a proper subset of $\Omega$ called $G$, say, then $\mu(A)=1$ has no consequence for $P(A|G)$ and in a addition it is now possible to pick another conditioning subset of $\Omega$, $G'$, such that $G' \not\subseteq G$ then R3. does not apply and therefore the value of  $P(A|G)$ and $P(A|G')$ have to be evaluated separately. That is, it is a modelling choice. How they are evaluated depends on whether an epistemic or an objective interpretation of $P$ is being used.

A further problematic consequence of Kolmogorov's conditional probability is when $A$ and $B$ are probabilistically independent
P2. $\mu (A\cap B)=\mu(A )\mu(B)$ implies $\mu(A|B)=\mu(A)$⋅
In general Renyi's formulation does not allow this analysis to be carried out. This is because the Kolmogorov conditional probability formula only holds under special conditions, see  R3.  Independence, in Renyi's scheme, is only defined with reference to some conditioning set, $C$ say. In which case probabilistic independence is defined by the condition
$$ P(A \cap B |C) = P(A|C)P(B|C)$$
and as a consequence it is only if  $B \subseteq C$ that
$$   P(A|B ) = { \frac{P(A \cap B | C)}{P(B | C)}} = P(A|C)$$
that is, only if $C$ includes $B$. Therefore, $P(A|C)$ being large only implies $P(A|B)$ is equally large when $C$ inudes $B$, using the mapping to the material implication in propositional logic as shown in  the earlier post.

The third example, P3.,  is that regardless of the probabilistic relation between $A$ and $B$, a third consequence of the Kolmogorov conditional probability definition is that whenever the probability of $A$ and $B$ is high $\mu(A|B)$ is high and so is $\mu(B|A)$:
P3. $(\mu(A \cap B) \sim 1) \Rightarrow((\mu(A|B) \sim 1) \land \mu(B|A) \sim 1))$
As above this carries over into Renyi's axiomisation only for the case of conditionalisation on the whole sample space. If another conditioning set is used, call it $C$ again, then P3. does not hold in general. It does hold, or its equivalent does, when both $A$ and $B$ are subsets of $C$ but that is then a reasonable conclusion for the special case of both $A$ and $B$ included in $C$.

Tuesday 10 July 2018

Leibniz and the propensity interpretation of probability

The point of focus here are the propensity interpretation of probability theory, in which probabilities are physical tendencies that cause events. Contemporary interest in the interpretation is down to Karl Popper and been picked up by Mellor. It is now playing a role in the dispositional metaphysics of objective chance. The origins and initial philosophical discussion of probability can be traced to Pascal and Leibniz and, it is argued, something close to the propensity interpretation attributable to Leibniz too. This role of Leibniz came as surprise on reading The Emergence of Probability by Ian Hacking. Hacking presents Leibniz as the first philosopher of probability and principal guide to the early development of the theory. In addition, as is often the case, the thinking of Leibniz was ahead of his own epoch and many of his points can best be appreciate only following developments in the 20th century.

The origin of probability as a useful science is primarily attributed to Blaise Pascal (1623-1662) and Pierre de Fermat (1601-1665) in a correspondence motivated by a request from Chevalier de Méré for mathematical guidance on games of chance. The answer that Pascal and Fermat developed is that Probability Theory is built upon a fundamental set of equally likely outcomes. This approach is somewhat circular but can be interpreted as an argument based on symmetry and this leads Leibniz naturally to the to the argument from indifference in the interpretation of the theory of probability.

The principle of indifference can take various forms:
  • If there is no reason that one event or outcome will happen more often than an other then they are equiprobable
  • If there is no reason to prefer the outcome of one event over another then they are equiprobable
  • If it is believed that one event will be no more likely to happen than another then they are equiprobable.
The interpretations of probability that derive from Pascal’s principle of symmetry (or equally likely cases) must be distinguished from the logical interpretation. Like so much of his work, most of Leibniz's thoughts on the relationship between probability and logic were not published in his lifetime, however there are important letters. Here as a representative and published quotation from Nouveaux Essais sur l’entendement humain
J’ay dit plus d’une fois qu’il faudroit une nouvelle espece de Logique, qui traiteroit des degrés de probabilité, . . .(I have said more than once that there is need of new type of logic, which will deal with the degrees of probability ...)
Leibniz was more optimistic that this can be done than Locke, who viewed it as “impossible to reduce to precise rules the various degrees wherein men give their assent.” Leibniz believed that a logical analysis of conditional implication would yield such rules, however, this is still considered problematic. The relationship that he saw here was that probability is useful when there is insufficient knowledge to make a rigorous deduction. Leibniz and his logical approach began from legal considerations (he trained as a lawyer), where there is uncertainty in the determination of a question of right (e.g., to property) or guilt. His approach is also important for the emphasis that conditional or relational probabilities are fundamental.

As a young man of 19, Leibniz published a paper proposing a numerical measure of proof for legal cases: “degrees of probability.” His goal was to render jurisprudence into an axiomatic deductive system akin to Euclidean geometry. So, the goal was to transform evidence (a legal notion), into something to be measured by some allocation of weight that will make calculation of justice possible. However, he was also convinced that there had to be an objective and correct situation. From this he developed a dual interpretation of probability:
  • Epistemic - dealing with uncertainty due to lack of knowledge
  • Objective - dealing with the degree feasibility for possibilities to be physically realised.
The epistemic view came first, initially dominated and its conditional character was inherited through its emergence from legal considerations. A bridge is provided from the legal term cases or casu in latin which also means events. Events happen and are part of the standard terminology in modern probability theory. Another term, important across Leibniz's philosophy, is possibility. Leibniz associated equi-possibility with probability and asserted that probability is a degree of possibility. Here he means by possibility the power to achieve various events. In a letter to Bourguet (in Die Philosophischen Schriften von Gottfried Wilhelm Leibniz: Band 3 ed Gerhardt) Leibniz states:
L'art de conjecturer est fondée sûr ce qui est plus ou moins facile, ou bien plus ou moins faisable ... (The art of conjecture is founded on that which is more or less easy or, better, more or less feasible ...)
So, there are now degrees of feasibility that are not dependent on any state of knowledge. However, these degrees of feasibility, propensities, objective possibilities can be themselves objects of knowledge.

Leibniz distinguishes epistemic probability that some possibility is realised and the physical, objective or ontological propensity for some possibility to exist. The relationship between the two is still problematic. For Leibniz, every possibility tends to exist and so every possible world has its tendency to exist to a degree that depends on its feasibility. Leibniz had access to a metaphysical synthesis that provides important insights even if we cannot subscribe to it.

Monday 9 July 2018

Material implication and conditional probability

A simple argument shows that in general the ratio formula for conditional probability cannot be the probability of the material conditional. But there is still controversy over both.


Despite the undoubted success of probability theory in providing tools for inference, statistical analysis and decision making, there remain concerns about its foundations. A major concern is with the status of conditional probability and its relationship with logical implication (indicative conditional). In propositional logic material implication provide the formal concept. Although this is often glossed over in standard texts it is taken seriously by E. W. Adams. However his solution giving primacy to conditional probability is also open to criticism.  These points are of practical importance as status of inference and its foundations in logic, probability and set theory are fundamental to the development of Artificial Intelligence

Adams’ thesis is that the assertability of the indicative conditional $A \to B$ is given by the conditional probability of $B$, given $A$. For example, he writes: “Take a conditional which is highly assertible, say, ‘If unemployment drops sharply, the unions will be pleased’; it will invariably be one whose consequent is highly probable given the antecedent. And, indeed, the probability that the unions will be pleased given unemployment drops sharply is very high”

The default standard foundation of probability theory is the axiomisation of A.N. Kolmogorov. This takes as one of its primitives a function denoting the probability of a set and these sets are called random events. An event is something that happens or has the potential to happen.

In Kolmogorov's theory probability space $\left( \Omega, \Sigma,\mu \right)$ consists of a set $\Omega$ (called the sample space), a $\sigma$-algebra $\Sigma$ of subsets of $\Omega$ (i.e., a set of subsets of $\Omega$ containing $\Omega$ and closed under complementation and countable union, but not necessarily consisting of all subsets of $\Omega$) whose elements are called measurable sets, and a probability measure $\mu:\Sigma \rightarrow \lbrack 0,1\rbrack$ satisfying the following properties:
P1. ${\mu}\left(X \right){\geq 0}$ for all $X \in \Sigma$
P2. ${\mu}\left( \Omega \right){= 1}$
P3. ${\mu}\left( {\bigcup}_{i = 1}^{\infty}{\ }X_{i} \right){= \ }\sum_{i = 1}^{\infty}{\mu(}X_{i}{)}$, if the $X_{i}$'s are pairwise disjoint members of $\Sigma$.
P4. $\mu(A | B) = \frac{\mu(A \cap B) }{\mu(B)}$
Postulate P4 provides an analysis of conditional probability. It is more often referred to as the definition. However conditional probability was current as a concept prior to the axiomisation. In the sense of
The probability of $A$ given $B$,
The probability of "if $B$ then $A$"
or
The probability that $B$ implies $A$.
 In the usage prior to the formalisation of probability $A$ and $B$ are not sets but usually statements or propositions. So a relationship between propositions and sets is needed.

In propositional logic, material implication is a rule of replacement that allows for a conditional statement to be replaced by a disjunction in which the antecedent is negated. The rule states that $P$ implies $Q$ is logically equivalent to not-$P$ (in symbols $\neg P$) or $Q$.
$$ P \to Q \Leftrightarrow \neg P\lor Q$$
where $\Leftrightarrow$ denotes logical equivalence.

There is a straight forward mapping between the sets and connectives in the set based axiomisation and the the propositions and connectives in propositional logic. The correspondence of connectives is:
  • $\cup$ corresponds to $\lor $
  • $\cap$ corresponds to $\land$
  • $\Omega$ corresponds to $\mathbf{t}$ (the single extension of all tautologies)
  • $\emptyset$ corresponds to $\mathbf{f}$ (the single extension of all falsehoods)
  • The set complement ($\bar{A}$ for any $A \in \Sigma$) corresponds to $\neg$ (negation). 
This would mean
  • $\bar{A} \cup B$ corresponds to $P \to Q$, where proposition $P$ pertains to the event represented by $A$ and $Q$ pertains to the event represented by $B$.
So, what is the relationship between $\mu(A | B)$ and $\mu(\bar{A} \cup B)$? A simple analysis shows that they are only equal in a very special case. Consider the partition of $\Omega$ shown in the diagram below.

 From this it follows:
$$ \mu(B|A) = \frac{c}{a+c}$$
and
$$ \mu(\bar{A} \cup B) = b+c+d = 1-a$$
Therefore, in this case,
$$ \mu(B|A) = P(\bar{A} \cup B) \Rightarrow P(A)=1$$.

So the two terms are only equal when $A$ is the certain event. In general, the ratio formula for conditional probability cannot be the probability of the material conditional. In general, the relationship is
$$\mu(\bar{A} \cup B) = 1 - \mu(A)(1- \mu(B|A))$$
This is equivalent to the result stated by E. W. Adams in "The Logic of Conditionals: An Application of Probability to Deductive Logic" page 3.

The morphism between propositional logic and set theory is used extensively in interpreting theories of probability. It preserves structure but does not extend to implication and it does not entail that meaning or ontological status is preserved. It is from the direction of metaphysical analysis of the ontological status of conditionals, both logical and probabilistic, that progress may be made. In a recently published book,  What Tends to Be, Rani Lill Anjum and Stephen Mumford provide a synthesis of this analysis.

In probability theory alternative axiom systems may be the answer and candidates exist from Renyi and Popper. However, the ontological analysis may indicate that the eventual practical answers lie in something more akin to physics rather than logic or mathematics. Future posts will engage critically with this work.

Tuesday 3 July 2018

Beyond the Evidence Base - the strength of causal explanations

The term "evidence based" is often used in statements in health care or science policy that are intended to indicate respect for a scientific approach. Indeed evidence is essential for testing scientific theories and specific statements but science provides something much more powerful and that is an explanatory theory.

Theories arise in science through a critical process that incorporates much debate and draws on past theories, philosophy (if only implicitly) and, of course, evidence. Having passed several tests and often having gone through several formulations a theory will be accepted quite generally as the best current explanation of the facts in the domain to which it applies. The main point to be made here is that the power of the theory goes beyond, and cannot be derived from, the evidence. This provides the ability to understand and predict states of affairs that are not covered by the current evidence base.

The power of explanatory theory can be used to eliminate, provisionally, courses of action and to guide positive proposals. As an example of how a philosophical analysis can contribute to clarifying these issues there is  a recent paper by Rani Lil Anjum "Evidence Based or Person Centered? An Ontological Debate" that uses the example of health care to analyse the limitations of the "evidence based" approach. This critiques the positivist underpinning of Evidence Based Medicine and provides a strong alternative. Lip service is still paid by some prominent scientists but following the work of Karl Popper and others its limitations are clear. However the work of Anjum with her colleague Stephen Mumford is developing philosophical tools that provide a conceptual framework for developing comprehensive causal explanations founded on a dispositional ontology.

Because scientific theories provide explanations that go beyond  the evidence base, they can make strong statements about situations where the evidence is missing and can be too difficult or expensive to generate. However, there is a risk that Evidence Base arguments will be used to undermine the power that theory provides to spell out the consequences of misguided actions. Climate change provides a simple example of an area where well established theory can make statements of global significance. The well established consequences of adding CO2 to the atmosphere together with the input that mankind has indeed added vast quantities of CO2 provides very strong and simple case for human driven climate change. That is, that human action is a cause of climate change. In greater detail the same theories can quantify and provide testable predictions, and an increased evidence base is the output rather than the input.