# passingconcierge

#
**passingconcierge**
t1_j2dx9el wrote

Reply to **In opposite : could you list things cheap today that will be unaffordable in 2030 ? (and why)** by **salutbobby**

Anything *managed* by software. Software subscription models are going to kill innovation.

#
**passingconcierge**
t1_iz480o8 wrote

Reply to comment by **owlthatissuperb** in **Causal Explanations Considered Harmful: On the logical fallacy of causal projection** by **owlthatissuperb**

> When I'm talking about labeled vs unlabeled, what I really mean is that we have some intuition for how the labeled dataset might behave. E.g. "an increase in money supply causes an increase in inflation" is a better causal hypothesis than "an increase the president's body temperature causes an increase in inflation". We can make that judgement having never seen data, based on our understanding of the system.

What you have here is a circular argument. You are arguing that we can label variables with labels that are theory driven and so we can infer causality between those labels. You have already theorised causality without the data. So, the data is not the source of explanation it is merely a means to, rhetorically, assert that causality is the explanation. You have a causal explanation in mind and label the data informed by that causal explanation and then you carry out a mathematical operation on the numbers labelled and so, *because you have labelled them* you infer a causal explanation.

So you are correct: you can make a *judgement* without seeing the data. The data adds nothing to your understanding of the system *because* you have started from a theory, a model, and your activities with the causal relationship in mind. The data does not "contain causal knobs".

> Having made that hypothesis, we can look back to see if the data support it. The combination of a reasonable causal mechanism, plus correlated data, is typically seen as evidence of causation.

I would argue that what you are doing here is establishing rules for a rhetoric. Let us assume that we both accept mathematics is a kind of unbiased source of knowledge. This is a broad and possibly unwarranted assumption that would need refining, but accept it, broadly, for now.

You have a set of data which you *recognise* as x and y values. You have no theoretical labels to add them. But you list them and you are lazy. So you use a spreadsheet to tell you that the y column can be derived from the x column by

```
f(x) = x^2 with R^2 = 1
```

So you are happy. The coefficient of determination ( R^2 ) tells you that the data "100% supports" the y=x^2 hypothesis. You are happy until someone comes along and says, have you considered

```
f(x) = x * x
f(x) = sqrt(g(x)), g(x) = x * x
f(x) = (x * x * x) / x
f(x) = (x * x * x * x) / (x * x)
f(x) = (x^n) / (x^n-1) forall(n)>2
```

You object that this is all just messing about with variations on squaring things. I agree. But I point out that all I am doing is *showing* that there is more than one way to express a *relationship of x to y* but, generally, avoiding the use of y as a label.

So when you have f(x) = sqrt(g(x)), g(x) = x * x it is an awful circumlocution but it demonstrates that you can have a whole range of things "happening" to avoid using y. Which raises an interesting point about your notions of labelling data.

For a moment, pretend x can be relabelled "money supply" and y can be relabelled "inflation". We have the data set, as before {(1,1),(2,4),(3,9), ..., ( n,n^2 )} and we are supposing that the relationship is f(x) = sqrt(g(x)), g(x) = x * x or it is f(x) x * x. The first things first,

```
f(x) is clearly to be relabelled as inflation.
g(x) is also inflation (see your point^1 below)
sqrt(g(x)) is money supply
```

Your point is that labelling clarifies *causality*. So, in mathematics it is permissible to rearrange a formula. But you are inferring *causality* so the only symbol in common in all of the *formulations* is the equals symbol. Which you might be holding in place of "causes". Which does correspond to your notion of Directed Acyclic Graphs but then places a huge constraint onto what you can actually say with labels.

So, because we have two formulations that you definitely agree on - the ones in the footnote - you can, rhetorically, say that we cannot tell if the causal case is

```
y=x^2 .................... y is caused by x^2
x=sqrt(y) ................ x is caused by sqrt(y)
```

which is then translated into

```
inflation is caused by squaring the money supply
the money supply is caused by square rooting inflation
```

What this highlights is that you now actually need, back in the labels, some meaningful understanding of what "squaring the money supply" is and what "square rooting inflation" is. Because, to be causally coherent, these cannot just be vacuous utterances. This example is incredibly simple.

Just imagine what would happen if your chosen econometric methodology dictates the use of linear regression. You then have a philosopical need to explain x and y in terms of a lot of mathematical structuring around squares, roots, differences, and so on.

Which might boil down to me saying, "I do not think that the equals sign is a synonym for causality". But it might also be saying that "data adds nothing to causal explanation in economics".

Quite literally, you have show two possible formulae for a simple relationship. Which suggests that, at best, a 1 in 2 chance (50% probability p=0.5) that you randomly select the "correct" relationship - where, here, "correct" requires that the relationship expresses something causal. This becomes worse when you realise that it is possible to express x^2 in an infinite variety of ways (rendering p=0, effectively true). This means that you are never really talking about causation.

Which leaves you in the position that econometrics is a good source of rhetorical support for causation but only really provides evidence of correlation: that there is, indeed, a pattern in the data. That pattern in the data does not, in any way, vouchsafe your theoretical causal explanation with certainty. Even if you label it.

^1 E.g. in your x->x2 example, if all you had were a list of Xs and Ys, you couldn't tell if the operation was y=x2 or x=sqrt(y). Without any knowledge of what the Xs and Ys refer to, you're stuck.

#
**passingconcierge**
t1_iz22jo3 wrote

Reply to comment by **owlthatissuperb** in **Causal Explanations Considered Harmful: On the logical fallacy of causal projection** by **owlthatissuperb**

> If you have a starting hypothesis (e.g. an increase in the money supply will cause inflation), you can very much go back and look at historical data to find support for your hypothesis.

You can express "increase in money supply" and "inflation" as "just a bunch of variable labels". So the two scenarios you sketch are identical in every sense apart from the first having named variables and the second having anonymous variables. Which gives the appearance that you are attributing causality on the basis of some pre-exisiting theory about "money supply" and "inflation". Which runs the risk of creating a circular definition. In essentials, you are ignoring the insights of Hume and the response of Kant regarding the insights of Hume.

I am happy to agree that if we have two columns of numbers

```
1 1
2 4
3 9
: :
99 9,801
```

we could agree that the *relationship* between the first column and the second is that the second is the square of the first. That establishes that there is a mathematical relationship but that mathematical relationship does not guarantee any kind of causality. Although, if you take the position of Tegmark - the Mathematical Universe Hypothesis - the existence of a mathematical relationship guarantees reality but not necessarily causality. Which leaves you in the same situation: data sets, labelled or not, do not reveal causality. For that you need a theory of knowledge that gives warrant to the knowledge that x=9 therefore y=81 is a causal relationship and simply labelling the numbers with "money supply equals nine therefore inflation equals eighty one" does not establish that.

Which largely points to there being no "causal knobs" inside data sets. There may be something about a data set that has some kind of "establishes causality" about it, but it is not simply doing mathematical manipulations or matching variable labelling. There is something rhetorical going on that you really are not making clear.

#
**passingconcierge**
t1_iyzm86f wrote

Reply to comment by **owlthatissuperb** in **Causal Explanations Considered Harmful: On the logical fallacy of causal projection** by **owlthatissuperb**

> The only way to infer causality is to reach into a system and modify it.

This seems, to me, to be an unfounded strong claim about inference that entails *causality* always being obliged to be empirical. Which, essentially, reduces econometrics, as it exists, to being entirely *correlative* knowledge *because it is composed entirely of historical data*.

What if there is no "cause knob" but, also, the set of data, C, at time 0 always results in the specific set of data, E, at time x>0, but a random set of data, Rn, at any time x where n<>x? There is nothing to modify since Modifying C changes the set and so there is no transition C->E. Which, essentially, means you have frustrated, prevented, blocked - *essentially interrupted* - the causal connection between C & E. This might not seem to be clearly expressed but it does actually require that causality is holistically considered: you have to take all the nodes and arcs of the graph into account.

You might say, that is simply a description of correlation and always was, and your claim might seem convincing. But how do you *exclude* causality. Even at a vanishingly small probability, the statement "that C causes E" is a fact; and a legitimate claim to make, even if you must qualify it by saying *but only once in a billion*. You might say *one in a billion means it will never happen*. Which is not a great claim. The probability of winning the lottery is, say, one in a billion - *or tens of billions* - yet there is more than one lottery winner since it started. The point being that, just because something has a low probability of happening does not *forbid* it happening.

> "when A changes, the B tends to change too" doesn't get you there (even if e.g. there's a time delay).

So, the idea here is not proven by your claims. You can infer causality by passively looking at data. Econometrics does it all the time. The deeper problem being that we live in a Universe that is deeply causal. Which suggests that starting from an assumption that there is "no causality involved" is a flawed premise. A flawed premise that is easily rejected because the data was *created by a person not a random process* and, therefore, you need good reason to reject the notion that the data "has" causality locked into it.

The idea of causality as being purely mechanistic, which is what it seems you are supposing here, is not the only way you can reason about causality.

#
**passingconcierge**
t1_isrqmly wrote

Reply to **Spooky artificial intelligence found to accurately predict the future by 99%** by **Stephen_P_Smith**

All this article tells you is that it is possible to frame a question that you know the answer to and then to have a statistical system extract the answer you first thought of from a data set. That is more a caution about the problems of taking AI systems uncritically at face value and perhaps the need for double blinding in predictive systems.

passingconcierget1_j6b4d46 wroteReply to comment by

cavillchallengerinSeeking passage to use for Eulogy from Hitchhiker's Guide to the Galaxy.bycavillchallengerIt is available at the Internet Archive