分类:Surprised by the Gambler’s and Hot Hand Fallacies

来自Big Physics
Jinshanw讨论 | 贡献2017年12月9日 (六) 23:42的版本


Miller, Joshua Benjamin and Sanjurjo, Adam, Surprised by the Gambler's and Hot Hand Fallacies? A Truth in the Law of Small Numbers (November 15, 2016). IGIER Working Paper No. 552. Available at SSRN: https://ssrn.com/abstract=2627354 or http://dx.doi.org/10.2139/ssrn.2627354

Abstract

We prove that a subtle but substantial bias exists in a standard measure of the conditional dependence of present outcomes on streaks of past outcomes in sequential data. The magnitude of this novel form of selection bias generally decreases as the sequence gets longer, but increases in streak length, and remains substantial for a range of sequence lengths often used in empirical work. The bias has important implications for the literature that investigates incorrect beliefs in sequential decision making---most notably the Hot Hand Fallacy and the Gambler's Fallacy. Upon correcting for the bias, the conclusions of prominent studies in the hot hand fallacy literature are reversed. The bias also provides a novel structural explanation for how belief in the law of small numbers can persist in the face of experience.

总结和评论

这篇李克强推荐给我神奇的文章讨论了这样一件事情:当我们重复扔上[math]\displaystyle{ N }[/math]次硬币的时候,我们做下面的一个记录——如果我们遇到了一个正面(H)就把下次的观测值记录下来;接着,在这个观测记录中,我们来计算正面的比例[math]\displaystyle{ p^{H}_{1} }[/math],并且看这个比例是否接近硬币的内在概率[math]\displaystyle{ q^{H} }[/math].

这个例子还可以推广成为连续观测到[math]\displaystyle{ k }[/math]次以后开始记录,再来计算[math]\displaystyle{ p^{H}_{k} }[/math],然后和[math]\displaystyle{ q^{H} }[/math]比较。这里为了简单计,我们用[math]\displaystyle{ k=1 }[/math]

这个问题的背景是手热效益:是否连续投球成功以后成功率变高,或者反过来叫做赌徒的谬误:是否连续出现正面之后正面的概率变小。当然,实际问题中,前者更复杂一些,因为有可能确实会出现打出上风球于是球场气氛得到了改变,从而有可能改变了投球成功率。之前有理论和实际研究[1]表明理论上[math]\displaystyle{ p^{H}_{k}=q^{H} }[/math],并且在篮球实际统计结果中确实不存在手热效益。

本文对之前的研究提出了挑战,认为:理论上[math]\displaystyle{ p^{H}_{k}\neq q^{H} }[/math],并且实际上篮球实际统计结果中存在手热效益。

初看起来,如果这个结果是正确的,那么,不仅否定了之前文章的结果,还会对理论造成冲击:[math]\displaystyle{ p^{H}_{k} }[/math]不过就是一个条件概率,怎么会在独立事件(扔硬币)的条件下,不等于[math]\displaystyle{ q^{H} }[/math]?看起来实在太神奇了,意义太非凡了,也太不可能是正确的了!

仔细读了这篇文章[2]以后,我发现,实际上,是统计方式的问题。当我们问“在这个观测记录中,正面的比例是多少”的时候,问题是没有清楚的定义的,存在两种理解。第一种,在一轮实验记录中——一轮的意思是[math]\displaystyle{ N }[/math]次结果的一个序列[math]\displaystyle{ x_{1}, x_{2}, \cdots, x_{N} }[/math],做上面的规定好的统计。第二种,在很多很多轮的结果的集合中,也就是把一大堆[math]\displaystyle{ \left\{x_{1}, x_{2}, \cdots, x_{N}\right\} }[/math]中来做上面规定好的统计。两者的答案可能是不一样的。以[math]\displaystyle{ k=1 }[/math]为例,前者相当于在单次结果上来计算这个比例[math]\displaystyle{ \frac{HH}{HH+HT} }[/math]。如果后面还把这样的很多次结果做一个平均的话,实际上相当于计算[math]\displaystyle{ \left\langle\frac{HH}{HH+HT}\right\rangle }[/math]。后者相当于直接计算统计平均[math]\displaystyle{ \frac{\left\langle HH\right\rangle}{\left\langle HH+HT \right\rangle} }[/math]

也就是说第一种定义是[math]\displaystyle{ p^{Sample}\left(H\right)_{1}=\frac{1}{\sum_{s \in S^{*}}}\sum_{s\in S^{*}}\frac{\sum_{j} x^{s}_{j}x^{s}_{j+1}=HH}{\sum_{j} x^{s}_{j}x^{s}_{j+1}=HH,HT} }[/math][math]\displaystyle{ S^{*} }[/math]是为了把那些不产生记录,于是分母会变成0的样本剔除掉。

通常的条件概率,[math]\displaystyle{ q^{H}=P\left(x_{j+1}=H|x_{j}=H\right)=\frac{\sum_{s} x^{s}_{j}x^{s}_{j+1}=HH}{\sum_{s}x^{s}_{j}x^{s}_{j+1}=HH,HT} }[/math]是固定[math]\displaystyle{ j }[/math]并且对于大样本求和的,也就是第二种意义下的计算,并且[math]\displaystyle{ j }[/math]是一个固定值。因此,出现第一种计算的结果理论上就不同,并不奇怪。甚至第二种计算,也不是和这个通常的条件概率是一样的。

在第二种计算下,我们关心[math]\displaystyle{ p^{Ensemble}\left(H\right)_{1}=\frac{\sum_{s,j} x^{s}_{j}x^{s}_{j+1}=HH}{\sum_{s,j} x^{s}_{j}x^{s}_{j+1}=HH,HT} }[/math]。不过,由于在这个情形下,每一个具体的[math]\displaystyle{ j }[/math]的情况下,这个比例都是[math]\displaystyle{ q^{H} }[/math],于是算出来的比例仍然是[math]\displaystyle{ q^{H} }[/math]

现在,搞清楚了理论上不同的值出现的情形和这些个情形的意义,那么,实际情况是如何做统计的呢?

一定程度上,如果统计是每一场球独立的结果计算,然后按照这个每一场球的结果来算整体的平均,那么,确实更加接近情形一,也就是理论上不等于[math]\displaystyle{ q^{H} }[/math]的情况。

如果把所有的结果放在一起再来统计,那么,就更加接近情形二,也就是理论上等于[math]\displaystyle{ q^{H} }[/math]的情况。

也就是说,这个结果仅仅是统计方式的差别带来的:如果人们讨论手热效益是每一场球的平均,那么,就应该用[2]的计算;如果人们讨论手热效益实际上是大量的不同场的球的整体合起来的感觉的平均,则就应该用[1]的计算。

更深层次的原因:实际上,统计学永远考虑的是系综平均,而不是对样本平均:对于只扔一次就算一个样本,我们需要整体样本空间通过重复一样地来扔很多很多次来产生;对于某种顺序或者方式扔N轮算一个样本,我们还是需要重复这样的N轮很多很多次来产生样本空间。因此,让N变成无穷大的极限是没有统计学意义的,只有让系综里面系统的个数,也就是S变成无穷大才是统计学极限的意义。实际上,从概率理论上说,[math]\displaystyle{ p^{Sample}\left(H\right)_{1} }[/math]没有意义。足见对概念的正确理解是多么重要啊。当然实际上人们是如何来估计对热手效益的理解的,那是另一回事。

回到这个工作的意义:文章宣称他们的这个概率解释了为什么赌徒的谬误之类的直觉现象是有道理的。这是完全错误的。在赌徒的谬误的情况下,所对应着的计算应该是固定j之后的系综平均,也就是标准数学条件概率,因此,完全就应该是理论上的数学上的正确答案,而不是他们定义的条件概率。他们定义的条件概率仅仅会在按照他们的统计的情况下出现:先按照序列来计算比例,然后计算这个比例的系综平均。

Short summary in English

  1. Given a coin with probability being head (H) is predetermined as [math]\displaystyle{ q^{H} }[/math]
  2. The usual mathematical conditional probability is defined as [math]\displaystyle{ P\left(x_{j+1}=H|x_{j}=H\right)=\frac{\sum_{s} x^{s}_{j}x^{s}_{j+1}=HH}{\sum_{s}x^{s}_{j}x^{s}_{j+1}=HH,HT} }[/math], where [math]\displaystyle{ j }[/math] is fixed and we have [math]\displaystyle{ P\left(x_{j+1}=H|x_{j}=H\right)=q^{H} }[/math]. The key here is that [math]\displaystyle{ \sum_{s} }[/math] is performed all over the ensemble of sequences, but not on each sequence.
  3. Another definition can be [math]\displaystyle{ p^{Ensemble}\left(H\right)_{1}=\frac{\sum_{s,j} x^{s}_{j}x^{s}_{j+1}=HH}{\sum_{s,j} x^{s}_{j}x^{s}_{j+1}=HH,HT} }[/math]. It can be proved that this definition leads to the same value with the above one.
  4. [2] defines [math]\displaystyle{ p^{Sample}\left(H\right)_{1}=\frac{1}{\sum_{s \in S^{*}}}\sum_{s\in S^{*}}\frac{\sum_{j} x^{s}_{j}x^{s}_{j+1}=HH}{\sum_{j} x^{s}_{j}x^{s}_{j+1}=HH,HT} }[/math], number of HH and HT are calculated on a single sequence first and do an average over the whole ensemble of sequences. Here [math]\displaystyle{ S^{*} }[/math] refers to the set of sequences where [math]\displaystyle{ \sum_{j} x^{s}_{j}x^{s}_{j+1}=HH,HT \gt 0 }[/math] to avoid [math]\displaystyle{ \frac{0}{0} }[/math].
  5. It is not clear to me that which definition is used in [1].
  6. Which one should be used in reality when people are talking about hot-hand effect? If averaging each game first and them do another average of a set of games with the previous average, then [math]\displaystyle{ p^{Sample}\left(H\right)_{1} }[/math] should be used. If otherwise records from all games are collected together first, then [math]\displaystyle{ p^{Ensemble}\left(H\right)_{1} }[/math] should be used.

下一步工作

把两种计算在原文的数据,或者更多的NBA数据上,都实现一下,然后,跟这两篇文章的结果做一个对比。这样这个问题就完全澄清了。

The above explanation conceptually distinguishes the usual mathematical [math]\displaystyle{ P\left(x_{j+1}=H|x_{j}=H\right) }[/math], [math]\displaystyle{ p^{Ensemble}\left(H\right)_{1} }[/math] and [math]\displaystyle{ p^{Sample}\left(H\right)_{1} }[/math]. It is clear and satisfying to me already. However, in order to indeed have a complete picture and provide an end-of-story answer to the original question about the hot-hand effect, one should go and collect the same data or a large data set and apply both [math]\displaystyle{ p^{Ensemble}\left(H\right)_{1} }[/math] and [math]\displaystyle{ p^{Sample}\left(H\right)_{1} }[/math], and furthermore compare the results against those in [2] and [1].

参考文献

  1. 1.0 1.1 1.2 1.3 Gilovich, T., R. Vallone, and A. Tversky (1985): “The Hot Hand in Basketball: On the Misperception of Random Sequences,” Cognitive Psychology, 17, 295–314.
  2. 2.0 2.1 2.2 2.3 Miller, Joshua Benjamin and Sanjurjo, Adam, Surprised by the Gambler's and Hot Hand Fallacies? A Truth in the Law of Small Numbers (November 15, 2016). IGIER Working Paper No. 552. Available at SSRN: https://ssrn.com/abstract=2627354 or http://dx.doi.org/10.2139/ssrn.2627354

附件:程序

conditionalP.py
# http://www.bigphysics.org/index.php/%E5%88%86%E7%B1%BB:Surprised_by_the_Gambler%E2%80%99s_and_Hot_Hand_Fallacies
import random
L=1000000 #number of iterations
N=3 #length of sequences
records=0 # number of sequecens which leads to records, some for example TTT leads to no record 
sample=0.0 #statistics final result for each sample sequence
ensemble=0.0 #statistics final result for whole ensemble
Psample = 0 #number of 1s in the records of per time sequence
Qsample = 0 #number of 0s in the records of per time sequence
Pensemble = 0 #number of 1s in the records of all time sequence
Qensemble = 0 #number of 1s in the records of all time sequence
flag = 0 #indicator of that fact that on the previous round r=1
step = 0 #number of iterations
r1=0  #value of r in the previous round
for trial in range(L):
	Psample=0 #statistics per sequence
	Qsample=0 #statistics per sequence
	for i in range(N):
		r = random.randint(0,1) #can be replaced with a better and specialized random number generator
		if flag==1:
			if r==1:
				Psample=Psample+1
				Pensemble=Pensemble+1 #statistics accumuated throughout the whole ensemble
			else:
				Qsample=Qsample+1
				Qensemble=Qensemble+1 #statistics accumuated throughout the whole ensemble
		flag=r
		r1=r	
	if Psample>0 or Qsample>0: #in the case of no records generated this round, one need this condition to avoid 0/0
		records=records+1
		sample=sample+1.0*Psample/(1.0*Psample+1.0*Qsample)
ensemble=1.0*Pensemble/(1.0*Pensemble+1.0*Qensemble)		

print("Psample=", sample/(1.0*records), "Pensemble=", ensemble)

运行结果:

Psample= 0.40503107107842035 Pensemble= 0.5001166990311979

前者接近5/12,后者接近1/2。

本分类目前不含有任何页面或媒体文件。