Statistics,Success Stories,and Cautionary,Tales,统计学的成功案例和警示故事,LESSON 1,第一课,1.1,WHAT IS STATISTICS,?,什么是统计学,Statistics,is a collection of procedures and principle for gathering data and analyzing information in order to help people make decisions when faced with,uncertainty,.,统计学,是一套收集数据和分析信息的方法和原则,以帮助人们在面对,不确定,性时制定决策。,The odds of finding two identical fingerprints were 1 in 64 billion.,Francis,Galton,两个随机个体具有相同,DNA,图形的概率为,310,-11,;如果同时用两种探针进行比较,两个个体完全相同的概率小于,510,-19,。,每支枪的枪管都有独一无二的特征,这种特征影响了它所发射的每一发子弹。,司法弹道学,航空公司通过抽样而省钱,最早的英文原版,打败庄家,1.2,SEVEN STATISTICAL STORIES WITH MORALS,7,个包含寓义的统计学故事,There are three kinds of lies:lies,damned lies and statistics,。,Benjamin Disraeli(British Prime Minister,18041881),CASE STUDY 1,Who are Those Speedy Drivers?,谁是快车手?,“Whats the fastest you have ever driven a car?”,-Penn State University,1994,110 109 90 140 105 150 120 110,110,90 115 95 145 140 110 105 85 95,100 115 124 95 100 125 140 85 120,115105 125 102 85 120 110 120 115 94 125 80 85 140 120 92 130 125 110 90 110,110,95,95,110 105 80 100 110 130 105,105,120 90 100 105 100 120 100,100,80 100 120 105 60 125 120 100 115 95 110 101 80 112 120 110 115 125 55 90,87 Males,80 75 83 80 100,100,90 75 95 85 90,90,90,120 85 100 120 75 85 80 70 85,110 85 75 105 95 75 70 90 70 82 85,100 90 95 90 110 80,80,110,110,95 75 130 95 110,110,80 90 105 90 110 75 100 90 110 85 90 80,80,85 50 80 100 80,80,80,95 100 90 100 95 80,80,50 88 90,90,85 70 90 30 85,85,87 85 90 85 75 90 102 80 100 95 110 80 95 90 80 90,102 Females,Res,ponses to“Whats the fastest youve ever driven?”,Dotplot,MALES,Fastest speed(mph),FEMALES,Fastest speed(mph),Res,ponses to“Whats the fastest youve ever driven?”,Five-number summary,Males,(87 Students),Females,(102 Students),Median,110,89,Quartiles,95,120,80,95,Extremes,55,150,30,130,一条平均水深,0.4m,河流绝不会比一个平均水深,0.6m,的游泳池更安全,Res,ponses to“Whats the fastest youve ever driven?”,Five-number summary,Males,(87 Students),Females,(102 Students),Median,110,89,Quartiles,95,120,80,95,Extremes,55,150,30,130,Definition,:The,median,is the value in the middle when the numbers are put in order.The,lower quartile,and,upper quartile,are(roughly)the medians of the lower and upper halves of the data.,Moral of the story,130,30,150,55,Extremes,95,80,120,95,Quartiles,89,110,Median,Females,(102 Students),Males,(87 Students),Simple summaries of data can tell an interesting story and are easier to digest than long lists.,CASE STUDY 2,Disaster in the Skies?,空中的灾难?,“,Planes get closer in midair as traffic control errors rise.,Errors by air traffic controllers climbed from 746 in fiscal 1997 to 878 in fiscal 1998,an 18%increase”,-,USA TODAY,Levin,1999,“,The errors per million flights handled by controllers climbed from 4.8 to 5.5”,5.5,4.8=114.6%,Definition,:The,rate,is simply the number of times something occurs per number of opportunities for it to occur.,Baseline rate,is the rate at a beginning time period or under specific conditions.,Moral of the story,When discussing the change in the rate or risk of occurrence of something,make sure you also include the base rate or baseline risk.,CASE STUDY 3,Did anyone ask you whom youve been dating?,“,According to a new,USA,Today,/Gallup Poll of teenagers across the country,57 percent of teens who go out on dates say theyve been out with someone of another race or ethnic group.”,-,USA TODAY,Perterson,1997,CASE STUDY 3,“,In most cases,parents arent a major obstacle.Sixty-four percent of teens says their parents dont mind that they date interracially,or wouldnt mind if they did.”,-,Sacramento Bee,Hiram,1997,How could the polltakers manage to ask so many teenagers these question?,Question 1,Could such a small sample possibly tell us anything about the millions of teenagers in the United States?,Question 2,Yes,-if those teens constituted a,random sample,from the,population,of interest.,How accurate could this sample possibly be?,Question 3,The results of this poll are accurate to within a,margin of error,of about 4.5%(95%confidence interval).,Moral of the story,A representative sample of only a few thousand,or perhaps even a few hundred,can give reasonably accurate information about a population of many millions.,CASE STUDY 4,Who Are Those Angry Women?,“,A well-conducted survey can be very informative,but a poorly conducted one can be a complete disaster.”,-,Statistics:Concepts and,Controversives,David S.Moore,“,The women who responded were fed up with men and eager to fight them.For example,91%of those who were divorced said they had initiated the divorce.The anger of women toward men became the theme of the book.”,-,Women and Love,Shere,.Hite,Shere,Hite sent questionnaires to 100,000 women asking about love,sex,and relationships.,The Hite sample exemplifies one of the most common problems with surveys-the sample data may not represent the population.,Extensive,nonresponse,from a random sample,or the use of a Extensive,nonresponse,from a random sample,or the use of a self-selected(i.e.,all-volunteer)sample,will probably produce biased results.,Moral of the story


