数据分析(SAS描述性统计分析过程).ppt

上传人:max****ui 文档编号:11543564 上传时间:2020-04-28 格式:PPT 页数:33 大小:799KB
返回 下载 相关 举报
数据分析(SAS描述性统计分析过程).ppt_第1页
第1页 / 共33页
数据分析(SAS描述性统计分析过程).ppt_第2页
第2页 / 共33页
数据分析(SAS描述性统计分析过程).ppt_第3页
第3页 / 共33页
点击查看更多>>
资源描述
数据分析,SAS软件,描述性统计分析过程,信息学院张建新2010.3-6.,几种描述性统计分析的SAS过程和作图过程,procmeans,procunivariateproccorr,procplot/procgplotproccapability,procmeans(1)Means过程的语句格式Means过程的主要控制语句如下:procmeans输入数据集名;,var,变量列表;,class变量列表;,byfreq,变量列表;变量;,weight变量;,id,变量列表;,output;run;,procmeans(2),var语句规定要求计算简单描述性统计量的数值变量的次序。,by语句按by语句定义的变量进行分组计算其相应的简单统计量,要求输入数据集已按by变量排序。class语句与by语句一样,可用class变量定义观测组,分别计算各组观测的描述统计量。输出格式与by不同且事先不需要按class变量排序。freq语句指定一个数值型的freq变量,它的值表示输入数据集中相应观测出现的频数。,weight语句规定一个weight变量,它的值表示相应观测的权数。,id语句在输出数据集中增加一个或几个附加变量,目的在于识别输出数据集里的观测。其值为生成这个观测的输入数据集中相应观测组里id变量具有的最大值。,n,t,cv,procmeans(3)procmeans语句中可用的统计量关键字,统计量名称,含义,统计量名称含义,未丢失的观测个数,mode,众数,出现频数最高的数,nmissmeanstderrsumstdvarusscssskewnesskurtosis,丢失的观测个数算术平均均值的标准误差加权和标准偏差方差变异系数的百分数加权平方和关于均值偏差的加权平方和对称性的度量偏度对尾部陡平的度量峰度,sumwgtmaxminrangemedianprtclmlclmuclm,权数和最大值最小值极差,maxmin中间值总体均值等于0的t统计量t分布的双尾p值置信度上限和下限置信度下限置信度上限,procmeans(4),output语句中的选项。,输出数据集名。,统计量关键字=变量名列表规定在输出数据集中要包含的统计量并规定这些统计量在新数据集中的变量名。means过程对output语句的次数没有限制,可以使用几个output语句来创建内容不同的多个数据集。,N,procmeans(5)SAS程序dataexamp1;inputx;cards;70.472.076.574.376.577.667.372.075.074.373.579.573.574.765.076.581.675.472.772.767.276.572.770.477.268.867.367.367.372.775.873.575.072.773.573.572.781.670.374.373.579.570.476.572.777.284.375.076.570.4;procmeansdata=examp1nmeancvskewnesskurtosisrangemedian;varx;run;输出,TheMEANSProcedure,AnalysisVariable:x,Mean,Variation,Skewness,Kurtosis,Range,Median,5073.74600005.40837940.15401110.358117919.3000000,73.5000000,procunivariate(1),单变量统计分析,对一组单指标实验数据进行分析常采用两种方法:,图示法包括茎叶图、盒型图和正态概率,图。,描述统计量包括矩、分位数、极端值和,頻数分布表。,procunivariate(2)Univariate过程的主要控制语句如下:procunivariate输入数据集名;,varbyfreqweightid,变量列表变量列表变量;变量;变量列表,;,output;run;,procunivariate(3)Univariate过程的主要控制语句如下:procunivariate输入数据集名;,varbyfreqweightid,变量列表变量列表变量;变量;变量列表,;,output;run;,教材1.1例题examp1_1(SAS程序),dataexamp1_1;inputx;cards;,74.378.868.878.070.480.580.569.771.273.579.575.675.078.872.072.072.074.371.272.075.073.578.874.375.865.074.371.269.768.073.575.072.064.375.880.369.774.373.573.575.875.868.876.570.471.281.275.070.468.070.472.076.574.376.577.667.372.075.074.373.579.573.574.765.076.581.675.472.772.767.276.572.770.477.268.867.367.367.372.775.873.575.072.773.573.572.781.670.374.373.579.570.476.572.777.284.375.076.570.4;,procunivariatedata=examp1_1;varx;run;,教材1.1例题examp1_1(SAS结果1)TheUNIVARIATEProcedureVariable:xMoments,NMeanStdDeviationSkewness,100SumWeights73.66SumObservations3.94008153Variance0.06007521Kurtosis,100736615.52424240.03386864,UncorrectedSS,544116.46CorrectedSS,1536.9,CoeffVariation5.34901103StdErrorMeanBasicStatisticalMeasures,0.39400815,Location,Variability,Mean73.66000StdDeviationMedian73.50000VarianceMode73.50000Range,3.9400815.5242420.00000,InterquartileRange,4.60000,教材1.1例题examp1_1(SAS结果2)TheUNIVARIATEProcedureQuantiles(Definition5),Quantile100%Max99%95%90%75%Q3,Estimate84.3082.9580.5079.1575.80,50%Median25%Q110%,73.5071.2068.40,5%1%0%Min,67.3064.6564.30,proccapability(能力分析过程),PROCCAPABILITYisdesignedforprocesscapabilityanalysis,including:,Histograms(直方图)andcomparativehistograms.Cumulativedistributionfunctionplots(cdfplots)(累积分布函数).,Quantile-quantileplots(Q-Qplots),probabilityplots,andprobability-probabilityplots(P-Pplots).Theseplotsfacilitatethecomparisonofadatadistributionwithvarioustheoreticaldistributions.Goodness-of-fit(拟合优度)testsforavarietyofdistributionsincludingthenormal.,Statisticalintervals(prediction,tolerance,andconfidenceintervals)foranormalpopulation.,教材1.2例题examp1_4(SAS程序),dataexamp1_4;inputx;cards;,74.378.868.878.070.480.580.569.771.273.579.575.675.078.872.072.072.074.371.272.075.073.578.874.375.865.074.371.269.768.073.575.072.064.375.880.369.774.373.573.575.875.868.876.570.471.281.275.070.468.070.472.076.574.376.577.667.372.075.074.373.579.573.574.765.076.581.675.472.772.767.276.572.770.477.268.867.367.367.372.775.873.575.072.773.573.572.781.670.374.373.579.570.476.572.777.284.375.076.570.4;,proccapabilitydata=examp1_4;,histogramx/normal(mu=estsigma=est);cdfplot/normal(mu=estsigma=est);qqplotx/normal(mu=estsigma=est);run;,教材1.2例题examp1_4(SAS结果)TheCAPABILITYProcedureFittedNormalDistributionforxParametersforNormalDistributionParameterSymbolEstimate,Mean,Mu,73.66,StdDevSigma3.940082QuantilesforNormalDistribution-Quantile-PercentObservedEstimated1.064.650064.49405.067.300067.179110.068.400068.610625.071.200071.002550.073.500073.660075.075.800076.317590.079.150078.709495.080.500080.140999.082.950082.8260,教材1.2例题examp1_4(SAS直方图),教材1.2例题examp1_4(SAS分布函数图),教材1.2例题examp1_4(SASqq图),教材1.2例题examp1_6(SAS程序),dataexamp1_6;inputx;cards;,74.378.868.878.070.480.580.569.771.273.579.575.675.078.872.072.072.074.371.272.075.073.578.874.375.865.074.371.269.768.073.575.072.064.375.880.369.774.373.573.575.875.868.876.570.471.281.275.070.468.070.472.076.574.376.577.667.372.075.074.373.579.573.574.765.076.581.675.472.772.767.276.572.770.477.268.867.367.367.372.775.873.575.072.773.573.572.781.670.374.373.579.570.476.572.777.284.375.076.570.4;,procunivariatedata=examp1_6plot;varx;run;,#,4,教材1.2例题examp1_6(SAS结果),StemLeaf,Boxplot,8438382812668035579555780888772267655555557500000004688888743333333377355555555555720000000777777771222270344444446977768008886723333666500643,13334371491114835521,0|+-+|*-+-*|+-+|0,教材1.2例题examp1_8(SAS程序),dataexamp1_8;inputx;cards;,254550545561646872757578798183848484858686868789898990919192100;,procunivariatedata=examp1_8normal;run;,proccapabilitydata=examp1_8graphicsnoprint;histogramx/weibullvscale=proportion;run;,datadelmin;,setexamp1_8;,ifx=25thendelete;run;,proccapabilitydata=delmingraphicsnoprint;histogramx/weibullvscale=proportion;cdfplotx/weibull;run;,2,教材1.2例题examp1_8(SAS结果1)TestsforNormality,Test,-Statistic-,-pValue-,Shapiro-Wilk,W0.863287,PrD,W-SqA-SqW-SqA-SqChi-SqW-Sq0.013PrA-SqChi-Sq0.073,结论:对于删去25的数据集,接受Weibull分布的假设,教材1.2例题examp1_8(SAS结果2),proccorr(1),proccorr(相关分析过程)用于计算变量之间的相关系数,包括Pearson(皮尔逊)的乘积矩相关和加权乘积矩相关。还能产生三个非参数的关联测量:Spearman的秩相关,Kendall的tau-b和Hoeffding的相关性度量D。proccorr语句调用corr过程,且是唯一必须,的语句。如果只使用proccorr这一条的语句,过程计算输入数据集中所有数值变量之间的相关系数。其余语句是供选择的。,proccorr(2)proccorr过程一般由下列语句控制:,proccorrvarwithpartialweightfreqByrun;,data=数据集;变量列表;变量列表;变量列表;变量;变量;变量列表;,教材1.3例题examp1_9(SAS程序),dataexamp1_9;inputxy;cards;,689716389270112568265931911210162123212031530375334622735221305584142292733217185537036287265740;,run;,proccorrdata=examp1_9pearsonspearmancov;run;run;,x,y,N,x,y,教材1.3例题examp1_9(SAS结果1)TheCORRProcedure,2Variables:,x,y,CovarianceMatrix,DF=19,x570.45007845.0789,y7845.0789112404.2632,SimpleStatistics,Variable,20,Mean33.85000,StdDev23.88410,Median27.00000,Minimum5.00000,Maximum70.00000,20,477.50000,335.26745342.0000082.00000,1125,x,y,x,y,教材1.3例题examp1_9(SAS结果2)TheCORRProcedurePearsonCorrelationCoefficients,N=20Prob|r|underH0:Rho=0,x1.000000.97971,y0.97971.00011.00000,|r|underH0:Rho=0,x1.000000.97366.0001,y0.97366|r|underH0:Rho=0,x1,x2,x3,x4,x5,x6,x1x2x3x4x5x6,1.000000.87024.0001-0.365760.1128-0.389690.0894-0.493080.0272-0.226300.3374,0.87024|r|underH0:Rho=0,x1,x2,x3,x4,x5,x6,x1x2x3x4x5x6,1.000000.81423.0001-0.370700.1076-0.380200.0982-0.577740.0076-0.199020.4002,0.81423.00011.00000-0.237700.3129-0.541900.0136-0.724730.0003-0.199400.3993,-0.370700.1076-0.237700.31291.000000.136620.56570.179240.44960.098410.6798,-0.380200.0982-0.541900.01360.136620.56571.000000.656200.00170.322630.1653,-0.577740.0076-0.724730.00030.179240.44960.656200.00171.000000.695210.0007,-0.199020.4002-0.199400.39930.098410.67980.322630.16530.695210.00071.00000,问题:变量之间的相关性如何?,
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 图纸专区 > 课件教案


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!