资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,Selecting Input Distribution,2,Introduction,The data on the input random variables of interest can be used in following ways:,The data values themselves are used directly in the simulation.This is called trace-driven simulation.,The data values could be used to define an empirical distribution function in some way.,Standard techniques of statistical inferences are used to“fit a theoretical distribution form to the data and perform hypothesis tests to determine the goodness of fit.,3,Different approaches,Approach 1 is used to validate simulation model when comparing model output for an existing system with the corresponding output for the system itself.,Two drawbacks of approach 1:simulation can only reproduce only what happened historically;and there is seldom enough data to make all simulation runs.,Approach 2 avoids these shortcomings so that any value between minimum and maximum can be generated.So approach 2 is preferred over approach 1.,If theoretical distributions can be found that fits the observed data(approach 3),then it is preferred over approach 2.,4,Different approaches,Approach 3 vs.approach 2:,Empirical distribution may have some irregularities if small number of data points are available.Approach 3 smoothens out the data and may provide information on the overall underlying distribution.,In approach 2,it is usually not possible to generate values outside the range of observed data in the simulation.,If one wants to test the performance of the simulated system under extreme conditions,that can not be done using approach 2.,There may be compelling(physical)reasons in some situations for using a particular theoretical distribution.In that case too,it is better to get empirical support for that distribution from the observed data.,5,Different approaches,Approach 3 vs.approach 2:,Theoretical distribution is a compact way of representing a set of data values.,In approach 2,if,n,data points are available from a continuous distribution,then,2n,values(data and the corresponding cumulative distribution function values)must be entered and stored in the computer to represent the empirical distribution in many simulation languages.Imagine the trouble,if a large data set of observed values is present!,6,Sources of randomness for common simulation experiments,Manufacturing,:processing times,machine operating times before a downtime,machine repair times etc.,Computer,:inter-arrival times of jobs,job types,processing requirements of jobs etc.,Communication,:inter-arrival times of messages,message types and lengths etc.,Mechanical systems,:fluid flow in pipes,accumulation of dirt on the pipe walls,manufacturing defect size and location on a mechanical boundary,etc.,7,Parameters of distribution,A,location parameter,specifies an abscissa location point of a distributions range of values.Usually,it is the midpoint(e.g.mean)or lower endpoint of the distributions range.,As location parameter changes the associated distribution merely shifts left or right without otherwise changing.,A,scale parameter,determines the scale(or unit)of measurement of the values in the range of distribution.,A change in scale parameter compresses or expands the associated distribution without altering its basic form.,8,Parameters of distribution,A,shape parameter,determines,distinct from location and scale,the basic form or shape of a distribution within the general family of distributions.,A change in shape parameter alters a distributions characteristics(e.g.skewness)more fundamentally than a change in location or scale.,Some distributions(e.g.normal,exponential)do not have a shape parameter,while others have several(beta distribution has two).,9,Empirical distributions,For ungrouped data:,Let,X,(i),denote the,i,th smallest of the,X,j,s so that:,10,Empirical distributions,For grouped data:,Suppose that,n,X,j,s are grouped in,k,adjacent intervals,a,0,a,1,),a,1,a,2,),a,k-1,a,k,),so that,j,th interval contains,n,j,observations.,n,1,+n,2,+n,k,=n,.,Let a piecewise linear function,G,be such that,G(a,0,)=0,G(a,j,)=(n,1,+n,2,+n,j,)/n,then:,11,Verifying Independence,Most of the statistical tests assume IID input.,At times,simulation experiments have input that are,by default dependent:e.g.hourly temperature in a city.,Two graphical ways of studying independence:,Correlation plot,:Plot of,j,for,j=0,1,2,l,.If,j,the differ from,0,by a significant amount,then this is strong evidence that the,X,i,s are not independent.,Scatter plot,:Plot of the pair(,X,i,X,i+1,)for,i=1,2,n-1,.If,X,i,s,are independent,then this plot would have points scattered randomly.Trend would indicate dependency.,12,Verifying Independence,13,Verifying Independence,14,Clues from summary statistics,For the,symmetric distributions,mean and median should match.In the sample data,if these values are sufficiently close to each other,we can think of a symmetric distribu
展开阅读全文