资源描述
Click to edit Master title style,*,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,Visualization andData Mining,Visualization andData Mining,2,Outline,Graphical excellence and lie factor,Representing data in 1,2,and 3-D,Representing data in 4+dimensions,Parallel coordinates,Scatterplots,Stick figures,2OutlineGraphical excellence a,3,Napoleon Invasion of Russia,1812,Napoleon,3Napoleon Invasion of Russia,4,Marley,1885,4Marley,1885,5,www.odt.org,from,http:/www.odt.org/Pictures/minard.jpg,used by permission,5 www.odt.org,from http:/,6,Snows Cholera Map,1855,6Snows Cholera Map,1855,7,Asia at night,7Asia at night,8,South and North Korea at night,Seoul,South Korea,North Korea,Notice how dark,it is,8South and North Korea at nigh,9,Visualization Role,Support interactive exploration,Help in result presentation,Disadvantage:requires human eyes,Can be misleading,9Visualization RoleSupport int,10,Bad Visualization:Spreadsheet,Year,Sales,1999,2,110,2000,2,105,2001,2,120,2002,2,121,2003,2,124,What is wrong with this graph?,10Bad Visualization:Spreadsh,11,Bad Visualization:Spreadsheet with misleading Y axis,Year,Sales,1999,2,110,2000,2,105,2001,2,120,2002,2,121,2003,2,124,Y-Axis scale gives,WRONG,impression of big change,11Bad Visualization:Spreadsh,12,Better Visualization,Year,Sales,1999,2,110,2000,2,105,2001,2,120,2002,2,121,2003,2,124,Axis from 0 to 2000 scale gives,correct impression of small change,12Better VisualizationYearSale,13,Lie Factor,Tufte requirement:0.95Lie Factor1.05,(E.R.Tufte,“The Visual Display of Quantitative Information”,2nd edition),13Lie FactorTufte requirement:,14,Tuftes Principles of Graphical Excellence,Give the viewer,the greatest number of ideas,in the shortest time,with the least ink in the smallest space.,Tell the truth about the data!,(E.R.Tufte,“The Visual Display of Quantitative Information”,2nd edition),14Tuftes Principles of Graph,15,Visualization Methods,Visualizing in 1-D,2-D and 3-D,well-known visualization methods,Visualizing more dimensions,Parallel Coordinates,Other ideas,15Visualization MethodsVisuali,16,1-D(Univariate)Data,Representations,7,5,3,1,0,20,Mean,low,high,Middle 50%,Tukey box plot,Histogram,161-D(Univariate)DataReprese,17,2-D(Bivariate)Data,Scatter plot,price,mileage,172-D(Bivariate)DataScatter,18,3-D Data(projection),price,183-D Data(projection)price,19,Lie Factor=14.8,(E.R.Tufte,“The Visual Display of Quantitative Information”,2nd edition),19Lie Factor=14.8(E.R.Tufte,20,3-D image(requires 3-D blue and red glasses),Taken by Mars Rover Spirit,Jan 2004,203-D image(requires 3-D blu,21,Visualizing in 4+Dimensions,Scatterplots,Parallel Coordinates,Chernoff faces,Stick Figures,21Visualizing in 4+Dimensions,22,Multiple Views,Give each variable its own display,A B C D E,1 4 1 8 3 5,2 6 3 4 2 1,3 5 7 2 4 3,4 2 6 3 1 5,A B C D E,1,2,3,4,Problem:does not show correlations,22Multiple ViewsGive each vari,23,Scatterplot Matrix,Represent each possible,pair of variables in their,own 2-D scatterplot,(car data),Q:,Useful for what?,A:linear correlations,(e.g.horsepower&weight),Q:Misses what?,A:multivariate effects,23Scatterplot MatrixRepresent,24,Parallel Coordinates,Encode variables along a horizontal row,Vertical line specifies values,Dataset in a Cartesian coordinates,Same dataset in parallel coordinates,Invented by,Alfred Inselberg,while at IBM,1985,24Parallel Coordinates Encode,25,Example:Visualizing Iris Data,Iris setosa,Iris versicolor,Iris virginica,25Example:Visualizing Iris Da,26,Flower Parts,Petal,a non-reproductive part of the flower,Sepal,a non-reproductive part of the flower,26Flower PartsPetal,a non-rep,27,Parallel Coordinates,Sepal,Length,5.1,27Parallel Coordinates Sepal 5,28,Parallel Coordinates:2 D,Sepal,Length,5.1,Sepal,Width,3.5,28Parallel Coordinates:2 DSep,29,Parallel Coordinates:4 D,Sepal,Length,5.1,Sepal,Width,Petal,length,Petal,Width,3.5,1.4,0.2,29Parallel Coordinates:4 DSep,30,5.1,3.5,1.4,0.2,Parallel Visualization of Iris data,305.13.51.40.2Parallel Visuali,31,Parallel Visualization Summary,Each data point is a line,Similar points correspond to similar lines,Lines crossing over correspond to negatively correlated attributes,Interactive exploration and clustering,Problems:order of axes,limit to 20 dimensions,31Parallel Visualization Summa,32,Chernoff Faces,Encode different variables values in characteristics,of human face,http:/www.cs.uchicago.edu/wiseman/chernoff/,http:/ applets:,32Chernoff FacesEncode differe,33,Interactive Face,33Interactive Face,34,Chernoff faces,example,34Chernoff faces,example,35,Stick Figures,Two variables are mapped to X,Y axes,Other variables are mapped to limb lengths and angles,Texture patterns can show data characteristics,35Stick FiguresTwo variables a,36,Stick figures,example,census data showing,age,income,sex,education,etc.,Closed figures correspond to women and we can see more of t
展开阅读全文