资源描述
单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,Data Mining,Yanci Zhang,What is Data Mining?,Extraction of implicit,previously unknown and potentially useful information from data,Exploration&analysis of large quantities of data,automatic or semi-automatic means,discover meaningful patterns,Process of Knowledge Discovery,Raw data,Data Warehouse,Patterns,Knowledge,!,Data cleaning and integration,Data transformation,selection,and mining,Pattern evaluation and knowledge presentation,Example:NBA,1/2,Play-by-play information,Who is on the court,Who shoots,Coaches want to know,Who works best?,What strategies combination works best?,Example:NBA,2/2,Advanced Scout is a data mining tool to answer these questions,Data collection,Data preprocessing:cleaning,transformations,enrichment,Data mining,Interpretation and knowledge discovery,What is(not)Data Mining?,What is not data mining,Look up phone number in phone directory,Query a web search engine for information about“Amazon”,What is data mining,Certain names are more prevalent in certain US locations(OBrien,ORurke,OReilly in Boston area),Group together similar documents returned by search engine according to their context(e.g.Amazon rainforest,A),Why Data Mining?,data rich but information poor,we are drown in data,but starving for knowledge,Wow,so much data!,How to make full use of it,?,Shovel!,I will mine,data,Data tomb,Tasks,Prediction Methods,Use some variables to predict unknown or future values of other variables,Description Methods,Find human-interpretable patterns that describe the data,Applications,Data analysis and decision support,Market analysis and management,Beer and diapers,Risk analysis and management,Credit card risk analysis and control,Fraud detection and detection of unusual patterns,Applications,Text mining and Web mining,Stream data mining,DNA and bio-data analysis,Similarity search and comparison among DNA sequences,Association analysis:identification of co-occurring gene sequences,Path analysis:linking genes to different disease development stages,Visualization tools and genetic data analysis,Challenges,Scalability,Dimensionality,Complex and Heterogeneous Data,Data Quality,Data Ownership and Distribution,Privacy Preservation,Streaming Data,Assignments,Group,Group16:PC and MAC 10,Group17:PC and MAC 11,Group18:What is augmented reality?,Group38:What is Graphics Processing Units(GPU)?,Individual:,Write an English article:Applications of Data mining(300 words),Deadline:,2011-11-10,
展开阅读全文