qContextually-relatedEntities-UWFacultyWebServer上下文相关的实体-威斯康星大学教师的Web服务器

上传人:ra****d 文档编号:252828522 上传时间:2024-11-20 格式:PPT 页数:34 大小:247KB
返回 下载 相关 举报
qContextually-relatedEntities-UWFacultyWebServer上下文相关的实体-威斯康星大学教师的Web服务器_第1页
第1页 / 共34页
qContextually-relatedEntities-UWFacultyWebServer上下文相关的实体-威斯康星大学教师的Web服务器_第2页
第2页 / 共34页
qContextually-relatedEntities-UWFacultyWebServer上下文相关的实体-威斯康星大学教师的Web服务器_第3页
第3页 / 共34页
点击查看更多>>
资源描述
Click to edit Master title style,*,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,Integrating Finite-state Morphologies with Deep LFG Grammars,Tracy Holloway King,FST and deep grammars,Finite state tokenizers and morphologies can be integrated into deep processing systems,Integrated tokenizers,eliminate the need for preprocessing,allow the grammar writer more control over the input,Morphologies,eliminate the need to list(multiple)surface forms in the lexicon,eliminate the need for lexical entries for words with predictable subcategorization frames,Talk outline,Basic integrated system,Integrating morphology FSTs,Interaction of tokenization and morphology,Basic Architecture,(Shallow markup),Tokenizing FSTs,Morphology FSTs,LFG grammar and lexicons,Constituent-structure,(tree),Functional-structure,(AVM),Input string,Example steps through the system,Input string:,Boys appeared.,Tokenizing:,boys TB appeared TB.TB,Morphology:,boy+Noun+Pl,appear+Verb+PastBoth+123SP,.+Punct,C-structure/F-structure:next slides,C-structure tree,F-structure AVM,The wider system:XLE,Handwritten grammars for various languages,Substantial for English,German,Japanese,Norwegian,Also:Arabic,Chinese,Urdu,Korean,Welsh,Malagasy,Turkish,Robustness mechanisms,Fragment grammar rules,Morphological guessers,Skimming when resource limits approached,Ambiguity management(packing),Compute all analyses(no“aggressive pruning),Propagate packed ambiguities across processing modules,Stochastic disambiguation,MaxEnt models to select from packed(f-)structures,Other processing available:,generation,semantics,transfer/rewriting,Comparisons to other systems/tasks,Parsing WSJ(Riezler et al,ACL 2002),Comparison to Collins model 3(Riezler et al,NAACL 2004),FST Morphologies,Associate surface form with,a lemma(stem/canonical form),a set of tags,Process is non-deterministic,can have many analyses for one surface form,grammar has to be able to deal with multiple analyses(morphological ambiguity),Issue:can the grammar control rampant morphological ambiguity?,Arabic vowelless representations,Example Morphology Output,turnips,turnip,+Noun+Pl,Mary,Mary,+Prop+Giv+Fem+Sg,falls,fall,+Noun+Pl,fall,+Verb+Pres+3sg,broken,break,+Verb+PastPerf+123SP,broken,+Verb+PastPart +Adj,New York,New York,+Prop+Place+USAState+Prefer,New York,+Prop+Place+City+Prefer,plus analyses of New and York,Morphologies and lexicons,Without a morphology,need to list all surface forms in the lexicon,bad for English,horrible for languages like Finnish and Arabic,With a morphology,one entry for the stem form,go V XLE (V-INTRANS go).,for:,go,goes,going,gone,went,With additional integration,words with predictable subcategorization frames need no entry,Basic idea,Run surface forms of words through the morphology to produce stems and tags,MorphConfig file specifies which morphologies the grammar uses,Look up stems,and tags,in the lexicon,Sublexical phrase structure rules build syntactic nodes covering the stems and tags,Standard grammar rules build larger phrases,Lexical entries for tags,boys=,boy +Noun +Pl,boy,N XLE,(NOUN boy).,+Noun,N_SFX XLE,(PERS 3),(EXISTS NTYPE).,+Pl,NNUM_SFX XLE,(NUM pl).,Sublexical rules for tags,Build up lexical nodes from stem plus tags,Rules are identical to standard phrase structure rules,Except display can hide the sublexical information,N-,N_BASE,N_SFX_BASE,NNUM_SFX_BASE,.,N,N_BASE,boy,N_SFX_BASE,+Noun,NNUM_SFX_BASE,+Pl,Resulting structures,N,N_BASE,boy,N_SFX_BASE,+Noun,NNUM_SFX_BASE,+Pl,PRED boy,PERS 3,NUM pl,NTYPE common,Lexical entries,Stems with unpredictable subcategorization frames need entries,verbs,adjectives with obliques(,proud of her,),nouns with that complements(,the idea that he laughed,),Most lexical items have predictable frames determined by part of speech,common and proper nouns,adjectives,adverbs,numbers,-unknown lexical entry,Match any stem to the entry,Provide desired functional information,%stem,will pass in the appropriate surface form(i.e.,the lemma/stem),Constrain application via morphological tag possibilities,-unknown N XLE(NOUN%stem);,A XLE(ADJ%stem);,ADV XLE(ADVERB%stem).,-unknown example,The box boxes.,Lexicon entries:,box,V,XLE(V-INTRANS%stem).,-unknown,N,XLE(NOUN%stem);,ADV,;,A,.,Morphology output:,box=box,+Noun,+Sg|,+Verb,+Non3Sg,boxes=box,+Noun,+Pl|,+Verb,+3Sg,Build up four effective lexical entries,1 noun,1 verb,1 adverb,1 adjective,adverb and adjective fail sublexically,noun and verb relevant for the sentence,Inflectional morphology summary,Integrating FST morphologies significantly decreases lexicon development,Verbs and other unpredictable items are listed only under their stem form,Predictable items such as nouns are processed via,unknown,and never listed in the lexicon,Guessers,Even large industrial FST morphologies are not complete,Novel words usually have regular morphology,Build and FST guesser based on this,Words with capital letters are proper nouns(,Saakashvili,),Words ending in,ed,are past tense verbs
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 商业管理 > 商业计划


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!