create photorealistic talking face

上传人:e****s 文档编号:243293268 上传时间:2024-09-20 格式:PPT 页数:35 大小:1.72MB
返回 下载 相关 举报
create photorealistic talking face_第1页
第1页 / 共35页
create photorealistic talking face_第2页
第2页 / 共35页
create photorealistic talking face_第3页
第3页 / 共35页
点击查看更多>>
资源描述
,*,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,Click to edit Master title style,Create Photo-Realistic Talking Face,Changbo Hu,*,This work was done during visiting Microsoft Research China with Baining Guo and Bo Zhang,Outline,Introduction of talking face,Motivations,System overview,Techniques,Conclusions,Introduction,What is a talking face,Face (lip) animation, driven by voice,Applications,The process of talking face,Face model,Motion capture,Mapping between,audio and video,Rendering,Photo-realistic?,Literatures,Walter,93, DecFace, 2Dwire frame model,Terzopoulos,95, Skin and muscle model,Breglar,97, Video Rewrite, Sample image based,TS Huang,98,Mesh model from range data,Poggio,98, MikeTalk, Viseme morphing,Guenter,99, Making face, 3D from multicamera,Zhengyou Zhang, 00, 3D face modeling from video through epipolar constraint,Cosatto,00, Planar quads model,Some Face models,Motivations,Aim: a graphics interface for conversation agent,Photo-realistic,Driven by Chinese,Smooth connection between sentences,Extended from “Video rewrite”,System overview:Pipeline of the system(1),Video with Sound,Images,Sound,Pose tracking,Phoneme,segmentation,Annotation,Lip motion Tracking,Train database,System overview: Pipeline of the system(2),New text,Wav sound,TTS system,Triphone sequence,Segmentation,Synthesized triphone sequence,Train database,Lip motion sequence,Rewrite to faces,Background sequence,Techniques,Analysis:,Audio process,Image process,Synthesis,Lip image,Background image,Stitch together,Audio part:,Sound Segmentation,Given the wav file and the script,Using HMM to train the segment system,Segment wav file to phoneme sequence,Example of the segmentation result:,SILOPEN023,SILOPEN2442,s4361,if46274,j7580,ia18197,sh98109,ang1110121,y122130,e4131133,y134145,in2146154,h155164,ang2165194,Annotation with Phoneme,Using phoneme to annotate video frames,Each phoneme in a sentence corresponds to a short time of video sequence,Training Sentence,Audio Frames,Video Frames,Phoneme Sequence,Frames for Phoneme1,Frames for Phoneme1,Phoneme1,Frames for Phoneme2,Frames for Phoneme2,Phoneme2,Phoneme Distance Analysis,Phoneme&triphone basics,Chinese Phoneme vs. English Phoneme,Distance Metrics definitions,Results,Phoneme Basics,Phonemes represents the basic elements in speech. All possible speech can be represented by combination of phonemes.,CH, JH, S, EH, EY, OY, AE, SIL,Triphone are three consecutive phonemes. It not only represents pronounce characteristics but also contains context information.,T-IY-P, IY-P-AA, P-AA-T,Chinese Phoneme vs. English,Chinese phoneme has two basic groups: Initials and Finals.,Initials: B, P, M, F, ,Finals: a3, o1, e2, eng3, iang4, ue5, ,Chinese finals each has 5 tones: 1,2,3,4,5.,Different tones: a1, a2, a3, a4, a5.,Chinese finals actually is not a basic elements of speech.,For example: iang1, iao1, uang1, iong1,Chinese phoneme set is much larger than English.,Phoneme Distance Analysis,Define the distance between any two phonemes.,Since we only synthesis video but not sound, so tone is ignored,Lip shape motion is the core element for distance metrics.,Phoneme Distance Analysis,Video 1,Video 2,Video 4,Video 1,Video 2,Video 3,Phoneme 1:,Phoneme 2:,Time Align to an uniform length,Video 2,Video 3,Video 4,Video 2,Video 1,Video 1,Average the videos to,get an average video,Video Average,Video Average,By comparing the two aligned average videos, we generate the,distance matrix of the whole phoneme set.,Image part:,Pose Tracking,Assume a plane model for face,Standard minimization method to find transform matrix (affine transform)Black,95,Mask is used to constrain interests part of the face,Template Picture,Mask Image,Pose tracking,Motion prediction using parameters with physical meaning,Pose Tracking,Some tracking results:,Lip Motion Tracking,Using Eigen Points (Covell, 91),Feature Points include Jaw, lip and teeth,Training database specified manually,Auto tracking through all pose-tracked images,Lip motion tracking,Lip Motion,Tracking,Train Database,(hand-labeled),Auto Tracking Results,Synthesis new sentences,New text converted by TTS system to wav,Wav is segmented to phoneme sequence,Using DP to find an optimal video sequence from the training database,Time-align triphone videos and stitch them together.,Transform the lip sequence and paste them to background faces.,Lip sequence synthesis,Optimal phoneme sequences,Triphone 1,Triphone 2,Triphone 5,Triphone 3,Triphone 4,Triphone 6,Triphone 7,Triphone 8,Triphone B,Triphone 9,Triphone A,Triphone C,New phoneme sequences,New phoneme sequences,Dynamic Programming,Begin,Triphone1,Triphone3,Triphone2,Triphone4,End,Triphone5,Edge Cost Definition,Two parts:,phoneme distance: 3 phonemes distances added together,Lip shape distance for the overlap portion of triphone video,Weighted add together two part,Background video generation,Background is a video sequence when the virtual character spoke something else,Similarity measurement of background,Select “standard frame”,The frame with maximal number of frames similar to it,Filter out the frames with jerkiness,Stitch the time-aligned result to background faces,Write back with a mask,Transform the synthesized lip to the background face,Mask image for,write-back operation,Original background frame,Write-back result of the same frame,More video results,More video results,Conclusion and Future Work,Pose tracking and lip motion tracking,Size of the train database,Talking face with expression,Real-time generation?,Fast modeling for different person,Animation,Thank you,
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 商业管理 > 商业计划


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!