httpwww.ens-lyon.fr~jylexcelMUMPShttpwww.enseeiht.fr

上传人:xx****x 文档编号:242868135 上传时间:2024-09-10 格式:PPT 页数:1 大小:312.50KB
返回 下载 相关 举报
httpwww.ens-lyon.fr~jylexcelMUMPShttpwww.enseeiht.fr_第1页
第1页 / 共1页
亲,该文档总共1页,全部预览完了,如果喜欢就下载吧!
资源描述
Cliquez pour modifier le style du titre du masque,Cliquez pour modifier les styles du texte du masque,Deuxime niveau,Troisime niveau,Quatrime niveau,Cinquime niveau,*,MUMPS,A Multifrontal Massively Parallel Solver,IMPLEMENTATION,Distributed,multifrontal,solver,MPI / F90 based (,C user interface,also available),Stability based on,partial pivoting,Dynamic Distributed Scheduling to,accomodate,both numerical fill-in and multi-user environment,Use of BLAS, LAPACK,ScaLAPACK,Main features,:,MUMPS,solves large systems of linear equations of the form Ax=b by factorizing A into A=LU or LDL,T,symmetric or,unsymmetric marices,(partial pivoting),parallel factorization and solve phase,s,(,uniprocessor,version also available),Iterative refinement and backward error analysis,various matrix,input formats,assembled,format,distributed,assembled,format,sum of,e,lemental,matrices,Null space functionalities,(,experimental,):,rank detection and null space basis,Partial,factorization and Schur complement matrix,Version for,complex,arithmetic,.,A fully asynchronous distributed solver,(VAMPIR trace, 8 processors).,AVAILABILITY,MUMPS,is available free of charge for non commercial use.,it has been used on a number of platforms (Cray T3E, Origin 2000, IBM SP,Linux clusters,) by a few hundred current users (finite elements, chemistry, simulation, aeronautics, ),If you are interested in obtaining MUMPS for you own use, please refer to the MUMPS home page.,BMW car,body,148770 unknowns,5396386 nonzeros,MSC.Software,.,Competitve performance,The,MUMPS,package has a good perfornance relative to other parallel sparse solve,rs; for example we see in the table below comparisons with the SuperLU code from Demmel and Li. These results are taken from “Analysis and comparison of two general solvers for distributed memory computers”, ACM TOMS, 27, 388-421.,CURRENT RESEARCH,: ACTIVE RESEARCH IS FEEDING THE MUMPS SOFTWARE PLATFORM.,The,MUMPS,package has been,partially supported by the Esprit IV Project PARASOL and by CERFACS, ENSEEIHT-IRIT, INRIA Rh,ne-Alpes, LBNL-NERSC,PARALLAB and RAL.,The authors are Patrick Amestoy, Jean-Yves LExcellent, Iain Duff and Jacko Koster.,Functionalities related to rank-revealing were first implemented by M. Tuma (Institute of Computer Science, Academy of Sciences of the Czech Republic), while he was at CERFACS.,We are also grateful to C. Bousquet, C. Daniel, A. Guermouche, G. Richard, S. Pralet and C. V,mel who have been working on some specific parts of this software.,Factorisation time in seconds of large matrices on the CRAY T3E; (1 proc=not enough memory).,Reorderings and optimization of the memory usage,MUMPS uses state-of-the-art reordering techniques (AMD, AMF, ND, SCOTCH, PORD, METIS). Those techniques have a strong impact on the parallelism and number of operations and we are currently studying their impact of such techniques on the dynamic memory usage of MUMPS. In particular we designed algorithms to optimize the memory occupation of the multifrontal stack. Future work includes dynamic memory load balancing and the design of an out-of-core version.,Best decrease obtained using our algorithm to decrease the stack for each reordering technique. Results obtained by A. Guermouche, (PhD student in the INRIA ReMaP project).,Mixing dynamic and static scheduling strategies,MUMPS uses a completely dynamic approach with distributed scheduling and scales well until around 100 processors. Introducing more static information helps reducing the costs of the dynamic decisions and makes MUMPS more scalable.,Matrix,Reordering,thermal,PORD,THREAD METIS,af23560,AMF,xenon2,SCOTCH,rma10,AMD,Percent. of memory decrease,73.5,30.4,32.2,24.7,17.6,Matrix,Ordering,Solver,Number of processors,1,4,8,16,32,64,128,bbmat,AMD,ND(metis),MUMPS,SuperLU,MUMPS,SuperLU,-,-,-,-,44.8,64.7,32.1,132.9,23.6,36.6,10.8,72.5,15.7,21.3,12.3,39.8,12.6,12.8,10.4,23.5,10.1,9.2,9.1,15.6,9.5,7.2,7.8,11.1,ecl32,AMD,ND(metis),MUMPS,SuperLU,MUMPS,SuperLU,-,-,-,-,53.1,106.8,23.9,48.5,31.3,56.7,13.4,26.6,20.7,31.2,9.7,15.7,14.7,18.3,6.6,9.6,13.5,12.3,5.6,7.6,12.9,8.2,5.4,5.6,Platforms with heterogeneous network (clusters of SMP),In the MUMPS scheduling, work is given to processors according to their load. Giving a penalty to the load of processors on a distant node helps performing tasks with high communication on the same node and improves the performance, as shown in the Table below.,This poster was prepared by Jean-Yves LExcellent (Jean-Yves.).,Mixing MPI and OpenMP on clusters of SMP,We report below on a preliminary experiment of hybrid parallelism on one node (16 procs) of an IBM SP. Best results are obtained when using 8 MPI processes with 2 OpenMP threads each. Regular problem from an 11pt discretization (Cubic grid 64x64x64), ND used. Results obtained by S.Pralet (PhD Cerfacs).,Effect of taking the hybrid network into account. Matrix PRE2, SCOTCH, 2 nodes of 16 processors of an IBM SP. Results obtained by S. Pralet (PhD CERFACS).,Effect of a injecting more static information to the dynamic scheduling of MUMPS. Rectangular grids of increasing size, ND. Results obtained by C. V,mel (PhD Cerfacs) on a CRAY T3E.,1,
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 图纸专区 > 大学资料


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!