David Cronk MPI-IO for EQM APPLICATIONS

上传人:xx****x 文档编号:242868593 上传时间:2024-09-10 格式:PPT 页数:40 大小:274KB
返回 下载 相关 举报
David Cronk MPI-IO for EQM APPLICATIONS_第1页
第1页 / 共40页
David Cronk MPI-IO for EQM APPLICATIONS_第2页
第2页 / 共40页
David Cronk MPI-IO for EQM APPLICATIONS_第3页
第3页 / 共40页
点击查看更多>>
资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,*,MPI-I/O for EQM APPLICATIONS,David Cronk,Innovative Computing Lab,University of Tennessee,June 20, 2001,Outline,Introduction,What is parallel I/O,Why do we need parallel I/O,What is MPI-I/O,MPI-I/O,Derived data types and,9/10/2024,2,OUTLINE (cont),MPI-I/O (cont),Data access,Non-collective access,Collective access,Split collective access,Examples,LBMPI - Bob Maier (ARC),CE-QUAL-IC - Victor,Parr,(UT-Austin),9/10/2024,3,INTRODUCTION,What is parallel I/O?,Multiple processes accessing a single file,9/10/2024,4,INTRODUCTION,What is parallel I/O?,Multiple processes accessing a single file,Often, both data and is non-contiguous,Ghost cells cause non-contiguous data access,Block or cyclic distributions cause non-contiguous,9/10/2024,5,Non-Contiguous Access,Local Mem,9/10/2024,6,INTRODUCTION,What is parallel I/O?,Multiple processes accessing a single file,Often, both data and is non-contiguous,Ghost cells cause non-contiguous data access,Block or cyclic distributions cause non-contiguous,Want to access data and files with as few I/O calls as possible,9/10/2024,7,INTRODUCTION (cont),Why use parallel I/O?,Many users do not have time to learn the complexities of I/O optimization,9/10/2024,8,INTRODUCTION (cont),Integer dim,parameter (dim=10000),Integer*4 out_array(dim),OPEN (fh,),WRITE(fh) (out_array(I), I=1,dim),rl = 4*dim,OPEN (fh, , DIRECT, RECL=rl),WRITE (fh, REC=1) out_array,9/10/2024,9,INTRODUCTION (cont),Why use parallel I/O?,Many users do not have time to learn the complexities of I/O optimization,Use of parallel I/O can simplify coding,Single read/write operation,vs,. multiple read/write operations,9/10/2024,10,INTRODUCTION (cont),Why use parallel I/O?,Many users do not have time to learn the complexities of I/O optimization,Use of parallel I/O can simplify coding,Single read/write operation,vs,. multiple read/write operations,Parallel I/O potentially offers significant performance improvement over traditional approaches,9/10/2024,11,INTRODUCTION (cont),Traditional approaches,Each process writes to a separate file,Often requires an additional post-processing step,Without post-processing, restarts must use same number of processor,Result sent to a master processor, which collects results and writes out to disk,Each processor calculates position in writes individually,9/10/2024,12,INTRODUCTION (cont),What is MPI-I/O?,MPI-I/O is a set of extensions to the original MPI standard,This is an interface specification: It does NOT give implementation specifics,It provides routines for and data access,Calls to MPI-I/O routines are portable across a large number of architectures,9/10/2024,13,DERIVED DATATYPES & VIEWS,Derived datatypes are not part of MPI-I/O,They are used extensively in conjunction with MPI-I/O,A is really a datatype expressing the access pattern of a file,are used to set,9/10/2024,14,DERIVED DATATYPES & VIEWS,Non-contiguous memory access,MPI_TYPE_CREATE_SUBARRAY,NDIMS - number of dimensions,ARRAY_OF_SIZES - number of elements in each dimension of full array,ARRAY_OF_SUBSIZES - number of elements in each dimension of sub-array,ARRAY_OF_STARTS - starting position in full array of sub-array in each dimension,ORDER - MPI_ORDER_(C or FORTRAN),OLDTYPE -,datatype,stored in full array,NEWTYPE - handle to new,datatype,9/10/2024,15,NONCONTIGUOUS MEMORY ACCESS,0,0,0,101,101,0,101,101,1,1,1,100,101,1,100,100,9/10/2024,16,NONCONTIGUOUS MEMORY ACCESS,INTEGER sizes(2), subsizes(2), starts(2), dtype, ierr,sizes(1) = 102,sizes(2) = 102,subsizes(1) = 100,subsizes(2) = 100,starts(1) = 1,starts(2)= 1,CALL MPI_TYPE_CREATE_SUBARRAY(2,sizes,subsizes,starts, MPI_ORDER_FORTRAN,MPI_REAL8,dtype,ierr),9/10/2024,17,NONCONTIGUOUS,MPI_(,FH,DISP,ETYPE,DATAREP,INFO,IERROR),9/10/2024,18,NONCONTIGUOUS,The holes in it from the processors perspective,multi-dimensional array access,9/10/2024,19,NONCONTIGUOUS,The holes in it from the processors perspective,multi-dimensional array access,MPI_TYPE_CREATE_SUBARRAY(),9/10/2024,20,Distributed array access,(0,0),(0,199),(199,0),(199,199),9/10/2024,21,Distributed array access,Sizes(1) = 200,sizes(2) = 200,subsizes(1) = 100,subsizes(2) = 100,starts(1) = 0,starts(2) = 0,CALL MPI_TYPE_CREATE_SUBARRAY(2, SIZES, SUBSIZES, STARTS, MPI_ORDER_FORTRAN, MPI_INT, , IERR),CALL MPI_TYPE_COMMIT(, IERR),CALL MPI_(FH, 0, MPI_INT, , NATIVE, MPI_INFO_NULL, IERR),9/10/2024,22,NONCONTIGUOUS,The holes in it from the processors perspective,multi-dimensional array distributed with a block distribution,Irregularly distributed arrays,9/10/2024,23,Irregularly distributed arrays,MPI_TYPE_CREATE_INDEXED_BLOCK,COUNT - Number of blocks,LENGTH - Elements per block,MAP - Array of displacements,OLD - Old datatype,NEW - New datatype,9/10/2024,24,Irregularly distributed arrays,0 1 2 4 7 11 12 15 20 22,0 1 2 4 7 11 12 15 20 22,MAP_ARRAY,9/10/2024,25,Irregularly distributed arrays,CALL MPI_TYPE_CREATE_INDEXED_BLOCK (10, 1, , MPI_INT, , IERR),CALL MPI_TYPE_COMMIT (, IERR),DISP = 0,CALL MPI_ (FH, DISP, MPI_INT, , native, MPI_INFO_NULL, IERR),9/10/2024,26,DATA ACCESS,Explicit,Offsets,Individual,Shared,Blocking,Non-Blocking,Non-Collective,Collective,9/10/2024,27,COLLECTIVE I/O,Memory layout on 4 processor,MPI temporary memory buffer,9/10/2024,28,EXAMPLE #1,Bob Maier - ARC,Production level Fortran code,Challenge problem,Every X iterations, write a re-start file,At conclusion write output file,On SP w/512 Processors, 12 hrs computation, 12 hrs I/O.,9/10/2024,29,EXAMPLE #1 (cont),Conceptually, four 3-dim. Arrays,Implemented with a single 4-dim array,Improved cache-hit ratio,Uses ghost cells,Write out to 4 separate files,Block-Block data distribution,Mem access is completely non-contiguous,9/10/2024,30,EXAMPLE #1 (cont),9/10/2024,31,EXAMPLE #1 - Solution,Set up array with size of file,set up array with subsize of file,set up array with size of local arrays,set up array with subsize of memory,set up array with starting positions in file,set up array with starting positions in memory,disp = 0,call mpi_type_create_subarray(3, , , , MPI_ORDER_FORTRAN, MPI_REAL8, , ierr),call mpi_type_commit (, ierr),do vars=1,4,mem_starts(1) = vars-1,call mpi_type_create_subarray(4, mem_sizes, mem_subsizes, mem_starts, MPI_ORDER_FORTRAN, MPI_REAL8, mem_type, ierr),call mpi_type_commit (mem_type, ierr),call mpi_ (),call mpi_ (fh, disp, MPI_REAL8, , “native”, ),call mpi_ (fh, Z, 1, mem_type, ),call mpi_ (fh, ierr),enddo,9/10/2024,32,LBMPI - PERFORMANCEOriginal: 5204 Seconds,9/10/2024,33,LBMPI - PERFORMANCE Original: 5204 Seconds,9/10/2024,34,EXAMPLE #2,Victor Parr - UTA,Production level Fortran code performing EPA simulations,(,CE-QUAL-ICM Message Passing code,),Typical production run performs a 10 year simulation dumping output for every simulation month,Irregular grid and irregular data distribution,High ratio of ghost cells,9/10/2024,35,EXAMPLE #2 (cont),header,9/10/2024,36,EXAMPLE #2 - CURRENT,Each processor writes all output (including ghost cells) to a process specific file,Post processor reads in process specific files,Determines if value is from resident cell,places resident values in appropriate position in a global output array,writes out global array to global output file,9/10/2024,37,EXAMPLE #2 - SOLUTION,1 2 4 7 9 1011 14 20 24,0 1 3 6 8 9 10 13 19 23,32 63 7 21 44 2 77 31 55 19,2 7 19 21 31 32 44 55 63 77,9 3 23 6 13 0 8 19 1 10,Mem_map,9/10/2024,38,EXAMPLE #2 - SOLUTION,DONE ONCE,create mem_map,create ,sort ,permute mem_map to match ,call mpi_type_create_indexed_block (num, 1, mem_map, MPI_DOUBLE_PRECISION, memtype, ierr),call mpi_type_commit (memtype, ierr),call mpi_type_create_indexed_block (num, 1, , MPI_DOUBLE_PRECISION, , ierr),call mpi_type_commit (, ierr),disp = size of initial header in bytes,DONE FOR EACH OUTPUT,call mpi_ (fh, disp, memtype, , native, MPI_INFO_NULL, ierr),call mpi_ (fh, buf, 1, memtype, status, ierr),disp = disp + total number of bytes written by all processes,9/10/2024,39,CONCLUSIONS,MPI-I/O potentially offers significant improvement in I/O performance,This improvement can be attained with minimal effort on part of the user,Simpler programming with fewer calls to I/O routines,Easier program maintenance due to simple API,9/10/2024,40,
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 图纸专区 > 大学资料


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!