资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,*,MPI-I/O for EQM APPLICATIONS,David Cronk,Innovative Computing Lab,University of Tennessee,June 20, 2001,Outline,Introduction,What is parallel I/O,Why do we need parallel I/O,What is MPI-I/O,MPI-I/O,Derived data types and,9/10/2024,2,OUTLINE (cont),MPI-I/O (cont),Data access,Non-collective access,Collective access,Split collective access,Examples,LBMPI - Bob Maier (ARC),CE-QUAL-IC - Victor,Parr,(UT-Austin),9/10/2024,3,INTRODUCTION,What is parallel I/O?,Multiple processes accessing a single file,9/10/2024,4,INTRODUCTION,What is parallel I/O?,Multiple processes accessing a single file,Often, both data and is non-contiguous,Ghost cells cause non-contiguous data access,Block or cyclic distributions cause non-contiguous,9/10/2024,5,Non-Contiguous Access,Local Mem,9/10/2024,6,INTRODUCTION,What is parallel I/O?,Multiple processes accessing a single file,Often, both data and is non-contiguous,Ghost cells cause non-contiguous data access,Block or cyclic distributions cause non-contiguous,Want to access data and files with as few I/O calls as possible,9/10/2024,7,INTRODUCTION (cont),Why use parallel I/O?,Many users do not have time to learn the complexities of I/O optimization,9/10/2024,8,INTRODUCTION (cont),Integer dim,parameter (dim=10000),Integer*4 out_array(dim),OPEN (fh,),WRITE(fh) (out_array(I), I=1,dim),rl = 4*dim,OPEN (fh, , DIRECT, RECL=rl),WRITE (fh, REC=1) out_array,9/10/2024,9,INTRODUCTION (cont),Why use parallel I/O?,Many users do not have time to learn the complexities of I/O optimization,Use of parallel I/O can simplify coding,Single read/write operation,vs,. multiple read/write operations,9/10/2024,10,INTRODUCTION (cont),Why use parallel I/O?,Many users do not have time to learn the complexities of I/O optimization,Use of parallel I/O can simplify coding,Single read/write operation,vs,. multiple read/write operations,Parallel I/O potentially offers significant performance improvement over traditional approaches,9/10/2024,11,INTRODUCTION (cont),Traditional approaches,Each process writes to a separate file,Often requires an additional post-processing step,Without post-processing, restarts must use same number of processor,Result sent to a master processor, which collects results and writes out to disk,Each processor calculates position in writes individually,9/10/2024,12,INTRODUCTION (cont),What is MPI-I/O?,MPI-I/O is a set of extensions to the original MPI standard,This is an interface specification: It does NOT give implementation specifics,It provides routines for and data access,Calls to MPI-I/O routines are portable across a large number of architectures,9/10/2024,13,DERIVED DATATYPES & VIEWS,Derived datatypes are not part of MPI-I/O,They are used extensively in conjunction with MPI-I/O,A is really a datatype expressing the access pattern of a file,are used to set,9/10/2024,14,DERIVED DATATYPES & VIEWS,Non-contiguous memory access,MPI_TYPE_CREATE_SUBARRAY,NDIMS - number of dimensions,ARRAY_OF_SIZES - number of elements in each dimension of full array,ARRAY_OF_SUBSIZES - number of elements in each dimension of sub-array,ARRAY_OF_STARTS - starting position in full array of sub-array in each dimension,ORDER - MPI_ORDER_(C or FORTRAN),OLDTYPE -,datatype,stored in full array,NEWTYPE - handle to new,datatype,9/10/2024,15,NONCONTIGUOUS MEMORY ACCESS,0,0,0,101,101,0,101,101,1,1,1,100,101,1,100,100,9/10/2024,16,NONCONTIGUOUS MEMORY ACCESS,INTEGER sizes(2), subsizes(2), starts(2), dtype, ierr,sizes(1) = 102,sizes(2) = 102,subsizes(1) = 100,subsizes(2) = 100,starts(1) = 1,starts(2)= 1,CALL MPI_TYPE_CREATE_SUBARRAY(2,sizes,subsizes,starts, MPI_ORDER_FORTRAN,MPI_REAL8,dtype,ierr),9/10/2024,17,NONCONTIGUOUS,MPI_(,FH,DISP,ETYPE,DATAREP,INFO,IERROR),9/10/2024,18,NONCONTIGUOUS,The holes in it from the processors perspective,multi-dimensional array access,9/10/2024,19,NONCONTIGUOUS,The holes in it from the processors perspective,multi-dimensional array access,MPI_TYPE_CREATE_SUBARRAY(),9/10/2024,20,Distributed array access,(0,0),(0,199),(199,0),(199,199),9/10/2024,21,Distributed array access,Sizes(1) = 200,sizes(2) = 200,subsizes(1) = 100,subsizes(2) = 100,starts(1) = 0,starts(2) = 0,CALL MPI_TYPE_CREATE_SUBARRAY(2, SIZES, SUBSIZES, STARTS, MPI_ORDER_FORTRAN, MPI_INT, , IERR),CALL MPI_TYPE_COMMIT(, IERR),CALL MPI_(FH, 0, MPI_INT, , NATIVE, MPI_INFO_NULL, IERR),9/10/2024,22,NONCONTIGUOUS,The holes in it from the processors perspective,multi-dimensional array distributed with a block distribution,Irregularly distributed arrays,9/10/2024,23,Irregularly distributed arrays,MPI_TYPE_CREATE_INDEXED_BLOCK,COUNT - Number of blocks,LENGTH - Elements per block,MAP - Array of displacements,OLD - Old datatype,NEW - New datatype,9/10/2024,24,Irregularly distributed arrays,0 1 2 4 7 11 12 15 20 22,0 1 2 4 7 11 12 15 20 22,MAP_ARRAY,9/10/2024,25,Irregularly distributed arrays,CALL MPI_TYPE_CREATE_INDEXED_BLOCK (10, 1, , MPI_INT, , IERR),CALL MPI_TYPE_COMMIT (, IERR),DISP = 0,CALL MPI_ (FH, DISP, MPI_INT, , native, MPI_INFO_NULL, IERR),9/10/2024,26,DATA ACCESS,Explicit,Offsets,Individual,Shared,Blocking,Non-Blocking,Non-Collective,Collective,9/10/2024,27,COLLECTIVE I/O,Memory layout on 4 processor,MPI temporary memory buffer,9/10/2024,28,EXAMPLE #1,Bob Maier - ARC,Production level Fortran code,Challenge problem,Every X iterations, write a re-start file,At conclusion write output file,On SP w/512 Processors, 12 hrs computation, 12 hrs I/O.,9/10/2024,29,EXAMPLE #1 (cont),Conceptually, four 3-dim. Arrays,Implemented with a single 4-dim array,Improved cache-hit ratio,Uses ghost cells,Write out to 4 separate files,Block-Block data distribution,Mem access is completely non-contiguous,9/10/2024,30,EXAMPLE #1 (cont),9/10/2024,31,EXAMPLE #1 - Solution,Set up array with size of file,set up array with subsize of file,set up array with size of local arrays,set up array with subsize of memory,set up array with starting positions in file,set up array with starting positions in memory,disp = 0,call mpi_type_create_subarray(3, , , , MPI_ORDER_FORTRAN, MPI_REAL8, , ierr),call mpi_type_commit (, ierr),do vars=1,4,mem_starts(1) = vars-1,call mpi_type_create_subarray(4, mem_sizes, mem_subsizes, mem_starts, MPI_ORDER_FORTRAN, MPI_REAL8, mem_type, ierr),call mpi_type_commit (mem_type, ierr),call mpi_ (),call mpi_ (fh, disp, MPI_REAL8, , “native”, ),call mpi_ (fh, Z, 1, mem_type, ),call mpi_ (fh, ierr),enddo,9/10/2024,32,LBMPI - PERFORMANCEOriginal: 5204 Seconds,9/10/2024,33,LBMPI - PERFORMANCE Original: 5204 Seconds,9/10/2024,34,EXAMPLE #2,Victor Parr - UTA,Production level Fortran code performing EPA simulations,(,CE-QUAL-ICM Message Passing code,),Typical production run performs a 10 year simulation dumping output for every simulation month,Irregular grid and irregular data distribution,High ratio of ghost cells,9/10/2024,35,EXAMPLE #2 (cont),header,9/10/2024,36,EXAMPLE #2 - CURRENT,Each processor writes all output (including ghost cells) to a process specific file,Post processor reads in process specific files,Determines if value is from resident cell,places resident values in appropriate position in a global output array,writes out global array to global output file,9/10/2024,37,EXAMPLE #2 - SOLUTION,1 2 4 7 9 1011 14 20 24,0 1 3 6 8 9 10 13 19 23,32 63 7 21 44 2 77 31 55 19,2 7 19 21 31 32 44 55 63 77,9 3 23 6 13 0 8 19 1 10,Mem_map,9/10/2024,38,EXAMPLE #2 - SOLUTION,DONE ONCE,create mem_map,create ,sort ,permute mem_map to match ,call mpi_type_create_indexed_block (num, 1, mem_map, MPI_DOUBLE_PRECISION, memtype, ierr),call mpi_type_commit (memtype, ierr),call mpi_type_create_indexed_block (num, 1, , MPI_DOUBLE_PRECISION, , ierr),call mpi_type_commit (, ierr),disp = size of initial header in bytes,DONE FOR EACH OUTPUT,call mpi_ (fh, disp, memtype, , native, MPI_INFO_NULL, ierr),call mpi_ (fh, buf, 1, memtype, status, ierr),disp = disp + total number of bytes written by all processes,9/10/2024,39,CONCLUSIONS,MPI-I/O potentially offers significant improvement in I/O performance,This improvement can be attained with minimal effort on part of the user,Simpler programming with fewer calls to I/O routines,Easier program maintenance due to simple API,9/10/2024,40,
展开阅读全文