JAHPF  http://www.tokyo.rist.or.jp/jahpf/ 

JAHPF:
Japan Association for High Performance Fortran

JAHPF (Japan Association for High Performance Fortran) is a coalition of about 20 compiler specialists and about 25 high performance computing users in Japan to promote High Performance Fortran (HPF). We believe that HPF is the only solution to provide a clear and easily understood programming interface for massively parallel or distributed-memory computing systems, and to make these systems further more familiar research tools for most of the HPC users. However, HPF is still in a maturing stage, and we have several hurdles to jump over to make HPF really usable.

JAHPF started in 1996 and its major goals are;
(1)Proposing a set of HPF extensions (HPF/JA) to improve performance and applicability of High Performance Fortran,
(2)Accumulating experiences of parallelizing with HPF, and making feedback to HPF specification and compiler implementation,
(3)Developing HPF benchmark programs, and
(4)Preparing documents for user friendly parallel programming.

This June, we published the Japanese translation of HPF 2.0 specification from Springer-Verlag Tokyo, ISBN4-431-70822-7. It includes newly designed HPF extensions, HPF/JA 1.0 as an appendix. The English version of the HPF/JA specification is available at http://www.tokyo.rist.or.jp/jahpf/spec/jahpf-e.html.
The extensions are designed to give users more control over sophisticated parallelizations and communication optimizations. They include parallelization of loops with complicated reductions, asynchronous communication, user controllable SHADOW, and communication pattern reuse for irregular remote data accesses.
JAHPF has also been making a lot of efforts on evaluating HPF's applicability to real-world HPC applications. Five applications have successfully parallelized so far by the tight cooperation between user and vendor members, and four more are now on their way. Compilers capable of handling HPF/JA extensions are now under development by the vendor members and will be available next year.

Benchmarking
We have so far developed and tested four benchmark programs in the field of real-world scientific computation. These programs are trimmed to be relatively compact (a few K steps on average) but still contain the core part of the calculation.
(1) CARPARRI: Molecular dynamics
(Car-Parrinello)
(2) ESPAC2: 2D electro-static plasma
(PIC method)
(3) IMPACT3D: 3D CFD code with
TVD scheme
(4) CIP2D: 2D CFD code with CIP method.
We have also developed the experimental HPF code, called as NJR, for analyzing the global climate change on the high performance computer, ' Earth Simulator. ' For this code we have made the performance evaluation on NEC Cenju-4, Hitachi SR2201 and Fujitsu VPP5000.


Overview of HPF/JA 1.0 Specification

Reduction kind
Specification of the reduction kind in HPF/JA extends possibility of loop parallelism in the original HPF. In HPF/JA, reduction variables may be referred to in any context, even in procedure calls.
Example: HPF/JA enables the following DO-loop to be parallelized (while HPF2.0 doesn't).
            z=0.0
      !HPFJ INDEPENDENT,REDUCTION(MAX:z)
            DO i=1,n
              IF(z<A(i))    z=A(i)
            END
The reduction kind can be one of the followings:
1) + * .AND. .OR. .EQV. .NEQV.
2) MAX MIN IAND IOR IEOR
3) FIRSTMAX LASTMAX FIRSTMIN LASTMIN
Keywords in the third group can be used to get results similar to MAXLOC and MINLOC in Fortran90.
 

Asynchronous communication
The ASYNCHRONOUS directive specifies the possibility of overlapping communication with computation in the similar way to asynchronous I/O in the HPF2.0 approved extensions.

Example: Communication overhead of redistribution will be hidden.
      !HPFJ ASYNCHRONOUS(ID=q1)BEGIN
      !HPF$ REDUSTRUBYTE A(*,CYCLIC)
      !HPFJ END  ASYNCHRONOUS
                 ...communication behind the computation
      !HPFJ ASYNCWAIT(ID=q1)
The ASYNCHRONOUS block can include: array assignment statements, FORALL statements and REDISTRIBUTE, REALIGN and REFLECT (shown below) directives.

Explicit access to SHADOW and local data
SHADOW is a set of additional array elements that extends the array section on each processor. Compilers use it as a buffer for the corresponding elements on the neighboring processors, but users can't in the original HPF. HPF/JA allows users to access the shadow explicitly.
The REFLECT directive specifies that each value of the shadow data should be updated with the value of the corresponding data object stored in its owner. (Fig.1)
When the EXT_HOME clause is specified with ON directive instead of HOME clause, the set of owner processors is extended with the processors which own the SHADOW areas. And the LOCAL clause guarantees the access without interprocessor communication. (Fig,2)
The LOCAL directive/clause eliminates the redundant communication to the shadow and other data objects. While the RESIDENT directive/clause (in the original HPF) does not assert disuse of communication inside the active processor set, the LOCAL directive/clause asserts that communication is not needed anyway.

Communication schedule reuse for irregular array accesses
The INDEX_REUSE directive asserts that access patterns for indirectly accessed arrays remain the same through consecutive invocations of loops. The directive can be placed before independent DO or FORALL loops to assert that indices of the specified array have the same values as those for the previous loop invocation.
Example: Access pattern to the specified array will be preserved at the first time and may be utilized later.
      !HPFJ INCEX_REUSE(frag)A,B
      !HPF$ INDEPENDENT
            DO i=1,n
              . . . =A(lx(i))+B(ly(i))
            END DO
If the directive is specified and a compiler knows data/computation mappings remain the same for the consecutive loop invocations, it can reuse the communication schedule for the first invocation.


last updated November 19, 1999
RIST