MPI code debugging and verification Tool MARMOT

 

Previous exercise             Next exercise             Back to menu

 

The example program

            cg-tutorial-marmot-exercise.c

is a simple program, in which every process computes the sum of all ranks, however, we deliberately put in some mistakes to invoke some of MARMOT's warnings.

1.      Submit the job via RB

You need the following files

     cg-tutorial-marmot-exercise (binary)

     cg-tutorial-marmot-exercise.jdl

to start the binary via RB.

Run

      edg-job-submit cg-tutorial-marmot-exercise.jdl

2.      Retrieve the Logfile

·         Run

          edg-job-get-output ***

to retrieve the logfiles

cg-tutorial-marmot-exercise.err

cg-tutorial-marmot-exercise.out

·         In case the previous step fails, have a look at the provided log files:

          cg-tutorial-marmot-exercise.err

          cg-tutorial-marmot-exercise.out

3.      Submit the job via Migrating Desktop

You need the file

     cg-tutorial-marmot-exercise (binary)

somewhere on a computing element for starting the binary via Migrating Desktop. Unfortunately, the home directories in the CrossGrid testbed have different names on different computing elements, so one has to know the correct name of one’s home directory.

  1. Start the job wizard à general à commandline.
  2. Fill in the job description.

  1.  Specify the job type and node number, for example

  1. Specify the hostname, for example

  1. Specify your home directory on this host, for example

  1. Specify your output files, for example

            Set the Refresh period in seconds for your files, for example 2 seconds.

For real applications, use a value like for example 10 seconds, or else there may be a problem with the refreshement of the files.

  1. It didn’t work for me to set environment variables in the Environment window. If you want to set an environment variable, edit your .bashrc in your home of the computing element where you want to run your application. If you set there for example

export TRACE_CALLS=0

the log file will not contain tracing of calls.

  1. Submit your job, possibly after saving it.

 

4.      Retrieve the Logfile

Open Job Monitor, select your job and click on Details à Files à Stderr à Visualize to display MARMOT’s log file.

In case the previous steps fail, have a look at the provided files:

          cg-tutorial-marmot-exercise.err

          cg-tutorial-marmot-exercise.out

The log file cg-tutorial-marmot-exercise.err will contain warnings, for example

18 rank 2 performs MPI_Type_struct

WARNING: MPI_Type_struct: blocklength[0]=0!

WARNING: MPI_Type_struct: datatype[0] is Fortran-Type!

WARNING: MPI_Type_struct: datatype[1] is optional!

WARNING: MPI_Type_struct: blocklength[0]=0!

WARNING: MPI_Type_struct: blocklength[0]=0!

19 rank 0 performs MPI_Type_struct

20 rank 1 performs MPI_Type_struct

21 rank 2 performs MPI_Type_struct

WARNING: MPI_Type_struct: datatype[0] is Fortran-Type!

WARNING: MPI_Type_struct: datatype[1] is optional!

WARNING: MPI_Type_struct: datatype[0] is Fortran-Type!

WARNING: MPI_Type_struct: datatype[1] is optional!

22 rank 0 performs MPI_Type_commit

23 rank 1 performs MPI_Type_commit

24 rank 0 performs MPI_Type_commit

25 rank 1 performs MPI_Type_commit

26 rank 2 performs MPI_Type_commit

NOTE: MPI_Type_commit: Datatype already committed!

NOTE: MPI_Type_commit: Datatype already committed!

27 rank 0 performs MPI_Address

28 rank 1 performs MPI_Address

29 rank 2 performs MPI_Type_commit

NOTE: MPI_Type_commit: Datatype already committed!

30 rank 0 performs MPI_Address

31 rank 1 performs MPI_Address

32 rank 2 performs MPI_Address

33 rank 0 performs MPI_Type_struct

34 rank 1 performs MPI_Type_struct

35 rank 2 performs MPI_Address

36 rank 0 performs MPI_Type_commit

37 rank 1 performs MPI_Type_commit

38 rank 2 performs MPI_Type_struct

39 rank 0 performs MPI_Issend

40 rank 1 performs MPI_Issend

41 rank 2 performs MPI_Type_commit

WARNING: MPI_Issend: count=0 !

WARNING: MPI_Issend: datatype is for reduction functions!

WARNING: MPI_Issend: count=0 !

WARNING: MPI_Issend: datatype is for reduction functions!

42 rank 0 performs MPI_Recv

WARNING: MPI_Recv: count = 0!

43 rank 1 performs MPI_Recv

44 rank 2 performs MPI_Issend

 

The warning for MPI_Issend refers for example to

        /* This will produce warnings:

         * the count is 0,

         * the types are for reduction functions.

         */

        MPI_Issend(&int_send_buf, 0, MPI_LONG_INT, right, MSG_TAG,

                   MPI_COMM_WORLD, &request);

where we had deliberately put some errors.

 

Previous exercise             Next exercise             Back to menu