## Structure Mapping

In the example below, only structure elements  _S.a_ ,  _S.b_  and  _S.p_   of the  _S_  structure appear in __map__ clauses of a __target__ construct. Only these components have corresponding variables and storage on the device.   Hence, the large arrays,  _S.buffera_  and  _S.bufferb_ , and the  _S.x_  component have no storage  on the device and cannot be accessed.

Also, since the pointer member  _S.p_  is used in an array section of a  __map__ clause, the array storage of the array section on the device,   _S.p[:N]_ , is *attached* to the pointer member  _S.p_  on the device. Explicitly mapping the pointer member  _S.p_  is optional in this case.

Note: The buffer arrays and the  _x_  variable have been grouped together, so that the components that will reside on the device are all together (without gaps). This allows the runtime to optimize the transfer and the storage footprint on the device.

In [None]:
//%compiler: clang
//%cflags: -fopenmp

/*
* name:       target_struct_map.1
* type:       C
* version: omp_5.1
*/
#include <stdio.h>
#include <stdlib.h>
#define N 100
#define BAZILLION 2000000

struct foo {
  char buffera[BAZILLION];
  char bufferb[BAZILLION];
  float x;
  float a, b;
  float *p;
};

#pragma omp begin declare target
void saxpyfun(struct foo *S)
{
  int i;
  for(i=0; i<N; i++)
    S->p[i] = S->p[i]*S->a + S->b;
}
#pragma omp end declare target

int main()
{
  struct foo S;
  int i;

  S.a = 2.0;
  S.b = 4.0;
  S.p = (float *)malloc(sizeof(float)*N);
  for(i=0; i<N; i++) S.p[i] = i;

  #pragma omp target map(alloc:S.p) map(S.p[:N]) map(to:S.a, S.b)
  saxpyfun(&S);

  printf(" %4.0f %4.0f\n", S.p[0], S.p[N-1]);
        //     4  202  <- output

  free(S.p);
  return 0;
}

The following example is a slight modification of the above example for  a C++ class.  In the member function  _SAXPY::driver_   the array section  _p[:N]_  is *attached* to the pointer member  _p_  on the device.

In [None]:
//%compiler: clang
//%cflags: -fopenmp

/*
* name:       target_struct_map.2
* type:       C++
* version: omp_5.1
*/
#include <cstdio>
#include <cstdlib>
#define N 100

class SAXPY {
  private:
   float a, b, *p;
  public:
   float buffer[N];

   SAXPY(float arg_a, float arg_b){ a=arg_a; b=arg_b; }
   void driver();
   void saxpyfun(float *p);
};

#pragma omp begin declare target
void SAXPY::saxpyfun(float *q)
{
  for(int i=0; i<N; i++)
    buffer[i] = q[i]*a + b;
}
#pragma omp end declare target

void SAXPY::driver()
{
  p = (float *) malloc(N*sizeof(float));
  for(int i=0; i<N; i++) p[i]=i;

  #pragma omp target map(alloc:p) map(to:p[:N]) map(to:a,b) \
              map(from:buffer[:N])   // attach(p) to device_malloc()
  {
    saxpyfun(p);
  }

  free(p);
}

int main()
{
  SAXPY my_saxpy(2.0,4.0);

  my_saxpy.driver();

  printf(" %4.0f %4.0f\n", my_saxpy.buffer[0], my_saxpy.buffer[N-1]);
         //   4   202     <- output

  return 0;
}

The next example shows two ways in which the structure may be *incorrectly* mapped.

In Case 1, the array section  _S1.p[:N]_  is first mapped in an enclosing __target__ __data__ construct, and the __target__ construct then implicitly maps the structure  _S1_ . The initial map of the array section does not map the base pointer  _S1.p_  -- it only maps the elements of the array section.  Furthermore, the implicit map is not sufficient to ensure pointer attachment for the structure member  _S1.p_  (refer to the conditions for pointer attachment described in Section 6.3). Consequentially, the dereference operation  _S1.p[i]_  in the call to  _saxpyfun_  will probably fail because  _S1.p_  contains a host address.

In Case 2, again an array section is mapped on an enclosing __target__ __data__ construct. This time, the nested __target__ construct explicitly maps  _S2.p_ ,  _S2.a_ , and  _S2.b_ . But as in Case 1, this does not satisfy the conditions for pointer attachment since the construct must map a list item for which  _S2.p_  is a base pointer, and it must do so when the  _S2.p_  is already present on the device or will be created on the device as a result of the same construct.

In [None]:
//%compiler: clang
//%cflags: -fopenmp

/*
* name:       target_struct_map.3
* type:       C
* version: omp_5.1
*/
#include <stdio.h>
#include <stdlib.h>
#define N 100
#define BAZILLION 2000000

struct foo {
  char buffera[BAZILLION];
  char bufferb[BAZILLION];
  float x;
  float a, b;
  float *p;
};

#pragma omp begin declare target
void saxpyfun(struct foo *S)
{
  int i;
  for(i=0; i<N; i++)
    S->p[i] = S->p[i] * S->a + S->b; // S->p[i] invalid
}
#pragma omp end declare target

int main()
{
  struct foo S1, S2;
  int i;

  // Case 1

  S1.a = 2.0;
  S1.b = 4.0;
  S1.p = (float *)malloc(sizeof(float)*N);
  for(i=0; i<N; i++) S1.p[i] = i;

  // No pointer attachment for S1.p here
  #pragma omp target data map(S1.p[:N])
  #pragma omp target // implicit map of S1
  saxpyfun(&S1);

  // Case 2

  S2.a = 2.0;
  S2.b = 4.0;
  S2.p = (float *)malloc(sizeof(float)*N);
  for(i=0; i<N; i++) S2.p[i] = i;

  // No pointer attachment for S2.p here either
  #pragma omp target data map(S2.p[:N])
  #pragma omp target map(S2.p, S2.a, S2.b) // implicit map of S2
  saxpyfun(&S2);

  // These print statement may not execute because the
  // above code is invalid
  printf(" %4.0f %4.0f\n", S1.p[0], S1.p[N-1]);
  printf(" %4.0f %4.0f\n", S2.p[0], S2.p[N-1]);

  free(S1.p);
  free(S2.p);
  return 0;
}

The following example correctly implements pointer attachment cases that involve implicit structure maps.

In Case 1, members  _p_ ,  _a_ , and  _b_  of the structure  _S1_  are explicitly mapped by the __target__ __data__ construct, to avoid mapping parts of  _S1_  that aren't required on the device. The mapped  _S1.p_  is attached to the array section  _S1.p[:N]_ , and remains attached while it exists on the device (for the duration of __target__ __data__ region).  Due to the  _S1_  reference inside the nested __target__ construct, the construct implicitly maps  _S1_  so that the reference refers to the corresponding storage created by the enclosing __target__ __data__ region. Note that only the members  _a_ ,  _b_ , and  _p_  may be accessed from this storage.

In Case 2, only the storage for the array section  _S2.p[:N]_  is mapped by the __target__ __data__ construct.  The nested __target__ construct explicitly maps  _S2.a_  and  _S2.b_  and explicitly maps an array section for which  _S2.p_  is a base pointer. This satisfies the conditions for  _S2.p_  becoming an attached pointer. The array section in this case is zero-length, but the effect would be the same if the length was a positive integer less than or equal to  _N_ . There is also an implicit map of the containing structure  _S2_ , again due to the reference to  _S2_  inside the construct. The effect of this implicit map permits access only to members  _a_ ,  _b_ , and  _p_ , as for Case 1.

In Case 3, there is no __target__ __data__ construct. The __target__ construct explicitly maps  _S3.a_  and  _S3.b_  and explicitly maps an array section for which  _S3.p_  is a base pointer. Again, there is an implicit map of the structure referenced in the construct,  _S3_ . This implicit map also causes  _S3.p_  to be implicitly mapped, because no other part of  _S3_  is present prior to the construct being encountered. The result is an attached pointer  _S3.p_  on the device. As for Cases 1 and 2, this implicit map only ensures that storage for the members  _a_ ,  _b_ , and  _p_  are accessible within the corresponding  _S3_  that is created on the device.

In [None]:
//%compiler: clang
//%cflags: -fopenmp

/*
* name:       target_struct_map.4
* type:       C
* version: omp_5.1
*/
#include <stdio.h>
#include <stdlib.h>
#define N 100
#define BAZILLION 2000000

struct foo {
  char buffera[BAZILLION];
  char bufferb[BAZILLION];
  float x;
  float a, b;
  float *p;
};

#pragma omp begin declare target
void saxpyfun(struct foo *S)
{
  int i;
  for(i=0; i<N; i++)
    S->p[i] = S->p[i]*S->a + S->b;
}
#pragma omp end declare target

int main()
{
  struct foo S1, S2, S3;
  int i;

  // Case 1

  S1.a = 2.0;
  S1.b = 4.0;
  S1.p = (float *)malloc(sizeof(float)*N);
  for(i=0; i<N; i++) S1.p[i] = i;

  // The target data construct results in pointer attachment for S1.p.
  // Explicitly mapping S1.p, S1.a, and S1.b rather than S1 avoids
  // mapping the entire structure (including members buffera, bufferb,
  // and x).
  #pragma omp target data map(S1.p[:N],S1.p,S1.a,S1.b)
  #pragma omp target //implicit map of S1
  saxpyfun(&S1);


  // Case 2

  S2.a = 2.0;
  S2.b = 4.0;
  S2.p = (float *)malloc(sizeof(float)*N);
  for(i=0; i<N; i++) S2.p[i] = i;

  // The target construct results in pointer attachment for S2.p.
  #pragma omp target data map(S2.p[:N])
  #pragma omp target map(S2.p[:0], S2.a, S2.b) // implicit map of S2
  saxpyfun(&S2);

  // Case 3

  S3.a = 2.0;
  S3.b = 4.0;
  S3.p = (float *)malloc(sizeof(float)*N);
  for(i=0; i<N; i++) S3.p[i] = i;

  // The target construct results in pointer attachment for S3.p.
  // Note that S3.p is implicitly mapped due to the implicit map of S3
  // (but corresponding storage is NOT created for members buffera,
  // bufferb, and x).
  #pragma omp target map(S3.p[:N], S3.a, S3.b)  // implicit map of S3
  saxpyfun(&S3);

  printf(" %4.0f %4.0f\n", S1.p[0], S1.p[N-1]);  //OUT1 4 202
  printf(" %4.0f %4.0f\n", S2.p[0], S2.p[N-1]);  //OUT2 4 202
  printf(" %4.0f %4.0f\n", S3.p[0], S3.p[N-1]);  //OUT3 4 202

  free(S1.p);
  free(S2.p);
  free(S3.p);
  return 0;
}