StarPU Handbook
 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Macros Groups Pages
Data Management

This section describes the data management facilities provided by StarPU. We show how to use existing data interfaces in Data Interfaces, but developers can design their own data interfaces if required. More...

Typedefs

typedef struct _starpu_data_state * starpu_data_handle_t
typedef struct starpu_arbiter * starpu_arbiter_t

Enumerations

enum  starpu_data_access_mode {
  STARPU_NONE, STARPU_R, STARPU_W, STARPU_RW,
  STARPU_SCRATCH, STARPU_REDUX, STARPU_COMMUTE, STARPU_SSEND,
  STARPU_LOCALITY, STARPU_ACCESS_MODE_MAX
}

Basic Data Management API

Data management is done at a high-level in StarPU: rather than accessing a mere list of contiguous buffers, the tasks may manipulate data that are described by a high-level construct which we call data interface.

An example of data interface is the "vector" interface which describes a contiguous data array on a spefic memory node. This interface is a simple structure containing the number of elements in the array, the size of the elements, and the address of the array in the appropriate address space (this address may be invalid if there is no valid copy of the array in the memory node). More informations on the data interfaces provided by StarPU are given in Data Interfaces.

When a piece of data managed by StarPU is used by a task, the task implementation is given a pointer to an interface describing a valid copy of the data that is accessible from the current processing unit.

Every worker is associated to a memory node which is a logical abstraction of the address space from which the processing unit gets its data. For instance, the memory node associated to the different CPU workers represents main memory (RAM), the memory node associated to a GPU is DRAM embedded on the device. Every memory node is identified by a logical index which is accessible from the function starpu_worker_get_memory_node(). When registering a piece of data to StarPU, the specified memory node indicates where the piece of data initially resides (we also call this memory node the home node of a piece of data).

void starpu_data_register (starpu_data_handle_t *handleptr, int home_node, void *data_interface, struct starpu_data_interface_ops *ops)
void starpu_data_ptr_register (starpu_data_handle_t handle, unsigned node)
void starpu_data_register_same (starpu_data_handle_t *handledst, starpu_data_handle_t handlesrc)
void starpu_data_unregister (starpu_data_handle_t handle)
void starpu_data_unregister_no_coherency (starpu_data_handle_t handle)
void starpu_data_unregister_submit (starpu_data_handle_t handle)
void starpu_data_invalidate (starpu_data_handle_t handle)
void starpu_data_invalidate_submit (starpu_data_handle_t handle)
void starpu_data_set_wt_mask (starpu_data_handle_t handle, uint32_t wt_mask)
void starpu_data_set_name (starpu_data_handle_t handle, const char *name)
void starpu_data_set_coordinates_array (starpu_data_handle_t handle, int dimensions, int dims[])
void starpu_data_set_coordinates (starpu_data_handle_t handle, unsigned dimensions,...)
int starpu_data_fetch_on_node (starpu_data_handle_t handle, unsigned node, unsigned async)
int starpu_data_prefetch_on_node (starpu_data_handle_t handle, unsigned node, unsigned async)
int starpu_data_idle_prefetch_on_node (starpu_data_handle_t handle, unsigned node, unsigned async)
void starpu_data_wont_use (starpu_data_handle_t handle)
starpu_data_handle_t starpu_data_lookup (const void *ptr)
int starpu_data_request_allocation (starpu_data_handle_t handle, unsigned node)
void starpu_data_query_status (starpu_data_handle_t handle, int memory_node, int *is_allocated, int *is_valid, int *is_requested)
void starpu_data_advise_as_important (starpu_data_handle_t handle, unsigned is_important)
void starpu_data_set_reduction_methods (starpu_data_handle_t handle, struct starpu_codelet *redux_cl, struct starpu_codelet *init_cl)
struct starpu_data_interface_opsstarpu_data_get_interface_ops (starpu_data_handle_t handle)
void starpu_data_set_user_data (starpu_data_handle_t handle, void *user_data)
void * starpu_data_get_user_data (starpu_data_handle_t handle)

Access registered data from the application

#define STARPU_ACQUIRE_NO_NODE
#define STARPU_ACQUIRE_ALL_NODES
#define STARPU_DATA_ACQUIRE_CB(handle, mode, code)
int starpu_data_acquire (starpu_data_handle_t handle, enum starpu_data_access_mode mode)
int starpu_data_acquire_cb (starpu_data_handle_t handle, enum starpu_data_access_mode mode, void(*callback)(void *), void *arg)
int starpu_data_acquire_cb_sequential_consistency (starpu_data_handle_t handle, enum starpu_data_access_mode mode, void(*callback)(void *), void *arg, int sequential_consistency)
int starpu_data_acquire_on_node (starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode)
int starpu_data_acquire_on_node_cb (starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode, void(*callback)(void *), void *arg)
int starpu_data_acquire_on_node_cb_sequential_consistency (starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode, void(*callback)(void *), void *arg, int sequential_consistency)
void starpu_data_release (starpu_data_handle_t handle)
void starpu_data_release_on_node (starpu_data_handle_t handle, int node)
starpu_arbiter_t starpu_arbiter_create (void) STARPU_ATTRIBUTE_MALLOC
void starpu_data_assign_arbiter (starpu_data_handle_t handle, starpu_arbiter_t arbiter)
void starpu_arbiter_destroy (starpu_arbiter_t arbiter)

Detailed Description

This section describes the data management facilities provided by StarPU. We show how to use existing data interfaces in Data Interfaces, but developers can design their own data interfaces if required.

Macro Definition Documentation

#define STARPU_ACQUIRE_NO_NODE

This macro can be used to acquire data, but not require it to be available on a given node, only enforce R/W dependencies. This can for instance be used to wait for tasks which produce the data, but without requesting a fetch to the main memory.

#define STARPU_ACQUIRE_ALL_NODES

This is the same as STARPU_ACQUIRE_NO_NODE, but will lock the data on all nodes, preventing them from being evicted for instance. This is mostly useful inside starpu only.

#define STARPU_DATA_ACQUIRE_CB (   handle,
  mode,
  code 
)

STARPU_DATA_ACQUIRE_CB() is the same as starpu_data_acquire_cb(), except that the code to be executed in a callback is directly provided as a macro parameter, and the data handle is automatically released after it. This permits to easily execute code which depends on the value of some registered data. This is non-blocking too and may be called from task callbacks.

Typedef Documentation

StarPU uses starpu_data_handle_t as an opaque handle to manage a piece of data. Once a piece of data has been registered to StarPU, it is associated to a starpu_data_handle_t which keeps track of the state of the piece of data over the entire machine, so that we can maintain data consistency and locate data replicates for instance.

This is an arbiter, which implements an advanced but centralized management of concurrent data accesses, see Concurrent Data Accesses for the details.

Enumeration Type Documentation

This datatype describes a data access mode.

Enumerator:
STARPU_NONE 

TODO

STARPU_R 

read-only mode.

STARPU_W 

write-only mode.

STARPU_RW 

read-write mode. This is equivalent to STARPU_R|STARPU_W

STARPU_SCRATCH 

A temporary buffer is allocated for the task, but StarPU does not enforce data consistency—i.e. each device has its own buffer, independently from each other (even for CPUs), and no data transfer is ever performed. This is useful for temporary variables to avoid allocating/freeing buffers inside each task. Currently, no behavior is defined concerning the relation with the STARPU_R and STARPU_W modes and the value provided at registration — i.e., the value of the scratch buffer is undefined at entry of the codelet function. It is being considered for future extensions at least to define the initial value. For now, data to be used in STARPU_SCRATCH mode should be registered with node -1 and a NULL pointer, since the value of the provided buffer is simply ignored for now.

STARPU_REDUX 

todo

STARPU_COMMUTE 

In addition to that, STARPU_COMMUTE can be passed along STARPU_W or STARPU_RW to express that StarPU can let tasks commute, which is useful e.g. when bringing a contribution into some data, which can be done in any order (but still require sequential consistency against reads or non-commutative writes).

STARPU_SSEND 

used in starpu_mpi_insert_task() to specify the data has to be sent using a synchronous and non-blocking mode (see starpu_mpi_issend())

STARPU_LOCALITY 

used to tell the scheduler which data is the most important for the task, and should thus be used to try to group tasks on the same core or cache, etc. For now only the ws and lws schedulers take this flag into account, and only when rebuild with USE_LOCALITY flag defined in the src/sched_policies/work_stealing_policy.c source code.

Function Documentation

void starpu_data_register ( starpu_data_handle_t handleptr,
int  home_node,
void *  data_interface,
struct starpu_data_interface_ops ops 
)

Register a piece of data into the handle located at the handleptr address. The data_interface buffer contains the initial description of the data in the home_node. The ops argument is a pointer to a structure describing the different methods used to manipulate this type of interface. See starpu_data_interface_ops for more details on this structure. If home_node is -1, StarPU will automatically allocate the memory when it is used for the first time in write-only mode. Once such data handle has been automatically allocated, it is possible to access it using any access mode. Note that StarPU supplies a set of predefined types of interface (e.g. vector or matrix) which can be registered by the means of helper functions (e.g. starpu_vector_data_register() or starpu_matrix_data_register()).

void starpu_data_ptr_register ( starpu_data_handle_t  handle,
unsigned  node 
)

Register that a buffer for handle on node will be set. This is typically used by starpu_*_ptr_register helpers before setting the interface pointers for this node, to tell the core that that is now allocated.

void starpu_data_register_same ( starpu_data_handle_t handledst,
starpu_data_handle_t  handlesrc 
)

Register a new piece of data into the handle handledst with the same interface as the handle handlesrc.

void starpu_data_unregister ( starpu_data_handle_t  handle)

This function unregisters a data handle from StarPU. If the data was automatically allocated by StarPU because the home node was -1, all automatically allocated buffers are freed. Otherwise, a valid copy of the data is put back into the home node in the buffer that was initially registered. Using a data handle that has been unregistered from StarPU results in an undefined behaviour. In case we do not need to update the value of the data in the home node, we can use the function starpu_data_unregister_no_coherency() instead.

void starpu_data_unregister_no_coherency ( starpu_data_handle_t  handle)

This is the same as starpu_data_unregister(), except that StarPU does not put back a valid copy into the home node, in the buffer that was initially registered.

void starpu_data_unregister_submit ( starpu_data_handle_t  handle)

Destroy the data handle once it is not needed anymore by any submitted task. No coherency is assumed.

void starpu_data_invalidate ( starpu_data_handle_t  handle)

Destroy all replicates of the data handle immediately. After data invalidation, the first access to the handle must be performed in write-only mode. Accessing an invalidated data in read-mode results in undefined behaviour.

void starpu_data_invalidate_submit ( starpu_data_handle_t  handle)

Submits invalidation of the data handle after completion of previously submitted tasks.

void starpu_data_set_wt_mask ( starpu_data_handle_t  handle,
uint32_t  wt_mask 
)

This function sets the write-through mask of a given data (and its children), i.e. a bitmask of nodes where the data should be always replicated after modification. It also prevents the data from being evicted from these nodes when memory gets scarse. When the data is modified, it is automatically transfered into those memory node. For instance a 1<<0 write-through mask means that the CUDA workers will commit their changes in main memory (node 0).

void starpu_data_set_name ( starpu_data_handle_t  handle,
const char *  name 
)

Set the name of the data, to be shown in various profiling tools.

void starpu_data_set_coordinates_array ( starpu_data_handle_t  handle,
int  dimensions,
int  dims[] 
)

Set the coordinates of the data, to be shown in various profiling tools. dimensions is the size of the array This can be for instance the tile coordinates within a big matrix.

void starpu_data_set_coordinates ( starpu_data_handle_t  handle,
unsigned  dimensions,
  ... 
)

Set the coordinates of the data, to be shown in various profiling tools. dimensions is the number of subsequent int parameters. This can be for instance the tile coordinates within a big matrix.

int starpu_data_fetch_on_node ( starpu_data_handle_t  handle,
unsigned  node,
unsigned  async 
)

Issue a fetch request for a given data to a given node, i.e. requests that the data be replicated to the given node as soon as possible, so that it is available there for tasks. If the async parameter is 0, the call will block until the transfer is achieved, else the call will return immediately, after having just queued the request. In the latter case, the request will asynchronously wait for the completion of any task writing on the data.

int starpu_data_prefetch_on_node ( starpu_data_handle_t  handle,
unsigned  node,
unsigned  async 
)

Issue a prefetch request for a given data to a given node, i.e. requests that the data be replicated to the given node when there is room for it, so that it is available there for tasks. If the async parameter is 0, the call will block until the transfer is achieved, else the call will return immediately, after having just queued the request. In the latter case, the request will asynchronously wait for the completion of any task writing on the data.

int starpu_data_idle_prefetch_on_node ( starpu_data_handle_t  handle,
unsigned  node,
unsigned  async 
)

Issue an idle prefetch request for a given data to a given node, i.e. requests that the data be replicated to the given node, so that it is available there for tasks, but only when the bus is really idle. If the async parameter is 0, the call will block until the transfer is achieved, else the call will return immediately, after having just queued the request. In the latter case, the request will asynchronously wait for the completion of any task writing on the data.

void starpu_data_wont_use ( starpu_data_handle_t  handle)

Advise StarPU that this handle will not be used in the close future, and is thus a good candidate for eviction from GPUs. StarPU will thus write its value back to its home node when the bus is idle, and select this data in priority for eviction when memory gets low.

starpu_data_handle_t starpu_data_lookup ( const void *  ptr)

Return the handle corresponding to the data pointed to by the ptr host pointer.

int starpu_data_request_allocation ( starpu_data_handle_t  handle,
unsigned  node 
)

Explicitly ask StarPU to allocate room for a piece of data on the specified memory node.

void starpu_data_query_status ( starpu_data_handle_t  handle,
int  memory_node,
int *  is_allocated,
int *  is_valid,
int *  is_requested 
)

Query the status of handle on the specified memory_node.

void starpu_data_advise_as_important ( starpu_data_handle_t  handle,
unsigned  is_important 
)

This function allows to specify that a piece of data can be discarded without impacting the application.

void starpu_data_set_reduction_methods ( starpu_data_handle_t  handle,
struct starpu_codelet redux_cl,
struct starpu_codelet init_cl 
)

This sets the codelets to be used for handle when it is accessed in the mode STARPU_REDUX. Per-worker buffers will be initialized with the codelet init_cl, and reduction between per-worker buffers will be done with the codelet redux_cl.

struct starpu_data_interface_ops * starpu_data_get_interface_ops ( starpu_data_handle_t  handle)
read

todo

void starpu_data_set_user_data ( starpu_data_handle_t  handle,
void *  user_data 
)

This sets the "user_data" field for the handle to user_data . It can then be retrieved with starpu_data_get_user_data. user_data can be any application-defined value, for instance a pointer to an object-oriented container for the data.

void * starpu_data_get_user_data ( starpu_data_handle_t  handle)

This retrieves the "user_data" field previously set for the handle .

int starpu_data_acquire ( starpu_data_handle_t  handle,
enum starpu_data_access_mode  mode 
)

The application must call this function prior to accessing registered data from main memory outside tasks. StarPU ensures that the application will get an up-to-date copy of the data in main memory located where the data was originally registered, and that all concurrent accesses (e.g. from tasks) will be consistent with the access mode specified in the mode argument. starpu_data_release() must be called once the application does not need to access the piece of data anymore. Note that implicit data dependencies are also enforced by starpu_data_acquire(), i.e. starpu_data_acquire() will wait for all tasks scheduled to work on the data, unless they have been disabled explictly by calling starpu_data_set_default_sequential_consistency_flag() or starpu_data_set_sequential_consistency_flag(). starpu_data_acquire() is a blocking call, so that it cannot be called from tasks or from their callbacks (in that case, starpu_data_acquire() returns -EDEADLK). Upon successful completion, this function returns 0.

int starpu_data_acquire_cb ( starpu_data_handle_t  handle,
enum starpu_data_access_mode  mode,
void(*)(void *)  callback,
void *  arg 
)

Asynchronous equivalent of starpu_data_acquire(). When the data specified in handle is available in the appropriate access mode, the callback function is executed. The application may access the requested data during the execution of this callback. The callback function must call starpu_data_release() once the application does not need to access the piece of data anymore. Note that implicit data dependencies are also enforced by starpu_data_acquire_cb() in case they are not disabled. Contrary to starpu_data_acquire(), this function is non-blocking and may be called from task callbacks. Upon successful completion, this function returns 0.

int starpu_data_acquire_cb_sequential_consistency ( starpu_data_handle_t  handle,
enum starpu_data_access_mode  mode,
void(*)(void *)  callback,
void *  arg,
int  sequential_consistency 
)

Equivalent of starpu_data_acquire_cb() with the possibility of enabling or disabling data dependencies. When the data specified in handle is available in the appropriate access mode, the callback function is executed. The application may access the requested data during the execution of this callback. The callback function must call starpu_data_release() once the application does not need to access the piece of data anymore. Note that implicit data dependencies are also enforced by starpu_data_acquire_cb_sequential_consistency() in case they are not disabled specifically for the given handle or by the parameter sequential_consistency. Similarly to starpu_data_acquire_cb(), this function is non-blocking and may be called from task callbacks. Upon successful completion, this function returns 0.

int starpu_data_acquire_on_node ( starpu_data_handle_t  handle,
int  node,
enum starpu_data_access_mode  mode 
)

This is the same as starpu_data_acquire(), except that the data will be available on the given memory node instead of main memory. STARPU_ACQUIRE_NO_NODE and STARPU_ACQUIRE_ALL_NODES can be used instead of an explicit node number.

int starpu_data_acquire_on_node_cb ( starpu_data_handle_t  handle,
int  node,
enum starpu_data_access_mode  mode,
void(*)(void *)  callback,
void *  arg 
)

This is the same as starpu_data_acquire_cb(), except that the data will be available on the given memory node instead of main memory. STARPU_ACQUIRE_NO_NODE and STARPU_ACQUIRE_ALL_NODES can be used instead of an explicit node number.

int starpu_data_acquire_on_node_cb_sequential_consistency ( starpu_data_handle_t  handle,
int  node,
enum starpu_data_access_mode  mode,
void(*)(void *)  callback,
void *  arg,
int  sequential_consistency 
)

This is the same as starpu_data_acquire_cb_sequential_consistency(), except that the data will be available on the given memory node instead of main memory. STARPU_ACQUIRE_NO_NODE and STARPU_ACQUIRE_ALL_NODES can be used instead of an explicit node number.

void starpu_data_release ( starpu_data_handle_t  handle)

This function releases the piece of data acquired by the application either by starpu_data_acquire() or by starpu_data_acquire_cb().

void starpu_data_release_on_node ( starpu_data_handle_t  handle,
int  node 
)

This is the same as starpu_data_release(), except that the data will be available on the given memory node instead of main memory. The node parameter must be exactly the same as the corresponding starpu_data_acquire_on_node* call.

starpu_arbiter_t starpu_arbiter_create ( void  )

This creates a data access arbiter, see Concurrent Data Accesses for the details

void starpu_data_assign_arbiter ( starpu_data_handle_t  handle,
starpu_arbiter_t  arbiter 
)

This makes accesses to handle managed by arbiter

void starpu_arbiter_destroy ( starpu_arbiter_t  arbiter)

This destroys the arbiter . This must only be called after all data assigned to it have been unregistered.