StarPU Handbook
 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Macros Groups Pages
Out Of Core

Introduction

When using StarPU, one may need to store more data than what the main memory (RAM) can store. This part describes the method to add a new memory node on a disk and to use it.

Similarly to what happens with GPUs (it's actually exactly the same code), when available main memory becomes scarse, StarPU will evict unused data to the disk, thus leaving room for new allocations. Whenever some evicted data is needed again for a task, StarPU will automatically fetch it back from the disk.

The principle is that one first registers a disk location, seen by StarPU as a void*, which can be for instance a Unix path for the stdio, unistd or unistd_o_direct backends, or a leveldb database for the leveldb backend, an HDF5 file path for the HDF5 backend, etc. The disk backend opens this place with the plug method.

StarPU can then start using it to allocate room and store data there with the disk write method, without user intervention.

The user can also use starpu_disk_open() to explicitly open an object within the disk, e.g. a file name in the stdio or unistd cases, or a database key in the leveldb case, and then use starpu_*_register functions to turn it into a StarPU data handle. StarPU will then use this file as external source of data, and automatically read and write data as appropriate.

Use a new disk memory

To use a disk memory node, you have to register it with this function:

int new_dd = starpu_disk_register(&starpu_disk_unistd_ops, (void *) "/tmp/", 1024*1024*200);

Here, we use the unistd library to realize the read/write operations, i.e. fread/fwrite. This structure must have a path where to store files, as well as the maximum size the software can afford storing on the disk.

Don't forget to check if the result is correct!

This can also be achieved by just setting environment variables:

export STARPU_DISK_SWAP=/tmp
export STARPU_DISK_SWAP_BACKEND=unistd
export STARPU_DISK_SWAP_SIZE=200

The backend can be set to stdio (some caching is done by libc), unistd (only caching in the kernel), unistd_o_direct (no caching), leveldb, or hdf5.

When that register call is made, StarPU will benchmark the disk. This can take some time.

Warning: the size thus has to be at least STARPU_DISK_SIZE_MIN bytes !

StarPU will then automatically try to evict unused data to this new disk. One can also use the standard StarPU memory node API to prefetch data etc., see the Standard Memory Library and the Data Interfaces .

The disk is unregistered during the starpu_shutdown().

Data Registration

StarPU will only be able to achieve Out-Of-Core eviction if it controls memory allocation. For instance, if the application does the following:

p = malloc(1024*1024*sizeof(float)); fill_with_data(p); starpu_matrix_data_register(&h, STARPU_MAIN_RAM, (uintptr_t) p, 1024, 1024, 1024, sizeof(float));

StarPU will not be able to release the corresponding memory since it's the application which allocated it, and StarPU can not know how, and thus how to release it. One thus have to use the following instead:

starpu_matrix_data_register(&h, -1, NULL, 1024, 1024, 1024, sizeof(float)); starpu_task_insert(cl_fill_with_data, STARPU_W, h, 0);

Which makes StarPU automatically do the allocation when the task running cl_fill_with_data gets executed. And then if its needs to, it will be able to release it after having pushed the data to the disk.

Using Wont Use

By default, StarPU uses a Least-Recently-Used (LRU) algorithm to determine which data should be evicted to the disk. This algorithm can be hinted by telling which data will no be used in the coming future thanks to starpu_data_wont_use(), for instance:

starpu_task_insert(&cl_work, STARPU_RW, h, 0); starpu_data_wont_use(h);

StarPU will mark the data as "inactive" and tend to evict to the disk that data rather than others.

Examples: disk_copy

/* Try to write into disk memory
* Use mechanism to push datas from main ram to disk ram
*/
#include <starpu.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
/* size of one vector */
#define NX (30*1000000/sizeof(double))
#define FPRINTF(ofile, fmt, ...) do { if (!getenv("STARPU_SSILENT")) {fprintf(ofile, fmt, ## __VA_ARGS__); }} while(0)
int main(int argc, char **argv)
{
double * A,*B,*C,*D,*E,*F;
/* limit main ram to force to push in disk */
setenv("STARPU_LIMIT_CPU_MEM", "160", 1);
/* Initialize StarPU with default configuration */
int ret = starpu_init(NULL);
if (ret == -ENODEV) goto enodev;
/* register a disk */
int new_dd = starpu_disk_register(&starpu_disk_unistd_ops, (void *) "/tmp/", 1024*1024*200);
/* can't write on /tmp/ */
if (new_dd == -ENOENT) goto enoent;
/* allocate two memory spaces */
starpu_malloc_flags((void **)&A, NX*sizeof(double), STARPU_MALLOC_COUNT);
starpu_malloc_flags((void **)&F, NX*sizeof(double), STARPU_MALLOC_COUNT);
FPRINTF(stderr, "TEST DISK MEMORY \n");
unsigned int j;
/* initialization with bad values */
for(j = 0; j < NX; ++j)
{
A[j] = j;
F[j] = -j;
}
starpu_data_handle_t vector_handleA, vector_handleB, vector_handleC, vector_handleD, vector_handleE, vector_handleF;
/* register vector in starpu */
starpu_vector_data_register(&vector_handleA, STARPU_MAIN_RAM, (uintptr_t)A, NX, sizeof(double));
starpu_vector_data_register(&vector_handleB, -1, (uintptr_t) NULL, NX, sizeof(double));
starpu_vector_data_register(&vector_handleC, -1, (uintptr_t) NULL, NX, sizeof(double));
starpu_vector_data_register(&vector_handleD, -1, (uintptr_t) NULL, NX, sizeof(double));
starpu_vector_data_register(&vector_handleE, -1, (uintptr_t) NULL, NX, sizeof(double));
starpu_vector_data_register(&vector_handleF, STARPU_MAIN_RAM, (uintptr_t)F, NX, sizeof(double));
/* copy vector A->B, B->C... */
starpu_data_cpy(vector_handleB, vector_handleA, 0, NULL, NULL);
starpu_data_cpy(vector_handleC, vector_handleB, 0, NULL, NULL);
starpu_data_cpy(vector_handleD, vector_handleC, 0, NULL, NULL);
starpu_data_cpy(vector_handleE, vector_handleD, 0, NULL, NULL);
starpu_data_cpy(vector_handleF, vector_handleE, 0, NULL, NULL);
/* StarPU does not need to manipulate the array anymore so we can stop
* monitoring it */
/* free them */
starpu_data_unregister(vector_handleA);
starpu_data_unregister(vector_handleB);
starpu_data_unregister(vector_handleC);
starpu_data_unregister(vector_handleD);
starpu_data_unregister(vector_handleE);
starpu_data_unregister(vector_handleF);
/* check if computation is correct */
int try = 1;
for (j = 0; j < NX; ++j)
if (A[j] != F[j])
{
printf("Fail A %f != F %f \n", A[j], F[j]);
try = 0;
}
/* free last vectors */
starpu_free_flags(A, NX*sizeof(double), STARPU_MALLOC_COUNT);
starpu_free_flags(F, NX*sizeof(double), STARPU_MALLOC_COUNT);
/* terminate StarPU, no task can be submitted after */
if(try)
FPRINTF(stderr, "TEST SUCCESS\n");
else
FPRINTF(stderr, "TEST FAIL\n");
return (try ? EXIT_SUCCESS : EXIT_FAILURE);
enodev:
return 77;
enoent:
return 77;
}

Examples: disk_compute

/* Try to write into disk memory
* Use mechanism to push datas from main ram to disk ram
*/
#include <starpu.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <math.h>
#define NX (1024)
int main(int argc, char **argv)
{
/* Initialize StarPU with default configuration */
int ret = starpu_init(NULL);
if (ret == -ENODEV) goto enodev;
/* Initialize path and name */
char pid_str[16];
int pid = getpid();
snprintf(pid_str, 16, "%d", pid);
const char *name_file_start = "STARPU_DISK_COMPUTE_DATA_";
const char *name_file_end = "STARPU_DISK_COMPUTE_DATA_RESULT_";
char * path_file_start = malloc(strlen(base) + 1 + strlen(name_file_start) + 1);
strcpy(path_file_start, base);
strcat(path_file_start, "/");
strcat(path_file_start, name_file_start);
char * path_file_end = malloc(strlen(base) + 1 + strlen(name_file_end) + 1);
strcpy(path_file_end, base);
strcat(path_file_end, "/");
strcat(path_file_end, name_file_end);
/* register a disk */
int new_dd = starpu_disk_register(&starpu_disk_unistd_ops, (void *) base, 1024*1024*1);
/* can't write on /tmp/ */
if (new_dd == -ENOENT) goto enoent;
unsigned dd = (unsigned) new_dd;
printf("TEST DISK MEMORY \n");
/* Imagine, you want to compute datas */
int *A;
int *C;
starpu_malloc_flags((void **)&A, NX*sizeof(int), STARPU_MALLOC_COUNT);
starpu_malloc_flags((void **)&C, NX*sizeof(int), STARPU_MALLOC_COUNT);
unsigned int j;
/* you register them in a vector */
for(j = 0; j < NX; ++j)
{
A[j] = j;
C[j] = 0;
}
/* you create a file to store the vector ON the disk */
FILE * f = fopen(path_file_start, "wb+");
if (f == NULL)
goto enoent2;
/* store it in the file */
fwrite(A, sizeof(int), NX, f);
/* close the file */
fclose(f);
/* create a file to store result */
f = fopen(path_file_end, "wb+");
if (f == NULL)
goto enoent2;
/* replace all datas by 0 */
fwrite(C, sizeof(int), NX, f);
/* close the file */
fclose(f);
/* And now, you want to use your datas in StarPU */
/* Open the file ON the disk */
void * data = starpu_disk_open(dd, (void *) name_file_start, NX*sizeof(int));
void * data_result = starpu_disk_open(dd, (void *) name_file_end, NX*sizeof(int));
starpu_data_handle_t vector_handleA, vector_handleC;
/* register vector in starpu */
starpu_vector_data_register(&vector_handleA, dd, (uintptr_t) data, NX, sizeof(int));
/* and do what you want with it, here we copy it into an other vector */
starpu_vector_data_register(&vector_handleC, dd, (uintptr_t) data_result, NX, sizeof(int));
starpu_data_cpy(vector_handleC, vector_handleA, 0, NULL, NULL);
/* free them */
starpu_data_unregister(vector_handleA);
starpu_data_unregister(vector_handleC);
/* close them in StarPU */
starpu_disk_close(dd, data, NX*sizeof(int));
starpu_disk_close(dd, data_result, NX*sizeof(int));
/* check results */
f = fopen(path_file_end, "rb+");
if (f == NULL)
goto enoent;
/* take datas */
int size = fread(C, sizeof(int), NX, f);
/* close the file */
fclose(f);
int try = 1;
for (j = 0; j < NX; ++j)
if (A[j] != C[j])
{
printf("Fail A %d != C %d \n", A[j], C[j]);
try = 0;
}
unlink(path_file_start);
unlink(path_file_end);
free(path_file_start);
free(path_file_end);
/* terminate StarPU, no task can be submitted after */
if(try)
printf("TEST SUCCESS\n");
else
printf("TEST FAIL\n");
return (try ? EXIT_SUCCESS : EXIT_FAILURE);
enodev:
return 77;
enoent2:
enoent:
unlink(path_file_start);
unlink(path_file_end);
free(path_file_start);
free(path_file_end);
return 77;
}

Performances

Scheduling heuristics for Out-of-core are still relatively experimental. The tricky part is that you usually have to find a compromise between privileging locality (which avoids back and forth with the disk) and privileging the critical path, i.e. taking into account priorities to avoid lack of parallelism at the end of the task graph.

It is notably better to avoid defining different priorities to tasks with low priority, since that will make the scheduler want to schedule them by levels of priority, at the depense of locality.

The scheduling algorithms worth trying are thus dmdar and lws, which privilege data locality over priorities. There will be work on this area in the coming future.

Disk functions

There are various ways to operate a disk memory node, described by the structure starpu_disk_ops. For instance, the variable starpu_disk_unistd_ops uses read/write functions.

All structures are in Out Of Core.