1	Adding a new function to Simple Parallel R INTerface (SPRINT)
2	=============================================================
3	
4	SPRINT is a framework for make parallel algorithms available to R users.
5	It is designed to be relatively easy to extend.
6	
7	SPRINT is made up of two components: R<->SPRINT interface and the compute
8	cluster itself. The two communicate via files.
9	
10	     Processor 0                Processors 1-n
11	+-------------------+     +-------------------------+
12	| +---------------+ |     |    +---------------+    |
13	| |       R       | |     |    |       R       |    |
14	| |               | |     |    |               |    |
15	| +---------------+ |     |    +---------------+    |
16	| | SPRINT-R stub | |     |  | Wait for cmd code |  |
17	| +-----------------------------------------------+ |
18	| |                    SPRINT                     | |
19	| |     +-----------------------------------+     | |
20	| |     |               ptest               |     | |
21	| |     |               pcor                |     | |
22	| |     +-----------------------------------+     | |
23	| +-----------------------------------------------+ |
24	+-------------------+     +-------------------------+
25	
26	When the user starts R with SPRINT loaded, the compute cluster goes into
27	a wait state. When R reaches a function that is in SPRINT, the SPRINT-R stub
28	sends a command to SPRINT via MPI. The MPI message contains an enumeration
29	code that represents a function, which forces SPRINT to wake up and execute
30	that function.
31	
32	The idea behind SPRINT is to allow parallel processing of data from within R
33	without being restrained by R. Functions created in SPRINT should have
34	similar interfaces to the serial R equivalent.
35	
36	Data required by the parallelised funciton is also passed via MPI. The creator
37	of the function is responsible for creating that data flow.
38	
39	Afterwards, the funciton created is also responsible for passing the
40	data back to R. This does not have to be the result of the processing,
41	it could be a file handle, or a simple error code. However, bear in
42	mind that the parallelised funciton should match the funcitonality of
43	the original R function as much as possible.
44	
45	This document develops an example SPRINT function, which requires:
46	 o creating an R stub (such that R can call the function)
47	 o the C equivalent of the R stub
48	 o the function to run in the computer cluster
49	 o finally, connecting the different parts
50	 
51	SPRINT is organised into the following directory structure:
52	
53	 / -- root dir        Contains configure scripts, etc.
54	   |
55	   |- exec           
56	   |- inst
57	   |- man
58	   |- po              Contains translation (not used).
59	   |- R               Contains the R stubs (see 1. below)
60	   |- src             Functions header files and sprint itself.
61	      |               The source code for the R<->sprint interface (also called sprint!) here.
62	      |- sprint       The source code the compute farm executable, sprint
63	      |- algorithms   Where you place your new functions in a directory
64	         |- pcor   
65	            |- implementation
66	            |- interface
67	         |- ptest
68	            |- implementation
69	            |- interface
70	         |- pFunction
71	            |- implementation
72	            |- interface
73	   |- tests
74	
75	1. Create a Stub in R
76	---------------------
77	
78	Add a file in the "R" directory to perform any appropriate actions (in the R
79	domain), then call the underlying C. Appropriate actions may include parameter
80	sanity checking and other housekeeping.
81	
82	We start off with two R functions which call the same backend function
83	with different parameters.
84	
	--- R/pexample.R --------------------------------------------------------------
	pexample<-function()
	{
	
	    .Call("pexample")
	}
	-------------------------------------------------------------------------------
	
93	2. Add the interface function
94	-----------------------------
95	
96	These are the C functions which are called by the R stubs. Like the R stubs
97	which they mirror they are likely to perform argument checking and general
98	housekeeping. For each new function create a directory in the "algorithms"
99	directory and add directories "implementation" and "interface". This function
100	lives in the "interface" directory and most won't need much editing, apart from the
101	commandCode
102	
	--- src/algorithms/pexample/interfaces/pexample.c -------------------------------------------------
	#include <stdout.h>
	#include <Rdefines.h>
	#include "../sprint.h"
	#include "pexample.h"
	
	// note that all data from R is of type SEXP
	SEXP pexample()
	{
	    SEXP result;
    	    int response;
	    enum commandCodes commandCode;
	    int message = 10;
	 
	    MPI_Initialized(&response);
	    if (response) {
	        DEBUG("MPI is init'ed in ptest\n");
	    } else {
	        DEBUG("MPI is NOT init'ed in ptest\n");
	    }
	   
	    // broadcast command to other processors
	    commandCode = PEXAMPLE;
	    MPI_Bcast(&commandCode, 1, MPI_INTEGER, 0, MPI_COMM_WORLD);
	
	    response = example(1,message);
	 
    	    result = NEW_NUMERIC(response);
	   
	    return result;
	}
134	-------------------------------------------------------------------------------
135	
136	--- pexample.h ----------------------------------------------------------------
	#ifndef _INTERFACE_PEXAMPLE_H
	#define _INTERFACE_PEXAMPLE_H
	
	// anything you want
	
	#endif
143	-------------------------------------------------------------------------------
144	
145	3. Implement the main function
146	-------------------------------
147	
148	These functions, again written in C, implement the actual parallel algorithm
149	you are interested in performing. They will make use of MPI for communication
150	and perform some useful work. This functions runs in the compute cluster.
151	
152	This file can be saved in the "implementation" directory for your function.
153	
154	--- algorithms/pexample/implementation/example.c ---------------------------------------------
    #include <stdio.h>
    #include "mpi.h"
	
	
	int example(int message)
	{
	    int result = 1;
	    int pool, rank;
	   
	    MPI_Comm_size(MPI_COMM_WORLD, &pool);
	    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
	   
	    LOG(stdout, "Process %i of %i says '%d'\n", rank, pool, message);
	   
	    result = 0;
	   
	    return result;
	}
	-------------------------------------------------------------------------------
174	
175	4. Connecting the R/C stubs to the compute cluster
176	---------------------------------------------------
177	
178	Declare the new functions and include them in the command code list.
179	
--- src/functions.h -----------------------------------------------------------
	/** Lists all the functions available, ensure that TERMINATE is first and
    	*  LAST is last
	**/
	enum commandCodes {TERMINATE = 0, TEST, EXAMPLE, LAST};
	
-------------------------------------------------------------------------------
187	
188	Then add them to the look-up table in common/functions.c.  These
189	are extern function with variable number of arguments. The functions also
190	need adding to the look-up table in the *same order* as the enumeration.
191	
--- src/common/algorithms/functions.c ------------------------------------------------------
	#include <stdio.h>
	#include "../functions.h"
	
	/**
	 * Declare the various command functions as external
	 **/
	extern int test(int n,...);
	extern int example(int n,...);
	
	/**
	 * This is a dummy operation which can be used where a command code exists
     * but does not represent a useful function.
	 **/
	
	int voidCommand()
	{
	    printf("Void command called, I would not expect this.\n");
	
	    return 1;
	}

	/**
	 * This array of function pointers ties up with the commandCode enumeration.
	 **/
	commandFunction commandLUT[] = {voidCommand, \
	                                test, \
	                                example, \
	                                voidCommand};
	-------------------------------------------------------------------------------
222	
223	Update the NAMESPACE file so R can find the new functions:
224	
--- NAMESPACE -----------------------------------------------------------------
	# Namespace file for sprint
	
	useDynLib(sprint)
	
	export(ptest)
	export(pexample)
-------------------------------------------------------------------------------
Finally, include the object files to the Makefile as it is done with previous functions.