Writing vectorial operation such as v = L.applyTo(u) carries an abstraction penalty due to the creation and copy of the returned array. Optimization of the above expression would cause it to be written as L.applyTo(u,v). This would move towards a functional rather than object-oriented style of programming and would decrease the readability of the code since it is not immediately obvious which of u and v are the input and output array.
An implementation of a "disposable" array is proposed which reduces the abstraction penalty by eliminating unnecessary copying of the returned array. This allows vectorial operations and more generally functions returning an array to be written in the more natural style.
The application of a tridiagonal operator to an array is currently implemented as shown in the following code.
Array TridiagonalOperator::applyTo(const Array& v) const { // preconditions - omitted for brevity // 1. allocate space for the result array int N = size(); Array result(N); // matricial product - omitted for brevity // boundary conditions are applied - omitted for brevity // 2. the array is returned by copy return result; }
The above method is then used, i.e., in an explicit finite-difference scheme as following:
// an initial condition is chosen Array v = ...; // the scheme is used for (int i=0; i<timeSteps; i++) { // the operator is applied // 3. the returned array is copied back into the input v = L.applyTo(v); }
At each step of the for loop, 1, 2 and 3 are executed. This results in:
Before implementing the proposal in this QuEP, the above can be already improved by making the result array a mutable data member of the operator and have applyTo return it by const reference. This would result in:
A disposable or temporary array class can be implemented which inherits from Array and redefines its copy semantics. A copy constructor and assignment operator would be added to Array which copy a disposable array with the same semantics, namely, the inner pointer to the allocated storage space and the corresponding informations are simply swapped between the array and the disposable array. After such operation the former will contain the data previously stored in the disposable array, while the latter will contain the data previously stored in the array which will be no longer used. This inexpensive pointer-copy operation will hereafter be referred to as shallow copy, as opposed to the usual deep copy in which each element of the source array is copied to the corresponding element of the target.
A draft implementation of such TempArray can be written as follows.
class Array { // This is needed for TempArray to access the Array data members. // Alternatively, they could be declared protected instead // of private. friend class TempArray; public: // constructors Array(const Array& a) { /* deep copy as usual */ } Array(const TempArray& a) { // shallow copy std::swap(pointer_,a.pointer_); std::swap(n_,a.n_); std::swap(bufferSize)_,a.bufferSize_); } // assignment operators Array& operator=(const Array& a) { /* deep copy as usual */ } Array& operator=(const TempArray& a) { /* shallow copy as above */ } // etc. }; class TempArray : public Array { public: // constructors TempArray(const TempArray& a) { /* shallow copy */ } TempArray(const Array& a) { /* deep copy - a is not disposable! */ } // assignment operators TempArray& operator=(const TempArray& a) { /* shallow copy */ } TempArray& operator=(const Array& a) { /* deep copy */ } // allow for resizing after having been invalidated by a copy void resize(size_t n); };
The tridiagonal operator application could now be written as:
TempArray TridiagonalOperator::applyTo(const Array& v) const { // 1. allocate space for the result array int N = size(); TempArray result(N); // matricial product - omitted for brevity // boundary conditions are applied - omitted for brevity // 2. the disposable array is returned by (shallow) copy return result; } // an initial condition is chosen Array v = ...; // the scheme is used for (int i=0; i<timeSteps; i++) { // 3. the returned array is (shallowly) copied back into the input v = L.applyTo(v); }
This would result in:
Moreover, the use of the disposable array can be combined with the previously introduced technique of storing the result array as a mutable data member. The implementation of applyTo would be in this case:
TempArray& TridiagonalOperator::applyTo(const Array& v) const { // 1. make sure that the disposable array is of the right size // since it could have been invalidated by a copy int N = size(); result_.resize(N); // matricial product - omitted for brevity // boundary conditions are applied - omitted for brevity // 2. the disposable array is returned by reference return result; } // an initial condition is chosen Array v = ...; // the scheme is used for (int i=0; i<timeSteps; i++) { // 3. the returned array is (shallowly) copied back into the input v = L.applyTo(v); }
This would result in:
The above seems perfectly fine on paper even if not really cuspy. I implemented it and had it working in my previous life as a physicists. The only care to be taken while coding is to use disposable arrays as such---i.e., as temporary objects to be used for storing values to be immediately passed to a "real" array and not as persistent storage.
However, unlikely as it seems to me, I am not entirely sure that it could not be broken by some optimizing compiler trying to do things too smart for its own good. Does anyone know something of the inner working of such beasts? Could (s)he give any informed advice of this?
Finally, I would not advise to run around waving one's hands in the air and shouting in joy at the prospect of huge performance gains. The disposable array trick merely takes away some abstraction penalty from functions returning an array. It won't make the implementation of the corresponding algorithms any faster, and therefore the performance gain will be small if returning the array by value was not a large part of the total function time.
A disposable array implementation was proposed which can in principle remove the abstraction penalty imposed by returning an array from a function.
This can allow one to give such function a more natural interface than the one requesting the output array to be previously allocated and passed to the function.
Feedback on the above proposal should be posted to the QuantLib-dev mailing list.