MADNESS 0.10.1
Public Member Functions | Private Member Functions | Private Attributes | List of all members
madness::SystolicMatrixAlgorithm< T > Class Template Referenceabstract

Base class for parallel algorithms that employ a systolic loop to generate all row pairs in parallel. More...

#include <systolic.h>

Inheritance diagram for madness::SystolicMatrixAlgorithm< T >:
Inheritance graph
[legend]
Collaboration diagram for madness::SystolicMatrixAlgorithm< T >:
Collaboration graph
[legend]

Public Member Functions

 SystolicMatrixAlgorithm (DistributedMatrix< T > &A, int tag, int nthread=ThreadPool::size()+1)
 A must be a column distributed matrix with an even column tile >= 2.
 
virtual ~SystolicMatrixAlgorithm ()
 
virtual bool converged (const TaskThreadEnv &env) const =0
 Invoked simultaneously by all threads after each sweep to test for convergence.
 
virtual void end_iteration_hook (const TaskThreadEnv &env)
 Invoked by all threads at the end of each iteration before convergence test.
 
int64_t get_coldim () const
 Returns length of column.
 
ProcessID get_rank () const
 Returns rank of this process in the world.
 
int64_t get_rowdim () const
 Returns length of row.
 
Worldget_world () const
 Returns a reference to the world.
 
virtual void kernel (int i, int j, T *rowi, T *rowj)=0
 Threadsafe routine to apply the operation to rows i and j of the matrix.
 
void run (World &world, const TaskThreadEnv &env)
 Invoked by the task queue to run the algorithm with multiple threads.
 
void solve_sequential ()
 Invoked by the user to run the algorithm with one thread mostly for debugging.
 
virtual void start_iteration_hook (const TaskThreadEnv &env)
 Invoked by all threads at the start of each iteration.
 
- Public Member Functions inherited from madness::TaskInterface
 TaskInterface (const TaskAttributes &attr)
 Create a new task with zero dependencies and given attributes.
 
 TaskInterface (int ndepend, const char *caller, const TaskAttributes attr=TaskAttributes())
 
 TaskInterface (int ndepend=0, const TaskAttributes attr=TaskAttributes())
 Create a new task with ndepend dependencies (default 0) and given attributes.
 
virtual ~TaskInterface ()
 
Worldget_world () const
 
virtual void run (World &)
 Runs a single-threaded task ... derived classes must implement this.
 
- Public Member Functions inherited from madness::PoolTaskInterface
 PoolTaskInterface ()
 Default constructor.
 
 PoolTaskInterface (const TaskAttributes &attr)
 
virtual ~PoolTaskInterface ()=default
 Destructor.
 
void execute ()
 
void set_nthread (int nthread)
 Call this to reset the number of threads before the task is submitted.
 
- Public Member Functions inherited from madness::TaskAttributes
 TaskAttributes (const TaskAttributes &attr)
 Copy constructor.
 
 TaskAttributes (unsigned long flags=0)
 Sets the attributes to the desired values.
 
virtual ~TaskAttributes ()
 
int get_nthread () const
 Get the number of threads.
 
bool is_generator () const
 Test if the generator attribute is true.
 
bool is_high_priority () const
 Test if the high priority attribute is true.
 
bool is_stealable () const
 Test if the stealable attribute is true.
 
template<typename Archive >
void serialize (Archive &ar)
 Serializes the attributes for I/O.
 
TaskAttributesset_generator (bool generator_hint)
 Sets the generator attribute.
 
TaskAttributesset_highpriority (bool hipri)
 Sets the high priority attribute.
 
void set_nthread (int nthread)
 Set the number of threads.
 
TaskAttributesset_stealable (bool stealable)
 Sets the stealable attribute.
 
- Public Member Functions inherited from madness::DependencyInterface
 DependencyInterface (int ndep, const char *caller)
 
 DependencyInterface (int ndep=0)
 
virtual ~DependencyInterface ()
 Destructor.
 
void dec ()
 Decrement the number of dependencies and invoke the callback if ndepend==0.
 
void dec_debug (const char *caller)
 
void inc ()
 Increment the number of dependencies.
 
void inc_debug (const char *caller)
 Same as inc(), but keeps track of caller; calling dec_debug() will signal error if no matching inc_debug() had been invoked

 
int ndep () const
 Returns the number of unsatisfied dependencies.
 
void notify ()
 Invoked by callbacks to notify of dependencies being satisfied.
 
void notify_debug (const char *caller)
 Overload of CallbackInterface::notify_debug(), updates dec()
 
bool probe () const
 Returns true if ndepend == 0 (no unsatisfied dependencies).
 
void register_callback (CallbackInterface *callback)
 Registers a callback that will be executed when ndepend==0; immediately invoked if ndepend==0.
 
void register_final_callback (CallbackInterface *callback)
 Registers the final callback to be executed when ndepend==0; immediately invoked if ndepend==0.
 
- Public Member Functions inherited from madness::CallbackInterface
virtual ~CallbackInterface ()
 

Private Member Functions

void cycle ()
 Cycles data around the loop ... only one thread should invoke this.
 
virtual void get_id (std::pair< void *, unsigned short > &id) const
 Get the task id.
 
void iteration (const int nthread)
 
void unshuffle ()
 Call this after iterating to restore correct order of rows in original matrix.
 

Private Attributes

DistributedMatrix< T > & A
 
const int64_t coldim
 A(coldim,rowdim)
 
std::vector< T * > iptr
 
std::vector< T * > jptr
 Indirection for implementing cyclic buffer !! SHOULD BE VOLATILE ?????
 
std::vector< int64_t > map
 Used to keep track of actual row indices.
 
const int64_t nlocal
 No. of local pairs.
 
const int64_t nproc
 No. of processes with rows of the matrix (not size of world)
 
const ProcessID rank
 Rank of current process.
 
const int64_t rowdim
 A(coldim,rowdim)
 
const int tag
 MPI tag to be used for messages.
 

Additional Inherited Members

- Static Public Member Functions inherited from madness::TaskAttributes
static TaskAttributes generator ()
 
static TaskAttributes hipri ()
 
static TaskAttributes multi_threaded (int nthread)
 
- Static Public Attributes inherited from madness::TaskInterface
static bool debug = false
 
- Static Public Attributes inherited from madness::TaskAttributes
static const unsigned long GENERATOR = 1ul<<8
 Mask for generator bit.
 
static const unsigned long HIGHPRIORITY = GENERATOR<<2
 Mask for priority bit.
 
static const unsigned long NTHREAD = 0xff
 Mask for nthread byte.
 
static const unsigned long STEALABLE = GENERATOR<<1
 Mask for stealable bit.
 
- Protected Member Functions inherited from madness::TaskInterface
virtual void run (const TaskThreadEnv &env)
 Override this method to implement a multi-threaded task.
 
- Protected Member Functions inherited from madness::CallbackInterface
virtual void notify_debug_impl (const char *caller)
 
- Static Protected Member Functions inherited from madness::PoolTaskInterface
template<typename fnobjT >
static std::enable_if<!(detail::function_traits< fnobjT >::value||detail::memfunc_traits< fnobjT >::value)>::type make_id (std::pair< void *, unsigned short > &id, const fnobjT &)
 
template<typename fnT >
static std::enable_if< detail::function_traits< fnT >::value||detail::memfunc_traits< fnT >::value >::type make_id (std::pair< void *, unsigned short > &id, fnT fn)
 

Detailed Description

template<typename T>
class madness::SystolicMatrixAlgorithm< T >

Base class for parallel algorithms that employ a systolic loop to generate all row pairs in parallel.

Constructor & Destructor Documentation

◆ SystolicMatrixAlgorithm()

template<typename T >
madness::SystolicMatrixAlgorithm< T >::SystolicMatrixAlgorithm ( DistributedMatrix< T > &  A,
int  tag,
int  nthread = ThreadPool::size()+1 
)
inline

A must be a column distributed matrix with an even column tile >= 2.

It is assumed that it is the main thread invoking this.

Parameters
[in,out]AThe matrix on which the algorithm is performed and modified in-place
[in]tagThe MPI tag used for communication (obtain from world.mpi.comm().unique_tag() )
[in]nthreadThe number of local threads to use (default is main thread all threads in the pool)

References madness::SystolicMatrixAlgorithm< T >::coldim, madness::SystolicMatrixAlgorithm< T >::iptr, madness::SystolicMatrixAlgorithm< T >::jptr, lo, MADNESS_ASSERT, madness::SystolicMatrixAlgorithm< T >::map, madness::SystolicMatrixAlgorithm< T >::nlocal, madness::SystolicMatrixAlgorithm< T >::nproc, p(), madness::SystolicMatrixAlgorithm< T >::rank, and madness::TaskAttributes::set_nthread().

◆ ~SystolicMatrixAlgorithm()

template<typename T >
virtual madness::SystolicMatrixAlgorithm< T >::~SystolicMatrixAlgorithm ( )
inlinevirtual

Member Function Documentation

◆ converged()

template<typename T >
virtual bool madness::SystolicMatrixAlgorithm< T >::converged ( const TaskThreadEnv env) const
pure virtual

Invoked simultaneously by all threads after each sweep to test for convergence.

There is a thread barrier before and after the invocation of this routine

Parameters
[in]envThe madness thread environment in case synchronization between threads is needed during computation of the convergence condition.

Implemented in madness::SystolicFixOrbitalOrders, madness::SystolicPMOrbitalLocalize, and TestSystolicMatrixAlgorithm< T >.

Referenced by madness::SystolicMatrixAlgorithm< T >::run().

◆ cycle()

template<typename T >
void madness::SystolicMatrixAlgorithm< T >::cycle ( )
inlineprivate

◆ end_iteration_hook()

template<typename T >
virtual void madness::SystolicMatrixAlgorithm< T >::end_iteration_hook ( const TaskThreadEnv env)
inlinevirtual

Invoked by all threads at the end of each iteration before convergence test.

There is a thread barrier before and after the invocation of this routine. Note that the converged() method is const whereas this can modify the class.

Parameters
[in]envThe madness thread environment in case synchronization between threads is needed during startup.

Reimplemented in madness::SystolicFixOrbitalOrders, and madness::SystolicPMOrbitalLocalize.

Referenced by madness::SystolicMatrixAlgorithm< T >::iteration().

◆ get_coldim()

template<typename T >
int64_t madness::SystolicMatrixAlgorithm< T >::get_coldim ( ) const
inline

Returns length of column.

References madness::SystolicMatrixAlgorithm< T >::coldim.

◆ get_id()

template<typename T >
virtual void madness::SystolicMatrixAlgorithm< T >::get_id ( std::pair< void *, unsigned short > &  id) const
inlineprivatevirtual

Get the task id.

Parameters
idThe id to set for this task

Reimplemented from madness::PoolTaskInterface.

References madness::PoolTaskInterface::make_id().

◆ get_rank()

template<typename T >
ProcessID madness::SystolicMatrixAlgorithm< T >::get_rank ( ) const
inline

Returns rank of this process in the world.

References madness::SystolicMatrixAlgorithm< T >::rank.

◆ get_rowdim()

template<typename T >
int64_t madness::SystolicMatrixAlgorithm< T >::get_rowdim ( ) const
inline

Returns length of row.

References madness::SystolicMatrixAlgorithm< T >::rowdim.

◆ get_world()

template<typename T >
World & madness::SystolicMatrixAlgorithm< T >::get_world ( ) const
inline

◆ iteration()

template<typename T >
void madness::SystolicMatrixAlgorithm< T >::iteration ( const int  nthread)
inlineprivate

◆ kernel()

template<typename T >
virtual void madness::SystolicMatrixAlgorithm< T >::kernel ( int  i,
int  j,
T rowi,
T rowj 
)
pure virtual

Threadsafe routine to apply the operation to rows i and j of the matrix.

Parameters
[in]iFirst row index in the matrix
[in]jSecond row index in the matrix
[in]rowiPointer to row i of the matrix (to be modified by kernel in-place)
[in]rowjPointer to row j of the matrix (to be modified by kernel in-place)

Implemented in TestSystolicMatrixAlgorithm< T >.

Referenced by madness::SystolicMatrixAlgorithm< T >::iteration().

◆ run()

template<typename T >
void madness::SystolicMatrixAlgorithm< T >::run ( World world,
const TaskThreadEnv env 
)
inlinevirtual

Invoked by the task queue to run the algorithm with multiple threads.

This is a collective call ... all processes in world should submit this task

Reimplemented from madness::TaskInterface.

References madness::SystolicMatrixAlgorithm< T >::converged(), madness::SystolicMatrixAlgorithm< T >::iteration(), madness::TaskThreadEnv::nthread(), and madness::SystolicMatrixAlgorithm< T >::unshuffle().

Referenced by madness::SystolicMatrixAlgorithm< T >::solve_sequential().

◆ solve_sequential()

template<typename T >
void madness::SystolicMatrixAlgorithm< T >::solve_sequential ( )
inline

Invoked by the user to run the algorithm with one thread mostly for debugging.

This is a collective call ... all processes in world should call this routine.

References madness::SystolicMatrixAlgorithm< T >::run().

◆ start_iteration_hook()

template<typename T >
virtual void madness::SystolicMatrixAlgorithm< T >::start_iteration_hook ( const TaskThreadEnv env)
inlinevirtual

Invoked by all threads at the start of each iteration.

There is a thread barrier before and after the invocation of this routine

Parameters
[in]envThe madness thread environment in case synchronization between threads is needed during startup.

Reimplemented in madness::SystolicFixOrbitalOrders, madness::SystolicPMOrbitalLocalize, and TestSystolicMatrixAlgorithm< T >.

Referenced by madness::SystolicMatrixAlgorithm< T >::iteration().

◆ unshuffle()

template<typename T >
void madness::SystolicMatrixAlgorithm< T >::unshuffle ( )
inlineprivate

Call this after iterating to restore correct order of rows in original matrix.

At the end of each iteration the matrix rows are logically back in their correct order. However, due to indirection to reduce data motion, if the local column dimension is not a factor of the number of cycles the underlying data may be in a different order. This restores sanity.

Only one thread should invoke this routine

References madness::SystolicMatrixAlgorithm< T >::coldim, madness::BaseTensor::dims(), madness::SystolicMatrixAlgorithm< T >::iptr, madness::SystolicMatrixAlgorithm< T >::jptr, madness::SystolicMatrixAlgorithm< T >::nlocal, madness::SystolicMatrixAlgorithm< T >::nproc, madness::Tensor< T >::ptr(), madness::SystolicMatrixAlgorithm< T >::rank, madness::SystolicMatrixAlgorithm< T >::rowdim, madness::BaseTensor::size(), and T().

Referenced by madness::SystolicMatrixAlgorithm< T >::run().

Member Data Documentation

◆ A

template<typename T >
DistributedMatrix<T>& madness::SystolicMatrixAlgorithm< T >::A
private

◆ coldim

template<typename T >
const int64_t madness::SystolicMatrixAlgorithm< T >::coldim
private

◆ iptr

template<typename T >
std::vector<T*> madness::SystolicMatrixAlgorithm< T >::iptr
private

◆ jptr

template<typename T >
std::vector<T*> madness::SystolicMatrixAlgorithm< T >::jptr
private

◆ map

template<typename T >
std::vector<int64_t> madness::SystolicMatrixAlgorithm< T >::map
private

◆ nlocal

template<typename T >
const int64_t madness::SystolicMatrixAlgorithm< T >::nlocal
private

◆ nproc

template<typename T >
const int64_t madness::SystolicMatrixAlgorithm< T >::nproc
private

◆ rank

template<typename T >
const ProcessID madness::SystolicMatrixAlgorithm< T >::rank
private

◆ rowdim

template<typename T >
const int64_t madness::SystolicMatrixAlgorithm< T >::rowdim
private

◆ tag

template<typename T >
const int madness::SystolicMatrixAlgorithm< T >::tag
private

MPI tag to be used for messages.

Referenced by madness::SystolicMatrixAlgorithm< T >::cycle().


The documentation for this class was generated from the following file: