The problem is expressed as a collection of independent tasks (i.e. autonomous agents) operating on a large data set (i.e. a central repository),
The solution involves efficiently managing all accesses by the agents while maintaining data consistency, a task can be the execution of an agent, or the operation where each agent is updating the repository.
The problem is expressed in terms of searching a solution space to make a decision or find an optimal solution
The solution involves dividing the search space into sub-spaces and efficiently bound and prune the search spaces
The problem is expressed as a two-phase program with a phase-1 mapping a single function onto independent sets of data and a phase-2 reducing the result from each set of data into a final answer
The solution is composed of the efficient distribution and load balancing of the mapping phase, and efficiently collecting the results in the reducing phase
The problem is expressed in terms of a set of transformations to a stream of data
The solution involves a clean partitioning of data transformations in “filters” and data transportation through “pipes” such that data transformations on different sections of the stream can happen concurrently
The problem is expressed in terms of natural sub-problems where by optimally solving a sequence of local problems, one can arrive at a globally optimal solution
The solution involves exploiting this property to efficient construction and execution of local problems and assembling them to arrive at the global optimal solution
The problem is expressed as models to help understand noisy real-world measurements
The solution involves measuring real-world input and use statistical inference to fit the measurement to a model (represented by a graph denoting the conditional independence structure between random variables) to help understand it
The problem is expressed in terms of statistical sampling of solution space with experiments using different parameter settings
The solution involves the efficient construction and execution of experiments and efficient analysis of the results to arrive at a useful approximation of the solution
The problem is expressed as a set of tasks operating on data
The solution involves determining an efficient granularity of tasks to avoid excessive task management overhead and an efficient task abstraction to strike a balance on the amount of task interactions vs. total work done
The problem is expressed as a sequential traversal through a data structure in some sequence
The solution involves determining a parallelization of the problem that involve extra work in handling the parallelism
The problem is expressed as the challenge of updating a piece of memory (summary record) by one process/thread (experiment) at a time
The solution is composed of software locks and semaphore, or explicit hardware support to manage the exclusivity of accesses/updates
The problem is expressed as the challenge of efficiently executing instruction sequences in lock step across lanes (for multiple problems) to amortize instruction management overheads
The solution is composed of efficient control flow and data structure to allow instruction fetch and data fetch overheads to be shared across SIMD lanes.