- take the two sorted arrays A & B (assume both are of size n) and make partitions of size log n in one of them, let's say A
- Now considering there are n/logn processors, assign each partition to a processor. On each partition take the last element (l) and do a binary search to find a cut point in the other array B such that all elements in B are <= l. Cut points of two such partitions in A correspond to a partition in B which can then be sequentially merged by the processor.
Span is O(log n); Work is O(n); so parallelism is O(n/logn)
Detailed information here [1]