Sunday, May 18, 2014

Find running median from a stream of integers

Problem:
Given that integers are read from a data stream. Find median of elements read so for in efficient way.

Solution:
We can use a max heap on left side to represent elements that are less than effective median, and a min heap on right side to represent elements that are greater than effective median.

After processing an incoming element, the number of elements in heaps differ at most by 1 element (because we re-balance the heaps if needed). When both heaps contain same number of elements, we pick average of heaps root data as effective median. When the heaps are not balanced, we select effective median from the root of heap containing more elements.


NOTE: Finding running median from a stream of data is a tough problem, and finding an exact solution with memory constraints efficiently is probably impossible for the general case. On the other hand, if the data has some characteristics we can exploit, we can develop efficient specialized solutions. For example, if we know that the data is an integral type, then we can use counting sort, which can give you a constant memory constant time algorithm. Heap based solution is a more general solution because it can be used for other data types (doubles) as well. And finally, if the exact median is not required and an approximation is enough, you can just try to estimate a probability density function for the data and estimate median using that.

Complexity:
time - O(1)
space - O(n) or O(1)
Links and credits:
http://stackoverflow.com/questions/10657503/find-running-median-from-a-stream-of-integers

Sunday, May 4, 2014

Maximum and minimum of an array using minimum number of comparisons

Problem:
Find maximum and minimum of an array using minimum number of comparisons.

Solution:
http://www.geeksforgeeks.org/maximum-and-minimum-in-an-array/

1. Linear: 1 + 2*(n - 2)
2. Divide and conqure: 1.5*n - 2 (if n is a power of 2, otherwise more comparisions are required)
3. Compare in pairs: also ~1.5*n.


Complexity:
time - O(n)
space - O(1) or O(log n) (for method #2)
Links and credits:
http://www.geeksforgeeks.org/maximum-and-minimum-in-an-array/