Friday, 15 June 2012

Interviewstreet Median Challenge Part 2 Solution C++


InterviewStreet Median Challenge 


Problem Statement


The median of M numbers is defined as the middle number after sorting them in order, if M is odd or the average number of the middle 2 numbers (again after sorting) if M is even. Given an empty number list at first. Then you can add or remove some number from the list. For each add or remove operation, output the median of numbers in the list.
Example : For a set of m = 7 numbers, { 9, 2, 8, 4, 1, 3, 10 } the median is the third number in sorted set { 1, 2, 3, 4, 8, 9, 10 } which is 4. Similarly for set of m = 4, { 4, 1, 10, 3 }, the median is the average of second and the third element in the sorted set { 1, 3, 4, 10 } which is (3+4)/2 = 3.5  
Input: The first line contains n, the number of operations. The next n lines is either "a x" or "r x" which indicates the operation is add or remove.
Output: For each operation: output the median after doing the operation. If the operation is remove and the number x is not in the list, output "Wrong!" in a single line. If the result is an integer DO NOT output decimal point. And if the result is a double number, DO NOT output trailing 0s.
Other conditions:   0 < n <= 100,000 and for each "a x" or "r x" , x will fit in 32-bit integer.
Input:
7
r 1
a 2
a 3
a 2
r 2
r 3
r 2
Output:
Wrong!
2
2.5
2
2.5
2
Wrong!
Note: As evident from the last line of the input, if after remove operation the list becomes empty you have to print "Wrong!" ( quotes are for clarity ).


Algorithm

1) Use two multisets - first one will keep small n/2 numbers, minset and other one will keep next n/2 numbers, maxset.
2) For insert operation:
     if the number is greater than max of minset, add it to maxset
     else add it to minset
3) For remove operation:
     if the number is in minset, remove it
     else if number is in maxset, remove it
     else do nothing
4) After every insert/remove operation, adjust size of both sets i.e. make it n/2
     if(size-of-maxset > size-of-minset) remove first element from maxset and insert it in minset
     if(size-of-minset > size-of-maxset + 1) remove last element from minset and insert it in maxset
5) Calculate Median as..
     if(size of both minset and maxset is zero) No median
     else if (size of minset > size of maxset) max of minset is median
     else if (size of minset = size of maxset) avg of max of minset and min of maxset is median

If you have a different approach in mind, please share it in comments section for our readers. 


Solution

// All test cased passed.
#include<iostream>
#include<set>

using namespace std;

int main()
{
int n;
cin>>n;
multiset<int> s1,s2;
int b; char c;
int flag = 0;
long long int median=0;
multiset<int>::iterator it;
for(int i=0;i<n;i++) {
    flag = 0;
    cin >> c >> b;
    if(c=='a') {
       if(s1.size() == 0) s1.insert(b);
       else {
          it = s1.end(); it--;
          int max = *it;
          if(b > max) s2.insert(b);
          else s1.insert(b);
       }
    }
    else if (c == 'r') {
       it = s1.find(b);
       if(it!=s1.end() && *it == b) s1.erase(it);
       else {
          it = s2.find(b);
          if(it!=s2.end() && *it == b) s2.erase(it);
          else flag = 1;
       }
    }
 
    // adjust size(make it n/2) of both two sets
    if(s2.size() > s1.size()) {
       it = s2.begin();
       s1.insert(*it);
       s2.erase(it);  
    }
    if(s1.size() > s2.size()+1) {
       it = s1.end();it--;
       s2.insert(*it);
       s1.erase(it);
    }
  
    // finding median
    if(flag == 1 || (s1.size() == 0 && s2.size() == 0))
      cout << "Wrong!\n";
    else if(s1.size() != s2.size()) {
       if(s1.size() > 0) {
         it = s1.end(); it--;
         cout << *it << endl;
       }
       else if(s2.size() > 0) {
         // program might not enter here
         it = s2.begin();
         cout << *it << endl;
       }
    }
    else if(s1.size()== s2.size()){
       it = s1.end(); it--;
       median = *it;
       it = s2.begin();
       median += *it;
       if(median%2==0)
              printf("%.0lf\n", median/2.);
       else
              printf("%.1lf\n", median/2.);
    }
    else cout << "Wrong!\n";
}





Time Complexity:

Insertion : O(logn)
Deletion : O(logn)
Median : O(1)



View another algorithm for Median problem here:
justprogrammng.blogspot.in/2012/06/interviewstreet-median-challenge.html

10 comments:

  1. I tried this in java. Where the closest that I get for an multiset is PriorityQueue ( where the elements need to be in natural order and should allow duplicates), but that times out :(

    ReplyDelete
  2. @Jithu...yes the problem can be solved using two PriorityQueues as we need to look at extremes of both priority queues and right closest you get for a multiset is PriorityQueue in Java :)

    Why dont you post your solution and we both will figure out what went wrong?

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. @Jithu: reason for time limit error using PriorityQueue is remove(Object ) in PriorityQueue is O(n).
    I'm stil trying this using PriorityQueue. I can understand reason for time limit error but i'm getting wrong answer errors. (embarrassing! for such a simple problem :( )
    I tried different inputs but unable to reproduce wrong error for any set of inputs.. not sure what i'm missing.. i am getting 4 wrong answers and 3 time limit errors.. here is my solution http://dsacodesnippets.blogspot.in/2012/06/median.html

    ReplyDelete
    Replies
    1. When you are calculating your median average it is possible to get an under/over flow if either integers are very large. My solution was not passing the judge for that reason.

      Delete
    2. Pranalee: Use 'long' instead of 'int' for sum as x is 32-bit number but 2x is not. Plus you need to avoid trailing zeroes.

      Use this code snippet:
      long sum = s1.peek() + s2.peek();
      if (sum % 2 == 0)
      System.out.printf("%.0f\n", sum/2.0);
      else
      System.out.printf("%.1f\n", sum/2.0);

      Delete
    3. I have used a sorted List and binary search to find out medians. All tests except 1, 3 and 10 are failing with wrong answers. I'm not sure what the problem is. Please see if you can spot it. TIA. The code is at http://kjspages.blogspot.in/2012/06/public-class-solution-param-args-public.html

      Delete
    4. @Pranalee: What language are you using?

      Removing an object from the python heaps is O(n), but this can be cut down to O(logn), done by using calling the python library function opposed to the heapify.
      The algo manages to run in time but I can't figure out where my program goes wrong, im getting wrong answers for test cases 7,8 and 9.

      Delete
  5. Hi,
    Only 3 of the test cases pass. Please let me know where the error is
    http://ideone.com/KxqNC

    ReplyDelete
  6. passes only 2 test case..any body please explain what's went wrong in this code.

    #include
    #define MAX 100000

    int buff[MAX]={-1};
    int index=0;

    void print_median()
    {
    double j,i;
    int k;
    if(index%2==1)
    printf("%d\n",buff[(index-1)/2]);
    else
    {
    j=(buff[index/2]+buff[(index/2)-1])/2.0f;
    i=j-(int)j;
    k=j;
    if(i!=0.0f)
    printf("%.1lf\n",j);
    else
    printf("%d\n",k);
    }
    }

    void append(int x)
    {
    int j,k;
    if(index==0)
    {
    buff[0]=x;
    index++;
    }
    else if(x>buff[index-1])
    {
    buff[index]=x;
    index++;
    }
    else{
    for(j=0;jj;k--)
    {
    buff[k]=buff[k-1];
    }
    buff[j]=x;
    index++;
    break;
    }

    }
    }
    print_median();
    }

    void rem(int x)
    {
    int i,k=0;
    for(i=0;i<index;i++)
    {
    if(buff[i]==x)
    {
    for(k=i;k<index-1;k++)
    {
    buff[k]=buff[k+1];
    }
    index--;
    buff[index]=-1;
    break;
    }
    }
    if(k==0)
    printf("Wrong!\n");
    else
    print_median();
    }

    int main()
    {
    int n,x;
    char c[2];
    scanf("%d",&n);
    while(n--)
    {
    fflush(stdin);
    scanf("%1s %d",c,&x);
    //printf("\nindex=%d c=%c\n",index,c[0]);
    if(c[0]=='a')
    append(x);
    else
    rem(x);


    }

    return 0;
    }

    ReplyDelete