Wednesday, November 21, 2012

UVA 12532 - Interval Product

http://uva.onlinejudge.org/external/125/12532.html

You are given a sequence of \(N (1 \leqslant N \leqslant 10^{5})\) integers \(X_{1}, X_{2}, \cdots, X_{n}\). You have to perform \(K (1 \leqslant K \leqslant 10^{5})\) operations over this sequence, which can be:

(1) Change an arbitrary value of the sequence.

(2) Given a range \([i, j]\) return if the product \(X_{i}, X_{i+1}, ..., X_{j-1}, X_{j}\) is positive, negative or zero.

As usual we are going to start asking ourselves if it's possible to solve this problem under the 2 seconds time constraint using the brute force approach? The answer is "maybe" after the UVA administrator install the new Quantum Computers servers :), because this is not the case a Brute Force solution is going to timeout.

To solve this problem I am going to describe two approaches, the first one using Segment Tree and the second one using Binary Indexed Trees (BIT):

In case that the reader is not familiar with this data structures I personally recommend the following TopCoder tutorials:
  1. Binary Indexed Trees by boba5551
  2. Range Minimum Query and Lowest Common Ancestor by danielp

(1) Solution using Segment Tree


The main problem of the Brute Force approach is that we need \(O(j - i + 1)\) time to answer each of the second operation queries. Using Segment Trees we can reduce this time to \(O(log N)\) where \(N\) is the size of our sequence.

The main idea is to construct a Segment Tree of the sub-intervals products of our sequence, for example, given the sequence \((-2, 6, 0, -1)\) we got the following tree:

There is just a little issue with this, the values of the given sequence are in the range \(-100 \leqslant X_{i} \leqslant 100\). This implies that we are in risk to get an overflow/underflow for some sequences, an easy one is \((100,100,100,100,100)\), if we execute the second operation over range \([0, 4]\) we got the value \(100^{5}\) that clearly surpass the 32 bit integer capacity.

Let's remember that this problem care just for the sign (e.g. negative, positive) between certain interval \([i,j]\). So to avoid the overflow problem we simply need to substitute all the negative numbers with -1 and all the positives values by 1. After applying the previous operations we got a tree that look like this:


This solution will give us an overall time complexity of \(O(K logN)\) which is enough to pass the 2 seconds time limit.



(2) Solution using Binary Indexed Trees  


The solution with BIT is a little bit less intuitive than the previous exposed, but we can easily came with it by asking ourselves the correct questions.

1. When the product of an interval \([i,j]\) becomes zero?

When there is one or more zeros in the interval.

2. When the product of an interval \([i,j]\) becomes negative?

When there is not zeros in the interval, and the total number of negative number between \([i,j]\) is odd.

3. When the product of an interval \([i,j]\) becomes positive?

When none of the other conditions holds then product of the interval \([i,j]\) is positive.

After considering these three questions it seems important to keep track of the frequency of zeros and negative numbers for any interval \([i,j]\) of our sequence. The easiest way to accomplish this task is maintaining two arrays with the cumulative frequencies of both types, for example:



So if we want to know the numbers of zeros between certain interval \([i,j]\) we just need to apply the old trick of \(freq[j] - freq[i - 1]\) (assuming that \(freq[0]\) is 0).

\(\\ freq[4] - freq[1] = 1 \\ freq[4] = 2 \\ freq[1] = 1 \\ 2 - 1 = 1 \)

Until this point everything seems just fine, however there is another problem we haven't consider yet, what happen when we update one of the values of our sequence (operation #1) ? if that's the case we also need to update our cumulative frequency array. It's not hard to see that even when we can perform operation #2 in \(O(1)\) time, the operation #1 is going to take in the worst case scenario \(O(N)\) time, that due to the vast amount of queries we can't afford for this problem.

Considering this problems, the necessity of using BIT seems more evident. We just need to carefully maintain the cumulative frequency updated after each operation is performed. The overall time complexity of this solution is \(O(KlogN)\).

3 comments:

Tausiq said...

Thanks mate! you have been a great help :)

Pavel Simo said...

You're welcome :)

Anand said...

Very nice tutorial. It finally helped me getting accepted my first Segment tree submission in JAVA at UVA..
Thanks bro.. :)