10132 - FileFragmentation
N identical bit strings ( composed of only ones and zeroes ) have been split into 2*N parts, such that each string is split into exactly two parts, of possibly unequal length. You have to reconstruct the original string.
The main idea here is that since every string is split into exactly 2 parts, it follows that if lMax is the length of the longest fragment, and lMin is the length of the shortest fragment, then a string of length lMax would obviously be paired with a string of length lMin, in the original string. To generalize, a fragment of length l[K] would be paired with one of length l[M], such that l[K] + l[M] = lMax + lMin = length of the original string.
What we need to determine is this pairing, and the order in which these string pairs need to be concatenated.
This can be done as follows :
- Sort all the strings by length. Let the length of all the strings be l...l[K] in sorted order
- Pair all strings of length l, with those of length l[K], in all possible orders ( i.e., The string of length l first, followed by the one of length l[K], and the reverse order ). At most, 8 such distinct strings may be formed, as there may be atmost 2 distinct strings of any given length. Add these to a set to prevent repetition of duplicates.
- Now, similarly find all strings that can be formed by pairing a string of length l, with some string of length l[K-1]. Find the intersection of this set, with the set formed in step 2. Let us call this intersection set "Result"
- Repeat the above procedure with all strings of length l[i] and l[K+1-i], for all valid values for i. For each such value of i, take the intersection of the set of possible strings that may be formed, with the "Result" set obtained previously, and call this intersection the new "Result" set
- Continue the above process till the "Result" set contains just one string. This is the final answer, i.e, the original string.
- If all i values have been tried, and the "Result" set contains more than one string, output any of its elements, as in such a case, there is no unique answer.
- Be careful regarding base cases ( Like, for example, the input may contain just 2 fragments; i.e., there was just one copy of the original string that was split )
- There may be more than one solution to a particular problem instance. In such a situation, output any of the possible solutions.
It is advisable to use object-oriented languages such as C++/JAVA etc. which have readymade implementations of the necessary data structures ( like the Set data structure etc. )
1 011 0111 01110 111 0111 10111