Puzzle time - naming the majority
-
What can I do with the counter? If I can manipulate it arbitrarily, it's simple: You can just remember everything by encoding it into the counter value. I don't even need the additional name storage. For instance, if the binary encoding of the list so far is
b1
, and I encounter a new name whose binary encoding isb2
, I update the counter tob1++b2
, where++
concatenates strings of binary numbers.So I assume there must be some restriction, such as that the counter can only be incremented/decremented once for every name or something like that.
-
I think it all depends on the size of the counter.
If the counter size is unlimited, it's simple by just remembering everything in the counter.
If the counter size is fixed (say, only numbers < 100), then I think I can prove that it's impossible.
If it's something like "log of input length", then things become less clear.
-
I think you can do it if you know for sure that a majority name exists, using only one counter increase or decrease in each step (which is the restriction I proposed in my first post).
Here's how:
The counter c starts with 0. Let's call the remembered name n.
Now, do this whenever a name m is read out:
If c = 0 then set n := m, increment c
else if n = m then increment c else decrement c.I think if a majority name exist, then it will be in n at the end.
Why? Because if there is a majority name, then there will be more increases than decreases.
I don't think you can also detect if a majority name exists without having something like linear space.
-
I think that’s the solution Klaus. Read the question carefully it doesn’t specify you must detect whether it’s the name was called a majority of the time. Just that you detect it if there is. Still, it seems a little sloppy that you’ll get an answer regardless and not know whether the majority condition was satisfied.
I won’t know whether that’s the solution until Saturday morning.
-
I think one of the interesting properties of this algorithm is that it will also work if the list of names is infinite, i.e., it will always give you the name of the majority name (if it exists) up to the current "time". In other words, it's a "streaming algorithms", similar to, say, the "100 day moving average" algorithm for financial streams. For streaming algorithms, it is critically important that the memory requirement does not grow with stream length (because it is infinite) and that answers can be produced piece-by-piece.
That said, the algorithm above does technically require log(input length) memory for the counter, so it would eventually exceed the capacity of any computer when given an infinite stream, but the logarithm grows so slowly that it's not really an issue in practice.
-
SOLUTION: It makes sense to use the counter to keep track of how well the name currently in memory is doing. Then, whenever that name is heard again, you can increment the counter; when some other name is heard, you must decrement it. The counter begins at 0; if it ever threatens to dip below zero, the current name in memory is replaced by the name being heard, and the counter is incremented from 0 to 1.
This may seem a bit dubious at first, because the new name you are putting into the counter may be the first occurrence of that name, while in the meantime your previous name (and other names) may have occurred many times. For example, if the list were "Alice, Bob, Alice, Bob, Alice, Bob, Charlie," Alice's name would remain in your memory until the end when Charlie becomes your nominee. But that's OK, since no name has a majority.
If a name does occur more than half the time, it's guaranteed to be the one in your memory at the end. Why? Suppose the majority name is Mary, and assume (for convenience) that a name is heard every minute. Divide time into intervals during which the same name remains in memory. Then each interval except possibly the last is an even number of minutes in length, and its memory-name was heard exactly half the time during the interval.
It follows that Mary's name is heard at most half the time within every interval except the last, therefore more than half the time in the last interval — which must therefore end with Mary's name in memory.
Here's an example: Suppose the name list is A, M, A, M, M, A, B, B, M, M, A, M, M, B, M. Then the intervals are of length 4 (with "A" in memory), 2 (with "M"), 4 (with "B"), 2 (with "A" again), and finally 3 (with "M"), the counter ending at value 1.