Simulating a Weighted Dice
Doing and re-doing the same thing in different ways tends to be a good way to build expertise. By asking the same question numerous times in technical interviews, I realized repetition can open new sights even with very simple tasks. In this post I’ll walk through one of my favorite interviewing questions and several quite different ways of approaching it.
The task is straightforward: write some code to simulate rolling a multi-sided, weighted dice. An example input might be the 4-sided dice with probabilities
[0.1, 0.1, 0.3, 0.5], where the output should be an integer between 1 and 4, such that 4 is 5x as likely as 1.
A common approach is to assign each outcome to a unique a segment between 0 and 1. For example, consider the following assignment:
Since the length of each segment is proportional to its desired probability, we can just generate a random number between 0 and 1, see which segment it falls in, and return the corresponding side label.
In Python 3, you might write the above approach like this:
What about rolling the same dice many times?
The above approach takes O(N) time for each dice roll - it takes linear time to build
cumulative_weights, and also linear time in the
for loop to find the right position. Can we save some time if our users want to roll the same dice many times? It turns out that we can. First, since the cumulative weights are sorted, we can use a binary search to eliminate the
for loop. Second, we can compute the cumulative weights only once, and reuse it across multiple dice rolls.
At this stage, most candidates write something like this:
Can we do better?
One problem with the above approach is that we exposed some implementation details to the user. Users now have to think about what to do with cumulative weights, which they didn’t have to think about before.
The natural object-oriented way to improve this is to introduce a class:
A fair number of the candidates I interview struggle to come up with a solution that takes less than linear time for each roll and also doesn’t expose implementation internals.
While discussing this problem with a group of friends, someone accused me of being “too OOP” when I explained that I nudge candidates towards the class solution above. Fair enough. Here’s another way using generators:
There are surely many other approaches as well. Challenging yourself to come up with other approaches could be a good way to discover new programming paradigms - for example, how would you reimplement this using closures?
If you enjoyed thinking about this question, or perhaps are considering asking it in your own interviews, here are some follow-on questions to think about:
- What if we no longer have access to a random number generator like
random.random(), and our only source of randomness is something like
randomCoinToss() -> boolean?
- What if
randomCoinToss()is very expensive? How can we minimize the number of times we call it?
- How would you test your
roll()function? What are some challenges to writing tests for a nondeterministic function, and how might we overcome them?
I like interview questions that cover algorithms and problem solving, and also have a code organization or design element. In my experience, the roll-a-weighted-dice question gives candidates an opportunity to show off both these skills, which is probably why I ask it quite often these days.