Increasing the MTBF

Increasing the MTBF, or the overall reliability of the system, is an expensive operation. That doesn't mean that you shouldn't do it, just that it involves a lot of testing, defensive programming, and hardware considerations.

Effectively, the goal here is to eliminate all bugs and failures. Since this is generally unfeasible (i.e. will take a near-infinite amount of time and money) in any reasonably sized system, the best you can do is approach that goal.

When my father worked at Bell-Northern Research (later part of Nortel Networks), he was responsible for coming up with a model to predict bug discovery rates, and a model for estimating the number of bugs remaining. As luck would have it, a high-profile prediction of when the next bug would be discovered turned out to be bang on and astonished everyone, especially management who had claimed "There are no more bugs left in the software!"

Once you've established a predicted bug discovery rate, you can then get a feeling for how much it is going to cost you in terms of test effort to discover some percentage of the bugs. Armed with this knowledge, you can then make an informed decision based on a cost model of when it's feasible to declare the product shippable. Note that this will also be a trade-off between when your initial public offering (IPO) is, the status of your competition, and so on. An important trade-off is the cost to fix the problem once the system is in the field. Using the models, you can trade off between testing cost and repair cost.