I have seen many mails and forum posts, which sort-of imply that increasing the number of threads working on a task will speed things up overall. This is a dangerous path to follow in many cases, because it has the major drawback of adding overhead.
In some cases there is a range where the increase of parallelism indeed improves the situation. But that is only because of very particular performance characteristics. Basically what needs to fit is the ration of in-system execution time vs. wait time for other systems. And the critical “side” condition for the other systems is that there is no resource congestion/conflict there: I.e. if you decide to add more load (by increasing the number of threads on your side) to an external database that is already maxed out, how do you expect your code to run faster?
Lastly, there is neither a one-size-fits-all number of threads to recommend, nor can one always say that adding threads will make things better at all. The opposite can happen quite easily and it always needs proper testing with full load. Putting the latter onto the system is harder than most people think, unfortunately. But the benefit will very often be a profoundly improved understanding about the application and especially its TRUE bottlenecks. Parallelism will almost always yield results very different from a single-threaded execution (e.g. when run from your IDE).
Let me close by saying that in a surprisingly high number of scenarios, and again there is a number of pre-conditions associated, a single-threaded execution will give the highest performance (which we have not defined as a term at all here, by the way). This is because the overhead mentioned at the beginning kicks in. If you manage to have your code run such that you reduce or even eliminate cache misses on the CPU, you can see improvements by a factor of 10,000+ (and that is really not percent but a factor). But in general the effort to achieve that outweighs the usefulness; unless we talk about things like high-frequency trading of course.