I want to use the highest possible number of threads (to use less computers) but without making the bottleneck to be in the client.
I have used JMeter a fair bit and found it is not great at generating really high load. On a 2Ghz Core2 Duo with 2Gb memory you can reasonably expect about 100 threads.
That being said, it is best to run it on your hardware so that the CPU of the PC does not peak at 100% - a stable 80%-90% is best otherwise the results are affected.
I have also tried WAPT 5 - it successfully ran 1000+ threads from the same PC. It is not free but it is more useable than JMeter but doesn't have all of the features.
Outdated answer since at least version 2.6 see https://stackoverflow.com/a/11922239/460802 for a more up to date one.