These products store your entire data set in memory (RAM), then process it. If your data has more than 500,000 rows (even after significant summarizing to reduce the size of the data set), it means that these tools will crash on most platforms. How do you get around this? Unless you use SAS or SQL Server Data Mining or a few other products (which ones?), my feeling is that you have to write your own code in a high level language such as C++ or Java (or Perl / Python if lots of string processing is required), combined with powerful sorting tools such as syncsort (this will help you work with small hash table or stacks), and powerful string matching tools such as grep.
How do you handle this problem? Do you proceed differently? Please don't tell me I should do sampling - I can not afford to do sampling on our very large data set because it is not well balanced.