I've got a SQL Server table with about 50,000 rows in it. I want to select about 5,000 of those rows at random. I've thought of a complicated way, creating a temp table with a "random number" column, copying my table into that, looping through the temp table and updating each row with
select top 10 percent * from [yourtable] order by newid()
In response to the "pure trash" comment concerning large tables: you could do it like this to improve performance.
select * from [yourtable] where [yourPk] in (select top 10 percent [yourPk] from [yourtable] order by newid())
The cost of this will be the key scan of values plus the join cost, which on a large table with a small percentage selection should be reasonable.