I think that "2x" estimate might actually be a little conservative in some cases. An 8x1 architecture clocked at ~600MHz is not out of the question, IMHO. It could well be an 8x2/16x1 configuration of some kind, although these reports of 110-150 million transistors doesn't really support that idea. Their mindset when it comes to the design of Loki certainly seems to be extremely performance-oriented.
nVidia are targetting 550-600MHz core for the NV40 (a "true" 8-piper), paired with 700-800MHz (that's 1.4-1.6GHz DDR) memory so expect similar performance leaps from them too.
x-bit are wrong on the process details - it's a 0.13u part, not 0.15u. I believe they're re-using some physical elements from RV350.
MuFu.