infuzion wrote:Interesting... could you roll your own rsqrtps, or does it just have to be opend in SM's Assembly?
As noted, I'm after the performance that I can only get from rsqrtps (a single machine instruction that there's currently just no way to get in SM -- unless the assembler module has an undocumented "emit" pseudo-op). While I'm waiting, I can use sqrtps instead, but as noted it's substantially slower.