+ SSE2=64bit audio, SSE3=faster FFT... please

Suggest new features, components or other changes to the software

Moderator: electrogear

Does your music computer have SSE2 (equal or greator than P4, Athlon64...)

Poll ended at Wed Feb 21, 2007 11:36 pm

Yes, I have atleast one SSE2 computer now
18
78%
No, but I will buy a new Windows computer with in 6 months (eg dual core)
0
No votes
No, but I will buy a new dual core Mac soon
0
No votes
No, and I'm unsure about buying a new computer soon.
5
21%
No: I'm building hardware synths only now
0
No votes
 
Total votes : 23

+ SSE2=64bit audio, SSE3=faster FFT... please

Postby infuzion on Fri Jan 19, 2007 11:36 pm

Seems that increasing the bit depth for floating point DSP seems to be more efficent than oversampling, and since SSE2 has been out for 6+ years (over 3 for AMD), could SM please have SSE2 processing? Might be easier to put it in phases (could be marketed that SM has "preliminary 64bit processing"):

1) SSE2 instructions opened for the ASM primitive (see below)
2) Introduce 64bit Stream connectors (just to connect ASM to ASM)
3) have 64bit Mono VST in/out
4) add converters to & from Poly/Pack4, Mono, & 64bit stereo
4a) rename Pack to Pack4, add Pack64Mono (merge 2 64bit Mono into a single 64bit stream)
4a) add converters to 64bit have noise generators for lower 32 bit (or connector in)
5) add primitives, starting with
+ Signal Analyser
+ Wave Reader (I assume wave tables can handle 64bit)
+ Wave Table Read
(edit2009: for some reason I had listed Envelope Control, Voices to 64bit, ASIO In/Out with 64bit... not really needed though)
++ somewhere here Code that can make 64bit ASM
+ Delay
+ ADSR
+ OSC primitives
+ everything else...

I hope phase 1 & perhaps 2 could added to SM quickly please. I understand the other phases would take a while... perhaps not until SM 1.5 or 2.0.

Here are the commands I hope to be added (edit2009: Why do I have Double-FP listed?)
Code: Select all
addpd //Packed Double-FP Add
andnpd //Bit-Wise Logical And Not (for Double-FP)
andpd //Bit-Wise Logical And (for Double-FP)
cmppd //Packed Double-FP Compare compare
cvtdq2pd //Packed Signed INT32 to Packed Double-FP Conversion
cvtpd2pi //Packed Double-FP to Packed INT32 Conversion
cvtpd2ps //Packed Double-FP to Packed Single-FP Conversion
cvtpi2pd //Packed Signed INT32 to Packed Double-FP Conversion
cvtps2pd //Packed Single-FP to Packed Double-FP Conversion
divpd //Packed Double-FP Divide
maxpd //Packed Double-FP Maximum
minpd //Packed Double-FP Minimum
movapd //Move Aligned TwoPacked Double-FP movement
movhpd //Move High Packed Double-FP movement
movhlps //Move High Single-FP to Low {faster AMD vs Intel}
movhps //Move High Packed Single-FP
movlpd //Move Low Packed Double-FP movement
movlps //Move Low Packed Single-FP
mulpd //Packed Double-FP Multiply
orpd //Bit-Wise Logical OR for Double-FP Data
shufpd //Shuffle Double-FP
shufps //Shuffle Single-FP, requested before
sqrtpd //Packed Double-FP Square Root
subpd //Packed Double-FP Substract
unpckhpd //Unpack High Packed Double-FP Data shuffle
unpcklpd //Unpack Low Packed Double-FP Data shuffle
unpcklps //useful for merging 2 mono streams into 1
xorpd //Bit-wise Logical XOR for Double-FP Data

* rounding controlled by MXCSR register (bits 13 and 14)
ldmxcsr //Load MXCSR register
stmxcsr //Save MXCSR register state
(SM default of rounded off?)

msc
s64in/out, s64intin/out, s64boolin/out,
float64 (needed?)

Examples of how to get current 32bit streams translated back & forth to 64bit {this is latency for AMD,Intel}:
Code: Select all
//Poly/Pack4 64bit process
streamin in;streamout out;
float temp32;
movaps xmm7,in;
cvtps2pd xmm0,xmm7; //{3,3-2} move positions 0&1 into 64bit float

// 64bit process with xmm0

cvtpd2ps xmm6,xmm0; //{10,11-10}
movlps temp32,xmm6; //{2,6-1} not needed if xmm6 isn't used in 64bit processing
movaps xmm7,in; //{2,6-1} not needed if xmm7 wasn't used
movhlps xmm7,xmm7; //{2,6} bounce positions 2&3 to 0&1
cvtps2pd xmm0,xmm7; //{3,3-2} move positions 0&1 into 64bit float

// 64bit process with xmm0, likely just copy the above

cvtpd2ps xmm7,xmm0; //{10,11-10} recycle xmm7
movlps xmm6,temp32; //{2,6-1} not needed if xmm6 isn't used in 64bit processing
movhps xmm6,xmm7; //{2,6-1} merge both stages
movaps out,xmm6;
// fini {total extra latentcy:37-30,57-32}
While translating 3-4 streams into 64bit will need twice as many instructions of DSP code as 32bit, if you merge 2 mono streams into one, you can use almost 1/2 as much instructions, since you don't need two seperate ASM routines:
Code: Select all
//dual mono 64bit process
monoin iLeft;monoin iRight;
monoout oLeft; monoout oRight;
movlps xmm7,iRight; //{2,6-1} assuming Mono is posision 0
unpcklps xmm7,iLeft; //{5,4-2} position 0=right, 1=left
cvtps2pd xmm0,xmm7; //{3,3-2}

// 64bit process with xmm0

cvtpd2ps xmm7,xmm0; //{10,11-10}
movlps oRight,xmm7; //{2,4-1} hopefully posisition 1 is ignored by monoout
shufps xmm7,xmm7,2; //{4,6-2} move pos1 to 0
movlps oLeft,xmm7; //{2,4-1} hopefully posisition 1 is ignored by monoout
// fini {total extra latentcy (-4 movaps):20,20-12?}


sources:
http://www.synthmaker.com/dokuwiki/doku ... _urls#sse2

edit: almost forgot my connector suggestions (bottom 2 are SM's for ref):
Image
I was trying to have cool colors, with 2 of something that looked wider than the Mono dot. I personally like the middle colors, with somewhere between the right & left lines.

edit2: forgot to suggest to have dithering down from double to single FP to have rounded off by default

cheers
Last edited by infuzion on Fri Dec 11, 2009 5:55 pm, edited 2 times in total.
Need help? First search the forum & WiKi, then post in the help forum with a clear topic, request, & OSM. Then please WiKi the correct solution. If you want my personal assistance, I charge by the hour or for an exchange of services.
infuzion
smstar
smstar
 
Posts: 6169
Joined: Wed May 04, 2005 8:02 pm
Location: Earth, USA, CO, Denver

Postby rl on Sun Jan 21, 2007 10:36 pm

++vote

Also I'd like to have a double type for the high level code module, even when stream connectors only support floats.
User avatar
rl
dsp wiz
 
Posts: 1494
Joined: Mon Feb 07, 2005 10:24 pm
Location: de.earth.universe.known

Postby infuzion on Mon Jan 22, 2007 3:55 pm

Thanks for the feedback rl; I was beginning to think people wer apathetic to my suggestion (not that is a bad thing... if noone else wants SSE2, then why should the devs waste their time?)
rl wrote:Also I'd like to have a double type for the high level code module, even when stream connectors only support floats.
So, you saying Code support should be priority 1.5 on my list? 64bit Code would be cool, but I hope more like 2.5 or 3.5... I hope to start chaining together 64bit ASM sooner since I'd hard to code things like delays & envelopes w/Hop8, which are harder if done in all in one block.

Besides, you can take the ASM output of the old single type Code, paste it into a text editor, find/replace the SSE commands w/SSE2, add my wrapper (if it works ;) ), & copy/paste the results into the 64bit ASM. A bit of work, but not a big deal if you only convert Code when it's finished & releasable.

I wonder how that would work? Make variables declaired FloatD vs Float?
Need help? First search the forum & WiKi, then post in the help forum with a clear topic, request, & OSM. Then please WiKi the correct solution. If you want my personal assistance, I charge by the hour or for an exchange of services.
infuzion
smstar
smstar
 
Posts: 6169
Joined: Wed May 04, 2005 8:02 pm
Location: Earth, USA, CO, Denver

Postby sambean on Mon Jan 22, 2007 4:10 pm

How about a code/asm SDK, help cut down on requests and such ;)

Sam
Posts++
User avatar
sambean
dsp wiz+
 
Posts: 1011
Joined: Thu Jun 02, 2005 12:14 am
Location: Ilha Formosa

Postby rl on Mon Jan 22, 2007 10:35 pm

So, you saying Code support should be priority 1.5 on my list?


no, your suggestions absolutely make sense. It's just that a double type is what I'm missing most e.g. this algorithm needs double precision, otherwise the filter is unstable.

On the other hand I haven't really started on ASM optimizations. I guess when starting, I will end up with similar suggestions to yours.
User avatar
rl
dsp wiz
 
Posts: 1494
Joined: Mon Feb 07, 2005 10:24 pm
Location: de.earth.universe.known

Postby infuzion on Tue Jan 23, 2007 12:00 am

rl wrote:On the other hand I haven't really started on ASM optimizations. I guess when starting, I will end up with similar suggestions to yours.
I'm somewhat comfortable w/ASM now, atleast the SSE math stuff.
Need help? First search the forum & WiKi, then post in the help forum with a clear topic, request, & OSM. Then please WiKi the correct solution. If you want my personal assistance, I charge by the hour or for an exchange of services.
infuzion
smstar
smstar
 
Posts: 6169
Joined: Wed May 04, 2005 8:02 pm
Location: Earth, USA, CO, Denver

Postby infuzion on Thu Jan 25, 2007 3:21 pm

hmmm... I'd thought there would be more voters by now... I guess people who use SM don't have a computer? :o

Thanks for those who have voted so far, & the feedbacks!
Need help? First search the forum & WiKi, then post in the help forum with a clear topic, request, & OSM. Then please WiKi the correct solution. If you want my personal assistance, I charge by the hour or for an exchange of services.
infuzion
smstar
smstar
 
Posts: 6169
Joined: Wed May 04, 2005 8:02 pm
Location: Earth, USA, CO, Denver

Postby Tom7777 on Fri Jan 26, 2007 8:47 am

I voted - Yes, I have atleast one SSE2 computer now

The reason i didn't say anything though is im not really a coder and this seems like a code related with all those code requests so maybe thats the reason theres not a load of post and votes. :)
Tom7777
smychopath
 
Posts: 3932
Joined: Wed Mar 16, 2005 10:46 pm

Postby Nu Audio Science on Sat Jan 27, 2007 12:17 pm

I voted cause all my puters are SSE2 but i didn't comment cause i'm crap at doing anything with SM unless you count booting it up and staring at it blankly O:)
Oh blimey
User avatar
Nu Audio Science
smunatic
 
Posts: 2237
Joined: Thu May 05, 2005 12:29 am

Postby infuzion on Tue Jan 30, 2007 3:56 am

Thanks for everyone's honest replies & votes!
I'm suprised by the % of "not planning to buy" votes; this means over 10% of our readership are using computers over 3 years old? I suppose we have alot of poor students or something then...

For those who think a new computer might be too much; I've seen many dual-core desktops & laptops under $500USD here:
http://DealNews.com
& many stores are having fire sales to dump XP computers for Vista ones.

While I normally vote for the underdog (AMD), I'd have to say for the price, wattage, and heat efficency, seems that a Core 2 Duo will out perform a dual core when it comes to SSE/SSE2. I'd pick up a AMD x2 only when a comperable Core 2 Duo is out of $ reach, "Core Duo" (not 2) is fine, but avoid at all costs PentiumD: to hot & not worth it IMHO.
Need help? First search the forum & WiKi, then post in the help forum with a clear topic, request, & OSM. Then please WiKi the correct solution. If you want my personal assistance, I charge by the hour or for an exchange of services.
infuzion
smstar
smstar
 
Posts: 6169
Joined: Wed May 04, 2005 8:02 pm
Location: Earth, USA, CO, Denver

Postby matti on Thu Feb 01, 2007 6:42 am

infuzion wrote:hmmm... I'd thought there would be more voters by now...

I cast my vote to support the idea. 64 bit would be very usefull for people who use asm code modules.



I just recently updated from 600mhz P3 to 2x2.8ghz PD. I'd say it's worth it(cheapest core2duos cost twice more), unless you live somewhere where the normal temp inside a room is way over 40 degrees celsius. Single core amd setups might give more punch(per euro) if you run just one prog, but intel dualcores have that magical stability to them. Everything runs so smoooth..
matti
essemilian
 
Posts: 472
Joined: Thu Nov 02, 2006 5:23 pm
Location: Finland

Postby infuzion on Thu Feb 01, 2007 8:06 am

matti wrote:64 bit would be very usefull for people who use asm code modules.

I just recently updated from 600mhz P3 to 2x2.8ghz PD. I'd say it's worth it(cheapest core2duos cost twice more), unless you live somewhere where the normal temp inside a room is way over 40 degrees celsius. Single core amd setups might give more punch(per euro) if you run just one prog, but intel dualcores have that magical stability to them. Everything runs so smoooth..
Well, sooner or later, I hope 64bit floats would happen thought all SM primitives, but it would just be the matter of adding a few commands to the ASM, no real programming AFAIK.

The PentiumD is a bargin chip for sure... but:
* To me, hotter = shorter lifespan & perhaps more instabiltiy (thinking I could take the computer in live locations, etc)
* Core2Duos are great overclockers, PentDs are simply too hot to try IMHO.
* I'm not sure but seems according to some latency charts, that SSE commands take fewer cycles on Core2 vs PentD... a verification would be great.
* I'm lazy, & only want to upgrade every 3-4 years, so spending a bit extra cash will keep me from upgrading early, thus saving time. But I only spend about $300 above cheap to do so, not $900.

matti, please don't think I'm saying you are wrong for grabbing your PD; I've gone the cheapest route before as well. I'm just letting others know some consideratons of mine, to equip them with a well educated choice. Actually, thinking about what you said, I'd suggest if you find a cheap cheap used PentiumD system (test it with a burn-in program & temp guage) in great condition, that could be the best bargin!
Need help? First search the forum & WiKi, then post in the help forum with a clear topic, request, & OSM. Then please WiKi the correct solution. If you want my personal assistance, I charge by the hour or for an exchange of services.
infuzion
smstar
smstar
 
Posts: 6169
Joined: Wed May 04, 2005 8:02 pm
Location: Earth, USA, CO, Denver

Postby matti on Thu Feb 01, 2007 8:57 am

infuzion wrote:The PentiumD is a bargin chip for sure... but:
* To me, hotter = shorter lifespan & perhaps more instabiltiy (thinking I could take the computer in live locations, etc)

Yeah, there's no right or wrong way to go. It's always a balance between features, disadvantages, power and price.
matti
essemilian
 
Posts: 472
Joined: Thu Nov 02, 2006 5:23 pm
Location: Finland

I vote YES !

Postby medanby on Sat Feb 03, 2007 12:08 am

Having just built a nice new dual-core 64bit box of extravagance, it would be great to make use of all the instructions.
Perhaps the CODE module could have instuctions to enable creating assembly with or without MMX, SSE, SSE2, etc. Some more MATHS functions in CODE would be nice as well.
The CODE and ASSEMBLY modules make this product very attractive and quite distinct from others.

Looking forward to installing SM 1.0.2

Cheers,

Michael
medanby
essemer
 
Posts: 10
Joined: Mon Nov 06, 2006 12:17 am

Postby infuzion on Sat Feb 03, 2007 6:04 am

Well, if you don't have atleast SSE1, then you have a 7+ year old computer, & most like not able to afford SM, nor any commertial VSTs made by it. It would be neat to have the Code to switch from 32bit SSE to 64bit SSE2 by an input... I guess we'll need a CPU primitive that told us if the host computer has SSE2 abilities or not. (But it may be a moot point if it takes a few years for SSE2 to get added to SM, & the SSE1-only computers will also be 7+ years old at that point. I'm indifferent, just don't want the dev team working TOO hard)

Exactly what math functions in Code do you want Michael?
Need help? First search the forum & WiKi, then post in the help forum with a clear topic, request, & OSM. Then please WiKi the correct solution. If you want my personal assistance, I charge by the hour or for an exchange of services.
infuzion
smstar
smstar
 
Posts: 6169
Joined: Wed May 04, 2005 8:02 pm
Location: Earth, USA, CO, Denver

Next

Return to Ideas and Requests

Who is online

Users browsing this forum: Google [Bot] and 1 guest