Changes
The most notable change is the capability to work horizontally in a register, as opposed to the more or less strictly vertical operation of all previous SSE instructions. More specifically, instructions to add and subtract the multiple values stored within a single register have been added. These instructions can be used to speed up the implementation of a number of DSP and 3D operations. There is also a new instruction to convert floating point values to integers without having to change the global rounding mode, thus avoiding costly pipeline stalls. Finally, the extension addsLDDQU
, an alternative misaligned integer vector load that has better performance on NetBurst based platforms for loads that cross cacheline boundaries.
CPUs with SSE3
* AMD: ** Opteron (since Stepping E4) ** Sempron (since Palermo. Stepping E3) ** Athlon 64 (since Venice Stepping E3 and San Diego Stepping E4) ** Athlon 64 FX (since San Diego Stepping E4) ** Athlon 64 X2 ** Phenom 64 X2 ** Turion family ** K10 family ** APU family (including without GPU) ** FX Series **New instructions
Common instructions
Arithmetic
;ADDSUBPD
:''Add-Subtract-Packed-Double''
:*Input: ,
:*Output:
;ADDSUBPS
:''Add-Subtract-Packed-Single''
:* Input: ,
:* Output:
AOS ( Array Of Structures )
;HADDPD
:''Horizontal-Add-Packed-Double''
:* Input: ,
:* Output:
;HADDPS
:''Horizontal-Add-Packed-Single''
:* Input: ,
:* Output:
;HSUBPD
:''Horizontal-Subtract-Packed-Double''
:* Input: ,
:* Output:
;HSUBPS
:''Horizontal-Subtract-Packed-Single''
:* Input: ,
:* Output:
;LDDQU
:As stated above, this is an alternative misaligned integer vector load. It can be helpful for video compression tasks.
; MOVDDUP
, MOVSHDUP
, MOVSLDUP
:These are useful for complex numbers and wave calculation like sound.
;FISTTP
:Like the older x87 FISTP
instruction, but ignores the floating point control register's rounding mode settings and uses the "chop" (truncate) mode instead. Allows omission of the expensive loading and re-loading of the control register in languages such as C where float-to-int conversion requires truncate behaviour by standard.
Other instructions
;MONITOR
, MWAIT
:The MONITOR
instruction is used to specify a memory address for monitoring, while the MWAIT
instruction puts the processor into a low-power state and waits for a write event to the monitored address.
References
External links