35.2.4.1. FLOATING POINT VECTOR SUMMARY
+-----------+
| IA-64 | Floating Point
+-----------+
Vector Data Format:
[ SINGLE x 2 ] = 64 bit
FPABS - FP Parallel Absolute Value
FPACK - FP Pack
FPAMAX.sf - FP Parallel Absolute Maximum
FPAMIN.sf - FP Parallel Absolute Minimum
FPCMP.frel.sf - FP Parallel Compare
FPCVT.fx/fxu(.trunc).sf - Convert Parallel FP to Integer
FPMA.sf - FP Parallel Multiply Add
FPMAX.sf - FP Parallel Maximum
FPMERGE.ns/s/se - FP Parallel Merge
FPMIN.sf - FP Parallel Minimum
FPMPY.sf - FP Parallel Multiply
FPMS.sf - FP Parallel Multiply Subtract
FPNEG - FP Parallel Negate
FPNEGABS - FP Parallel Negate Absolute Value
FPNMA.sf - FP Parallel Negative Multiply Add
FPNMPY.sf - FP Parallel Negative multiply
FPRCPA.sf - FPP Reciprocal Apprioximation
FPRSQRTA.sf - FPP Reciprocal Square Root Approximation
+----------+
| SPARC |
+----------+
SPARC V9 Extended Instructions. (UltraSPARC)
UltraSPACE = V9 + VIS I
Vector Data Format:
Pixel: [ BYTE x 4 ] = 32 bit (8-bit Unsigned Integer)
Fixed Point: [ WORD x 4 ] = 64 bit (16-bit Signed Fixed Point)
Fixed Point: [ DWORD x 2 ] = 64 bit (32-bit Signed Fixed Point)
FPADD16 - Partitioned Add VIS I
FPADD16S VIS I
FPADD32 VIS I
FPADD32S VIS I
FPSUB16 - Partitioned Subtract VIS I
FPSUB16S VIS I
FPSUB32 VIS I
FPSUB32S VIS I
FPACK16 - Pixel formatting VIS I
FPACK32 VIS I
FPACKFIX VIS I
FEXPAND VIS I
FPMERGE VIS I
FMUL8x16 - Partitioned Multiply VIS I
FMUL8x16AU VIS I
FMUL8x16AL VIS I
FMUL8SUx16 VIS I
FMUL8ULx16 VIS I
FMULD8SUx16 VIS I
FMULD8ULx16 VIS I
FCMPGT16 - Pixel compare VIS I
FCMPGT32 VIS I
FCMPLE16 VIS I
FCMPLE32 VIS I
FCMPNE16 VIS I
FCMPNE32 VIS I
FCMPEQ16 VIS I
FCMPEQ32 VIS I
EDGE8 - Edge handling VIS I
EDGE8L VIS I
EDGE8N VIS II
EDGE8LN VIS II
EDGE16 VIS I
EDGE16L VIS I
EDGE16N VIS II
EDGE16LN VIS II
EDGE32 VIS I
EDGE32L VIS I
EDGE32N VIS II
EDGE32LN VIS II
PDIST - Pixel component Distance VIS I
LDDFA - Load Array/etc VIS I
LDDA - Load Quadword atomic VIS I
STDFA - Partial store/etc VIS I
ARRAY8 - 3D Array addressing VIS I
ARRAY16 VIS I
ARRAY32 VIS I
ALIGNADDRESS - Alignment VIS I
ALIGNADDRESS_LITTLE VIS I
FALIGNDATA VIS I
RDASR - Read Graphic status reg US
WRASR - Write Graphic status reg US
SIAM - Set Interval Arithmetic mode VIS II
FZERO FZEROS - Logical VIS I
FONE FONES VIS I
FSRC1 FSRC1S VIS I
FSRC2 FSRC2S VIS I
FNOT1 FNOT1S VIS I
FNOT2 FNOT2S VIS I
FOR FORS VIS I
FNOR FNORS VIS I
FAND FANDS VIS I
FNAND FNANDS VIS I
FXOR FXORS VIS I
FXNOR FXNORS VIS I
FORNOT1 FORNOT1S VIS I
FORNOT2 FORNOT2S VIS I
FANDNOT1 FANDNOT1S VIS I
FANDNOT2 FANDNOT2S VIS I
BMASK - Set GSR.MASK for Shuffle instr. VIS II
BSHUFFLE - Shuffle VIS II
+----------+
| MIPS |
+----------+
MIPS V
.PS [ SINGLE x 2 ] = 64 bit PS - paired single
LUXC1 - Load PS Unaligned
LDXC1 - Load PS Aligned
LWXC1 - Load PS
SUXC1 - Store PS Unaligned
SDXC1 - Store PS
SWXC1 - Store PS Unaligned
ALNV.PS - Handled not 8-bit Alignment Vectors
Example:
luxc1 f1,bc
ldxc1 f2,de
f0 B C
f1 D E
alnv.ps f2,f0,f1,T0
f2 C D
ADD.PS/ SUB.PS - Add/Subtract
MUL.PS - Multiply
ABS.PS - Absolute Value
MOV.PS - Move
NEG.PS - Negate
CVT.S.PU - Convert from PS Upper to S
CVT.S.PL - Convert from PS Lower to S
CVT.PS.S - Convert to PS from 2 S
PLL.PS - PS from two PS (Low,Low)
PLU.PS - PS from two PS (Low,Up)
PUL.PS - PS from two PS (Up,Low)
PUU.PS - PS from two PS (Up,Up)
C.XX - FP Vector Compare/Branch
MADD.PS - Multiply/Add
MSUB.PS - Multiply/Sub
NMADD.PS - Negatte Muliply/Add
NMSUB.PS - Negate Multiply/Sub
MIPS 64 - is superset of MIPS V Instruction Set Architecture.
Contain new data type - paired-single. This datatype prowide 2-way
SIMD capability for two 32-bit single precension floating packed in
one 64-bit register.
Vector Data Format:
.PS [ SINGLE x 2 ] = 64bit
MIPS-3D ASE (Application Specific Extensions):
ADDR.PS - F.P. Reduction add
MULR.PS - F.P. Reduction multiply
RECIP1.S - Reciprocal with reduced precension result
RECIP1.D
RECIP1.PS
RECIP2.S - Reciprocal 2nd step
RECIP2.D
RECIP2.PS
RSQRT1.S - Reciprocal square root with reduced precension result
RSQRT.D
RSQRT.PS
RSQRT2.S - Reciprocal square root 2nd step
RSQRT2.D
RSQRT2.PS
CVT.PS.PW - Convert two 32-bit integers to F.P. paired-single
CVT.PW.PS - Convert F.P paired single to two paired words
CABS.cond.S - F.P. Absolute Values Compare
CABS.cond.D
CABS.cond.PS
BC1ANY2F cc - Branch on any of two FP condition code false
BC1ANY2T cc - Branch on any of two FP condition code true
BC1ANY4F cc - Branch on any of four FP condition code false
BC1ANY4T cc - Branch on any of four FP condition code true
+----------+
| x86 |
+----------+
KNI (Katmai New Instructions) SIMD [Pentium 3]
----------------------------------
XMM Registers (128-bit)
Vector Data Format:
[ SINGLE x 4 ] = 128 bit
ADDPS - Packed Single F.P Add
ANDNPS - Bit-wide Logical And-Not for Single-FP
ANDPS - Bit-wide Logical And For Single-FP
CMPPS - Packed Single-FP Compare
CVTPI2PS - Packed Signed INT32 to Packed Single-FP Conversion
CVTPS2PI - Packed Single-FP to Packed INT32 Conversion
CVTTPS2PI - Packed Single-FP to Packed INT32 Conversion (Truncate)
DIVPS - Packed Single FP Divide
MAXPS - Packed Single FP Maximum
MINPS - Packed Single FP Minimum
MOVAPS - Move Aligned Four Packed Single-FP
MOVUPS - Move Unaligned Four Packed Single-FP
MULPS - Packed Single-FP Multiply
ORPS - Bit-wise Logical OR for Single-FP Data
RCPPS - Packed Single-FP Reciprocal
RSQRTPS - Packed Single-FP Square Root Reciprocal
SHUFPS - Shuffle Single-FP
SQRTPS - Packed Single-FP Square Root
SUBPS - Packed Single-FP Subtract
XORPS - Bit-wise Logical XOR for Single-FP Data
(And some other operations)
Williamette SIMD2 Extensions [Pentium 4]:
-------------------------
[ DOUBLE x 2 ] = 128 bit
ADDPD - Add packed Double-FP
ANDNPD - AND-NOT packed Double-FP
ANDPD - AND packed Double-FP
CMPxxPD - Packed Double-FP Compare
DIVPD - Divide packed Double-FP
MAXPD - Packed Double-FP Maximum
MINPD - Packed Double-FP Minimum
MULPD - Multiply packed Double-FP
ORPD - OR packed Double-FP
SHUFPD - Shuffle
SQRTPD - Packed Double-FP Square Root
SUBPD - Sub packed Double-FP
XORPD - XOR packed Double-FP
Also, conversions:
CVTPD2PI CVTPI2PD
CVTPD2DQ
CVTPD2PS CVTPS2PD
CVTTPD2PI
CVTTPD2PS
And moving:
MOVAPD - Aligned
MOVHPD - High
MOVLPD - Low
MOVMSKPD - Byte Mask
MOVUPD - Unaligned
Packs/Unpacks operations not described here.
Prescott New Instructions (PNI):
---------------------------
ADDSUBPD - Double FP Add/Sub
ADDSUBPS - Single FP Add/Sub
HADDPD - Packed double-FP Horizontal Add
HADDPS - Packed single-FP Horizontal Add
HSUBPD - Packed double-FP Horizontal Sub
HSUBPS - Packed single-FP Horizontal Sub
LDDQU - Load Unaligned 128-bit integer
MOVDDUP - Move one double-FP and duplicate
MOVSHDUP - Move one single-FP high and duplicate
MOVSLDUP - Move one single-FP low and duplicate
+-----------+
| Power PC |
+-----------+
PowerPC G4 AntiVect ISA extension:
Vector Data Format:
V-registers (128 bit)
[ 4 x SINGLE FP ] = 128bit
LVEWX - Load Vector Element Word Indexed
LVX - Load Vector Indexed
LVXL - Load Vector Indexed LRU
STVEWX - Store Vector Element Word Indexed
STVX - Store Vector Indexed
STVXL - Store Vector Indexed LRU
VADDFP - Vector Add FP
VCFSX - Vector Convert from Signed Fixed-point word
VCFUX
VCMPBFP/VCMPBFP. - Vector Compare Bounds FP
VCMPEQFP/VCMPEQFP. - Vector Compare Equal-to-FP
VCMPGEFP/VCMPEQFP. - Vector compare Greater than or Equal to FP
VCMPGTFP/VCMPGTFP. - Vector compare Greater to FP
VCTSXS Vector convert to Signed FP Word Saturate
VCTUXS Vector convert to Unsigned FP Word Saturate
VEXPTEFP Vector 2 Raised to the Exponent Estimate Floating Point
VLOGEFP Vector Log2 Estimate Floating Point
VMADDFP Vector Multiply Add Floating Point
VMAXFP Vector Maximum Floating Point
VMINFP Vector Minimum FP
VNMSSUBFP Vector Negative Multiply-Subtract FP
VREPFP Vector Reciprocal Estimate FP
VRFIM Vector Round to FP Integer toward Minus Infinity
VRFIN Vector Round to FP Integer Nearest
VRFIP Vector Round to FP Integer toward Plus Infinity
VRFIZ Vector Round to FP Integer toward Zero
VRSQRTEFP Vector Reciprocal Square Root Estimate Floating Point
VSUBFP Vector Subtract FP
Index Prev Next