X-bit Labs has an overview of the new instructions offered by Intel's 22nm Haswell:
The instructions fit into several categories:
AVX2 - Integer data types expanded to 256-bit SIMD. AVX2’s integer support is particularly useful for processing visual data commonly encountered in consumer imaging and video processing workloads. With Haswell, Intel will have AVX for floating point, and AVX2 for integer data types.
Bit manipulation instructions are useful for compressed database, hashing , large number arithmetic, and a variety of general purpose codes.
Gather useful for vectorizing codes with nonadjacent data elements. Haswell gathers are masked for safety, (like the conditional loads and stores introduced in Intel AVX), which favors their use in codes with clipping or other conditionals.
Any-to-Any permutes – useful shuffling operations. Haswell adds support for DWORD and QWORD granularity permutes across an entire 256-bit register.
Vector-Vector Shifts: We added shifts with the vector shift controls. These are critical in vectorizing loops with variable shifts.
Floating Point Multiply Accumulate – Intel’s floating-point multiply accumulate significantly increases peak flops and provides improved precision to further improve transcendental mathematics. They are broadly usable in high performance computing, professional quality imaging, and face detection. They operate on scalar, 128-bit packed single and double precision data types, and 256-bit packed single and double-precision data types.
The vector instructions build upon the expanded (256-bit) register state added in Intel AVX, and as such as supported by any operating system that supports Intel AVX.