Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Overview of the new instructions (cutted and pasted from the PDF):

AVX2 promotes the vast majority of 128-bit integer SIMD instruction sets to operate with 256-bit wide YMM registers. AVX2 instructions are encoded using the VEX prefix and require the same operating system support as AVX. Generally, most of the promoted 256-bit vector integer instructions follow the 128-bit lane operation, similar to the promoted 256-bit floating-point SIMD instructions in AVX. Newer functionalities in AVX2 generally fall into the following categories:

• Fetching non-contiguous data elements from memory using vector-index memory addressing. These “gather” instructions introduce a new memory-addressing form, consisting of a base register and multiple indices specified by a vector register (either XMM or YMM). Data elements sizes of 32 and 64-bits are supported, and data types for floating-point and integer elements are also supported.

• Cross-lane functionalities are provided with several new instructions for broadcast and permute operations. Some of the 256-bit vector integer instruc- tions promoted from legacy SSE instruction sets also exhibit cross-lane behavior, e.g. VPMOVZ/VPMOVS family.

• AVX2 complements the AVX instructions that are typed for floating-point operation with a full compliment of equivalent set for operating with 32/64-bit integer data elements.

• Vector shift instructions with per-element shift count. Data elements sizes of 32 and 64-bits are supported.



I find the "gather" functionality especially intriguing.

APL and J have "transpose" instructions that let you rearrange the dimensions of a multidimensional array. I figured out awhile back how that could be done without having to move all the individual elements in memory; basically, keep track of the 'step size' between elements on each axis, and you can shuffle them around as much as you like. Of course, once you've done that, you're jumping back and forth all over the array to pull in each element when/where you want it.

Well, looky what those "gather" instructions do! Promising. Very promising...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: