Floating-point down convert and narrow to BFloat16 (top, predicated)
Convert active 32-bit single-precision elements from the source vector to BFloat16 format, and place the results in the odd-numbered 16-bit elements of the destination vector, leaving the even-numbered elements unchanged. Inactive elements in the destination vector register remain unmodified.
Unlike the BFloat16 matrix multiplication and dot product instructions, this instruction honors all of the FPCR bits that apply to single-precision arithmetic. It can also generate a floating-point exception that causes cumulative exception bits in the FPSR to be set, or a synchronous exception to be taken, depending on the enable bits in the FPCR.
ID_AA64ZFR0_EL1.BF16 indicates whether this instruction is implemented.
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | Pg | Zn | Zd |
if (!HaveSVE() && !HaveSME()) || !HaveBF16Ext() then UNDEFINED; integer g = UInt(Pg); integer n = UInt(Zn); integer d = UInt(Zd);
<Zd> |
Is the name of the destination scalable vector register, encoded in the "Zd" field. |
<Pg> |
Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field. |
<Zn> |
Is the name of the first source scalable vector register, encoded in the "Zn" field. |
CheckSVEEnabled(); integer elements = VL DIV 32; bits(PL) mask = P[g]; bits(VL) operand = if AnyActiveElement(mask, 32) then Z[n] else Zeros(); bits(VL) result = Z[d]; for e = 0 to elements-1 if ElemP[mask, e, 32] == '1' then bits(32) element = Elem[operand, e, 32]; Elem[result, 2*e+1, 16] = FPConvertBF(element, FPCR[]); Z[d] = result;
Internal version only: isa v33.11seprel, AdvSIMD v29.05, pseudocode v2021-09_rel, sve v2021-09_rc3d ; Build timestamp: 2021-10-06T11:41
Copyright © 2010-2021 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.