CPYFPWTRN, CPYFMWTRN, CPYFEWTRN

Memory Copy Forward-only, writes unprivileged, reads non-temporal. These instructions perform a memory copy. The prologue, main, and epilogue instructions are expected to be run in succession and to appear consecutively in memory: CPYFPWTRN, then CPYFMWTRN, and then CPYFEWTRN.

CPYFPWTRN performs some preconditioning of the arguments suitable for using the CPYFMWTRN instruction, and performs an implementation defined amount of the memory copy. CPYFMWTRN performs an implementation defined amount of the memory copy. CPYFEWTRN performs the last part of the memory copy.

The inclusion of implementation defined amounts of memory copy allows some optimization of the size that can be performed.

The memory copy performed by these instructions is in the forward direction only, so the instructions are suitable for a memory copy only where there is no overlap between the source and destination locations, or where the source address is greater than the destination address.

The architecture supports two algorithms for the memory copy: option A and option B. Which algorithm is used is implementation defined.

Portable software should not assume that the choice of algorithm is constant.

After execution of CPYFPWTRN, option A (which results in encoding PSTATE.C = 0):

After execution of CPYFPWTRN, option B (which results in encoding PSTATE.C = 1):

For CPYFMWTRN, option A (encoded by PSTATE.C = 0), the format of the arguments is:

For CPYFMWTRN, option B (encoded by PSTATE.C = 1), the format of the arguments is:

For CPYFEWTRN, option A (encoded by PSTATE.C = 0), the format of the arguments is:

For CPYFEWTRN, option B (encoded by PSTATE.C = 1), the format of the arguments is:

Integer
(FEAT_MOPS)

313029282726252423222120191817161514131211109876543210
sz011001op10Rs100101RnRd
op2

Epilogue (op1 == 10)

CPYFEWTRN [<Xd>]!, [<Xs>]!, <Xn>!

Main (op1 == 01)

CPYFMWTRN [<Xd>]!, [<Xs>]!, <Xn>!

Prologue (op1 == 00)

CPYFPWTRN [<Xd>]!, [<Xs>]!, <Xn>!

if !HaveFeatMOPS() then UNDEFINED; if sz != '00' then UNDEFINED; integer d = UInt(Rd); integer s = UInt(Rs); integer n = UInt(Rn); bits(4) options = op2; MOPSStage stage; case op1 of when '00' stage = MOPSStage_Prologue; when '01' stage = MOPSStage_Main; when '10' stage = MOPSStage_Epilogue; otherwise SEE "Memory Copy and Memory Set"; if d == s || s == n || d == n then UNDEFINED; if d == 31 || s == 31 || n == 31 then UNDEFINED;

Assembler Symbols

<Xd>

For the epilogue and main variant: is the 64-bit name of the general-purpose register that holds an encoding of the destination address, encoded in the "Rd" field.

For the prologue variant: is the 64-bit name of the general-purpose register that holds the destination address and is updated by the instruction, encoded in the "Rd" field.

<Xs>

For the epilogue and main variant: is the 64-bit name of the general-purpose register that holds an encoding of the source address, encoded in the "Rs" field.

For the prologue variant: is the 64-bit name of the general-purpose register that holds the source address and is updated by the instruction, encoded in the "Rs" field.

<Xn>

For the epilogue variant: is the 64-bit name of the general-purpose register that holds an encoding of the number of bytes to be transferred and is set to zero at the end of the instruction, encoded in the "Rn" field.

For the main variant: is the 64-bit name of the general-purpose register that holds an encoding of the number of bytes to be transferred, encoded in the "Rn" field.

For the prologue variant: is the 64-bit name of the general-purpose register that holds the number of bytes to be transferred and is updated by the instruction to encode the remaining size and destination, encoded in the "Rn" field.

Operation

CheckMOPSEnabled(); integer N = MaxBlockSizeCopiedBytes(); bits(64) toaddress = X[d]; bits(64) fromaddress = X[s]; bits(64) cpysize = X[n]; bits(64) stagecpysize; bits(8*N) readdata; integer B; if HaveMTE2Ext() then SetTagCheckedInstruction(TRUE); boolean supports_option_a = MemCpyOptionA(); (racctype, wacctype) = MemCpyAccessTypes(options); if stage == MOPSStage_Prologue then if cpysize<63> == '1' then cpysize = 0x7FFFFFFFFFFFFFFF<63:0>; if supports_option_a then PSTATE.C = '0'; // Copy in the forward direction offsets the arguments. toaddress = toaddress + cpysize; fromaddress = fromaddress + cpysize; cpysize = Zeros(64) - cpysize; else PSTATE.C = '1'; PSTATE.N = '0'; PSTATE.V = '0'; PSTATE.Z = '0'; // IMP DEF selection of the amount covered by pre-processing. stagecpysize = CPYPreSizeChoice(toaddress, fromaddress, cpysize); assert stagecpysize<63> == cpysize<63> || stagecpysize == Zeros(); if SInt(cpysize) > 0 then assert SInt(stagecpysize) <= SInt(cpysize); else assert SInt(stagecpysize) >= SInt(cpysize); else boolean zero_size_exceptions = MemCpyZeroSizeCheck(); // Check if this version is consistent with the state of the call. if zero_size_exceptions || SInt(cpysize) != 0 then if supports_option_a then if PSTATE.C == '1' then boolean wrong_option = TRUE; boolean from_epilogue = stage == MOPSStage_Epilogue; MismatchedMemCpyException(supports_option_a, d, s, n, wrong_option, from_epilogue, options); else if PSTATE.C == '0' then boolean wrong_option = TRUE; boolean from_epilogue = stage == MOPSStage_Epilogue; MismatchedMemCpyException(supports_option_a, d, s, n, wrong_option, from_epilogue, options); bits(64) postsize = CPYPostSizeChoice(toaddress, fromaddress, cpysize); assert postsize<63> == cpysize<63> || SInt(postsize) == 0; if stage == MOPSStage_Main then stagecpysize = cpysize - postsize; // Check if the parameters to this instruction are valid. if MemCpyParametersIllformedM(toaddress, fromaddress, cpysize) then boolean wrong_option = FALSE; boolean from_epilogue = FALSE; MismatchedMemCpyException(supports_option_a, d, s, n, wrong_option, from_epilogue, options); else stagecpysize = postsize; // Check if the parameters to this instruction are valid for the epilogue. if (cpysize != postsize || MemCpyParametersIllformedE(toaddress, fromaddress, cpysize)) then boolean wrong_option = FALSE; boolean from_epilogue = TRUE; MismatchedMemCpyException(supports_option_a, d, s, n, wrong_option, from_epilogue, options); if supports_option_a then while SInt(stagecpysize) != 0 do // IMP DEF selection of the block size that is worked on. While many // implementations might make this constant, that is not assumed. B = CPYSizeChoice(toaddress, fromaddress, cpysize); assert B <= -1*SInt(stagecpysize); readdata<B*8-1:0> = Mem[fromaddress + cpysize, B, racctype]; Mem[toaddress + cpysize, B, wacctype] = readdata<B*8-1:0>; cpysize = cpysize + B; stagecpysize = stagecpysize + B; if stage != MOPSStage_Prologue then X[n] = cpysize; else while UInt(stagecpysize) > 0 do // IMP DEF selection of the block size that is worked on. While many // implementations might make this constant, that is not assumed. B = CPYSizeChoice(toaddress, fromaddress, cpysize); assert B <= UInt(stagecpysize); readdata<B*8-1:0> = Mem[fromaddress, B, racctype]; Mem[toaddress, B, wacctype] = readdata<B*8-1:0>; fromaddress = fromaddress + B; toaddress = toaddress + B; cpysize = cpysize - B; stagecpysize = stagecpysize - B; if stage != MOPSStage_Prologue then X[n] = cpysize; X[d] = toaddress; X[s] = fromaddress; if stage == MOPSStage_Prologue then X[n] = cpysize; X[d] = toaddress; X[s] = fromaddress;


Internal version only: isa v33.11seprel, AdvSIMD v29.05, pseudocode v2021-09_rel, sve v2021-09_rc3d ; Build timestamp: 2021-10-06T11:41

Copyright © 2010-2021 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.