Open
Description
We have specialization for SpecForElem
for i8
and u8
(which provides a small performance win over plain extend for vec) but we could look into using specialization for that involves duplicating more digits with rep stosw
(or d
or q
). It may need benchmarking to ensure that it's actually faster as mentioned by @joshtriplett
I am not sure how to reproduce the assembly rep stosw
, someone that know could take up this issue.
Previous discussion: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/SpecForElem.20for.20other.20integers