K 11
svm:headrev
V 44
ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f:333413

K 10
svn:author
V 3
mjg
K 8
svn:date
V 27
2018-05-09T15:16:25.461392Z
K 7
svn:log
V 470
amd64: depessimize bcmp for small buffers

Adapt assembly generated by clang for memcmp and use it for <= 64 sized
compares (which are the vast majority).

Sample result of doing stats on Broadwell (% of samples):
before: 4.0 kernel     bcmp                 cache_lookup
after : 0.7 kernel     bcmp                 cache_lookup

The routine is most definitely still not optimal. Anyone interested in
spending time improving it is welcome to take over.

Reviewed by:	kib

END
