Reorder some instructions and change the register usage to reduce
the number of pipeline stalls. Also use the bfextu and bfins
instructions for bitfield manipulations instead of shifting and
masking.
This makes gzipping a 80MB file approximately 2% faster.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>