CALLER_ARG_SPLAT is not necessary for method_missing. We just need
to unshift the method name into the arguments.
This optimizes all method_missing calls:
* mm(recv) ~9%
* mm(recv, *args) ~215% for args.length == 200
* mm(recv, *args, **kw) ~55% for args.length == 200
* mm(recv, **kw) ~22%
* mm(recv, kw: 1) ~100%
Note that empty argument splats do get slower with this approach,
by about 30-40%. Other than non-empty argument splats, other
argument splats are faster, with the speedup depending on the
number of arguments.