Exllama 2 vs v2. Weirdly, inference seems to speed up over time.