Around mid-2024 I tried running feeding Hong Kong court cases to long context (local) LLMs and see how they fared.
Didn't work well. Although most of them claimed to support long contexts, they kind of just failed (got repetitive, etc.).
Since I wasn't really using them for important stuff I just set it aside.
[ Fast forward an eternity later (i.e. a couple months) ]
I'm pleasantly surprised to find that more recent models (as ancient as Llama-3.2 3B) actually performed pretty well on such long contexts. Llama-3.2 3B was actually the worst of the bunch, and apparently the recent gemma-3 models did really well.
The only issue is that gemma-3 27B is a bit slow.
Didn't really bother to check whether the difference was due to model performance or llama.cpp bugs. I suspect more of the former than the latter (Llama3.1 8B occasionally glitches out too).
Anyway, that's kinda good news, maybe I actually can create a comprehensive database of HK court case summary and commentary....
No comments:
Post a Comment