Biased test of GPT-4 era LLMs (300+ models, DeepSeek-R1 included)

ยท

29 min read

Intro

Time to time I was playing with various models I can run locally (on a 16GB VRAM GPU), checking out their conversational and reasoning capabilities. I don't fully trust public benchmarks, as I've encountered multiple models with great scores on those, which then felt pretty dumb in real conversations, and scenarios not covered by those benchmarks. They didn't generalize well, if you will.

My solution for that was shooting some basic math questions (elementary school level) to filter out models that couldn't even add or multiply small numbers. Those that did well were qualified as potentially usable (read: not utterly stupid). But it was process that could be automated and improved, so... I did exactly that.

Approach

I started about 3 weeks ago, and after multiple iterations I ended up with a kind of semi-private benchmark of mine (not strictly private, as I tested those questions on online models - Chat Arena, etc.), made of 83 multi-domain questions total in the current revision (v7). Number of questions per category:

Math: 9
Logic: 8
Anatomy: 3
Sex: 11
Literature: 11
Geography: 4
History: 5
Varia: 5
Censorship: 3
Roleplaying: 5
Scenario: 5
Science: 7
Metaphor: 7

Included questions are NOT hard problems, but more like basic reasoning checks in a range of contexts. Things I wish models could intuitively grasp during a conversation. Every question has clear and precise instructions, and there is only one best answer - usually just a single word, or a number. Sample question could look like that:

Answer with just a single number, and nothing else. Let's assume an hour would have 40 minutes. If I started my journey on 23:35, and finished it 10 past midnight - how many minutes it would take, then?

I know it's harder for models to answer this way, but it's done on purpose. It tests if the model can understand the question, follow the instructions, and then figure out the correct answer without having to write it all down, first. Only exact match is considered as a correct answer, so models that cannot stop yapping will fail badly.

Test suite covers understanding of adults-only areas of life (with system message clearly stating model will talk with an adult user, in such as cases), but it doesn't require generating explicit content - smart models available on the Chat Arena could answer those questions with their default system message. Roleplaying questions were pretty basic (1 or 2 turns), just to check model ability to follow less common system messages, and if it can speak from a different perspective than just a casual AI assistant.

DeepSeek-R1 and thinking models ๐Ÿง 

As I was progressing with my tests, DeepSeek-R1 just got released - first open-source model that could actually be used as an alternative for OpenAI o1 models. Along with it distilled versions were also released, usable with llama.cpp. I got curious about those, soโ€ฆ I changed my mind a bit, and included thinking models in my tests, too - cause those distilled R1s had strictly structured output, and thinking part could be easily separated from normal output. During my tests all thinking models were given 2000 tokens budget for thinking process. In case of locally launched distilled R1s only the output after </think> tag was considered as an answer (as llama.cpp doesn't split reasoning_content from content alike DeepSeek API).

Questions were cross-checked with ChatGPT, Gemini 2, and Claude 3.5 Sonnet, and every expected answer (sometimes after pointing out errors in AI reasoning) was considered as ultimately best answer (most logical and simplest solution that meets all the requirements of the question, Occam's razor โค๏ธ).

Other details

Temperature was always set to 0.3 (except for some online models with fixed temperature), and each question was repeated 10 times with different seeds (to check consistency, and reduce the impact of RNG). 80 of the 83 questions are single-turn. The final score was measured as % of correct answers vs total number of questions (830).

I was picking the quants that allowed me to fully load given model onto my GPU. My test cases didnโ€™t require big context, so sometimes I picked a bit bigger quant and further reduced context size - so I didnโ€™t use much VRAM for that. The only exceptions from that rule were: 1) Llama-3.3-70B-Instruct (I tried it out of curiosity, but it was only partially loaded in VRAM, and thus super-slow - not really usable locally on my GPU), and 2) Athene-V2-Chat (highest ranking open weights ~70B model on Chat Arena leaderboard, not available on OpenRouter).

After testing multiple local models I got curious about how online models would score in my benchmark, soโ€ฆ Iโ€™ve spent a few $, and tested multiple popular online models, too.

And now - let's meet the winners ๐Ÿ˜Ž.

Best thinking online models

  1. ๐Ÿฅ‡ o1-2024-12-17 (89.8) ๐Ÿง 

  2. ๐Ÿฅˆ o1-preview-2024-09-12 (85.1) ๐Ÿง 

  3. ๐Ÿฅ‰ o3-mini-2025-01-31 (84.1) ๐Ÿง 

Best thinking online models (non-OpenAI)

  1. ๐Ÿฅ‡ deepseek-r1 (80.2) ๐Ÿง 

  2. ๐Ÿฅˆ gemini-2.0-flash-thinking-exp-01-21 (74.7) ๐Ÿง 

  3. ๐Ÿฅ‰ deepseek-r1-distill-llama-70b (58.2) ๐Ÿง 

Best non-thinking ๐Ÿ˜… online models

  1. ๐Ÿฅ‡ gpt-4o-2024-08-06 (73.3)

  2. ๐Ÿฅˆ grok-2-1212 (73.1)

  3. ๐Ÿฅ‰ gemini-2.0-pro-exp-02-05 (72.5)

Best ~70B models

  1. ๐Ÿฅ‡ llama-3.3-70b-instruct (63.7)

  2. ๐Ÿฅˆ llama-3.1-sonar-large-128k-chat (62.5)

  3. ๐Ÿฅ‰ qwen-2-72b-instruct (62.5)

Best 27-69B models

  1. ๐Ÿฅ‡ Gemmasutra-Pro-27B-v1-IQ4_XS.gguf (58.0)

  2. ๐Ÿฅˆ gemma-2-27b-it-abliterated.Q3_K_M.gguf (56.9)

  3. ๐Ÿฅ‰ FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-IQ3_XS.gguf (55.9) ๐Ÿง 

Best 18-26B models

  1. ๐Ÿฅ‡ Mistral-Small-24B-Instruct-2501-abliterated-Q3_K_M.gguf (49.6)

  2. ๐Ÿฅˆ ChatWaifu_v2.0_22B.i1-IQ4_XS.gguf (49.5)

  3. ๐Ÿฅ‰ Cydonia-24B-v2-Q3_K_M.gguf (48.1)

Best 14-17B models

  1. ๐Ÿฅ‡ Qwen2.5-Coder-14B-Instruct-abliterated-Q6_K.gguf (49.5)

  2. ๐Ÿฅˆ Rombos-Coder-V2.5-Qwen-14b-Q6_K.gguf (47.5)

  3. ๐Ÿฅ‰ Qwen2.5-Coder-14B-Instruct-Q6_K.gguf (46.9)

Best 10-13B models

  1. ๐Ÿฅ‡ MN-12B-Lyra-v4-Q6_K-imat.gguf (45.5)

  2. ๐Ÿฅˆ ArliAI-RPMax-12B-v1.1-Q6_K.gguf (42.2)

  3. ๐Ÿฅ‰ Gemma-The-Writer-N-Restless-Quill-10B-D_AU-Q6_k.gguf (41.7)

Best 9B models

  1. ๐Ÿฅ‡ Smegmma-9B-v1.Q6_K.gguf (53.9)

  2. ๐Ÿฅˆ Smegmma-Deluxe-9B-v1-Q6_K.gguf (53.6)

  3. ๐Ÿฅ‰ MT3-Gen4-gemma-2-9B.Q6_K.gguf (51.8)

Best 8B models

  1. ๐Ÿฅ‡ Wingless_Imp_8B-Q6_K.gguf (44.2)

  2. ๐Ÿฅˆ L3-8B-Lunaris-v1-Q6_K.gguf (43.1)

  3. ๐Ÿฅ‰ DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.Q6_K.gguf (42.8)

Best 7B models

  1. ๐Ÿฅ‡ Qwen2.5-7B-Instruct-1M-abliterated.Q6_K.gguf (42.9)

  2. ๐Ÿฅˆ Qwen2.5-7B-CelestialHarmony-1M.Q6_K.gguf (41.1)

  3. ๐Ÿฅ‰ HomerCreativeAnvita-Mix-Qw7B.Q6_K.gguf (41.0)

Best 3-6B models

  1. ๐Ÿฅ‡ Llama-3.2-3B-Instruct-abliterated.Q8_0.gguf (31.1)

  2. ๐Ÿฅˆ Eximius_Persona_5B-Q6_K.gguf (27.7)

  3. ๐Ÿฅ‰ Phi-3-mini-4k-instruct-Q8_0.gguf (27.6)

Best 1-2B models

  1. ๐Ÿฅ‡ granite-3.1-2b-instruct-Q8_0.gguf (22.8)

  2. ๐Ÿฅˆ Qwen2.5-1.5B-Instruct-Q8_0.gguf (20.4)

  3. ๐Ÿฅ‰ EXAONE-3.5-2.4B-Instruct-abliterated.Q8_0.gguf (20.2)

Final words

I am also attaching the scores of all the models I've tested. I measured many classics, so if you're curious about how they compare to each other, I might have an answer in this table. Just please keep in mind that these scores are one of many possible perspectives, and models with lower score can be fun to talk with, too. Intelligence is a multi-dimensional creature, so single number will never tell you the whole truth ๐Ÿ™ƒ.

Model

Score
(v7.2)

Params
(B)

Type

o1-2024-12-17

89.8

?

API (OpenAI)

o1-preview-2024-09-12

85.1

?

API (OpenAI)

o3-mini-2025-01-31

84.1

?

API (OpenAI)

deepseek-r1

80.2

671

API (OpenRouter)

gemini-2.0-flash-thinking-exp-01-21

74.7

?

API (Gemini)

gpt-4o-2024-08-06

73.3

?

API (OpenAI)

grok-2-1212

73.1

?

API (xAI)

o1-mini-2024-09-12

73.0

?

API (OpenAI)

gemini-2.0-pro-exp-02-05

72.5

?

API (Gemini)

chatgpt-4o-latest

72.4

?

API (OpenAI)

grok-beta

70.4

?

API (xAI)

gemini-1.5-pro-002

69.4

?

API (Gemini)

gemini-2.0-flash-exp

68.3

?

API (Gemini)

gemini-2.0-flash

66.4

?

API (Gemini)

claude-3.5-sonnet

65.7

?

API (OpenRouter)

gpt-4-0613

65.1

?

API (OpenAI)

gpt-4-turbo-2024-04-09

64.9

?

API (OpenAI)

llama-3.3-70b-instruct

63.7

70

API (OpenRouter)

qwen-max

63.4

?

API (OpenRouter)

llama-3.1-sonar-large-128k-chat

62.5

70

API (OpenRouter)

qwen-2-72b-instruct

62.5

72

API (OpenRouter)

deepseek-chat

61.8

671

API (OpenRouter)

llama-3.2-90b-vision-instruct

61.6

90

API (OpenRouter)

mistral-large-2411

59.9

123

API (OpenRouter)

Llama-3.3-70B-Instruct-IQ3_XXS.gguf

59.8

70

llama.cpp

mistral-large-2407

59.4

123

API (OpenRouter)

llama-3.1-70b-instruct

59.3

70

API (OpenRouter)

llama-3.1-lumimaid-70b

58.3

70

API (OpenRouter)

deepseek-r1-distill-llama-70b

58.2

70

API (OpenRouter)

claude-3-opus

58.0

?

API (OpenRouter)

Gemmasutra-Pro-27B-v1-IQ4_XS.gguf

58.0

27

llama.cpp

l3.3-euryale-70b

57.8

70

API (OpenRouter)

hermes-3-llama-3.1-405b

57.0

405

API (OpenRouter)

gemma-2-27b-it-abliterated.Q3_K_M.gguf

56.9

27

llama.cpp

llama-3-70b-instruct

56.7

70

API (OpenRouter)

llama-3.1-405b-instruct

56.6

405

API (OpenRouter)

minimax-01

56.4

456

API (OpenRouter)

Llama-3.3-70B-Instruct-IQ2_S.gguf

56.0

70

llama.cpp

Athene-V2-Chat.i1-IQ3_XXS.gguf

55.9

72

llama.cpp

FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-IQ3_XS.gguf

55.9

32

llama.cpp

Big-Tiger-Gemma-27B-v1-IQ4_XS.gguf

55.8

27

llama.cpp

gemini-1.5-flash-002

54.3

?

API (Gemini)

qwen-2.5-72b-instruct

54.3

72

API (OpenRouter)

Q2.5-32B-Slush.i1-IQ3_XS.gguf

54.0

32

llama.cpp

Qwen2.5-32B-AGI-IQ3_XS.gguf

54.0

32

llama.cpp

Smegmma-9B-v1.Q6_K.gguf

53.9

9

llama.cpp

claude-3.5-haiku

53.6

?

API (OpenRouter)

Smegmma-Deluxe-9B-v1-Q6_K.gguf

53.6

9

llama.cpp

gemma-2-27b-it-IQ4_XS.gguf

53.4

27

llama.cpp

Gemmasutra-Pro-27B-v1-IQ3_XS.gguf

53.3

27

llama.cpp

gemma-2-27b-it-abliterated.Q3_K_S.gguf

53.1

27

llama.cpp

32B-Qwen2.5-Kunou-v1-IQ3_XS.gguf

52.9

32

llama.cpp

Qwen2.5-32b-RP-Ink-IQ3_XS.gguf

52.9

32

llama.cpp

qwen-plus

52.4

?

llama.cpp

Replete-LLM-V2.5-Qwen-32b-IQ3_XS.gguf

52.4

32

llama.cpp

TheBeagle-v2beta-32B-MGS-IQ3_XS.gguf

52.3

32

llama.cpp

hermes-3-llama-3.1-70b

52.0

70

API (OpenRouter)

ultiima-32B.i1-IQ3_XS.gguf

52.0

32

llama.cpp

MT3-Gen4-gemma-2-9B.Q6_K.gguf

51.8

9

llama.cpp

Qwen2.5-32B-Instruct-abliterated.i1-IQ3_XS.gguf

51.8

32

llama.cpp

DeepSeek-R1-Distill-Qwen-32B-IQ3_XS.gguf

51.6

32

llama.cpp

gemini-2.0-flash-lite-preview-02-05

51.4

?

API (Gemini)

Qwen2.5-32B-Instruct-CFT-IQ3_XS.gguf

51.4

32

llama.cpp

Rombos-LLM-V2.5-Qwen-32b.i1-IQ3_XS.gguf

51.4

32

llama.cpp

LIMO.i1-IQ3_XS.gguf

51.0

32

llama.cpp

gemma-2-27b-it

50.8

27

API (OpenRouter)

gpt-4o-mini-2024-07-18

50.8

?

API (OpenAI)

qwen-2-vl-72b-instruct

50.8

72

llama.cpp

gemma-2-27b-it-IQ3_XS.gguf

50.7

27

llama.cpp

Bespoke-Stratos-32B.i1-IQ3_XS.gguf

50.6

32

llama.cpp

Qwen2.5-Gutenberg-Doppel-32B-IQ3_XS.gguf

50.6

32

llama.cpp

llama-3.1-nemotron-70b-instruct

50.2

70

API (OpenRouter)

Qwen2.5-32B-Instruct-IQ3_XS.gguf

50.1

32

llama.cpp

Mistral-Small-24B-Instruct-2501-abliterated-Q3_K_M.gguf

49.6

24

llama.cpp

Sky-T1-32B-Preview-IQ3_XS.gguf

49.6

32

llama.cpp

ChatWaifu_v2.0_22B.i1-IQ4_XS.gguf

49.5

22

llama.cpp

Qwen2.5-Coder-14B-Instruct-abliterated-Q6_K.gguf

49.5

14

llama.cpp

Tiger-Gemma-9B-v3-Q6_K.gguf

49.3

9

llama.cpp

Big-Tiger-Gemma-27B-v1-IQ3_XS.gguf

49.2

27

llama.cpp

eva-llama-3.33-70b

49.2

70

API (OpenRouter)

Rombo-LLM-V3.0-Qwen-32b-IQ3_XS.gguf

48.8

32

llama.cpp

EZO-Qwen2.5-32B-Instruct.i1-IQ3_XS.gguf

48.7

32

llama.cpp

gemini-1.5-flash-8b-001

48.7

8

API (Gemini)

Dumpling-Qwen2.5-32B-IQ3_XS.gguf

48.4

32

llama.cpp

FuseChat-Gemma-2-9B-Instruct.Q6_K.gguf

48.4

9

llama.cpp

Gemma2-9b-it-Boku-v3-Q6_K.gguf

48.4

9

llama.cpp

gpt-3.5-turbo-0125

48.2

?

API (OpenAI)

Cydonia-24B-v2-Q3_K_M.gguf

48.1

24

llama.cpp

inflection-3-productivity

48.1

?

API (OpenRouter)

claude-3-sonnet

48.0

?

API (OpenRouter)

Pantheon-RP-Pure-1.6.2-22B-Small-IQ4_XS.gguf

47.8

22

llama.cpp

Sky-T1-32B-Flash-IQ3_XS.gguf

47.7

32

llama.cpp

Rombos-Coder-V2.5-Qwen-14b-Q6_K.gguf

47.5

14

llama.cpp

gemma-2-9b-it-SimPO.Q6_K.gguf

47.3

9

llama.cpp

Mistral-Small-Instruct-2409-IQ4_XS.gguf

47.3

22

llama.cpp

Aster-G2-9B-v1.Q6_K.gguf

47.2

9

llama.cpp

Mistral-Small-Drummer-22B-IQ4_XS.gguf

47.2

22

llama.cpp

Gemma-2-Ataraxy-9B-Q6_K.gguf

47.1

9

llama.cpp

Gemma-The-Writer-9B-D_AU-Q6_k.gguf

47.1

9

llama.cpp

Mistral-Small-22B-ArliAI-RPMax-v1.1-IQ4_XS.gguf

47.1

22

llama.cpp

Mistral-Small-24B-Instruct-2501-Q3_K_M.gguf

47.1

24

llama.cpp

MS-Meadowlark-22B.i1-IQ4_XS.gguf

47.0

22

llama.cpp

OpenThinker-32B-abliterated.i1-IQ3_XS.gguf

47.0

32

llama.cpp

Gemma-The-Writer-Mighty-Sword-9B-D_AU-Q6_k.gguf

46.9

9

llama.cpp

gemma2-9B-sunfall-v0.5.2.Q6_K.gguf

46.9

9

llama.cpp

Qwen2.5-Coder-14B-Instruct-Q6_K.gguf

46.9

14

llama.cpp

STILL-2.i1-IQ3_XS.gguf

46.9

32

llama.cpp

mistral-small-24b-instruct-2501

46.6

24

API (OpenRouter)

Mistral-Small-24B-Instruct-2501-Q4_K_M.gguf

46.5

24

llama.cpp

Qwentile2.5-32B-Instruct-IQ3_XS.gguf

46.5

32

llama.cpp

SauerkrautLM-gemma-2-9b-it.Q6_K.gguf

46.5

9

llama.cpp

Mistral-Small-24B-Instruct-2501-IQ3_XS.gguf

46.3

24

llama.cpp

gemma2-gutenberg-9B.Q6_K.gguf

46.1

9

llama.cpp

Darkest-muse-v1-Q6_K.gguf

45.9

9

llama.cpp

Cydonia-22B-v1.3.i1-IQ4_XS.gguf

45.7

22

llama.cpp

MN-12B-Lyra-v4-Q6_K-imat.gguf

45.5

12

llama.cpp

Pantheon-RP-1.6.2-22B-Small-IQ4_XS.gguf

45.5

22

llama.cpp

Quill-v1.Q6_K.gguf

45.5

9

llama.cpp

G2-9B-Aletheia-v1.Q6_K.gguf

45.3

9

llama.cpp

OpenThinker-32B-IQ3_XS.gguf

45.3

32

llama.cpp

Mistral-Small-Instruct-2409-abliterated.i1-IQ4_XS.gguf

45.2

22

llama.cpp

Virtuoso-Small-Q6_K.gguf

45.2

14

llama.cpp

dolphin-mixtral-8x22b

44.9

141

API (OpenRouter)

gemma-2-Ifable-9B.Q6_K.gguf

44.8

9

llama.cpp

yi-large

44.8

?

API (OpenRouter)

Cydonia-v1.3-Magnum-v4-22B.i1-IQ4_XS.gguf

44.3

22

llama.cpp

gemma-2-9b-it-Q6_K.gguf

44.3

9

llama.cpp

magnum-v4-22b-IQ4_XS.gguf

44.3

22

llama.cpp

Cydonia-22B-v1.3.Q4_K_S.gguf

44.2

22

llama.cpp

Wingless_Imp_8B-Q6_K.gguf

44.2

8

llama.cpp

gemma-2-9b-it-DPO.Q6_K.gguf

44.1

9

llama.cpp

meissa-qwen2.5-14b-instruct-q6_k.gguf

44.1

14

llama.cpp

Mistral-Small-24B-ArliAI-RPMax-v1.4.i1-IQ4_XS.gguf

44.1

24

llama.cpp

MS-Schisandra-22B-v0.2.i1-IQ4_XS.gguf

44.1

22

llama.cpp

Qwen2.5-Coder-32B-Instruct-IQ3_XS.gguf

43.9

32

llama.cpp

TQ2.5-14B-Neon-v1.Q6_K.gguf

43.6

14

llama.cpp

CogitoZ14.Q6_K.gguf

43.5

14

llama.cpp

Primal-Opus-14B-Optimus-v1.Q6_K.gguf

43.4

14

llama.cpp

Dolphin3.0-Mistral-24B.Q4_K_M.gguf

43.3

24

llama.cpp

G2-9B-Blackout-R1.Q6_K.gguf

43.3

9

llama.cpp

Megatron-Corpus-14B-Exp.Q6_K.gguf

43.3

14

llama.cpp

TQ2.5-14B-Sugarquill-v1.Q6_K.gguf

43.3

14

llama.cpp

L3-8B-Lunaris-v1-Q6_K.gguf

43.1

8

llama.cpp

Gemma-2-9B-It-SPPO-Iter3-Q6_K

43.0

9

llama.cpp

Impish_QWEN_14B-Q6_K.gguf

43.0

14

llama.cpp

Virtuoso-Small-v2-Q6_K.gguf

43.0

14

llama.cpp

0x-lite-Q6_K.gguf

42.9

14

llama.cpp

L3-Aethora-15B.Q6_K.gguf

42.9

15

llama.cpp

L3-Lunaris-v1-15B.Q6_K.gguf

42.9

15

llama.cpp

Qwen2.5-7B-Instruct-1M-abliterated.Q6_K.gguf

42.9

7

llama.cpp

DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.Q6_K.gguf

42.8

8

llama.cpp

NQLSG-Qwen2.5-14B-MegaFusion-v2.Q6_K.gguf

42.8

14

llama.cpp

DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored.Q6_K.gguf

42.7

8

llama.cpp

L3-8B-Stheno-v3.2-Q6_K-imat.gguf

42.7

8

llama.cpp

TQ2.5-14B-Aletheia-v1.Q6_K.gguf

42.7

14

llama.cpp

L3-8B-Stheno-v3.1-Q6_K-imat.gguf

42.5

8

llama.cpp

L3-SAO-MIX-8B-V1.Q6_K.gguf

42.5

8

llama.cpp

Tifa-Deepsex-14b-CoT.Q6_K.gguf

42.4

14

llama.cpp

nova-lite-v1

42.3

?

API (OpenRouter)

Qwen2.5-14B-Instruct-Q6_K.gguf

42.3

14

llama.cpp

ArliAI-RPMax-12B-v1.1-Q6_K.gguf

42.2

12

llama.cpp

command-r-plus-08-2024

42.0

104

API (OpenRouter)

Qwen2.5-Coder-14B-Instruct-Uncensored.Q6_K.gguf

42.0

14

llama.cpp

Qwentessential-14B-v1.Q6_K.gguf

42.0

14

llama.cpp

SAO_LightMix-8B-Model_Stock.Q6_K.gguf

42.0

8

llama.cpp

ZYH-LLM-Qwen2.5-14B-Q6_K.gguf

42.0

14

llama.cpp

claude-3-haiku

41.9

?

API (OpenRouter)

Hathor_RP-v.01-L3-8B.Q6_K.gguf

41.9

8

llama.cpp

Gemma-The-Writer-N-Restless-Quill-10B-D_AU-Q6_k.gguf

41.7

10

llama.cpp

Lamarck-14B-v0.7-Q6_K.gguf

41.7

14

llama.cpp

Gemma-2-Ataraxy-v2-9B-Q6_K.gguf

41.6

9

llama.cpp

EVA-Gutenberg3-Qwen2.5-32B.i1-IQ3_XS.gguf

41.4

32

llama.cpp

EVA-Qwen2.5-32B-v0.2.i1-IQ3_XS.gguf

41.4

32

llama.cpp

SaoRPM-2x8B.Q6_K.gguf

41.3

14

llama.cpp

Bielik-11B-v2.3-Instruct-Q6_K.gguf

41.2

11

llama.cpp

Dinobot-Opus-14B-Exp.Q6_K.gguf

41.2

14

llama.cpp

Megatron-Opus-14B-Exp.Q6_K.gguf

41.2

14

llama.cpp

Qwen2.5-7B-CelestialHarmony-1M.Q6_K.gguf

41.1

7

llama.cpp

Dolphin3.0-Mistral-24B.i1-IQ3_XS.gguf

41.0

24

llama.cpp

HomerCreativeAnvita-Mix-Qw7B.Q6_K.gguf

41.0

7

llama.cpp

qwen-turbo

41.0

?

API (OpenRouter)

Rombos-LLM-V2.6-Qwen-14b-Q6_K.gguf

40.8

14

llama.cpp

Saka-14B.Q6_K.gguf

40.8

14

llama.cpp

Meta-Llama-3-8B-Instruct.Q6_K.gguf

40.7

8

llama.cpp

Q2.5-Instruct-1M_Harmony.Q6_K.gguf

40.7

7

llama.cpp

Hathor_Stable-v0.2-L3-8B.Q6_K.gguf

40.6

8

llama.cpp

Impish_QWEN_7B-Q6_K.gguf

40.6

7

llama.cpp

L3-Umbral-Mind-RP-v1.0-8B-Q6_K-imat.gguf

40.6

8

llama.cpp

Sombrero-Opus-14B-Elite5.Q6_K.gguf

40.6

14

llama.cpp

G2-DA-Nyxora-27b-V2-IQ3_XS.gguf

40.5

27

llama.cpp

nous-hermes-2-mixtral-8x7b-dpo

40.5

47

API (OpenRouter)

lfm-40b

40.4

40

API (OpenRouter)

SAINEMO-reMIX.Q6_K.gguf

40.1

12

llama.cpp

Vikhr-Nemo-12B-Instruct-R-21-09-24-Q6_K.gguf

40.1

12

llama.cpp

Cydonia-22B-v1-IQ4_XS.gguf

40.0

22

llama.cpp

magnum-12b-v2.5-kto-Q6_K.gguf

40.0

12

llama.cpp

Chocolatine-2-14B-Instruct-v2.0b3-Q6_K.gguf

39.9

14

llama.cpp

magnum-12b-v2-Q6_K.gguf

39.9

12

llama.cpp

ZYH-LLM-Qwen2.5-14B-V2-Q6_K.gguf

39.9

14

llama.cpp

Gemmasutra-9B-v1c-Q6_K.gguf

39.8

9

llama.cpp

magnum-v3-27b-kto-IQ3_XS.gguf

39.8

27

llama.cpp

nova-pro-v1

39.8

?

API (OpenRouter)

Odin-Q6_K.gguf

39.8

9

llama.cpp

Qwen2.5-7B-Instruct-1M-Q6_K.gguf

39.8

7

llama.cpp

MN-GRAND-Gutenburg-Lyra4-Lyra-12B-DARKNESS-D_AU-Q6_k.gguf

39.6

12

llama.cpp

nova-micro-v1

39.6

?

API (OpenRouter)

Nymph_8B.Q6_K.gguf

39.6

8

llama.cpp

Qwenvergence-14B-v10.Q6_K.gguf

39.6

14

llama.cpp

Rocinante-12B-v1.1-Q6_K.gguf

39.6

12

llama.cpp

Chocolatine-2-14B-Instruct-v2.0.3.Q6_K.gguf

39.5

14

llama.cpp

Evac-Opus-14B-Exp.Q6_K.gguf

39.5

14

llama.cpp

UnslopNemo-12B-v4.1-Q6_K.gguf

39.3

12

llama.cpp

Qwen2.5-7B-Instruct-1M-GRPO_logic_KK_5PPL-Q6_K.gguf

39.2

7

llama.cpp

claude-2.1

39.0

?

API (OpenRouter)

Llama-3-8B-Instruct-Finance-RAG.Q6_K.gguf

39.0

8

llama.cpp

Meissa-Qwen2.5-7B-Instruct.Q6_K.gguf

38.9

7

llama.cpp

Qwen2.5-7B-RRP-1M.Q6_K.gguf

38.9

7

llama.cpp

Captain_BMO-12B-Q6_K.gguf

38.8

12

llama.cpp

Llama-3-Lumimaid-8B-v0.1-OAS-Q6_K-imat.gguf

38.8

8

llama.cpp

Qwen2.5-14B-Instruct-abliterated.Q6_K.gguf

38.8

14

llama.cpp

gemma-2-9B-it-advanced-v2.1.Q6_K.gguf

38.7

9

llama.cpp

Qwen2.5-7B-Instruct-Uncensored.Q6_K.gguf

38.7

7

llama.cpp

SuperNova-Medius-Q6_K.gguf

38.6

14

llama.cpp

Magellanic-Opus-14B-Exp.Q6_K.gguf

38.4

14

llama.cpp

natsumura-assistant-1.1-llama-3.1-8B.Q6_K.gguf

38.3

8

llama.cpp

3Blarenegv3-ECE-PRYMMAL-Martial-Q6_K.gguf

38.2

7

llama.cpp

Lumimaid-v0.2-8B.q6_k.gguf

38.2

8

llama.cpp

Llama-3-Instruct-8B-ORPO-v0.2-Q6_K.gguf

38.1

8

llama.cpp

MN-SlushoMix.Q6_K.gguf

38.1

12

llama.cpp

NeuralDaredevil-8B-abliterated.Q6_K.gguf

38.1

8

llama.cpp

Gemma-2-27B-Chinese-Chat.i1-IQ3_XS.gguf

38.0

27

llama.cpp

NightyGurps-14b-v1.1-Q6_K.gguf

38.0

14

llama.cpp

QRWKV6-32B-Instruct-Preview-v0.1.i1-IQ3_XS.gguf

38.0

32

llama.cpp

s1.1-32B-IQ3_XS.gguf

38.0

32

llama.cpp

ArliAI-Llama-3-8B-Formax-v1.0-Q6_K.gguf

37.8

8

llama.cpp

DeepSeek-R1-Distill-Qwen-14B-Q6_K.gguf

37.8

14

llama.cpp

L3-8B-Niitama-v1.Q6_K.gguf

37.8

8

llama.cpp

Llama-3-Soliloquy-8B-v2-Q6_K-imat.gguf

37.8

8

llama.cpp

Wayfarer_Eris_Noctis-12B.Q6_K.gguf

37.8

12

llama.cpp

Calcium-Opus-14B-Elite-1M.Q6_K.gguf

37.7

14

llama.cpp

EXAONE-3.5-32B-Instruct-abliterated.i1-IQ3_XS.gguf

37.7

32

llama.cpp

meta-llama-3.1-8b-instruct-abliterated.Q6_K.gguf

37.7

8

llama.cpp

L3.1-Dark-Planet-SpinFire-Uncensored-8B-D_AU-Q6_k.gguf

37.6

8

llama.cpp

Lumimaid-Magnum-v4-12B.Q6_K.gguf

37.6

12

llama.cpp

mistral-nemo-gutenberg3-12B.i1-Q6_K.gguf

37.5

12

llama.cpp

MN-DPT-Slush-q6_k.gguf

37.5

12

llama.cpp

Qwen2.5-7B-Instruct-Q6_K.gguf

37.5

7

llama.cpp

Tsunami-0.5x-7B-Instruct.Q6_K.gguf

37.5

7

llama.cpp

Hermes-3-Llama-3.1-8B.Q6_K.gguf

37.3

8

llama.cpp

L3-Dark-Planet-8B-max-D_AU-Q6_k.gguf

37.3

8

llama.cpp

L3-Nymeria-15B.Q6_K.gguf

37.3

15

llama.cpp

FuseChat-7B-VaRM.Q6_K.gguf

37.2

7

llama.cpp

MN-Slush.Q6_K.gguf

37.2

12

llama.cpp

Zurich-14B-GCv2-5m.Q6_K.gguf

37.2

14

llama.cpp

Atlas-Pro-7B-Preview-1M.Q6_K.gguf

37.1

7

llama.cpp

Tsunami-1.0-7B-Instruct.Q6_K.gguf

37.1

7

llama.cpp

Lumimaid-v0.2-12B.q6_k.gguf

37.0

12

llama.cpp

saiga_llama3_8b-q8_0.gguf

37.0

8

llama.cpp

14B-Qwen2.5-Kunou-v1.Q6_K.gguf

36.9

14

llama.cpp

cybertron-v4-qw7B-MGS-Q6_K.gguf

36.9

7

llama.cpp

MN-GRAND-Gutenberg-Lyra4-Lyra-12B-MADNESS.Q6_K.gguf

36.9

12

llama.cpp

Tiamat-8B-1.2-Llama-3-DPO.Q6_K.gguf

36.9

8

llama.cpp

Mistral-Nemo-12B-ArliAI-RPMax-v1.2-Q6_K.gguf

36.7

12

llama.cpp

Daredevil-8B-abliterated.Q6_K.gguf

36.6

8

llama.cpp

Qwen2.5-14B-Instruct-1M-abliterated.Q6_K.gguf

36.5

14

llama.cpp

RigoChat-7b-v2.Q6_K.gguf

36.5

7

llama.cpp

L3.1-8B-Dark-Planet-Slush.Q6_K.gguf

36.4

8

llama.cpp

openchat-3.5-1210.Q6_K.gguf

36.4

7

llama.cpp

Llama-3-Instruct-8B-IPO-v0.2-Q6_K.gguf

36.1

8

llama.cpp

WebMind-7B-v0.1.Q6_K.gguf

36.1

7

llama.cpp

Bigger-Body-12b.Q6_K.gguf

36.0

12

llama.cpp

dolphin-mixtral-8x7b

36.0

47

API (OpenRouter)

Llama-3.3-70B-Instruct-IQ1_M.gguf

36.0

70

llama.cpp

magnum-v4-9b-Q6_K.gguf

36.0

9

llama.cpp

MiniCPM-o-2.6-Q6_K.gguf

36.0

8

llama.cpp

openhermes-2.5-mistral-7b.Q6_K.gguf

36.0

7

llama.cpp

RolePlayLake-7B-Toxic.Q6_K.gguf

35.9

7

llama.cpp

Virtuoso-Lite-Q6_K.gguf

35.9

10

llama.cpp

QRWKV6-32B-Instruct-Preview-v0.1-abliterated.i1-IQ3_XS.gguf

35.8

32

llama.cpp

UNA-SimpleSmaug-34b-v1beta.i1-IQ3_XS.gguf

35.7

34

llama.cpp

Beepo-22B.i1-IQ4_XS.gguf

35.5

22

llama.cpp

Llama-3.1-8B-ArliAI-Formax-v1.0-Q6_K.gguf

35.5

8

llama.cpp

NovaSpark.Q6_K.gguf

35.5

8

llama.cpp

DeepHermes-3-Llama-3-8B-q6.gguf

35.4

8

llama.cpp

Meta-Llama-3.1-8B-Instruct-Q8_0.gguf

35.4

8

llama.cpp

MN-12B-Mag-Mell-R1.Q6_K.gguf

35.4

12

llama.cpp

Qwen2.5-14B-Instruct-1M-Q6_K.gguf

35.4

14

llama.cpp

Meraj-Mini-Q6_K.gguf

35.3

7

llama.cpp

ResplendentAI-ChattyMix-8b-64k.Q6_K.gguf

35.3

8

llama.cpp

Daredevil-8B.Q6_K.gguf

35.2

8

llama.cpp

Fimbulvetr-11B-v2-Q6_K.gguf

35.2

11

llama.cpp

Phi-4-Model-Stock-v4.Q6_K.gguf

35.2

14

llama.cpp

Qwen2.5-14B-Instruct-1M-GRPO-Reasoning.Q6_K.gguf

35.2

14

llama.cpp

Qwen2.5-7B-Instruct-abliterated-v2.Q6_K.gguf

35.2

7

llama.cpp

Tiamat-7B.Q6_K.gguf

35.2

7

llama.cpp

EXAONE-3.5-7.8B-Instruct-abliterated.Q6_K.gguf

35.1

8

llama.cpp

oxy-1-small.Q6_K.gguf

35.1

14

llama.cpp

mixtral-8x22b-instruct

34.9

141

API (OpenRouter)

SauerkrautLM-Nemo-12b-Instruct-Q6_K.gguf

34.8

12

llama.cpp

BlenderLLM-Q6_K.gguf

34.7

7

llama.cpp

c4ai-command-r-08-2024-IQ3_XS.gguf

34.7

32

llama.cpp

Chaos_RP_l3_8B-Q6_K-imat.gguf

34.7

8

llama.cpp

COCO-7B-Instruct-1M.Q6_K.gguf

34.7

7

llama.cpp

Falcon3-10B-Instruct-q6_k.gguf

34.7

10

llama.cpp

InternLM2_5-20B-ArliAI-RPMax-v1.1.i1-Q4_K_M.gguf

34.7

20

llama.cpp

Meta-Llama-3.1-8B-Instruct-Q6_K.gguf

34.7

8

llama.cpp

palm-2-chat-bison

34.7

?

API (OpenRouter)

Pygmalion-2-13B-Q6_K.gguf

34.6

13

llama.cpp

SauerkrautLM-7b-HerO.Q6_K.gguf

34.6

7

llama.cpp

EXAONE-3.5-32B-Instruct-IQ3_XS.gguf

34.5

32

llama.cpp

Llama-3-Instruct-8B-CPO-v0.2-Q6_K.gguf

34.5

8

llama.cpp

Llama-3.1-SauerkrautLM-8b-Instruct.Q6_K.gguf

34.5

8

llama.cpp

llama-3.2-11b-vision-instruct

34.5

11

API (OpenRouter)

silicon-maid-7b.Q6_K.gguf

34.3

7

llama.cpp

L3.1-8B-Slush-v1.1.Q6_K.gguf

34.2

8

llama.cpp

Llama-3-Instruct-8B-SPPO-Iter3-Q6_K.gguf

34.1

8

llama.cpp

magnum-32b-v2-IQ3_XS.gguf

34.1

32

llama.cpp

LlamaThink-8B-instruct-Q6_K.gguf

34.0

8

llama.cpp

magnum-v4-12b-Q6_K.gguf

34.0

12

llama.cpp

NemoMix-Unleashed-12B-Q6_K.gguf

34.0

12

llama.cpp

Aspire-8B-model_stock.Q6_K.gguf

33.9

8

llama.cpp

capybarahermes-2.5-mistral-7b.Q6_K.gguf

33.9

7

llama.cpp

L3-Aethora-15B-V2-Q6_K.gguf

33.9

15

llama.cpp

Meta-Llama-3.1-8B-Instruct-ablated-v1.Q6_K.gguf

33.9

8

llama.cpp

SJT-7B-1M.Q6_K.gguf

33.9

7

llama.cpp

glm-4-9b-chat-abliterated-Q6_K.gguf

33.7

9

llama.cpp

Nous-Hermes-2-SOLAR-10.7B-Q6_K.gguf

33.7

11

llama.cpp

Theia-21B-v1-IQ4_XS.gguf

33.7

21

llama.cpp

Dobby-Mini-Leashed-Llama-3.1-8B.Q6_K.gguf

33.6

8

llama.cpp

EVA-Qwen2.5-14B-v0.2.Q6_K.gguf

33.6

14

llama.cpp

c4ai-command-r7b-12-2024-Q6_K.gguf

33.5

7

llama.cpp

Llama-Krikri-8B-Instruct.Q6_K.gguf

33.5

8

llama.cpp

Selene-1-Mini-Llama-3.1-8B-Q6_K.gguf

33.5

8

llama.cpp

Theia-21B-v2-IQ4_XS.gguf

33.5

21

llama.cpp

Mistral-Nemo-Instruct-2407-Q6_K.gguf

33.3

12

llama.cpp

EXAONE-3.5-7.8B-Instruct-Q6_K.gguf

33.0

8

llama.cpp

Qwen2.5-Coder-7B-Instruct-Q6_K.gguf

33.0

7

llama.cpp

qwen2-7b-instruct-q6_k.gguf

32.9

7

llama.cpp

internlm2_5-20b-chat-q4_k_m.gguf

32.7

20

llama.cpp

Krutrim-2-instruct-Q6_K.gguf

32.7

12

llama.cpp

Llama-3.1-SuperNova-Lite.Q6_K.gguf

32.7

8

llama.cpp

Marco-o1-Q6_K.gguf

32.7

7

llama.cpp

MN-12b-RP-Ink.Q6_K.gguf

32.3

12

llama.cpp

magnum-v3-9b-customgemma2-Q6_K.gguf

32.2

9

llama.cpp

MN-12B-Celeste-V1.9.Q6_K.gguf

32.2

12

llama.cpp

Mythalion-13B-Q6_K.gguf

32.2

13

llama.cpp

c4ai-command-r7b-12-2024-abliterated-Q6_K.gguf

32.0

7

llama.cpp

glm-4-9b-chat-1m-Q6_K.gguf

32.0

9

llama.cpp

Llama-3-Instruct-8B-KTO-v0.2-Q6_K.gguf

32.0

8

llama.cpp

Llama-3-Instruct-8B-RDPO-v0.2-Q6_K.gguf

31.9

8

llama.cpp

Llama-3.1-8B-BookAdventures.Q6_K.gguf

31.8

8

llama.cpp

Llama-3.1-8B-Lexi-Uncensored_V2_Q8.gguf

31.8

8

llama.cpp

Llama-3.1-Storm-8B.Q6_K.gguf

31.8

8

llama.cpp

Dolphin3.0-Llama3.1-8B-Q6_K.gguf

31.7

8

llama.cpp

glm-4-9b-chat-Q6_K.gguf

31.7

9

llama.cpp

magnum-v3-9b-chatml-Q6_K.gguf

31.7

9

llama.cpp

Falcon3-Jessi-v0.4-7B-Slerp.Q6_K.gguf

31.6

7

llama.cpp

MergeMonster-Decensored-7b-20231125.q6_K.gguf

31.6

7

llama.cpp

Ministral-8B-Instruct-2410-Q6_K.gguf

31.6

8

llama.cpp

SnowLotus-v2-10.7B-Q6_K.gguf

31.6

11

llama.cpp

Llama-3-Instruct-8B-SLiC-HF-v0.2-Q6_K.gguf

31.4

8

llama.cpp

MS-sunfall-v0.7.0.i1-IQ4_XS.gguf

31.4

22

llama.cpp

Nera_Noctis-12B-Q6_K.gguf

31.4

12

llama.cpp

Nyanade_Stunna-Maid-7B-v0.2-Q6_K-imat.gguf

31.4

7

llama.cpp

Captain_Eris_Noctis-12B-v0.420.Q6_K.gguf

31.3

12

llama.cpp

Llama-3-SauerkrautLM-8b-Instruct.Q6_K.gguf

31.3

8

llama.cpp

Loyal-Macaroni-Maid-7B.Q6_K.gguf

31.3

7

llama.cpp

DaringLotus-v2-10.7B-Q6_K.gguf

31.2

11

llama.cpp

Llama-3-Refueled-Q6_K.gguf

31.2

8

llama.cpp

Luminis-phi-4.Q6_K.gguf

31.2

14

llama.cpp

MN-Dark-Planet-TITAN-12B-D_AU-Q6_k.gguf

31.2

12

llama.cpp

orca_mini_v3_7b.Q6_K.gguf

31.2

7

llama.cpp

bagel-8b-v1.0-Q6_K.gguf

31.1

8

llama.cpp

LLAMA-3_8B_Unaligned_BETA-Q6_K.gguf

31.1

8

llama.cpp

Llama-3-Instruct-8B-DPO-v0.2-Q6_K.gguf

31.1

8

llama.cpp

Llama-3-Instruct-8B-RRHF-v0.2-Q6_K.gguf

31.1

8

llama.cpp

Llama-3.2-3B-Instruct-abliterated.Q8_0.gguf

31.1

3

llama.cpp

MergeMonster-7b-20231124.q6_K.gguf

31.1

7

llama.cpp

IceLemonTeaRP-32k-7b.Q6_K.gguf

31.0

7

llama.cpp

L3.2-8X3B-MOE-Dark-Champion-Inst-18.4B-uncen-ablit_D_AU-Q5_k_s.gguf

31.0

18

llama.cpp

aya-expanse-32b-IQ3_XS.gguf

30.8

32

llama.cpp

internlm3-8b-instruct-q8_0.gguf

30.8

9

llama.cpp

L3-Super-Nova-RP-8B.Q6_K.gguf

30.8

8

llama.cpp

Llama-3-8B-Instruct-abliterated-v2.Q6_K.gguf

30.8

8

llama.cpp

Reformed-Christian-Bible-Expert-12B.Q6_K.gguf

30.8

12

llama.cpp

Falcon3-7B-Instruct-q6_k.gguf

30.7

7

llama.cpp

L3.1-8b-RP-Ink-Q6_K.gguf

30.6

8

llama.cpp

Falcon3-Mamba-7B-Instruct-q6_k.gguf

30.5

7

llama.cpp

lfm-7b

30.5

7

API (OpenRouter)

olmo-2-1124-13B-instruct-Q6_K.gguf

30.4

14

llama.cpp

Control-Nanuq-8B.Q6_K.gguf

30.2

8

llama.cpp

Llama-3-Instruct-8B-SimPO-v0.2-Q6_K.gguf

30.2

8

llama.cpp

Llama-3.1-8B-toxic-dpo-NoWarning.Q6_K.gguf

30.2

8

llama.cpp

Nera_Noctis-12B-v0.420.Q6_K.gguf

30.2

12

llama.cpp

Mistral-7B-OpenOrca.Q6_K.gguf

30.1

7

llama.cpp

RolePlayLake-7B.Q6_K.gguf

30.1

7

llama.cpp

magnum-v4-27b-IQ3_XS.gguf

30.0

27

llama.cpp

watt-tool-8B-Q6_K.gguf

29.8

8

llama.cpp

Llama-3-Hercules-5.0-8B-Q6_K.gguf

29.6

8

llama.cpp

kukulemon-7B-Q6_K-imat.gguf

29.5

7

llama.cpp

nemo-sunfall-v0.6.1.Q6_K.gguf

29.5

12

llama.cpp

orca_mini_v3_13b.Q6_K.gguf

29.5

13

llama.cpp

SeminalRP-22b.i1-IQ4_XS.gguf

29.5

22

llama.cpp

Llama3.1-8B-Chinese-Chat-Q6_K.gguf

29.4

8

llama.cpp

Yi-1.5-34B-Chat-IQ3_XS.gguf

29.4

34

llama.cpp

openchat-3.6-8b-20240522-Q6_K.gguf

29.3

8

llama.cpp

internlm3-8b-instruct-q6_k.gguf

29.2

9

llama.cpp

Qwen2.5-7B-Instruct-abliterated.Q6_K.gguf

29.2

7

llama.cpp

granite-3.1-8b-instruct-abliterated.Q6_K.gguf

29.0

8

llama.cpp

Human-Like-Mistral-Nemo-Instruct-2407.Q6_K.gguf

28.9

12

llama.cpp

LemonadeRP-4.5.3-Q6_K.gguf

28.9

7

llama.cpp

Llama-3.1-8B-UltraMedical.Q6_K.gguf

28.9

8

llama.cpp

Phi-4-Q6_K.gguf

28.9

14

llama.cpp

BeagleLake-7B.Q6_K.gguf

28.8

7

llama.cpp

internlm2_5-7b-chat-q6_k.gguf

28.8

7

llama.cpp

aya-expanse-8b-Q6_K.gguf

28.7

8

llama.cpp

SummLlama3.1-8B.Q6_K.gguf

28.6

8

llama.cpp

StableBeluga-7B-Q6_K.gguf

28.4

7

llama.cpp

Aura_Uncensored_l3_8B-Q6_K-imat.gguf

28.3

8

llama.cpp

granite-3.1-8b-instruct-Q6_K.gguf

28.3

8

llama.cpp

Llama-3-Instruct-8B-SimPO.Q6_K.gguf

28.1

8

llama.cpp

LLama3.1-Rhino-8B-RAG.Q6_K.gguf

28.1

8

llama.cpp

Pantheon-RP-1.5-12B-Nemo-Q6_K.gguf

28.0

12

llama.cpp

Bespoke-Stratos-7B.Q6_K.gguf

27.7

7

llama.cpp

BuRP_7B.Q6_K.gguf

27.7

7

llama.cpp

Eximius_Persona_5B-Q6_K.gguf

27.7

5

llama.cpp

MergeMonster-WritingStyle-7b-20231126.q6_K.gguf

27.6

7

llama.cpp

Phi-3-mini-4k-instruct-Q8_0.gguf

27.6

4

llama.cpp

BeagleLake-7B-Toxic.Q6_K.gguf

27.5

7

llama.cpp

causallm_14b.Q5_1.gguf

27.1

14

llama.cpp

Llama-3.2-3B-Instruct-Q8_0.gguf

27.1

3

llama.cpp

Violet_Twilight-v0.2.Q6_K.gguf

27.1

12

llama.cpp

orca-2-13b.Q6_K.gguf

27.0

13

llama.cpp

Phi-lthy4.Q6_K.gguf

26.9

14

llama.cpp

Poppy_Porpoise-0.72-L3-8B-Q6_K-imat.gguf

26.6

8

llama.cpp

FineTunedOnNovelAndFandom.Q6_K.gguf

26.5

12

llama.cpp

Hermes-2-Pro-Mistral-7B.Q6_K.gguf

26.5

7

llama.cpp

mistral-nemo-gutenberg-12B-v2.Q6_K.gguf

26.5

12

llama.cpp

Chocolatine-3B-Instruct-DPO-Revised.Q8_0.gguf

26.4

3

llama.cpp

internlm2_5-7b-chat-1m-q6_k.gguf

26.4

7

llama.cpp

Einstein-v7-Qwen2-7B.Q6_K.gguf

26.1

7

llama.cpp

Falcon3-10B-Instruct-abliterated.Q6_K.gguf

26.1

10

llama.cpp

GigaChat-20B-A3B-instruct-v1.5-q4_K_M.gguf

25.8

20

llama.cpp

LLaMA3-iterative-DPO-final-Q6_K.gguf

25.8

8

llama.cpp

MN-12B-Starcannon-v2.Q6_K.gguf

25.8

12

llama.cpp

wizardlm-2-8x22b

25.8

141

API (OpenRouter)

AceInstruct-7B-Q6_K.gguf

25.7

7

llama.cpp

Dobby-Mini-Unhinged-Llama-3.1-8B.Q6_K.gguf

25.7

8

llama.cpp

Erosumika-7B-v3-0.2-Q6_K-imat.gguf

25.7

7

llama.cpp

MythoLogic-Mini-7B-Q6_K.gguf

25.4

7

llama.cpp

command-r

25.3

35

API (OpenRouter)

Llama-Spark-Q6_K.gguf

25.3

8

llama.cpp

MythoMax-L2-13B-Q6_K.gguf

25.3

13

llama.cpp

daybreak-kunoichi-2dpo-7b.Q6_K.gguf

25.2

7

llama.cpp

Llama-3.1-8B-Stheno-v3.4-Q6_K.gguf

25.2

8

llama.cpp

PyThagorean-3B.Q8_0.gguf

25.2

3

llama.cpp

Teleut-7b.Q6_K.gguf

25.2

7

llama.cpp

sorcererlm-8x22b

25.1

141

API (OpenRouter)

mini-magnum-12b-v1.1.Q6_K.gguf

24.9

12

llama.cpp

olmo-2-1124-7B-instruct-Q6_K.gguf

24.9

7

llama.cpp

aya-23-35B-IQ3_XXS.gguf

24.8

35

llama.cpp

DeepSeek-R1-Distill-Llama-8B-Q6_K.gguf

24.8

8

llama.cpp

QwQ-32B-Preview-IQ3_XS.gguf

24.8

32

llama.cpp

Captain-Eris-Violet-12B-GRPO-Q6_K.gguf

24.7

12

llama.cpp

codegemma-7b-it.Q6_K.gguf

24.7

7

llama.cpp

GigaChat-20B-A3B-instruct-q4_K_M.gguf

24.7

20

llama.cpp

Kunoichi-DPO-v2-7B-Q6_K-imatrix.gguf

24.7

7

llama.cpp

Configurable-Llama-3-8B-v0.3.Q6_K.gguf

24.6

8

llama.cpp

WestLake-7B-v2.Q6_K.gguf

24.6

7

llama.cpp

Xwin-MLewd-13B-V0.2-Q6_K.gguf

24.6

13

llama.cpp

codegemma-1.1-7b-it.Q6_K.gguf

24.3

7

llama.cpp

synatra-7b-v0.3-rp.Q6_K.gguf

24.3

7

llama.cpp

MythoMix-L2-13B.Q6_K.gguf

24.2

13

llama.cpp

Lexi-Llama-3-8B-Uncensored_Q8_0.gguf

24.1

8

llama.cpp

Qwen2.5-3B-Instruct-Q8_0.gguf

24.1

3

llama.cpp

EVA-Qwen2.5-14B-v0.1.Q6_K.gguf

23.9

14

llama.cpp

Geneva-12B-GCv2-5m.Q6_K.gguf

23.9

12

llama.cpp

LLaMA2-13B-Psyfighter2.Q6_K.gguf

23.9

13

llama.cpp

Rombo-LLM-V2.5-Qwen-3b.Q8_0.gguf

23.9

3

llama.cpp

remm-slerp-l2-13b.Q6_K.gguf

23.7

13

llama.cpp

TimeCrystal-L2-13B-Q6_K.gguf

23.7

13

llama.cpp

Falcon3-3B-Instruct-q8_0.gguf

23.6

3

llama.cpp

LLaMA2-13B-Tiefighter.Q6_K.gguf

23.5

13

llama.cpp

InfinityRP-v1-7B-Q6_K-imat.gguf

23.4

7

llama.cpp

Phi-3-mini-128k-instruct-Q8_0.gguf

23.4

4

llama.cpp

dolphin-2.8-experiment26-7b-Q6_K.gguf

23.3

7

llama.cpp

MythoMist-7B-Q6_K.gguf

23.3

7

llama.cpp

Yi-1.5-9B-Chat-Q6_K.gguf

23.3

9

llama.cpp

Azure_Dusk-v0.2_Q6_K.gguf

23.0

12

llama.cpp

Nemo-12b-Humanize-KTO-v0.1-Q6_K.gguf

22.9

12

llama.cpp

glm-edge-4B-chat-Q8_0.gguf

22.8

4

llama.cpp

granite-3.1-2b-instruct-Q8_0.gguf

22.8

2

llama.cpp

natsumura-storytelling-rp-1.0-llama-3.1-8b.Q6_K.gguf

22.8

8

llama.cpp

Eyas-17B-Instruct.Q5_K_M.gguf

22.7

17

llama.cpp

Pantheon-10.7B-Q6_K.gguf

22.7

11

llama.cpp

Qwen2.5-3B-Instruct-Abliterated.Q8_0.gguf

22.7

3

llama.cpp

Crimson_Dawn-v0.2_Q6_K.gguf

22.2

12

llama.cpp

Aura-4B.Q8_0.gguf

22.0

4

llama.cpp

loyal-piano-m7.Q6_K.gguf

22.0

7

llama.cpp

Pantheon-RP-1.6.1-12B-Nemo-Q6_K.gguf

22.0

12

llama.cpp

juanako-7b-una.Q6_K.gguf

21.9

7

llama.cpp

Hermes-3-Llama-3.2-3B.Q8_0.gguf

21.7

3

llama.cpp

rose-20b.Q4_K_M.gguf

21.7

20

llama.cpp

Eris_Remix_7B.Q6_K.gguf

21.6

7

llama.cpp

inflection-3-pi

21.4

?

API (OpenRouter)

LLAMA-3_8B_Unaligned_Alpha-Q6_K.gguf

21.4

8

llama.cpp

airoboros-l2-13b-3.1.1.Q6_K.gguf

21.3

7

llama.cpp

solar-10.7b-instruct-v1.0.Q6_K.gguf

21.3

11

llama.cpp

lfm-3b

21.2

3

API (OpenRouter)

SOVL_Llama3_8B-Q6_K-imat.gguf

21.1

8

llama.cpp

vicuna-13b-v1.5.Q6_K.gguf

20.8

13

llama.cpp

Llama-3.1-Uncensored-New.Q6_K.gguf

20.7

8

llama.cpp

airoboros-m-7b-3.1.2.Q6_K.gguf

20.6

7

llama.cpp

Llama-3-8B-Uncensored-0.3c.Q6_K.gguf

20.4

8

llama.cpp

Qwen2.5-1.5B-Instruct-Q8_0.gguf

20.4

1

llama.cpp

EXAONE-3.5-2.4B-Instruct-abliterated.Q8_0.gguf

20.2

2

llama.cpp

L3-Luna-8B.Q6_K.gguf

20.2

8

llama.cpp

L3.2-Rogue-Creative-Instruct-Uncensored-Abliterated-7B-GGUF

20.2

7

llama.cpp

Kunocchini.Q6_K.gguf

20.1

7

llama.cpp

L3-8B-Helium3.Q6_K.gguf

19.8

8

llama.cpp

Qwen1.5-14B-Chat.Q6_K.gguf

19.8

14

llama.cpp

Bielik-7B-Instruct-v0.1-Q6_K.gguf

19.6

7

llama.cpp

Peach-9B-8k-Roleplay-Q6_K.gguf

19.6

9

llama.cpp

dolphin-2.2.1-mistral-7b.Q6_K.gguf

19.5

7

llama.cpp

Reasoning-Llama-3.1-CoT-RE1.Q6_K.gguf

19.4

8

llama.cpp

Llama-3.1-Tulu-3-8B-Q6_K.gguf

19.3

8

llama.cpp

MiniCPM3-4B.Q8_0.gguf

19.3

3

llama.cpp

OpenThinker-7B-Q6_K.gguf

19.3

7

llama.cpp

Impish_LLAMA_3B-Q8_0.gguf

19.2

3

llama.cpp

Mistral-7B-Instruct-v0.3-Q6_K.gguf

19.0

7

llama.cpp

granite-3.1-2b-instruct-abliterated.Q6_K.gguf

18.6

2

llama.cpp

L3.1-8B-sunfall-v0.6.1-dpo.Q6_K.gguf

18.6

8

llama.cpp

Eurus-7b-kto.Q6_K.gguf

18.4

7

llama.cpp

Q25-1.5B-VeoLu-Q8_0.gguf

18.4

1

llama.cpp

dolphin-2.9.4-llama3.1-8b-Q6_K.gguf

18.3

8

llama.cpp

Llama-3SOME-8B-v1.Q6_K.gguf

18.3

8

llama.cpp

miniclaus-qw1.5B-UNAMGS-GRPO-Q8_0.gguf

18.2

1

llama.cpp

OpenThinker-7B-abliterated.Q6_K.gguf

18.1

7

llama.cpp

Average_Normie_v3.69_8B-Q6_K-imat.gguf

18.0

8

llama.cpp

orca-2-7b.Q6_K.gguf

17.8

7

llama.cpp

EXAONE-3.5-2.4B-Instruct-Q8_0.gguf

17.5

2

llama.cpp

Qwen2.5-Sex.Q8_0.gguf

17.5

1

llama.cpp

gemma-1.1-7b-it.Q6_K.gguf

17.3

7

llama.cpp

gemma-2-2b-it-Q8_0.gguf

17.3

2

llama.cpp

DeepSeek-V2-Lite-Chat.Q6_K.gguf

17.0

16

llama.cpp

dolphin-2.9-llama3-8b-q6_K.gguf

16.7

8

llama.cpp

miniclaus-qw1.5B-UNAMGS-Q8_0.gguf

16.7

1

llama.cpp

Impish_Mind-Q6_K.gguf

16.6

8

llama.cpp

Phi-3.5-mini-instruct-Q8_0.gguf

16.6

4

llama.cpp

Llama-3-Unholy-8B.q6_k.gguf

16.5

8

llama.cpp

Hercules-4.0-Mistral-v0.2-7B-Q6_K.gguf

16.3

7

llama.cpp

qwen2-1_5b-instruct-q8_0.gguf

16.3

2

llama.cpp

vicuna-13b-v1.5-16k.Q6_K.gguf

16.1

13

llama.cpp

EuroLLM-9B-Instruct-Q6_K.gguf

16.0

9

llama.cpp

aya-23-8B-Q6_K.gguf

15.5

8

llama.cpp

Index-1.9B-Chat-Q8_0.gguf

15.5

2

llama.cpp

Sky-T1-mini.Q6_K.gguf

15.4

7

llama.cpp

mistral-7b-instruct-v0.1.Q6_K.gguf

15.2

7

llama.cpp

phi-2-layla-v1-chatml-Q8_0.gguf

14.8

3

llama.cpp

OLMoE-1B-7B-0125-Instruct-Q6_K.gguf

14.7

7

llama.cpp

Codestral-22B-v0.1-IQ4_XS.gguf

14.5

22

llama.cpp

GritLM-7B.Q6_K.gguf

14.5

7

llama.cpp

PyThagorean-Tiny.Q8_0.gguf

14.5

1

llama.cpp

DeepSeek-R1-Distill-Qwen-7B-Q6_K.gguf

14.3

7

llama.cpp

Llama-3.2-1B-Instruct-Q8_0.gguf

14.2

1

llama.cpp

Wizard-Vicuna-30B-Uncensored.i1-IQ3_XS.gguf

14.1

30

llama.cpp

Llama-3-8B-ProLong-64k-Instruct.Q6_K.gguf

14.0

8

llama.cpp

alphamonarch-7b.Q6_K.gguf

13.9

7

llama.cpp

Llama-3-8B-ProLong-512k-Instruct.Q6_K.gguf

13.7

8

llama.cpp

Meditron3-8B.Q6_K.gguf

13.7

8

llama.cpp

MythoLogic-L2-13B.i1-Q6_K.gguf

13.5

13

llama.cpp

neural-chat-7b-v3-3-Q6_K.gguf

13.5

7

llama.cpp

LwQ-10B-Instruct.Q6_K.gguf

13.4

10

llama.cpp

Wayfarer-12B-Q6_K.gguf

13.4

12

llama.cpp

FastLlama-3.2-1B-Instruct-Q8_0.gguf

12.8

1

llama.cpp

Qwen1.5-7B-Chat.Q6_K.gguf

12.8

7

llama.cpp

Falcon3-1B-Instruct-q8_0.gguf

12.7

1

llama.cpp

L3-DARKEST-PLANET-16.5B-D_AU-Q6_k.gguf

12.5

16

llama.cpp

Pantheon-RP-1.0-8B-Llama-3-Q6_K.gguf

12.4

8

llama.cpp

orca_mini_v9_7_1B-Instruct.Q8_0.gguf

12.3

1

llama.cpp

Nous-Hermes-Llama-2-7B.Q6_K.gguf

12.2

7

llama.cpp

Lieutenant_BMO-10B-Q6_K.gguf

12.0

10

llama.cpp

L3-DARKEST-PLANET-Seven-Rings-Of-DOOM-16.5B-D_AU-Q6_k.gguf

11.9

16

llama.cpp

TinySwallow-1.5B-Instruct-Q8_0.gguf

11.8

2

llama.cpp

Megrez-3B-Instruct-abliterated.Q8_0.gguf

11.4

3

llama.cpp

mpt-7b-8k-chat.Q6_K.gguf

11.4

7

llama.cpp

Megrez-3B-Instruct-Q8_0.gguf

11.2

3

llama.cpp

L3-TheSpice-8b-v0.8.3-Q6_K-imat.gguf

11.1

8

llama.cpp

Llama-3.2-1B-Instruct-abliterated.Q8_0.gguf

10.6

1

llama.cpp

magnum-v2-4b-Q8_0.gguf

10.6

4

llama.cpp

internlm2_5-1_8b-chat-q8_0.gguf

10.5

2

llama.cpp

Teuken-7B-instruct-commercial-v0.4-Q6_K.gguf

10.5

7

llama.cpp

Janus-Pro-7B-LM.Q6_K.gguf

10.2

7

llama.cpp

RWKV-v6-Finch-14B-HF.Q6_K.gguf

10.2

14

llama.cpp

MythoBoros-13B.Q6_K.gguf

9.9

13

llama.cpp

Qwen2.5-1.5B-Instruct-abliterated.Q8_0.gguf

9.9

1

llama.cpp

orca_mini_v9_7_3B-Instruct.Q8_0.gguf

9.8

3

llama.cpp

SmolLM2-1.7B-Instruct-Q8_0.gguf

9.5

2

llama.cpp

Teuken-7B-instruct-research-v0.4-Q6_K.gguf

8.7

7

llama.cpp

llama-2-7b-chat.Q6_K.gguf

8.4

7

llama.cpp

Mistral-7B-Instruct-SimPO-Q6_K.gguf

8.4

7

llama.cpp

Qwen2-Boundless-Q8_0.gguf

8.4

1

llama.cpp

EstopianMaid-13B-Q6_K.gguf

8.1

13

llama.cpp

Qwen1.5-4B-Chat.Q8_0.gguf

8.1

4

llama.cpp

chatglm3-6b.Q6_K.gguf

7.8

6

llama.cpp

vicuna-7b-v1.5.Q6_K.gguf

7.7

7

llama.cpp

glm-edge-1.5B-chat-Q8_0.gguf

7.6

1

llama.cpp

Qwen2.5-0.5B-Instruct-Q8_0.gguf

7.2

1

llama.cpp

ShoriRP.v077.q6_k.gguf

7.1

7

llama.cpp

Zephyr-7B-beta-Q6_K.gguf

6.7

7

llama.cpp

L3-SMB-Grand-Horror-16.5B-V2-D_AU-Q6_k.gguf

6.4

16

llama.cpp

Yi-Coder-9B-Chat-Q6_K.gguf

6.4

9

llama.cpp

RWKV-v6-Finch-7B-World3-HF-Q6_K.gguf

5.8

7

llama.cpp

mixtral-8x7b-instruct

5.7

47

API (OpenRouter)

DeepSeek-R1-Distill-Qwen-1.5B-Q8_0.gguf

5.5

2

llama.cpp

llama-2-13b-chat.Q6_K.gguf

5.5

13

llama.cpp

RWKV-v6-Finch-7B-HF.Q6_K.gguf

5.5

7

llama.cpp

gorilla-openfunctions-v2-Q6_K.gguf

5.2

7

llama.cpp

HelpingAI-9B.Q6_K.gguf

5.2

9

llama.cpp

minerva-7b-instruct-v1.0-q6_k.gguf

5.1

7

llama.cpp

BioMistral-7B.Q6_K.gguf

4.9

7

llama.cpp

DeepScaleR-1.5B-Preview-Q8_0.gguf

4.8

2

llama.cpp

Wizard-Vicuna-13B-Uncensored.Q6_K.gguf

4.7

13

llama.cpp

qwen2-0_5b-instruct-q8_0.gguf

4.6

1

llama.cpp

SmallThinker-3B-Preview-Q8_0.gguf

4.6

3

llama.cpp

Wizard-Vicuna-7B-Uncensored.Q6_K.gguf

4.6

7

llama.cpp

gemma-1.1-2b-it.Q8_0.gguf

4.0

2

llama.cpp

Orion-14B-Chat.Q6_K.gguf

3.4

14

llama.cpp

Moistral-11B-v2.Q6_K.gguf

3.1

11

llama.cpp

oxy-1-micro.Q8_0.gguf

3.1

1

llama.cpp

SmolLM2-360M-Instruct-Q8_0.gguf

2.0

1

llama.cpp

stablelm-2-12b-chat.Q6_K.gguf

1.8

12

llama.cpp

stable-code-instruct-3b.Q8_0.gguf

1.7

3

llama.cpp

TinyDolphin-2.8-1.1b-Q8_0.gguf

1.7

1

llama.cpp

gemma-7b-it.Q6_K.gguf

1.1

7

llama.cpp

SmolLM-1.7B-Instruct-Q8_0.gguf

0.7

2

llama.cpp

WizardLM-2-7B-Q6_K.gguf

0.5

7

llama.cpp

gemma-2b-it.Q8_0.gguf

0.0

2

llama.cpp

SmolLM-360M-Instruct-Q8_0.gguf

0.0

1

llama.cpp

Donations

If youโ€™d like to support my work โ†’ โ˜• https://ko-fi.com/moonride โ˜•.

Changelog

2025-02-04 Identified a bug that caused answers to one question not being counted. Recalculating the scores improved the results of about 30 models. Also included the results from the new models Iโ€™ve tested.

2025-02-09 Identified a bug related to handling communication errors and rejections. I needed to re-run the tests for 1 model, and re-calculate the scores for 2 others. Added more results.

2025-02-13 Added more results.

2025-02-17 Added more results.

ย