LLM Judge - Manual Review

Type your name to start (or resume) reviewing. If the name you type matches one of the existing reviewers below, you will pick up their progress where they left off.

Existing reviewers

Click a name to resume that reviewer's session, or type a new name above to start a fresh one. Typing the exact same name as an existing reviewer above resumes their session (case-sensitive).

ReviewerDoneProgress
Antonio4 / 100
Simone3 / 100

Per-item review matrix

One row per item to review. The middle columns show what each LLM judge said; the right columns show what each human reviewer has decided. A dash means a verdict is missing.

#CaseEval. modelUnderstanding pointClaudeAI judgeGeminiAI judgeOpenAI judgeAntonioSimone
0op02ClaudeAIString decryptionsatisfiedsatisfiedsatisfiedsatisfiedsatisfied
1op02KimiK26String decryptionsatisfiedsatisfiedsatisfiedsatisfiedsatisfied
2op02OpenAIObfuscated XOR operationsatisfiednot_satisfiedsatisfiedsatisfiedsatisfied
3op02OpenAIString decryptionsatisfiednot_satisfiedsatisfiedsatisfied-
4op02Qwen359BString decryptionsatisfiedsatisfiedsatisfied--
5op04OpenAIBlowfishsatisfiedsatisfiedsatisfied--
6op04gemma431BitBlowfishsatisfiedsatisfiedsatisfied--
7op05GLM51filename comparison with browser name stringssatisfiedsatisfiedsatisfied--
8op05GLM51get current process pathsatisfiedsatisfiedsatisfied--
9op05KimiK26get filenamesatisfiedsatisfiedsatisfied--
10op05Qwen359Bobfuscated branch conditionsatisfiedsatisfiedsatisfied--
11op05gemma431Bitobfuscated branch conditionsatisfiedsatisfiedsatisfied--
12op06ClaudeAIcalculating odd semiprime and its factors for AES-CBCnot_satisfiednot_satisfiednot_satisfied--
13op06OpenAIcalculating the time window at which the malware is runningnot_satisfiednot_satisfiednot_satisfied--
14op06gemma431Bitcalculating the time window at which the malware is runningnot_satisfiednot_satisfiednot_satisfied--
15op07OpenAIDecryption via multiple arithmetic operations (MUL, SUB, ADD)satisfiednot_satisfiedsatisfied--
16op07OpenAIString decryptionsatisfiednot_satisfiedsatisfied--
17op08ClaudeAIzip data compressionsatisfiedsatisfiedsatisfied--
18op08Qwen359Bbrowser extensionsnot_satisfiedsatisfiednot_satisfied--
19op08Qwen359Bzip data compressionnot_satisfiedsatisfiednot_satisfied--
20op09ClaudeAIdata encodingsatisfiednot_satisfiednot_satisfied--
21op09ClaudeAIdata verificationsatisfiednot_satisfiednot_satisfied--
22op09ClaudeAIxor data decryptionsatisfiednot_satisfiednot_satisfied--
23op09GeminiAIdata encodingsatisfiedsatisfiedsatisfied--
24op09KimiK26data verificationnot_satisfiednot_satisfiednot_satisfied--
25op11ClaudeAIXOR encryptionsatisfiednot_satisfiedsatisfied--
26op11ClaudeAIcustom ROR-13 hashingsatisfiednot_satisfiedsatisfied--
27op11ClaudeAIdynamic API resolution via PEBsatisfiednot_satisfiedsatisfied--
28op11KimiK26XOR encryptionsatisfiednot_satisfiedsatisfied--
29op11KimiK26custom ROR-13 hashingsatisfiednot_satisfiedsatisfied--
30op11KimiK26dynamic API resolution via PEBsatisfiednot_satisfiedsatisfied--
31op11OpenAIXOR encryptionnot_satisfiedsatisfiedsatisfied--
32op11OpenAIcustom ROR-13 hashingnot_satisfiedsatisfiedsatisfied--
33op11OpenAIdynamic API resolution via PEBnot_satisfiedsatisfiedsatisfied--
34op11Qwen359Bcustom ROR-13 hashingnot_satisfiednot_satisfiednot_satisfied--
35op12KimiK26execution of dynamic codesatisfiedsatisfiedsatisfied--
36op12Qwen359Bdata unpackingnot_satisfiedsatisfiednot_satisfied--
37op12Qwen359Bdynamic API resolutionnot_satisfiedsatisfiednot_satisfied--
38op12Qwen359Bexecution of dynamic codenot_satisfiedsatisfiednot_satisfied--
39op13KimiK26API resolution via PE EAT walkingsatisfiedsatisfiedsatisfied--
40op14Qwen359Baplib librarysatisfiedsatisfiedsatisfied--
41op14Qwen359Bdata decompressionsatisfiedsatisfiedsatisfied--
42op15GeminiAIbase-64 decodingsatisfiedsatisfiedsatisfied--
43op16GLM51blowfish algorithmsatisfiedsatisfiedsatisfied--
44op16GeminiAIblowfish algorithmnot_satisfiedsatisfiedsatisfied--
45op16GeminiAIdata decryptionnot_satisfiedsatisfiedsatisfied--
46op16KimiK26blowfish algorithmnot_satisfiedsatisfiednot_satisfied--
47op16KimiK26data decryptionnot_satisfiedsatisfiednot_satisfied--
48op16OpenAIdata decryptionsatisfiedsatisfiedsatisfied--
49op16Qwen359Bdata decryptionnot_satisfiednot_satisfiednot_satisfied--
50op17ClaudeAIcheck the number of processors is > 2satisfiednot_satisfiednot_satisfied--
51op17OpenAIcheck the number of processors is > 2satisfiednot_satisfiednot_satisfied--
52op20GeminiAICRC32 hashingnot_satisfiednot_satisfiednot_satisfied--
53op21gemma431Bitdirectory creationsatisfiedsatisfiedsatisfied--
54op23Qwen359Bdynamic API resolutionsatisfiednot_satisfiednot_satisfied--
55op23Qwen359Bstring hashingsatisfiednot_satisfiednot_satisfied--
56op24GLM51loop through all array itemsnot_satisfiednot_satisfiednot_satisfied--
57op25GeminiAIthe generated string is 77 characters longsatisfiedsatisfiedsatisfied--
58op26ClaudeAIgenerate a random numbersatisfiedsatisfiedsatisfied--
59op26OpenAIgenerate a random numbersatisfiedsatisfiedsatisfied--
60op26gemma431Bitthe random number is used to select the the charactersatisfiedsatisfiedsatisfied--
61op28ClaudeAIget Windows directorysatisfiedsatisfiedsatisfied--
62op28gemma431Bitget volume serialsatisfiedsatisfiedsatisfied--
63op29ClaudeAIcompare authority against hardcoded constantsatisfiednot_satisfiedsatisfied--
64op29ClaudeAIget token informationsatisfiednot_satisfiedsatisfied--
65op29ClaudeAIopen current processsatisfiednot_satisfiedsatisfied--
66op29GLM51open current processsatisfiedsatisfiedsatisfied--
67op29GeminiAIcompare authority against hardcoded constantsatisfiednot_satisfiednot_satisfied--
68op29GeminiAIget token informationsatisfiednot_satisfiednot_satisfied--
69op29GeminiAIopen current processsatisfiednot_satisfiednot_satisfied--
70op30ClaudeAIresolve import address tablesatisfiedsatisfiedsatisfied--
71op30GLM51resolve import address tablesatisfiedsatisfiedsatisfied--
72op30GeminiAIexecute exported function TcBootOtl-satisfiedsatisfied--
73op30GeminiAIresolve import address table-satisfiedsatisfied--
74op30KimiK26relocate memorysatisfiedsatisfiedsatisfied--
75op30OpenAIresolve import address tablesatisfiedsatisfiedsatisfied--
76op31GeminiAIcompute murmurhash32satisfiedsatisfiedsatisfied--
77op32Qwen359Bloop the input buffersatisfiednot_satisfiedsatisfied--
78op32Qwen359Bset the buffer to the input charactersatisfiednot_satisfiedsatisfied--
79op32gemma431Bitloop the input buffersatisfiednot_satisfiedsatisfied--
80op32gemma431Bitset the buffer to the input charactersatisfiednot_satisfiedsatisfied--
81op33KimiK26RC4 PRGA phasesatisfiedsatisfiedsatisfied--
82op35GLM51data hashingsatisfiedsatisfiedsatisfied--
83op35OpenAIdata hashingsatisfiedsatisfiedsatisfied--
84op36GeminiAImatching flag with hard-coded valuesnot_satisfiedsatisfiednot_satisfied--
85op36Qwen359Bmatching flag with hard-coded valuessatisfiedsatisfiednot_satisfied--
86op36gemma431Bitmatching flag with hard-coded valuessatisfiedsatisfiednot_satisfied--
87op37gemma431BitSHA1 implementationnot_satisfiedsatisfiednot_satisfied--
88op38ClaudeAIcicle the input buffersatisfiednot_satisfiednot_satisfied--
89op38ClaudeAIxor encrypt each bytesatisfiednot_satisfiednot_satisfied--
90op38GLM51cicle the input buffersatisfiednot_satisfiedsatisfied--
91op38GLM51xor encrypt each bytesatisfiednot_satisfiedsatisfied--
92op38GeminiAIcicle the input buffersatisfiednot_satisfiedsatisfied--
93op38GeminiAIxor encrypt each bytesatisfiednot_satisfiedsatisfied--
94op38KimiK26cicle the input buffersatisfiednot_satisfiedsatisfied--
95op38KimiK26xor encrypt each bytesatisfiednot_satisfiedsatisfied--
96op38OpenAIcicle the input buffersatisfiednot_satisfiedsatisfied--
97op38OpenAIxor encrypt each bytesatisfiednot_satisfiedsatisfied--
98op38gemma431Bitcicle the input buffersatisfiednot_satisfiedsatisfied--
99op38gemma431Bitxor encrypt each bytesatisfiednot_satisfiedsatisfied--

Your name is used only to track your progress and to attribute your verdicts in the output CSV.