Skip to content

Pull requests: Aleph-Alpha-Research/eval-framework

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix(deps): update dependency numpy to >=2.5.0
#410 opened Jun 25, 2026 by aar-public-version-bump-bot Bot Loading…
1 task
fix: fix bugs in new SQUAD ma task
#408 opened Jun 24, 2026 by AuguB Contributor Loading…
fix(deps): update dependency openai to >=1.109.1,<3
#397 opened Jun 23, 2026 by aar-public-version-bump-bot Bot Loading…
1 task
refactor!: implement task naming convention
#379 opened Jun 19, 2026 by fsschneider Collaborator Draft
7 of 9 tasks
chore: release v1.0.0
#375 opened Jun 18, 2026 by martinreinhardt01 Collaborator Loading…
12 tasks
chore: manual upgrade of hf ecosystem packages
#362 opened Jun 13, 2026 by prabhuteja12 Collaborator Draft
12 tasks
chore(deps): lock file maintenance
#342 opened Jun 5, 2026 by aar-public-version-bump-bot Bot Loading…
1 task
chore(deps): update python to v3.14.5
#337 opened Jun 5, 2026 by aar-public-version-bump-bot Bot Loading…
1 task
refactor: Make linter stricter
#233 opened May 7, 2026 by martinreinhardt01 Collaborator Loading…
12 tasks
fix: minerva sympy memory limit
#223 opened Apr 20, 2026 by prabhuteja12 Collaborator Draft
4 of 12 tasks
feat: bpb implementations
#212 opened Apr 9, 2026 by prabhuteja12 Collaborator Loading…
12 tasks
Update citation year and add version+author to README
#159 opened Jan 26, 2026 by tfburns Collaborator Loading…
1 task done
chore: Bump pyasn1 from 0.6.1 to 0.6.2 in the uv group across 1 directory dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code
#157 opened Jan 16, 2026 by dependabot Bot Loading…
docs: add LLM as judge guide
#151 opened Jan 12, 2026 by AhmedHammam-AA Collaborator Loading…
fix(main): duplicated task that are actually the same
#144 opened Jan 7, 2026 by benureau Loading…
3 of 13 tasks
fix(wmt): use HuggingFace datasets instead of sacrebleu
#137 opened Dec 19, 2025 by AhmedHammam-AA Collaborator Loading…
Remove leading space in ground truth formatting
#129 opened Dec 10, 2025 by SohirMaskey Loading…
3 of 13 tasks
harcoded date for consistent evals
#99 opened Nov 4, 2025 by GrS-AA Collaborator Draft
13 tasks
ProTip! Exclude everything labeled bug with -label:bug.