Tagged: llm

1 post and 1 project

Posts

How I Benchmark LLMs on AL Code

An in-depth look at CentralGauge, an open source benchmark for evaluating LLM performance on AL code generation for Business Central, covering task design, scoring methodology, and cross-model comparison results.

alllmbenchmarkbusiness-centraldeveloper-tools

Projects

CentralGauge - AL Code Benchmark for LLMs

Active

An open source benchmark for evaluating LLM performance on AL code generation for Microsoft Dynamics 365 Business Central, with 56 tasks across three difficulty tiers, real compilation, and test execution.

alllmbenchmarkbusiness-centralai