AL Corpus

Active

A Rust CLI that extracts structured training datasets and detects anti-patterns from AL codebases using tree-sitter parsing.

A Rust CLI tool that parses AL codebases with tree-sitter-al and extracts structured JSONL datasets for LLM fine-tuning. Processes all .al files, extracts objects (tables, pages, codeunits, reports, enums) with full metadata, captures every procedure and trigger with signatures, parameters, variables, and call references, then generates prompt/completion pairs from procedure signatures and bodies.

Also includes an anti-pattern labeler that detects ten common AL mistakes: CalcFields in loops, record operations in loops, missing SetLoadFields, unfiltered FindSet, hardcoded record IDs, and more. Each flagged with severity levels.

Parses 15,000+ files in under a minute. Feeds directly into al-train for fine-tuning pipelines.