Chapter 19. Profiling / Code Coverage

Table of Contents

What and why
Gathering profiling data
Annotating

What and why

Grammars tend to accumulate rules and conditions over time, as exceptions and corner cases are discovered. But these are very rarely removed again, since they may still be useful but nobody knows if they really are. These tools aim to solve that problem, by letting you test a grammar against a large corpus and see exactly what rules and contexts are used, how often they are used (or not), and examples of contexts in which they are used.

Gathering profiling data

When running a corpus through a grammar, the extra cmdline flag --profile data.sqlite will gather code coverage and data for hits and misses for every rule and condition into an SQLite 3 database. Each run must use its own database, but they can subsequently be merged with cg-merge-annotations output.sqlite input-one.sqlite input-two.sqlite input-three.sqlite ....

Annotating

Use cg-annotate data.sqlite /path/to/output to render the gathered data as HTML. This will create a /path/to/output/index.html file that you can open in a browser, alongside files with hit examples for each rule and context.

In case of included grammars, each grammar is rendered separately. And in each rendering, the rules and conditions that matched are clickable to go to a page where an example context is shown. The example has # RULE TARGET BEGIN and # RULE TARGET END to mark exactly what cohort triggered the rule/condition.