Revieko — Report Fields Reference (Summary / Full report / JSON)

This document is a field reference for Revieko reports: what each field means, where it appears (PR comment / Full report / JSON), how to interpret it, and common reasons behind “weird” values.

Read first: docs/PR_REVIEW_QUICKSTART.md (the 3–5 minute review path).

This document is strictly a field dictionary.

0) Context: what “reports” exist

In a GitHub PR, there are usually two output levels:

PR comment (Summary)

Short: statuses, top-N hotspots, per-file structural risk, links to the Full report.

Full report (Markdown / JSON)

A complete artifact: tables, additional sections, internal/service fields, analysis limits.

1) Notation and conventions

1.1 Risks and scales

  • _risk ∈ [0,100] — normalized scores (good for thresholds and “traffic lights”).

  • _risk_level ∈ {low, medium, high} — discretized levels for quick decisions.

  • _density ∈ [0,1] — fraction of “signal” tokens/effects in a chosen area.

1.2 Locations and line numbers

In Markdown reports, hotspot locations are usually:

path/to/file.py:LINE_START-LINE_END

In JSON, typically:

  • file_path

  • line_start / line_end (1-based)

In PR mode, lines refer to the new version of the file (after applying the diff).

1.3 Fields may be empty

The Effects / Taint / Control columns in hotspots may be empty if:

  • that channel is disabled in the config/version,

  • heuristics found nothing in that specific window,

  • analysis is partial and that chunk wasn’t analyzed.

2) Summary (PR comment): fields and meaning

A typical header:

Status: High risk
struct_risk: 69.49
hotspots: 10
ci_status: warn
control_risk_level: low
analysis: full|partial

Below is the breakdown.

2.1 Status (human-readable status)

Where: Summary / Full report (Markdown)
Type: string (usually “Low/Medium/High risk”)
Meaning: a human-friendly risk/policy label.

In practice, Status usually aligns with risk_level + ci_status, but it’s still just a label.

2.2 risk_level

Where: Full report (Markdown), often in JSON (depends on CLI/integration mode)
Type: enum: low | medium | high
Meaning: a discrete structural-risk level (for the “dig deeper or not” decision).

Typical interpretation:

  • low: structurally similar to the repo’s typical code.

  • medium: review hotspots.

  • high: noticeable structural anomaly — almost always warrants manual inspection.

2.3 struct_risk

Where: Summary / Full report (Markdown)
Type: float 0..100
Meaning: global structural risk for the analyzed object (PR/diff).

Notes:

  • In PR reports, this is the overall risk for the diff.

  • File-level detail is in Per-file structural risk.

  • In JSON, the equivalent is usually global_struct_risk.

2.4 hotspots (count)

Where: Summary (short comment)
Type: int
Meaning: how many hotspots are displayed in this comment.

Important: this is not “coverage” and not “total anomalies”. It’s the number of printed hotspots (top-N), limited by output policy.

2.5 ci_status

Where: Summary / Full report (Markdown), sometimes JSON
Type: enum: ok | warn | fail
Meaning: CI/integration signal (how to display/gate).

Typical policy:

  • ok: no thresholds exceeded.

  • warn: there’s something to look at (soft gate).

  • fail: strict mode (hard gate) — depends on configuration.

2.6 control_risk_level

Where: Summary / Full report (Markdown), JSON (if Control channel is enabled)
Type: enum: low | medium | high
Meaning: discrete level of “control complexity” (conditions / error paths / return tails).

Intuition:

  • low: branching/errors/return tails look normal.

  • high: conditions and/or error branches and/or tails are overheated → hard to read, easy to get wrong.

2.7 analysis: full|partial

Where: Summary / Full report (Markdown)
Type: enum: full | partial
Meaning: coverage of supported file types.

  • full → all supported code in the PR was analyzed (currently: Python).

  • partial → the PR includes changes in files CodeGuard doesn’t analyze (e.g., YAML/CSV/Markdown). Python files are still analyzed fully; “partial” refers to the PR as a whole.

Practical notes:

  • Don’t read partial as “Python was only half analyzed”.

  • Read partial as “some of the PR lies outside analysis scope and must be reviewed manually.”

3) Per-file structural risk (table)

Where: Summary / Full report (Markdown)
Meaning: each file’s contribution to overall structural risk.

Columns:

  • File — path

  • Risk — file structural risk (0..100)

How to read:

  • “One file carries everything” → start there.

  • Risk is spread out → the PR likely changes a layer/style in multiple places.

JSON equivalent: file_struct_risk: { "path": number, ... }

4) Hotspots: per-row fields and how to interpret them

A Hotspot is a local line range that looks atypical.

4.1 Hotspot fields in Markdown (table)

A typical row:

| Location | Kind | Score | Effects | Taint | Control |

Column breakdown:

Location
Path and line range: file.py:START-END

Kind
Type: enum. Common values:

  • struct_pattern_break — a break in the typical structural pattern in this window (unusual IF/LOOP/RETURN markers, new “patterns”, style shift).

  • depth_spike — locally abnormal nesting depth.

  • control_outlier — window is overheated in control regimes (conditions/errors/return tails).

  • mixed — no single dominant factor, but the window still deviates.

Score
Type: float (ranking)
Meaning: review priority (higher → earlier in the list).

Important:

  • score is not bug probability and not an absolute “quality score”.

  • It exists to sort hotspots within a report.

Effects
Type: string / label / short hint
Meaning: semantic hint about side effects (if Semantic channel is enabled and something is found).

Typical effects (token/fragment level):
LOG, DB, FILE_IO, NET_IO, EXEC, ...

Taint
Type: string / label / short hint
Meaning: nearby data sources (if enabled), e.g.:

  • USER_INPUT — input data

  • SECRET — potential secrets

  • CONST — constants

Control
Type: string / label / short hint
Meaning: control context for the window, e.g.:
BRANCH_COND, ERROR_PATH, RETURN_PATH, INIT_PATH, CLEANUP_PATH, NORMAL

4.2 Hotspot fields in JSON

Typically:

  • file_path: string

  • line_start, line_end: int (1-based)

  • segment_id: string (e.g., hunk_0)

  • segment_kind: enum (diff_hunk, full_file, …)

  • kind: enum

  • score: float

  • (optional) effect_hint, taint_hint, control_hint

5) Control summary: table and fields

Where: Full report (Markdown) and JSON (if Control channel is enabled)

5.1 Control summary (Markdown)

A PR report may include a table:

| File | overall | branch_cond | error_path | return_path |

Meaning:

  • overall — overall control risk for the file (often aggregate/max).

  • branch_cond — branching/conditions risk.

  • error_path — error-branch risk.

  • return_path — “return/raise tail” risk.

5.2 Control summary (JSON): typical fields in control_summary.per_file[path]

A common “per-file aggregates” model includes:

Shares of control regimes

  • control_share_branch_cond

  • control_share_error_path

  • control_share_return_path

  • control_share_init_path

  • control_share_cleanup_path

Complexity metrics by regime

  • control_*_complexity_mean

  • control_*_complexity_max

Cross metrics: control × effects

  • branch_cond_effect_density

  • error_path_effect_density

  • return_path_effect_density

Control risks (0..100)

  • branch_cond_risk

  • error_path_risk

  • return_path_risk

  • overall_control_risk

5.3 How to read a “strong signal” in the Control layer

Examples:

  • High branch_cond_risk + branch_cond_effect_density > 0 → conditions contain IO/NET/DB/LOG (often suspicious and hard to review).

  • High error_path_risk + high error_path_effect_density → error handlers perform many effects.

  • High return_path_risk + high return_path_effect_density → effectful logic right before returning (changes behavior at exit).

6) Semantic layer: effect_summary, file_semantic_risk, taint

If the Semantic channel is enabled, JSON (and sometimes Markdown) includes effect/flow-related fields.

6.1 effect_summary

Where: JSON, sometimes Full report (Markdown)
Meaning: aggregates for effects/taint.

effect_summary.per_file[path] (per file)
Typical metrics:

  • effect_density_tail ∈ [0,1]
    Share of effectful tokens (NET/DB/FILE/EXEC/LOG) in the tail of the file/segment.
    Intuition: closer to 1 → the tail is densely effectful.

  • dangerous_flow_score ∈ [0,1]
    Coarse signal that USER_INPUT occurs near effects (NET_IO/DB/EXEC/LOG).
    > 0 → at least one potentially risky neighboring flow was detected.

  • secret_leak_score ∈ [0,1]
    Coarse signal that SECRET appears near LOG / NET_IO.
    > 0 → potential secret leakage.

effect_summary.global
Maxima/aggregates across all report files — quick answer to “is there any effect activity at all?”

6.2 file_semantic_risk and file_semantic_risk_level

Where: JSON
Type:

  • file_semantic_risk[path] ∈ [0,100]

  • file_semantic_risk_level[path] ∈ {low, medium, high}

Meaning: a repo-aware semantic risk estimate for the file (effect tail + suspicious flow + historical adjustment).

Common prioritization rule:
High struct_risk and high file_semantic_risk for the same file → that file almost always deserves first-pass manual review.

6.3 taint (how to interpret)

Taint labels are usually not “security verdicts”; they describe nearby data sources:

  • USER_INPUT → track where the value goes next (especially near NET/DB/EXEC/LOG).

  • SECRET → ensure it isn’t logged or sent over the network.

7) Invariants / rule violations (if rules are enabled)

If team rules (semantic invariants) are configured, the report may include:

  • invariant_violations: a list of violations

    • rule/id

    • location

    • short explanation

This is the most straightforward layer:
a rule is violated → here is the spot → fix it or explicitly exempt it.

8) Coverage & scope: what “partial analysis” means

8.1 Why a report can be partial

analysis=partial means the PR includes files outside the analysis profile (currently CodeGuard is focused on Python). Examples: YAML, CSV, Markdown, assets, data, configs, etc.

8.2 How to read partial correctly

  • For the Python part: the report is complete (per-file risk, hotspots, etc.).

  • For out-of-profile files: CodeGuard is silent because it doesn’t analyze them.

8.3 What the reviewer should do

  • Review Python via the report (start with top hotspots).

  • Review out-of-profile files manually: format correctness, backward compatibility, migrations, project standards, secret safety in configs, etc.

8.4 Product note (if relevant)

If the project needs YAML/Markdown/CSV to also pass an “automatic radar”, that’s a separate roadmap: add analyzers for new types/languages and expand scope.

9) Metadata and service fields

Depending on integration/version, you may see:

  • Generated / timestamp (Markdown header)

  • repo_root, pr_number (JSON)

  • Markdown / JSON links (in Summary)

  • expires_at_unix_s (in Summary, if Full report links are time-limited): Unix time (seconds) when the tokenized link/artifact may expire.

10) Name mapping: Markdown ↔ JSON (common confusion)

  • Markdown struct_risk (PR Summary / Full report)
    ≈ JSON global_struct_risk

  • Markdown “Per-file structural risk”
    ≈ JSON file_struct_risk

  • Markdown “Hotspots” (top-N table)
    ≈ JSON hotspots (may be wider/richer in the full artifact)

  • Markdown analysis: full|partial
    ← derived from analysis_limits + fallback/limitations

11) Quick answers to “weird” values

Why is control_summary 0.0 everywhere?

  • Control channel is disabled, or

  • control regimes don’t stand out in this PR, or

  • analysis is partial and the relevant chunks weren’t analyzed.

Why are Effects/Taint empty in hotspots?

  • Semantic channel is disabled, or

  • heuristics found no signals in this window, or

  • “reviewer mode” is showing only structural hotspots.

Why are there “only 10 hotspots”, but I’m sure there are more anomalies?
That’s the top-N output limit in the PR comment. Open the Full report (Markdown/JSON).

© 2026 • Build systems that reconstruct the structure of reality