Parsers
Parsers convert file contents into JSON messages for the CDviz pipeline.
Where Parsers Are Used
sendcommand: Submit CI/CD artifacts directly from pipelines (--input-parser)- OpenDAL source: Parse files from filesystem or cloud storage (
parser = "...") transformcommand: Batch file processing (auto-detects format)
Quick Reference
| Parser | Format | Output | Auto-detected extensions | Use Case |
|---|---|---|---|---|
auto | Auto-detect | Varies | — | Default — detects by file extension |
json | JSON | 1 message | .json | Single JSON object per file |
jsonl | JSON Lines | N messages | .jsonl, .ndjson | One message per line |
csv_row | CSV | N messages | .csv | One message per row (header as keys) |
text | Plain text | 1 message | — | Entire file as {"text": "..."} |
text_line | Plain text | N messages | .txt, .log | One message per non-empty line |
xml | XML | 1 message | .xml (feature flag) | XML converted to JSON structure |
tap | TAP format | 1 message | .tap (feature flag) | Test Anything Protocol results |
metadata | Any | 1 message | — | File metadata only (no content) |
Feature Flags
xml and tap require feature flags in the collector build:
cargo install cdviz-collector --features parser_xml,parser_tapBuilt-in (always available): json, jsonl, csv_row, text, text_line, metadata, auto
Parsers and Transformers
Parsers produce an intermediate message — the body is raw parsed content (text, CSV row, XML-as-JSON, etc.), not a CDEvent. A transformer is required to map that body to a valid CDEvent before delivery to a sink.
Exception: json and jsonl parsers can be used directly when the source files are already valid CDEvents.
CLI Usage
# Source file is already a CDEvent — no transformer needed
cdviz-collector send --data @event.json --input-parser json --url $CDVIZ_URL
# Non-CDEvents input — provide a config with a transformer
cdviz-collector send --data @junit.xml --input-parser xml --config pipeline.toml --url $CDVIZ_URL
# Wrap a process: capture exit code and result files as CDEvents
cdviz-collector send --run testsuiterun_junit --url $CDVIZ_URL -- pytest --junit-xml=TEST-results.xmlParser Reference
auto
parser = "auto" # default — can omitDetects format by file extension (see Quick Reference table). Falls back to text_line for unknown extensions. Never selects metadata — that must be specified explicitly.
[sources.ci_outputs.extractor]
type = "opendal"
kind = "fs"
polling_interval = "30s"
recursive = true
path_patterns = ["**/*.json", "**/*.xml", "**/*.tap", "**/*.log", "**/*.csv"]
parser = "auto"
parameters = { root = "/var/ci/outputs" }json
parser = "json"- Entire file → 1 message; body is the parsed JSON object
- Fails if file contains invalid JSON or multiple JSON objects
[sources.cdevents_json.extractor]
type = "opendal"
kind = "fs"
polling_interval = "10s"
path_patterns = ["**/*.json"]
parser = "json"
parameters = { root = "./cdevents" }jsonl
parser = "jsonl"- Each non-empty line → 1 message; empty lines are skipped
- Each line must be individually valid JSON
- Use directly with
sendonly when each line is already a valid CDEvent; otherwise use with a transformer
csv_row
parser = "csv_row"- First row is the header; column names become JSON keys
- Each data row → 1 message; all values are strings
[sources.release_exports.extractor]
type = "opendal"
kind = "s3"
polling_interval = "15m"
path_patterns = ["releases/**/*.csv"]
parser = "csv_row"
[sources.release_exports.extractor.parameters]
bucket = "company-release-exports"
region = "us-east-1"Map CSV columns to CDEvents with a VRL transformer:
.context.type = "dev.cdevents.service.deployed.0.3.0"
.context.source = "csv-import"
.context.timestamp = .body.timestamp
.subject.id = .body.service
.subject.content.artifactId = "pkg:generic/" + .body.service + "@" + .body.versiontext
parser = "text"- Entire file → 1 message with body
{"text": "...file content..."} - Preserves all whitespace and line breaks; use
text_linefor per-line messages - Requires a transformer to map
body.textto a CDEvent
[sources.build_logs.extractor]
type = "opendal"
kind = "fs"
polling_interval = "1m"
path_patterns = ["**/build.log"]
parser = "text"
parameters = { root = "/var/ci/logs" }text_line
parser = "text_line"- Each non-empty line → 1 message with body
{"text": "...line content..."} - Default fallback for
.txtand.logfiles when usingauto - Requires a transformer to map each line to a CDEvent
Parse structured fields from each line in a VRL transformer:
# Parse ESLint format: "path:line:col - Severity: message"
parts = parse_regex!(.body.text, r'^(.+?):(\d+):(\d+) - (Error|Warning): (.+)$')
.subject.content = {
"file": parts[1],
"line": to_int!(parts[2]),
"severity": parts[4],
"message": parts[5]
}xml
parser = "xml"IMPORTANT
Requires parser_xml feature flag.
- Entire XML document → 1 message; attributes prefixed with
@, text content as#text - Commonly used for JUnit test reports (Jest, pytest, Maven Surefire, PHPUnit, xUnit)
- Requires a transformer (via
--configor configured source) to map the XML-as-JSON body to a CDEvent
# Send a JUnit report with a transformer config
cdviz-collector send --data @test-results.xml --input-parser xml --config pipeline.toml --url $CDVIZ_URL
# Or wrap the test runner — captures exit code and result files
cdviz-collector send --run testsuiterun_junit --url $CDVIZ_URL -- pytest --junit-xml=TEST-results.xmlMap JUnit results to a CDEvents test run with VRL:
suite = .body.testsuite
.context.type = "dev.cdevents.testsuiterun.finished.0.2.0"
.subject.content.outcome = if to_int!(suite["@failures"]) + to_int!(suite["@errors"]) > 0 {
"failure"
} else {
"pass"
}
.subject.content.testSuiteName = suite["@name"]tap
parser = "tap"IMPORTANT
Requires parser_tap feature flag.
- Entire TAP file → 1 message with
version,plan, andtestsarray - Supports TAP 13: ok/not ok results, YAML diagnostic blocks, skip/todo directives
- Supported in JavaScript (tap, node-tap), Perl, Ruby, Go, Rust
- Requires a transformer (via
--configor configured source) to map the parsed body to a CDEvent
# Send a TAP file with a transformer config
cdviz-collector send --data @test-results.tap --input-parser tap --config pipeline.toml --url $CDVIZ_URL
# Or wrap the test runner — captures exit code and TAP output files
cdviz-collector send --run testsuiterun_tap --url $CDVIZ_URL -- node --test --test-reporter=tapMap TAP results to CDEvents with VRL:
passed = length(filter(.body.tests, |_, t| t.ok == true))
failed = length(filter(.body.tests, |_, t| t.ok == false))
.context.type = "dev.cdevents.testsuiterun.finished.0.2.0"
.subject.content.outcome = if failed > 0 { "failure" } else { "pass" }
.subject.content.testSuiteName = .metadata.file_namemetadata
parser = "metadata"- No file content is read — emits file metadata only
- 1 message per file; body is empty; fields:
file_path,file_name,file_size,last_modified,content_type - Must be specified explicitly —
autonever selects this parser
Emit a CDEvent when a new artifact appears in S3 without downloading it:
[sources.artifact_tracking.extractor]
type = "opendal"
kind = "s3"
polling_interval = "1m"
path_patterns = ["artifacts/**/*.jar", "releases/**/*.tar.gz"]
parser = "metadata"
[sources.artifact_tracking.extractor.parameters]
bucket = "build-artifacts"
region = "us-east-1"Map to CDEvents ArtifactPublished with VRL:
.context.type = "dev.cdevents.artifact.published.0.3.0"
.context.source = "s3-artifact-bucket"
.context.timestamp = .metadata.last_modified
.subject.id = "pkg:generic/" + .metadata.file_name