Artifact Types
OpenGov Data publishes several machine-generated text artifacts for each meeting. Each file has a different purpose.
VTT Transcript
The VTT transcript is the timestamped transcript output.
Use it when you want:
- timestamps
- caption-style transcript review
- source alignment with the original recording
- rough navigation by time
This is the closest text artifact to the transcription stage.
Sliced Transcript TXT
The sliced transcript is a plain-text version of the transcript produced during segmentation.
Use it when you want:
- a readable raw-ish transcript
- transcript text grouped by detected segments
- an intermediate artifact between VTT and normalized text
This file may contain extra spacing or segment boundaries.
Segment JSONL
The segment JSONL file contains timestamped transcript segments in structured form.
Use it when you want:
- search indexing
- semantic retrieval
- timestamp-aware excerpts
- machine-readable transcript chunks
- future data processing
Each line is a JSON object representing one transcript segment.
Normalized Transcript
The normalized transcript is cleaned into sentence-per-line text.
Use it when you want:
- easier reading
- summarization input
- text extraction
- copy/paste review
- comparing meeting content across files
This is usually the best artifact for human review.
Summary
The summary is a generated meeting summary, when available.
Use it when you want:
- a faster overview
- agenda item review
- motions, votes, and major discussion points
- a starting point before reading the transcript
Summaries are machine-generated and should be checked against the transcript and original recording before official use.
Important Note
All artifacts are machine-generated and may contain errors. Important claims should be verified against the original public meeting recording.