Langfuse just got faster →
DocsAPI & Data PlatformFeaturesBlob Storage Export Field Reference

Blob Storage Export Field Reference

This page lists every field exported by the Langfuse blob storage integration, organized by export table. For setup instructions and configuration, see Export to Blob Storage.

Types are described as they appear in JSON/JSONL exports. Timestamps use YYYY-MM-DD HH:MM:SS.ffffff format (e.g. 2024-05-29 13:46:19.963000) in UTC. See Notes on CSV exports for how types map in CSV format.

Export sources

The blob storage integration supports three export source modes (configurable per project in Project Settings > Integrations > Blob Storage):

ModeBlob paths writtenDescription
Enriched observations (recommended)observations_v2/, scores/Each observation row includes trace-level fields (user_id, session_id, trace_name, etc.) directly. No warehouse-side JOIN needed for trace context.
Traces and observations (legacy)traces/, observations/, scores/Three separate files per time window. Observations do not include trace-level fields; join on trace_id in your warehouse.
Traces and observations (legacy) and enriched observationsAll of the aboveWrites both sets of observation files plus traces and scores.

Scores are always exported regardless of mode.

We recommend Enriched observations for most use cases — it produces fewer files and avoids cross-file JOINs for trace context.

Traces (traces/)

Exported in Traces and observations (legacy) and Traces and observations (legacy) and enriched observations modes only.

FieldTypeDescriptionUsage notes
idstringUnique trace identifier.Primary key. Use to join with observations and scores via trace_id.
timestampstring (timestamp)Trace creation timestamp (event time).Primary time axis for traces. Use for partitioning, filtering, and time-series analysis.
namestringUser-defined trace name (e.g. the top-level operation).Useful for grouping and filtering traces by operation type.
environmentstringEnvironment label (e.g. production, staging).Filter or partition by environment.
project_idstringLangfuse project identifier.All rows in one export belong to the same project.
metadataobjectUser-supplied key-value metadata attached to the trace.Arbitrary context. Extract keys relevant to your analytics.
user_idstringEnd-user identifier associated with the trace.Group by user for per-user analytics.
session_idstringSession identifier grouping related traces.Group traces into sessions for conversation-level analysis.
releasestringApplication release/version tag.Filter or compare across releases.
versionstringUser-provided version string set via the SDK.Track how changes to your application affect metrics over time.
publicbooleanWhether the trace is publicly shareable.Filter for public/private traces.
bookmarkedbooleanWhether the trace is bookmarked in the Langfuse UI.Filter for bookmarked items.
tagsarray of stringsUser-defined tags on the trace.Multi-value filtering and grouping.
inputstringTrace input payload.The top-level input to the traced operation. May be plain text or JSON; may be large.
outputstringTrace output payload.The top-level output. May be plain text or JSON; may be large.
created_atstring (timestamp)Row creation time.System timestamp. Typically close to timestamp but may differ for late-arriving data.
updated_atstring (timestamp)Last update time.Useful for incremental processing: re-process rows where updated_at > last sync.

Fields not in the trace export

These fields are not exported directly. Derive them in your warehouse:

FieldHow to derive
total_costSum observation-level total_cost grouped by trace_id from the observations file.
latencyCompute MAX(end_time) - MIN(start_time) across observations per trace_id.
observationsJoin the observations file on trace_id for the full list.
scoresJoin the scores file on trace_id for the full list.
html_pathConstruct as {langfuse_host}/project/{project_id}/traces/{id}.

Observations (observations/)

Exported in Traces and observations (legacy) and Traces and observations (legacy) and enriched observations modes.

These rows contain observation-level data only. Trace-level fields like user_id, session_id, and tags are not included — join the traces/ file on trace_id in your warehouse, or switch to the Enriched observations export mode.

FieldTypeDescriptionUsage notes
idstringUnique observation identifier.Primary key.
trace_idstringParent trace identifier.Join on this to get trace-level fields, or to link with scores.
project_idstringLangfuse project identifier.All rows in one export belong to the same project.
environmentstringEnvironment label.Filter by environment.
typestringObservation type: SPAN, GENERATION, or EVENT.Generations are LLM calls; spans are arbitrary operations; events are point-in-time markers.
parent_observation_idstring or nullParent observation ID (for nested observations).Reconstruct the trace tree by walking parent pointers. Null for root-level observations.
start_timestring (timestamp)When the observation started.Primary time axis for observations.
end_timestring (timestamp) or nullWhen the observation ended.Null for events and in-progress observations.
namestringUser-defined observation name.Group/filter by name (e.g. function name, model call label).
metadataobjectUser-supplied key-value metadata.Arbitrary context. Extract keys relevant to your analytics.
levelstringLog level: DEBUG, DEFAULT, WARNING, ERROR.Filter for errors or warnings.
status_messagestringStatus or error message.Inspect for debugging failed observations.
versionstringUser-provided version string set via the SDK.Informational.
inputstringObservation input payload.For generations: the prompt/messages sent to the LLM. May be plain text or JSON; may be large.
outputstringObservation output payload.For generations: the LLM response. May be plain text or JSON; may be large.
provided_model_namestringModel name as provided by the user/SDK.The raw model string (e.g. gpt-4o, claude-sonnet-4-20250514). This is what the API returns as model.
model_parametersstringModel call parameters as a JSON-encoded string (e.g. "{\"temperature\":0.7}").Parse as JSON. Useful for analyzing how model settings affect quality/cost.
usage_detailsobject (string → integer)Token usage breakdown by category.Extract keys: input for input tokens, output for output tokens, total for total. May contain additional keys like input_cached_tokens, reasoning_tokens, etc.
cost_detailsobject (string → number)Cost breakdown by category (USD).Extract keys: input for input cost, output for output cost.
completion_start_timestring (timestamp) or nullWhen the first token was generated (for streaming).Used to compute time_to_first_token. Null for non-streaming calls.
prompt_namestringName of the Langfuse prompt used, if any.Filter for observations using a specific prompt.
prompt_versioninteger or nullVersion number of the Langfuse prompt used.Track which prompt version was active.
total_costnumberTotal computed cost for this observation (USD).Observation-level cost. Sum across a trace for trace-level cost.
latencynumber or nullDuration in seconds (end_time - start_time).Null when end_time is null.
time_to_first_tokennumber or nullTime to first token in seconds (completion_start_time - start_time).Null when completion_start_time is null. Measures streaming responsiveness.
model_idstring or nullLangfuse model definition ID (resolved from provided_model_name).Used to look up pricing. Null if no model definition matched.
created_atstring (timestamp)Row creation time.System timestamp.
updated_atstring (timestamp)Last update time.Incremental processing.
prompt_idstringLangfuse prompt definition ID.Use with prompt_name/prompt_version for prompt analytics.
tool_callsarray of stringsRaw tool/function call payloads from the LLM response.Parse each element as JSON. Contains the full tool call objects.
tool_call_namesarray of stringsNames of tools/functions called.Quick filter/group without parsing full tool_calls.
tool_definitionsobjectTool/function schemas provided to the LLM.May be an empty object {} when no tools were provided.
usage_pricing_tier_namestring or nullName of the pricing tier used for cost calculation.User-defined tier name from the model definition. Null if no tiered pricing applies.
input_pricestring or nullPer-unit input price from the matched model definition. Decimal string.Null if no model definition matched. Cast to numeric in your pipeline.
output_pricestring or nullPer-unit output price from the matched model definition. Decimal string.Null if no model definition matched. Cast to numeric in your pipeline.
total_pricestring or nullPer-unit total price from the matched model definition. Decimal string.Null if no model definition matched. Used for models with a flat per-call price.

Trace-level fields not in legacy observations

These fields are absent from the observations/ file. Either join the traces/ file on trace_id, or switch to the Enriched observations export mode where they are included directly on each row.

Field
user_id
session_id
trace_name
tags
release
bookmarked
public

Enriched observations (observations_v2/)

Exported in Enriched observations and Traces and observations (legacy) and enriched observations modes.

This file contains all the same fields as the observations/ file above, plus the following trace-level fields included directly on each row — no warehouse-side JOIN needed:

FieldTypeDescriptionUsage notes
user_idstringEnd-user identifier from the parent trace.Directly available — no JOIN needed.
session_idstringSession identifier from the parent trace.Directly available — no JOIN needed.
trace_namestringName of the parent trace.Group observations by their parent trace name.
tagsarray of stringsTags from the parent trace.Directly available — no JOIN needed.
releasestringRelease tag from the parent trace.Directly available — no JOIN needed.
bookmarkedbooleanBookmark flag from the parent trace.Directly available.
publicbooleanPublic flag from the parent trace.Directly available.

For integrations created on or after 2026-04-01, latency and time_to_first_token are in seconds (consistent with the observations/ file). For integrations created before that date, these fields are in milliseconds for backward compatibility.

Deriving trace-level aggregates from a single file

With the Enriched observations export, you can compute trace-level metrics from the observations_v2/ file alone — no cross-file JOIN needed. Group by trace_id and compute SUM(total_cost) for trace cost and MAX(end_time) - MIN(start_time) for trace latency.

Scores (scores/)

Always exported regardless of export source mode. Only scores with aggregatable data types (NUMERIC, BOOLEAN, CATEGORICAL) are included.

FieldTypeDescriptionUsage notes
idstringUnique score identifier.Primary key.
timestampstring (timestamp)Score creation timestamp (event time).Primary time axis for scores.
project_idstringLangfuse project identifier.All rows in one export belong to the same project.
environmentstringEnvironment label.Filter by environment.
trace_idstringAssociated trace identifier.Join to get trace or observation context.
observation_idstring or nullAssociated observation identifier (optional).Null if the score is trace-level. Non-null if the score targets a specific observation.
session_idstringAssociated session identifier.Direct access to session context without joining traces.
dataset_run_idstring or nullAssociated dataset run identifier (if score came from an evaluation run).Links scores to experiment/evaluation runs. Null for ad-hoc or annotation scores.
namestringScore name (e.g. accuracy, helpfulness, hallucination).Group/filter by score metric name.
valuenumberNumeric score value.For BOOLEAN: 0 or 1. For CATEGORICAL: index of the category. For NUMERIC: the raw value.
sourcestringScore source: API, ANNOTATION, EVAL.API = programmatic via SDK, ANNOTATION = human annotation in UI, EVAL = LLM-as-judge evaluator.
commentstring or nullOptional human comment or evaluator reasoning.Context for the score. Useful for annotation workflows.
data_typestringScore data type: NUMERIC, BOOLEAN, or CATEGORICAL.Determines how to interpret value and string_value.
string_valuestring or nullString representation for categorical scores.The category label (e.g. "positive", "neutral"). Null for numeric/boolean scores.
created_atstring (timestamp)Row creation time.System timestamp.
updated_atstring (timestamp)Last update time.Incremental processing.

Enriching scores with trace/observation context

Scores do not include trace-level fields inline. To enrich:

Export modeHow to get trace context
Traces and observations (legacy)Join scores to the traces/ file on trace_id for user_id, environment, name, etc.
Enriched observationsJoin scores to the observations_v2/ file on trace_id. Since multiple observations share the same trace_id, deduplicate first (e.g. pick one row per trace_id) to avoid multiplying score rows. Each observation row already includes user_id, session_id, trace_name, tags, release, environment.

Notes on CSV exports

In CSV format all values are represented as text. Key differences from JSON/JSONL:

JSON typeCSV representation
stringPlain text value.
numberNumeric text (e.g. 1.23). Parse as float in your pipeline.
integerNumeric text without decimal point (e.g. 1024).
booleantrue or false.
nullEmpty field.
array of stringsJSON-encoded string (e.g. ["tag1","tag2"]). Parse the field as JSON.
objectJSON-encoded string (e.g. {"input":500,"output":120}). Parse the field as JSON.
string (timestamp)YYYY-MM-DD HH:MM:SS.ffffff in UTC (e.g. 2024-05-29 13:46:19.963000). Parse as timestamp in your pipeline.

Price fields: input_price, output_price, and total_price are exported as quoted strings in JSON/JSONL (e.g. "0.03") to preserve decimal precision. In CSV they appear as plain text. Cast these to a numeric or decimal type in your warehouse. Other cost fields (total_cost, cost_details values) are exported as JSON numbers.

When loading CSV into a warehouse, cast timestamp fields to your warehouse's timestamp type, numeric fields to float/decimal, and parse JSON-encoded fields (objects, arrays) into native map/array types if your warehouse supports them.

File organization in blob storage

{project_id}/
├── traces/                   # Traces and observations (legacy) mode only
│   └── {timestamp}.{json|jsonl|csv}[.gz]
├── observations/             # Traces and observations (legacy) mode only
│   └── {timestamp}.{json|jsonl|csv}[.gz]
├── observations_v2/          # Enriched observations mode only
│   └── {timestamp}.{json|jsonl|csv}[.gz]
└── scores/                   # Always exported
    └── {timestamp}.{json|jsonl|csv}[.gz]

Files are partitioned by the configured export frequency (hourly, daily, or weekly). Each file covers one time window.


Was this page helpful?