Skip to main content

DTSSE Grafana Installation And Management

This page covers the DTSSE Managed Grafana instance, its PostgreSQL backend, the dashboard data ingestion job, and the main repos involved in day-to-day management.

Repositories and code paths

Use these repos for different parts of the stack:

Area Repo Main code paths Notes
Managed Grafana infrastructure hmcts/grafana-infrastructure components/grafana/main.tf, components/grafana/grafana.tf, components/grafana/postgres.tf, components/grafana/action-group.tf, components/grafana/data.tf, components/grafana/variables.tf, environments/aat/aat.tfvars, environments/prod/prod.tfvars, azure-pipelines.yml This is the current source of truth for the DTSSE Managed Grafana stack.
Grafana folders, dashboards, datasources, and library panels hmcts/grafana-infrastructure components/grafana-config/main.tf, components/grafana-config/folders-from-json.tf, components/grafana-config/dashboards-from-json.tf, components/grafana-config/datasources-from-json.tf, components/grafana-config/panels-from-json.tf, components/grafana-config/config/aat/, components/grafana-config/config/prod/, azure-pipelines.yml This is the dashboard-as-code control plane for DTSSE Managed Grafana.
Dashboard data ingestion app hmcts/dtsse-dashboard-ingestion src/main/run.ts, src/main/executor.ts, src/main/query/interdependent.ts, src/main/interdependent/jenkins.metrics.ts, src/main/jenkins/cosmos.ts, azure-pipelines.yaml Reads data from external systems and writes normalized rows into the DTSSE dashboard PostgreSQL database.
Runtime deployment of ingestion job hmcts/cnp-flux-config apps/dtsse/dtsse-dashboard-ingestion/, apps/dtsse/aat/base/kustomization.yaml, apps/dtsse/prod/base/kustomization.yaml, apps/dtsse/automation/kustomization.yaml Deploys the ingestion workload to AKS.
Shared Helm chart for the ingestion job hmcts/hmcts-charts stable/dtsse-dashboard-ingestion/Chart.yaml, stable/dtsse-dashboard-ingestion/values.yaml Defines the reusable CronJob/chart defaults.

Grafana instance and PostgreSQL

The Managed Grafana instance is created in grafana-infrastructure by azurerm_dashboard_grafana.main in components/grafana/grafana.tf. The same component also assigns the Azure roles Grafana Viewer, Grafana Editor, and Grafana Admin to Azure AD groups.

The PostgreSQL database that backs the dashboard data is created from components/grafana/postgres.tf using the shared terraform-module-postgresql-flexible module. The connection string is written to Key Vault as the db-url secret and is consumed later by the ingestion job.

The rest of the DTSSE Grafana support resources are created alongside it in the same repo:

The main environment-specific settings are:

One naming detail to remember: production is still named dtsse-grafana10-prod even though the deployed grafana_major_version is now 11.

Grafana major version upgrades

The Grafana major version is controlled by grafana_major_version in the environment tfvars files:

To perform a major upgrade:

  1. Change grafana_major_version in AAT first.
  2. Raise a PR in azure-pipelines.yml and review the Terraform plan for aat.
  3. Merge the change and validate the AAT instance, including the token-management and Grafana configuration stages in the same pipeline.
  4. Repeat the same process for prod.

Useful related code:

PostgreSQL firewall setup

The PostgreSQL access model is environment-specific:

When public_access = true, components/grafana/postgres.tf creates two firewall rules, grafana1000 and grafana1001, using the Managed Grafana outbound IP addresses. This is the current production pattern.

When public_access = false, no firewall rules are created. AAT instead relies on the delegated PostgreSQL subnet looked up in components/grafana/data.tf.

If database connectivity breaks after a Grafana change, check:

  • The current value of public_access
  • The Grafana outbound IP list on the Managed Grafana resource
  • The generated azurerm_postgresql_firewall_rule entries
  • The db-url Key Vault secret written by Terraform

Access control via azure-access

Access is split between Terraform role assignments and Azure AD group membership:

Current default role mapping in grafana-infrastructure is:

  • Viewers: DTS CFT Developers, DTS SDS Developers, DTS SE - Grafana Readers
  • Editors: DTS Grafana Editors
  • Admins: DTS Platform Operations

Operational notes:

Grafana access is controlled by the Grafana role assignments in Terraform and by Azure AD group membership in azure-access. It is not governed by a separate Grafana-specific access-package flow in this implementation.

Dashboard configuration repo and creating new dashboards

Dashboard configuration is managed in hmcts/grafana-infrastructure under components/grafana-config.

The control points are:

Environment content is stored here:

Current structure:

  • folders-json/ defines Grafana folders and UIDs.
  • dashboard-json/<folder>/ stores exported dashboard JSON grouped by Grafana folder.
  • datasources-json/ stores datasource definitions such as azure-monitor.json and postgresql-dashboard.json.
  • panel-json/<category>/ stores reusable library panels.

This is the authoritative dashboard-as-code implementation for DTSSE Managed Grafana. cnp-flux-config still contains dashboard-as-code patterns for the platform monitoring stack, but it is not the control plane for DTSSE Managed Grafana.

Creating a new dashboard

Use this workflow:

  1. Create or update the dashboard in the Managed Grafana UI in AAT.
  2. Export the dashboard JSON from Grafana.
  3. Decide which folder it belongs to. If the folder does not exist yet, add a folder definition under components/grafana-config/config/aat/folders-json/ and components/grafana-config/config/prod/folders-json/.
  4. If you are introducing a new folder path, add the matching folder UID mapping for that path in components/grafana-config/dashboards-from-json.tf. The dashboard JSON path and the Terraform folder_mappings entry must stay aligned.
  5. Commit the dashboard JSON to components/grafana-config/config/aat/dashboard-json/<folder>/.
  6. Raise a PR in grafana-infrastructure and review the Terraform plan for aat.
  7. Merge the change and validate the dashboard in AAT.
  8. Promote the same JSON to components/grafana-config/config/prod/dashboard-json/<folder>/, then repeat the same PR and merge flow for prod.

If the dashboard needs a new datasource or reusable panel, add the supporting JSON in the same repo:

  • Datasource: components/grafana-config/config/<environment>/datasources-json/
  • Library panel: components/grafana-config/config/<environment>/panel-json/

If you introduce a new library-panel category, also update components/grafana-config/panels-from-json.tf so the new category is included in panel_folder_mappings.

The Grafana service-account token is already managed in Key Vault and can be used for API export/import:

  • Token secret: grafana-auth
  • Token name secret: grafana-auth-name
  • Key Vault: dtsse-aat or dtsse-prod

Example export flow:

export GRAFANA_NAME=dtsse-grafana-aat
export GRAFANA_RG=dtsse-aat
export GRAFANA_KV=dtsse-aat
export GRAFANA_UID=<dashboard_uid>

export GRAFANA_URL=$(az grafana show -n "$GRAFANA_NAME" -g "$GRAFANA_RG" --query properties.endpoint -o tsv)
export GRAFANA_TOKEN=$(az keyvault secret show --vault-name "$GRAFANA_KV" --name grafana-auth --query value -o tsv)

curl -sS \
  -H "Authorization: Bearer $GRAFANA_TOKEN" \
  "$GRAFANA_URL/api/dashboards/uid/$GRAFANA_UID" \
  -o "$GRAFANA_UID.json"

Import is the same pattern using POST /api/dashboards/db, but the standard DTSSE path is to commit the JSON into grafana-infrastructure and let the pipeline apply it.

How Jenkins publishes data to Cosmos

The write-side lives in hmcts/cnp-jenkins-library, not in the Grafana infra repo.

The Jenkins-side Cosmos credentials are configured in cnp-flux-config/apps/jenkins/jenkins/jenkins.yaml, where Jenkins defines:

  • COSMOSDB_TOKEN_KEY
  • The azureCosmosDB credential with id cosmos-connection
  • The Cosmos endpoint https://${pipeline-metrics-account-name}.documents.azure.com:443/

Key code paths:

How the data ingestion job works

The DTSSE ingestion app is hmcts/dtsse-dashboard-ingestion, and it is deployed by Flux as the dtsse-dashboard-ingestion HelmRelease.

Deployment/runtime:

Current schedules are:

  • AAT: hourly
  • Production: two staggered CronJobs running every ten minutes

Secrets injected into the job include:

  • db-url
  • cosmos-key
  • cosmos-db-name
  • jenkins-databases
  • github-token
  • sonar-token
  • jira-token
  • snow-username
  • snow-password

Application flow:

  1. src/main/run.ts loads every query file in src/main/query.
  2. src/main/executor.ts runs database migrations first, then executes the queries.
  3. src/main/query/interdependent.ts runs the ordered datasets that depend on one another.
  4. src/main/interdependent/jenkins.metrics.ts reads Jenkins metrics from Cosmos, validates them, and writes normalized records into PostgreSQL.
  5. src/main/jenkins/cosmos.ts reads from the Cosmos containers pipeline-metrics, cve-reports, performance-metrics, and app-helm-chart-metrics.
  6. src/main/config.ts resolves all secrets from environment variables or Key Vault-mounted properties.

The job is therefore a read from Cosmos and APIs, write to PostgreSQL pipeline. Grafana then reads from PostgreSQL.

How to migrate a repo to another team

There are two different meanings of “move to another team” here.

1. Move a repository to another reporting team in Grafana

The dashboard database stores repository ownership in github.repository.team_id.

Use the dedicated Azure DevOps pipeline from dtsse-dashboard-ingestion/azure-pipelines.yaml:

  • Pipeline purpose: Update Team ID for GitHub Repository
  • Inputs:
    • GitHub repo URL(s)
    • Target team_id
    • Environment (aat or prod)

The pipeline fetches db-url from Key Vault and runs src/main/admin/github.update-team-id.sh, which updates github.repository.team_id directly in PostgreSQL.

The valid team_id values are listed in dtsse-dashboard-ingestion/README.md.

2. Move Flux ownership of the DTSSE namespace or workload

If the operational ownership of the namespace changes in cnp-flux-config, the main control points are:

  • apps/dtsse/base/kustomize.yaml Update TEAM_AAD_GROUP_ID and, if needed, the Slack channel.
  • CODEOWNERS Update the apps/dtsse/ owner entry.
  • Any namespace/environment kustomizations under apps/dtsse/ Update references if the namespace itself is being restructured.

Useful background docs in cnp-flux-config:

If you are creating a new namespace rather than reusing dtsse, the helper scripts are:

This page was last reviewed on 9 March 2026. It needs to be reviewed again on 9 September 2026 by the page owner platops-build-notices .
This page was set to be reviewed before 9 September 2026 by the page owner platops-build-notices. This might mean the content is out of date.