Preventos Data API third-party daemon integration
This guide is for third-party developers building an unattended daemon, scheduled job, or integration service that periodically fetches measurement data from the Preventos Data API.
Integration overview
A typical daemon integration has four steps:
- Register or receive an Entra application identity for the third-party daemon.
- Authenticate with Microsoft Entra ID using client credentials.
- Call
GET /data-catalogto discover the environments, sites, signals, and available time ranges that the application can read. - Persist a batch plan from the catalog and fetch values on a schedule, for example once per day.
The Data API is read-only. Access is restricted by the authenticated application's permission set. A daemon should not hardcode site or signal assumptions without first reading the data catalog.
Credential renewal is handled by a separate control-plane API. See Client Credential Renewal for the planned self-service secret and certificate rotation workflow.
Public API configuration
Use these placeholders when configuring the daemon application. Preventos provides the actual values separately.
| Setting | Placeholder |
|---|---|
| Tenant ID | <tenant-id> |
| API base URL | <api-base-url> including the trailing / |
| API scope identifier | <api-scope> |
| APIM subscription key header | Ocp-Apim-Subscription-Key: <subscription-key> |
The Microsoft Entra External ID authority URL is derived from the tenant ID:
https://<tenant-id>.ciamlogin.com/<tenant-id>/
The third party receives its own application client ID, either a client secret or certificate details, and an API Management subscription key separately. Do not store client secrets, certificate passwords, private keys, or subscription keys in source control, logs, request samples, or exported batch files.
For OAuth 2.0 client credentials with MSAL, request the API resource's application permissions with the .default scope provided by Preventos:
<api-scope>
In app-only daemon flows, MSAL uses the resource /.default form so Entra ID can issue the application permissions that were granted to the daemon app.
Authentication
Use the OAuth 2.0 client credentials grant. MSAL is recommended because it handles authority configuration, token acquisition, token caching, retries for token requests, and certificate-based credentials consistently across platforms.
The daemon must send the resulting access token on every protected API request:
Authorization: Bearer <access-token>
When calling the public API Management endpoint, the daemon must also send its subscription key on every API request:
Ocp-Apim-Subscription-Key: <subscription-key>
All endpoints intended for third-party integrations require a valid bearer token. GET /version can be called by any authenticated entity. GET /data-catalog and all values endpoints validate the request and matching Data API permissions.
Client secret example in C#
Install the MSAL package in the daemon application:
dotnet add package Microsoft.Identity.Client
Acquire and use a token:
using System.Net.Http.Headers;
using Microsoft.Identity.Client;
var tenantId = Environment.GetEnvironmentVariable("DATA_API_TENANT_ID")
?? throw new InvalidOperationException("Missing DATA_API_TENANT_ID.");
var baseUrl = Environment.GetEnvironmentVariable("DATA_API_BASE_URL")
?? throw new InvalidOperationException("Missing DATA_API_BASE_URL.");
var scope = Environment.GetEnvironmentVariable("DATA_API_SCOPE")
?? throw new InvalidOperationException("Missing DATA_API_SCOPE.");
var clientId = Environment.GetEnvironmentVariable("DATA_API_CLIENT_ID")
?? throw new InvalidOperationException("Missing DATA_API_CLIENT_ID.");
var clientSecret = Environment.GetEnvironmentVariable("DATA_API_CLIENT_SECRET")
?? throw new InvalidOperationException("Missing DATA_API_CLIENT_SECRET.");
var subscriptionKey = Environment.GetEnvironmentVariable("DATA_API_SUBSCRIPTION_KEY")
?? throw new InvalidOperationException("Missing DATA_API_SUBSCRIPTION_KEY.");
var authority = $"https://{tenantId}.ciamlogin.com/{tenantId}/";
var app = ConfidentialClientApplicationBuilder
.Create(clientId)
.WithClientSecret(clientSecret)
.WithAuthority(authority)
.Build();
var authResult = await app
.AcquireTokenForClient([scope])
.ExecuteAsync();
using var http = new HttpClient
{
BaseAddress = new Uri(baseUrl)
};
http.DefaultRequestHeaders.Authorization =
new AuthenticationHeaderValue("Bearer", authResult.AccessToken);
http.DefaultRequestHeaders.TryAddWithoutValidation(
"Ocp-Apim-Subscription-Key",
subscriptionKey);
using var response = await http.GetAsync("data-catalog");
response.EnsureSuccessStatusCode();
var catalogJson = await response.Content.ReadAsStringAsync();
Client secret example in Node.js
Install MSAL for Node.js:
npm install @azure/msal-node
Acquire a token and read the data catalog:
import { ConfidentialClientApplication } from "@azure/msal-node";
const tenantId = process.env.DATA_API_TENANT_ID;
const baseUrl = process.env.DATA_API_BASE_URL;
const scope = process.env.DATA_API_SCOPE;
const clientId = process.env.DATA_API_CLIENT_ID;
const clientSecret = process.env.DATA_API_CLIENT_SECRET;
const subscriptionKey = process.env.DATA_API_SUBSCRIPTION_KEY;
if (!tenantId || !baseUrl || !scope || !clientId || !clientSecret || !subscriptionKey) {
throw new Error("Missing Data API environment configuration.");
}
const authority = `https://${tenantId}.ciamlogin.com/${tenantId}/`;
const app = new ConfidentialClientApplication({
auth: {
clientId,
clientSecret,
authority,
},
});
const tokenResult = await app.acquireTokenByClientCredential({
scopes: [scope],
});
if (!tokenResult?.accessToken) {
throw new Error("Token acquisition failed.");
}
const response = await fetch(new URL("data-catalog", baseUrl), {
headers: {
Authorization: `Bearer ${tokenResult.accessToken}`,
"Ocp-Apim-Subscription-Key": subscriptionKey,
Accept: "application/json",
},
});
if (!response.ok) {
throw new Error(`Data API request failed: ${response.status} ${response.statusText}`);
}
const catalog = await response.json();
console.log(catalog);
Client secret example in Python
Install MSAL for Python:
python -m pip install msal requests
Acquire a token and stream a one-day CSV export:
import os
from pathlib import Path
import msal
import requests
tenant_id = os.environ["DATA_API_TENANT_ID"]
base_url = os.environ["DATA_API_BASE_URL"]
scope = os.environ["DATA_API_SCOPE"]
client_id = os.environ["DATA_API_CLIENT_ID"]
client_secret = os.environ["DATA_API_CLIENT_SECRET"]
subscription_key = os.environ["DATA_API_SUBSCRIPTION_KEY"]
authority = f"https://{tenant_id}.ciamlogin.com/{tenant_id}/"
app = msal.ConfidentialClientApplication(
client_id=client_id,
client_credential=client_secret,
authority=authority,
)
token_result = app.acquire_token_for_client(scopes=[scope])
if "access_token" not in token_result:
raise RuntimeError(f"Token acquisition failed: {token_result.get('error_description')}")
headers = {
"Authorization": f"Bearer {token_result['access_token']}",
"Ocp-Apim-Subscription-Key": subscription_key,
"Accept": "text/csv",
}
params = {
"signalIds": "Velocity,Depth",
"date": "2026-05-31",
"tzi": "Europe/Helsinki",
"format": "csv",
"csvPreset": "finnish",
}
url = f"{base_url.rstrip('/')}/environments/100/sites/site-ref/values"
with requests.get(url, headers=headers, params=params, stream=True, timeout=120) as response:
response.raise_for_status()
with Path("site-ref-2026-05-31.csv").open("wb") as output:
for chunk in response.iter_content(chunk_size=1024 * 1024):
if chunk:
output.write(chunk)
Client secret sample with curl
Get an access token:
curl --request POST "https://<tenant-id>.ciamlogin.com/<tenant-id>/oauth2/v2.0/token" \
--header "Content-Type: application/x-www-form-urlencoded" \
--data-urlencode "client_id=<client-id>" \
--data-urlencode "client_secret=<client-secret>" \
--data-urlencode "scope=<api-scope>" \
--data-urlencode "grant_type=client_credentials"
Call the API with the returned access token:
curl --request GET "<api-base-url>data-catalog" \
--header "Authorization: Bearer <access-token>" \
--header "Ocp-Apim-Subscription-Key: <subscription-key>" \
--header "Accept: application/json"
Create a self-signed certificate
Certificate authentication uses a key pair. The daemon keeps the private key and uses it to acquire tokens. Preventos, or the Entra application administrator, needs only the public certificate material. Do not send the private key, PFX password, or PFX file unless a secure private-key handover has been explicitly arranged.
Use short-lived certificates and rotate them before expiry. A 6 to 12 month validity period is usually a good default for daemon integrations. Store the private key or PFX in the host's protected secret store, deployment secret store, or another controlled location outside source control.
Recommended files:
| File | Purpose |
|---|---|
data-api-daemon.cer or data-api-daemon.crt | Public certificate to upload to the Entra application registration. |
data-api-daemon.pfx | Private-key bundle used by the daemon application. Protect this file and its password. |
Windows PowerShell
Run these commands in PowerShell. The password prompt protects the exported PFX file.
$cert = New-SelfSignedCertificate `
-Subject "CN=preventos-data-api-daemon" `
-CertStoreLocation "Cert:\CurrentUser\My" `
-KeyAlgorithm RSA `
-KeyLength 2048 `
-HashAlgorithm SHA256 `
-KeySpec Signature `
-KeyExportPolicy Exportable `
-NotAfter (Get-Date).AddMonths(12)
$pfxPassword = Read-Host "PFX password" -AsSecureString
Export-Certificate `
-Cert $cert `
-FilePath ".\data-api-daemon.cer"
Export-PfxCertificate `
-Cert $cert `
-FilePath ".\data-api-daemon.pfx" `
-Password $pfxPassword
$cert | Select-Object Subject, Thumbprint, NotBefore, NotAfter
Give data-api-daemon.cer to the Entra application administrator. Configure the daemon with:
$env:DATA_API_CERTIFICATE = "C:\secure-path\data-api-daemon.pfx"
$env:DATA_API_CERTIFICATE_PASSWORD = "<pfx-password-from-secret-store>"
macOS or Linux with OpenSSL
Run these commands in a secure working directory. OpenSSL prompts for the private-key pass phrase and the PFX export password.
openssl req \
-x509 \
-newkey rsa:2048 \
-sha256 \
-days 365 \
-keyout data-api-daemon.key \
-out data-api-daemon.crt \
-subj "/CN=preventos-data-api-daemon"
openssl pkcs12 \
-export \
-out data-api-daemon.pfx \
-inkey data-api-daemon.key \
-in data-api-daemon.crt \
-name data-api-daemon
openssl x509 \
-in data-api-daemon.crt \
-outform DER \
-out data-api-daemon.cer
openssl x509 \
-in data-api-daemon.crt \
-noout \
-subject \
-fingerprint \
-dates
Give data-api-daemon.cer or data-api-daemon.crt to the Entra application administrator. Configure the daemon with:
export DATA_API_CERTIFICATE="/secure-path/data-api-daemon.pfx"
export DATA_API_CERTIFICATE_PASSWORD="<pfx-password-from-secret-store>"
Certificate example in C#
Prefer certificate authentication for long-running production integrations when possible. The private key must be protected by the host operating system, a secret store, or a managed deployment process.
using System.Security.Cryptography.X509Certificates;
using Microsoft.Identity.Client;
var tenantId = Environment.GetEnvironmentVariable("DATA_API_TENANT_ID")
?? throw new InvalidOperationException("Missing DATA_API_TENANT_ID.");
var scope = Environment.GetEnvironmentVariable("DATA_API_SCOPE")
?? throw new InvalidOperationException("Missing DATA_API_SCOPE.");
var clientId = Environment.GetEnvironmentVariable("DATA_API_CLIENT_ID")
?? throw new InvalidOperationException("Missing DATA_API_CLIENT_ID.");
var certificatePath = Environment.GetEnvironmentVariable("DATA_API_CERTIFICATE")
?? throw new InvalidOperationException("Missing DATA_API_CERTIFICATE.");
var certificatePassword = Environment.GetEnvironmentVariable("DATA_API_CERTIFICATE_PASSWORD");
var authority = $"https://{tenantId}.ciamlogin.com/{tenantId}/";
var certificate = X509CertificateLoader.LoadPkcs12FromFile(
certificatePath,
certificatePassword);
var app = ConfidentialClientApplicationBuilder
.Create(clientId)
.WithCertificate(certificate)
.WithAuthority(authority)
.Build();
var authResult = await app
.AcquireTokenForClient([scope])
.ExecuteAsync();
For older .NET versions, use the X509Certificate2 constructor overload that matches the deployment platform and key-storage requirements.
Discover data with GET /data-catalog
The daemon should periodically refresh the data catalog and use it as the source for batch requests.
GET /data-catalog
Authorization: Bearer <access-token>
Ocp-Apim-Subscription-Key: <subscription-key>
Accept: application/json
The response is grouped by environment and site. It includes top-level values fetch limits:
| Field | Purpose |
|---|---|
limits.maxRawDbPoints | Complete-export budget for estimated raw database points. |
limits.referenceInterval | Reference interval used for the default budget. |
limits.referenceRange | Reference range used for the default budget. |
The default limit is 44640 raw DB points, equivalent to 31 days at PT1M for one underlying DB query.
Each readable signal includes:
| Field | Purpose |
|---|---|
id | Signal identifier to use in values requests. |
displayName | Human-readable signal name. |
unit | Measurement unit. |
baseInterval | Expected measurement interval when known. |
valuesUrl | Relative URL for fetching that signal. |
ranges | Active time ranges for the signal, newest first. |
For site-level batch requests, each site also has a valuesUrl that already contains a signalIds=... query string for all signals the app can read at that site.
If the application has root-level permission, it must provide an explicit environment filter:
GET /data-catalog?environments=100,102
Authorization: Bearer <access-token>
Ocp-Apim-Subscription-Key: <subscription-key>
Non-root applications can omit the filter to receive the environments allowed by their permission set.
Build a periodic batch
Use the data catalog to create a stable batch plan:
- Read
environments[].sites[]. - Select the sites and signals the integration needs.
- Store the environment ID, site ID, signal IDs, output format, timezone, and destination file or downstream target.
- On each scheduled run, apply a moving time window such as yesterday, the previous 24 hours, or the last completed local day.
- Refresh the catalog on a regular cadence so the daemon picks up permission, site, signal, and availability changes.
Example batch item:
{
"name": "site-ref-velocity-depth",
"env": "100",
"siteId": "site-ref",
"signalIds": ["Velocity", "Depth"],
"timezone": "Europe/Helsinki",
"format": "csv",
"csvPreset": "finnish"
}
For a once-per-day daemon, prefer date for a completed single day rather than from and to. Avoid fetching the current partial day unless the integration explicitly needs in-progress data:
GET /environments/100/sites/site-ref/values?signalIds=Velocity,Depth&date=2026-05-31&tzi=Europe/Helsinki&format=csv&csvPreset=finnish
Authorization: Bearer <access-token>
Ocp-Apim-Subscription-Key: <subscription-key>
Accept: text/csv
The API also supports JSON and NDJSON:
GET /environments/100/sites/site-ref/signals/Velocity/values?date=2026-05-31&tzi=Europe/Helsinki&format=ndjson
Authorization: Bearer <access-token>
Ocp-Apim-Subscription-Key: <subscription-key>
Accept: application/x-ndjson
Fetch values
Use values endpoints only after reading data-catalog. The catalog tells the daemon which environments, sites, signals, time ranges, base intervals, and fetch limits are available to the caller.
The API has two values endpoints:
| Endpoint | Purpose |
|---|---|
GET /environments/{env}/sites/{siteId}/values?signalIds={signalIds} | Site-level request for one or more comma-separated signals. Use the site's catalog valuesUrl when possible. |
GET /environments/{env}/sites/{siteId}/signals/{signalId}/values | Signal-level request for one signal identified in the route. |
Both endpoints support the same date range, timezone, response format, aggregation, and metadata query parameters. Prefer site-level requests when the daemon naturally exports a group of signals from the same site, because the API can calculate derived signals from shared underlying telemetry without extra DB reads.
Request parameters
The most common query parameters for daemon integrations are:
| Parameter | Description |
|---|---|
signalIds | Comma-separated signal IDs for site-level values requests. Use IDs from data-catalog. |
date | Preferred parameter for fetching one completed local day. Do not combine with from and to. |
from / to | Bounded time range for windows other than a single day. Provide both values for complete exports. Date-only values are accepted. |
tzi | Time zone for date parsing and timestamp formatting, for example Europe/Helsinki. Defaults to UTC. |
format | json, ndjson, or csv. When present, this overrides the HTTP Accept header. |
take | Maximum number of records for sampling or partial reads. Values above 10000 are clamped to 10000. Avoid using take for complete periodic exports. |
aggregation | Optional aggregation mode. Aggregation reduces output rows, but the raw telemetry fetch must still fit the API fetch limit. |
interval | Aggregation bucket size, for example 01:00:00 or duration format PT1H. |
meta | Set false to suppress metadata. |
Time zones and date parsing
If date, from, or to are supplied without timezone information, the API interprets them in the timezone named by tzi. If from or to include timezone information, that timezone information is used for the instant in the request. Returned timestamps are formatted in the specified tzi timezone.
For time range parameters, the API accepts either date-only values or ISO 8601 date-time values. A date-only from value is interpreted as midnight at the start of that day in the specified tzi timezone. A date-only to value is adjusted to the end of that day.
Examples:
| Goal | Query |
|---|---|
| Completed local day | date=2026-05-31&tzi=Europe/Helsinki |
| Whole calendar month with date-only values | from=2026-05-01&to=2026-05-31&tzi=Europe/Helsinki |
| Explicit 24-hour window | from=2026-05-31T00:00:00+03:00&to=2026-06-01T00:00:00+03:00 |
The API streams response data for all formats, including JSON, NDJSON, and CSV. Daemons should stream the response to storage or downstream processing instead of loading the full body into memory.
Plan safe fetch windows
For complete bounded exports, the API estimates raw DB reads before fetching telemetry:
estimatedRawDbPoints = ceil(rangeDuration / baseInterval) * dbQueryCount
dbQueryCount is the number of underlying DB telemetry query parameters, not the number of requested logical output signals. For example, requesting level and calculated flow may still use one DB query if both are produced from the same raw level telemetry. Requesting fewer signals helps only when it reduces the underlying DB query count.
Use data-catalog ranges, signal baseInterval, and top-level limits.maxRawDbPoints to split exports. With the default 44640 budget:
- One DB query at
PT1Mcan fetch up to 31 days. - Two DB queries at
PT1Mshould be split into about 15.5-day windows. - Aggregated responses must still fit the raw DB estimate because aggregation happens after telemetry is fetched.
Requests with missing from or missing to keep the API's existing partial-read behavior and should be used only with take for sampling or catch-up probes. Complete periodic exports should always use date or bounded from and to.
If a bounded complete export is too large, the API returns 413 Payload Too Large with error: "request_too_large", the estimated raw DB points, the maximum budget, DB query count, and a recommended smaller range. If the API cannot estimate the request because a selected range has no baseInterval, it returns error: "request_size_unavailable".
Error handling and operations
Daemon applications should:
- Cache access tokens until their expiry and reacquire them when needed.
- Treat
401 Unauthorizedas a token acquisition or token validation problem. - Treat
403 Forbiddenas missing, changed, or revoked Data API permission. - Treat
400 Bad Requestas an invalid environment, site, signal, date range, or query parameter. - Treat
406 Not Acceptableas an unsupportedformatorAcceptheader. - Treat
413 Payload Too Largeas a signal to split the requested time range, or to inspectdata-catalogwhenbaseIntervalis unavailable. - Retry transient network failures and
5xxresponses with bounded exponential backoff. - Avoid retry storms; keep scheduled jobs idempotent by writing each export to a deterministic period-specific destination.
- Record the requested environment, site, signal IDs, date range, response format, HTTP status, and correlation information in logs, but never log bearer tokens or client credentials.
- Detect truncated response streams when downloading large responses.