InfoTiles User Guide — Platform Overview and PipeFusion Analyses – Knowledge Base

Introduction

This guide helps users understand how InfoTiles ingests data, processes it, and presents insights through maps, visualisations, and dashboards. The goal is to help you quickly find relevant datasets, interpret analyses, and use filtering and search in a safe and efficient way.

The guide is practically oriented, focusing on day-to-day work with data, while also explaining core principles and assumptions that affect results.

How InfoTiles handles data

InfoTiles supports three main types of input data. The only requirement is that data is structured for machine readability — i.e. table format or database format.

Data fetched from source systems and processed automatically by machine learning models.
Datasets you upload directly into the solution as static tables (CSV, (geo)JSON, shapefile, or similar).
Datasets fetched directly from an external source without processing — for example, publicly available weather data or a direct mirror of a specialist system.

Together these form a flexible data pipeline that can accommodate both continuous streams and periodic, manual updates.

Getting started

New to InfoTiles? See the Explore ready-to-use dashboards article and the Interact with a Dashboard guide for step-by-step instructions on opening and working with dashboards.

Using the Dashboard

*Figure 1: Example of a dashboard in InfoTiles. A dashboard is a collection of analyses presented in interactive maps, charts, and tables.*

Basic navigation and use

Dashboards are fully interactive: selecting, clicking, or marking an element (map, chart, or table) automatically filters all other visualisations.

At the top of the dashboard you will find search, filter, and time selector controls, used to narrow which data is shown and which period the analysis covers.
In the search field you can use free text to search directly on name, ID, or properties to find specific objects such as pumps, pipes, or zones.
The map can be navigated with zoom and pan, and map layers can be toggled on and off to reduce complexity and focus on relevant objects.
Clicking elements in charts sets the selected property as a filter, and all visualisations update immediately.
You can filter by position or focus on objects within a geographic area by selecting / marking an area on the map.
Active filters are shown at the top and can be removed by clicking "×" to return to the full overview.
Individual visualisations can be maximised for a better overview — click the three-dot menu and choose Maximise.
Tables can be used for sorting and downloading for further analysis. To export to CSV, click (…) in the top right and choose Download CSV. See also: How to export data as a CSV file.

Figure 2: The filter, search, and time period selector area at the top of dashboards.

Filter function and search function

Filters can be applied across maps, charts, and tables — for example by zone, ownership, material type, age, risk level, or time period. All filters work dynamically and update visualisations immediately.

If two datasets share columns/fields with the same name, you can filter both datasets simultaneously using the search function. When using the filter function (green square), first select Data View and then the query. The search field (highlighted in red) will search through everything — so if the value "PS100" exists in both sources it will appear in both results, even if the value is stored in fields with different names.

Add filter panel showing Data View selector and field inputs — *The filter panel. Select a Data View, then choose a field, operator, and value to narrow results.*

For example, you can display only pipes (from the pipes dataset) and breaks (from the work-order log) on pipes with a specific material type — even though the break records and pipe records are in different datasets.

Note: If a dashboard contains charts and visualisations based on different datasets, a filter applied to one dataset will cause visualisations based on other datasets to appear empty — the no results found icon will be shown.

No results indicator shown when a filter does not match all datasets — Figure 3: Example — applying a filter that is not valid for all datasets in the dashboard will cause some visualisations to show "no results". For instance: a dashboard containing work-order data and a rainfall time series. Filtering on "material type" will work for the pipe-related charts but the rainfall chart will appear empty.

Input Data, Data Quality, Analysis and Machine Learning

Processed input data — PipeFusion

All datasets starting with pf_ have been ingested via a machine learning tool called PipeFusion. PipeFusion uses data from Gemini VA as its source and will by default apply a filter for status: Drift (in service). Pipes from Gemini with other status values — removed, replaced, or projected — will not appear in PipeFusion results.

As part of quality assurance, missing nodes, missing connections, and incorrect pointers from Gemini are corrected. The original value from LSID/PSID is retained in the field pf_id_source, and changes are documented under pf_history. Total node count and pipe length may be updated as a result of corrections.

For analysis of raw data, dedicated data views exist for the relevant tables, prefixed with Gemini:

Gemini raw data views: VA CPNT, VA Diary, VA Diary Detail, VA Inspsta — *The Gemini raw data views available for direct analysis of source data.*

Datasets uploaded directly by the user — client datasets

Datasets uploaded as "dead" (static) tables should start with client_ to distinguish them from the rest, and to avoid conflicts between read and write permissions in the solution.

This is useful if you have data in e.g. Excel that you want to combine with other analyses. Be aware that datasets starting with client_ are not updated unless you or someone with access to the source file manually uploads a new version. You should therefore not use this method for data that needs to be updated more than once a year.

Tabular data (such as CSV) has its own shortcut on the landing page (reached by clicking the InfoTiles logo in the top left). See also: How to import data from a file.

InfoTiles landing page with Add Integrations, Try sample data, and Upload a file options — *The InfoTiles landing page. Use the **Upload a file** shortcut to add your own datasets directly.*

It is also possible to load your own spatial data while working in the map, using Add layer:

Map view with the Add layer panel open — *Opening the Add layer panel from within the map view.*

Followed by Upload file. Shapefile format is recommended as it automatically reprojects coordinates without requiring manual configuration.

Add layer dropdown showing Upload file and Layer group options — *Choose **Upload file** to index your own GeoJSON, Shapefile, or other spatial data. See also:* *How to add a built-in open data source*.

Naming conventions for uploaded datasets

Good practice when uploading your own dataset is to include the date at the end of the filename — either just the year or the full date. Date formats should follow one of these:

yyyymmdd — standard in the data world, easy to sort newest to oldest.
dd-mm-yyyy

Also give the dataset a meaningful name so colleagues understand what it contains. Avoid spaces and full stops in the name to prevent issues with system settings and file format parsing.

Examples of good dataset names:

client_watersamples_2024
client_basement_flooding_2018-2024
client_sewagezones_20230416
client_unidirectional_watersupply_16-04-2023

Data directly from source

Datasets loaded directly from the source, without processing in PipeFusion, are named from the source system. For example: Gemini_workorder_diary.

Analyses Performed by PipeFusion

PipeFusion performs several analyses on datasets from the pipe database and work-order systems, along with associated observations and operational data (e.g. maintenance leak reports, pipe inspection reports) and other relevant data sources such as subscriber databases, and various sensor data and measurements (e.g. SCADA data, weather data, water quality data).

Key analyses and data generation performed by PipeFusion:

Network connections and zone divisions
Probability of failure/breakage
Consequence of failure/breakage
Risk of failure/breakage (based on probability and consequence)
Inflow/infiltration calculations
Water consumption for calculation and alerting of water leaks with associated area delineation

Network connections and zone divisions

Generating missing connections

PipeFusion identifies missing connections by building a graph model of the pipe network based on nodes and pipe elements, then testing connectivity, direction, and logical relationships in the network. The algorithm detects breaks in the structure — pipes that are close to each other without being connected, incorrect endpoints, or missing nodes. Based on spatial proximity and network logic, PipeFusion proposes automatic corrections that connect elements correctly. These corrections can be reviewed and validated before being used further in analyses.

Generating missing pipes

PipeFusion can generate new pipes by:

Adding short pipes where they are missing.
Splitting existing pipes into multiple components at nodes, to maintain a hydraulically correct model.

PipeFusion generates missing pipes by analysing the network structure and identifying logical breaks between nodes that should be connected but lack a registered pipe.

Service connections: The solution can also generate service connections from a municipal pipe to an uploaded point representing a subscriber (location of water meter, registered subscriber, building, or other relevant geopoint). This is done by PipeFusion calculating an assumed route (which is logical for a machine; topography, physical obstacles, and practical conditions are not taken into account). Total pipe length in InfoTiles will therefore be higher.

Missing connections: Below is an example where PipeFusion flags a possible missing connection or pipe between two points. Two points are shown as candidates for correction — where the actual fix must be made in the source system (Gemini VA).

Map showing points where the ML algorithm defers quality control to a specialist — *Figure 4: Map showing points where the machine learning algorithm defers quality control and correction to a specialist.*

Pipes not connected to the network (unconnected)

PipeFusion will "traverse" all valid paths through the network using a method called traversal. Pipes connected to each other in groups of at least 20 pipes are assigned a value under Group. However, some pipes are too far from anything to connect to, meaning from a data perspective it is not possible to carry water to or collect water from these pipes. This layer is designed to simplify searching for such pipes, as methods in Gemini such as parallel-scrolling or visual map analysis can make them difficult to detect.

PipeFusion sewage network map showing pipe segments, unconnected segments, and layer types — *The PipeFusion sewage network map with layer panel, showing connected pipe segments, unconnected segments, node types (bifurcation, treatment works, pumping stations), and fluid type classification.*

Zone divisions for water (DMAs)

In Norway, we typically distinguish between two types of zone division for drinking water networks: metering zones and pressure zones. In English, zones in the drinking water network are often called DMAs (District Metering Areas — i.e. metering zones), but this can also mean District Managing Areas, which is a broader concept. PipeFusion calculates 2 types of zones for water:

Pressure zones: Zone type delimited by water works, reservoirs, booster stations, closed valves, and pressure reduction valves (PRVs). Check valves are classified as PRVs as they perform the same logic.
Consumption zones (DMA): Use the same main objects as above, but replace PRVs with bulk meters. Note: the method for creating metering zones always starts from a "measurement point" with entity type "MM" in the table VA_LEQUIP in Gemini. If an area is expected to be a metering zone but shows no value, check which objects define the boundary — often a closed valve is missing or misclassified.

Zone divisions for sewage

Generating sewage zones

Input data: pipes of types combined sewer (AF), wastewater (SP), and stormwater (OV) with subcategories, plus stations (PAF, PSP, POV), treatment plants (RA), overflows (OVL), and outfalls (UTL/UTS).

Different municipalities use different practices for sewage zone definitions. PipeFusion generates zones based solely on network information (points and pipes) and does not account for or override municipality-specific zone practices.

The algorithm starts at treatment plants and travels backwards through the network until it stops at a node that is the starting point for a zone. After all pipes and points are analysed, an additional step traverses the network to map cross-connections and other special cases.

For sewage, the network is divided into zones by splitting it into connected sewage zones sorted by type of endpoint: pump station/pump, overflow, treatment plant, outfall, or "Ambiguous" (areas that can drain multiple ways due to cross-connections).

You can also filter on different network types in the map by toggling layers in the map menu: Sewage/wastewater, Stormwater, Combined, Other.

Automatically generated sewage zone map — *Figure 5: Automatically generated zone map for sewage.*

Outfall

Outfall zones are generated from outfall points, where water exits the system to the recipient and connects to entity type UTS/UTL in Gemini.

Zone naming

Zone naming is based on the station name (pf_station). When Gemini VA is the source for the pipe database, this field is linked to the REF/EXREF/STATION fields where the structured name (e.g. PS100) is stored.

Sewage zones where cross-connections create ambiguous drainage areas — *Figure 6: Sewage zones where the pipe network has possible cross-connection points allowing water to flow multiple ways. These zones are labelled "Ambiguous".*

Visual zone delimitation vs. zone as a pipe property

If you filter by geometry on the map, you will get all pipes visible within the zone, but not all of these are necessarily connected to the network in that zone. If you filter by zone name under the pipes' properties, you will only see the pipes that PipeFusion has connected to that zone.

Risk calculations

This section describes how risk in the pipe network is calculated through a combination of probability of failure and consequence of failure. This provides a more complete and decision-supporting risk picture that can be used for targeted prioritisation of measures.

Dashboard showing probability, consequence, and combined risk of failure in the pipe network — *Figure 7: Dashboard showing probability, consequence, and combined risk of failure/operational issues in the pipe network.*

Probability of failure

Probability of failure is calculated using machine learning trained on historical failure and maintenance data from many Norwegian municipalities. The model uses, among other things, age, material, and other properties of the pipe network, combined with registered incidents and work orders. The algorithm also considers the network structure — how pipes and nodes are connected in the system — using a Graph Neural Network (GNN) that analyses connections and patterns in the network.

The result is a probability value indicating how likely it is that a component has already failed or will fail soon. Input data includes:

Pipe database: pipe function (water, sewage, combined, stormwater), material type, year of installation, geographic position in the network.
Work-order/operational data: failure history for pipes of comparable types, blockage history, pipe inspection data.

PipeFusion's calculated probability of failure is accurate for around 90% of pipes. However, if you don't know which pipes the model is unexpectedly wrong about, users should always be sceptical of the result. This can be addressed by displaying a reference calculation based on empirical statistics in parallel — for example Norsk Vann's prioritisation table for pipe replacement — to easily see where there are large deviations between PipeFusion and expected results.

Dashboard showing unexpected deviation between PipeFusion probability and Norsk Vann statistics — *Figure 8: Dashboard showing an unexpectedly large deviation between PipeFusion's calculated probability of failure compared to Norsk Vann's statistical reference calculations.*

Consequence calculations

Consequence is calculated by analysing what happens in the network if a given element fails or stops functioning. The model is based on mathematical graph theory and requires the network to be stored as a graph database — it is therefore calculated on the result from PipeFusion, not raw data. The core principle is to calculate what share of the network loses connectivity if a pipe is removed.

Sewage: The sewage network is defined with flow direction, so the method starts at the outermost part of the network, an end node, and follows the flow direction while calculating backwards how many elements depend on the given pipe. The result is normalised based on position and hierarchy and given a value between 0 and 1. Categories 1–5 are also produced so results can be compared with other approaches or set up as a risk matrix following the DiVa method.

Water: Water supply is more complex as there is no fixed flow direction and pipes can belong to ring systems. First, all paths water can travel from the water works to an end node are calculated, and pipe connections belonging to many travel paths get higher consequence than those in only a few possible routes. Redundancy is also taken into account, so alternative supply routes can reduce consequence. Pipe dimension is used as a factor in consequence calculations within each ring system, so larger dimensions get higher consequence and smaller pipes are graded down based on their size relative to the largest pipe in their group.

*Figure 9: Map showing consequences of failure across the pipe network.*

Inflow and infiltration analyses

Inflow/infiltration (fremmedvann) is calculated per pump station by analysing incoming volumes of sewage based on available measurement data — flow measurements, level measurements in sumps, or pump run-times combined with known pump capacity. The model first establishes a reference level based on dry periods, where water added is mainly assumed to be sanitary sewage. Deviations above this level are interpreted as inflow/infiltration — typically from infiltration, leakage, or faulty connections.

Gross inflow: Total volume of inflow handled by a pump, without accounting for what has already been added from upstream areas.
Net inflow: Inflow volume originating within the pump's own catchment area — inflow from upstream pump stations is deducted based on the hierarchical structure of the pipe network. Net inflow gives a better picture of where in the network the inflow actually originates and where measures should be prioritised.

InfoTiles SewerIntelligence calculates the following approximately in real time per pump station:

Total volume of water pumped at each pump station
Expected sewage volume (baseflow) at each pump station
Volume and share of inflow in near real time, per hour and per pump station, including:
- Total inflow volume transported
- Net inflow volume originating within the pump's own catchment area, after upstream contributions are subtracted

Data sources: The SewerIntelligence module calculates inflow volumes using historical and real-time SCADA data (flow and level measurements) together with rainfall data. Measurement inputs used include: flow meters in pump stations, level sensors in pump sumps and at overflows, level sensors in flumes in the network, and flow meters at treatment plant inlets. The method is designed to work with only level measurements, as these are the minimum variable always available. If flow meters are available — in pumps or in the network — these can also be used, but are not required.

Expected baseflow per time unit per weekday (V_n) is determined using a combination of manual labelling and statistical analysis of raw data. To avoid being skewed by abnormal years (dry or wet), at least 2 years of data is preferred so that each season is represented. In the absence of long training data, the driest available scenario will be used as the baseline for inflow calculation.

For a detailed explanation of SewerIntelligence outputs and how to interpret them, see: SewerIntelligence — Understanding the Output.

SewerIntelligence overview dashboard showing inflow volumes, baseflow, and pump station breakdown — *Figure 10: Overview dashboard for inflow/infiltration — total inflow volume in the network for the selected period, baseflow vs. inflow time series, and per-pump-station breakdown.*

Identifying likely sources of inflow

Starting from a pump station where inflow is to be analysed, you can navigate directly to all upstream components connected to the selected pump. The analysis shows, among other things, distribution of ownership, estimated rehabilitation costs for selected elements, risk calculation per pipe component (based on component properties), age, material types, and other relevant parameters.

Components with the highest probability of failure are marked in red on the map and summarised in an associated table. In the work of reducing inflow, these calculations can be used as a basis for:

Planning and prioritising inspections, with a focus on high-risk components
Identifying new measurement points to detect inflow (e.g. at network branches to confirm or rule out inflow in delimited areas)
Planning rehabilitation with assessment of cost and benefit

Upstream network analysis showing ownership breakdown, catchment zone map, cost estimates, and network composition — *Upstream network analysis for a selected pump station catchment area — ownership split, network length by function, cost estimates, and composition by material and function.*