Saturday, 28 June 2025

UK Railway Industry for Dummies focusing on Rail Infrastructure

Rail assets are organised into large hierarchical asset classes that are interdependent to make up a rail system. These rail assets are organised using detailed, lower-level assets built from taxonomies and ontologies tailored to each jurisdiction within the rail industry.  Railway interaction and operation of assets must conform to various stringent rail regulations.  Safety has a massive focus.

Taxonomy organises data hierarchically, while ontology models both hierarchies and complex relationships between entities and their properties. In the rail industry, ontologies are crucial for successfully modelling assets.

The picture shows examples of significant assets (high-level)

Main Railway Infrastructure Assets high-level overview.

An AI-generated image to explain commonly used railway terms.




The railways consist of "rolling stock, rail infrastructure, and environment"; these components have multiple relationships with one another.
1. Rolling stock is the trains.
2. Rail Infrastructure relates to: 
    2.1. Electrification/power/energy, generally used for power supply for signalling, train power, and telecoms.
    2.2. Telecommunication, track-to-control, and train-to-control are used to communicate, including sensors and IoT devices.
    2.3. Signalling relates to ensuring train safety so the train knows there is a train ahead of it, and issues when to slow down.
    2.4. Track Engineering, also known as Rail Engineering and The Permanent Way, involves the rails, connectivity, support, extensive physics and geometry, steel rail installation and joining, ballast (the ground on which the track is laid), drainage, substructure, and sleepers. It gets detailed with rail joins (Fishplated) and even the welding process used.  Fastening types, baseplates, sleepers, off-track maintenance such as hedge trimming (you won't believe the rules unless you work in the rail industry) ...
3. The environment refers to the existing conditions before the railway, including the topography and type of terrain, bridges, and rivers.

The interdependencies with the rail industry are perfect for numerous AI scenarios.  With all AI, you need high-quality data, and it must be secured appropriately.  Bring the information from the various business functions, allowing for automation, ML, AI and better decision-making.

Each country or jurisdiction has different rules for trains, and operators must comply with Health, Safety, and Environment (HSE) regulations.  There are industry rules adapted to each jurisdiction and standards that vary between regions.  For example, most jurisdictions have a gauge width requirement; in the UK, the standard gauge is 4 feet 8 1/2 inches (1435mm).  There are exceptions, such as heritage railways in the UK.  There are manufacturing standards for everything.  EN13674 is the British Rail specification for the actual pure steel used to manufacture the track to be installed.

ISO 55000/1/2 addresses Physical Asset Management.  Building Information Modelling (BIM) enhances the design and construction process, and both apply to Rail Infrastructure.  There is generally a disconnect between Asset Management and BIM, and International Foundation Modelling (IFC) tries to help build a standardised set of assets for the railway business; we are on v4.3.

  

References used: 

Permanent Way Institution (2023) Understanding Track Engineering. London: The PWI. Available at: https://www.thepwi.org/product/understanding-track-engineering/ (Accessed: 4 July 2025)

Camarazo, D., Roxin, A. and Lalou, M. (2024) Railway systems’ ontologies: A literature review and an alignment proposal. 26th International Conference on Information Integration and Web Intelligence (iiWAS2024), Bratislava, Slovakia, December 2024. Available at: https://ube.hal.science/hal-04797679/document (Accessed: 4 July 2025).

Network Rail (2021) Asset Management: Weather Resilience and Climate Change Adaptation Plan. London: Network Rail. Available at: https://www.networkrail.co.uk/wp-content/uploads/2021/11/Asset-Management-WRCCA-Plan.pdf (Accessed: 4 July 2025).


Thursday, 26 June 2025

openBIM for AEC understanding

Within the AEC industry, standards are necessary to ensure that all project stakeholders are speaking the same language, thereby improving collaboration.  We can also process data to automate various processes if the data is standardised.

BIM (Building Information Modelling) is used to improve collaboration on infrastructure projects.  BIM is essentially ISO 19650, and it has various levels.

Building Models contain 3D information that shows how assets fit together.  Each of these assets may contain properties that can be used to look for clash detections.  Think of a CAD diagram, it lays out the plans for a building so all parties can see the proposed plan.  As CAD technology advances, you can add more information about the project.  For example, as an electrician, I only want to see the layers that affect my work.  CAD can be further extrapolated to show products and material information.

closedBIM: These were the original big BIM systems, including AutoCAD, Revit, and Bentley ProjectWise.  These tools feature visual editors and viewers, allowing them to securely store the files needed for a project and ensure that the appropriate people have access.  These all have their own proprietary standards.

openBIM: Read other parties' data, improves collaboration and consensus.  Easier to switch tools to reduce costs or get better features.  Consists of:

  1. IFC (common language)
  2. bSDD (industry common language)
  3. IDS (Requirement specification)
  4. BCF (check)
  5. openCDE (sharing with APIS)

Industry Foundation Classes (IFC) serve as the basis for standardising how information is handled.  Has standards for location, such as geographic information.  Materials, Geometry, and Spatial Structures are covered by IFC classes.  In each industry, these base IFCs are added to.  The BuildingSmart bSDD is an extension of IFC for specialised industries and sectors, published to provide more specific, agreed-upon standards.  

Project Requirements: These can vary, but having an agreed-upon format, such as an Information Delivery Specification (IDS), is helpful. Although it is not necessary or widely used, it ensures that precise details are provided.  Therefore, collaboration allows all parties to clearly understand what is needed using IDS.

IDS uses bSDD, which is based on IFC, so that the requirement specifications are precisely laid out.

openCDE defines technical interfaces, .....

Thursday, 5 June 2025

AI Vendor Management - Formiti

AI is going crazy, and you can build your own but generally you need to look at a supplier, so it's worth understand management of Vendors, you as the controller using their service are at risk of them not making their AI operations transparent.  It's a big business risk to my clients.  

GDPR is closely linked to AI, and if you use a service/vendor, the reputation and fine risk may fall on you as the provider.  Need visibility into each vendor, how they are using AI, in turn they are using vendors so it's a nice complex dependency problem.  You need to be aware of what you are relying on.

Ensure contracts with vendors consider AI, how the process your data and how their sub process vendors do the same.

Track website customer behaviour, we use a vendor to clean up the data.  In turn, I have no idea that they are using AI outside of the UK or EU.  Follow the dependency chains as all this needs to be transparent to the end customer if needed.

Monday, 2 June 2025

Copilot Studio 2025 Notes

Copilot Studio is fantastic, the licensing is complex, and the AI integration is excellent. Architects really need to understand Licensing and billing, or AI will get out of control.  The Purview and governance look very good.  Copilot Studio Cost Estimator (preview June 2025)

MS Build 2025: 

MCP Server in Preview - possible to collect data from other AI services or write back.

Connector Kit - So, you can add custom connectors from Power Platform Connectors, including Copilot Studio - great stuff.

Agent Flow - Added functionality to Power Automate flows (Copilot Studio aware), deployed via solutions.

NoteThe M365 Agent Toolkit appears to be an interesting tool that allows agents to perform tasks using Office add-ins with VS Code.

Licensing

You need to be aware:

  • M365 agents - require all end users to have M365 Copilot licences, retailing at $20/user.  Alternatively, users can consume the agents using a PAYG model per message (it racks up quickly).  I can add these to MS Teams, and it appears that people with licences can ask the M365 agent, while others can view the results (some more testing and understanding are needed here by me).
  • Copilot Studio - Requires a Copilot Studio AI Studio/maker licence at $30/retail. Users don't need a licence to use it, but you pay per message, and this can rack up quickly, so watch your usage. Buying bulk message credits can help reduce costs.
  • Each prompt generates multiple messages, which are all billable (complex to calculate)
  • (If you use Copilot Studio and it calls Azure AI Foundry, it also bills Tokens (also complex to estimate)
  • Copilot Studio utilises the AI Foundry connector through its Premium connector.

Monday, 26 May 2025

Playwright Post 6 - Automating Canvas App MFA login for Playwright unattended for Canvas apps

Overview:  Modern security makes automating logins requiring MFA rather difficult.  This post looks at possible approaches to automate the login.

Option 1. Turn off MFA—not really, but you can set a conditional rule in EntraId to not perform MFA. This is not an option in many enterprises.

Option 2. Time-based One-Time Password (TOTP)—Microsoft Authenticator makes this pretty difficult. At least I can't do it, as the APIS are relatively limited. This is kind of expected, as it's a security measure.

Option 3. Programmatically acquire an access token without browser automation, use MSAL with a client secret or certificate (for confidential clients). 

Option 4.  Use Playwright to record the login and intercept the access token once logged in.  Then you can store it and use it.  There are a few easy options to get the session:

4.1. Retrieve the access token from the response once logged in

4.2. Retrieve from your local storage:

  const token = await page.evaluate(() => {
    return window.localStorage.getItem('adal.idtoken') || window.sessionStorage.getItem('adal.idtoken');
  });
4.3. Retrieve the token using Playwright at the command run level

Note: This adds the token to my repository. Don't save the token to your repository if you don't realise that the Access/Bearer token will expire depending on what your EntraId sets. The default is 1 hour.

Option 4.3.1. Like option 4.3, use the refresh token to silently generate a new Access token. You store the refresh token during the recorded login (by default, it lasts for 90 days) to generate a new access token when needed.

Option 4.3.2.  Take it further back to generate the refresh token using the access code you get at the original login, renew the "refresh token", and generate a new access token to run your tests.

If you decide to store your access token, refresh token or code, don't store them in your code repo.  You know why, if you've made it this far.

Thought: as a refresh token works for 90 days on a sliding scale, I've never used the option 4.3.2, as by storing the refresh token, all I need to do is to extend the refresh token by using it to get an access token, and the refresh token has 90 days from that point. 

This is the plan I'm thinking of using:

Tuesday, 20 May 2025

Entra AAD Security Groups - Remember

Overview: I have lost count of the number of poor Active Directory and Azure Active Directories I have seen.  I don't think I've ever seen a good Active Directory actually.  Certainly nothing large over 5K users. 

I'm working with a multinational, and we need to improve the security.  Things are a little all over the place, oddly named and inconsistent, basically the normal for an 300k internal user enterprise with history and multiple aquations.

I identify a coupe of properties that will really create a nice hierarchy, issue is I'm using more than the allowed 5k Dynamic AAD Security Groups.  

Group Types to be aware of relating to Entra

1. Static AAD Security Groups

Got to add the users manually, or at least automate the process for anything but the smallest of Entra users.

Static AAD Security groups can be nested.

3. Dynamic AAD Security Groups

Up to 5,000 dynamic groups.

You can inherit Security groups or be inherited (no nesting).

3. Distribution AAD Groups

Used for email and calendars, not security.

4. O365 Groups/Teams Groups

They can inherit O365 groups or AAD Security groups.  They are managed within the org so not the best idea to place heavy security on manually managed teams. 

Resolution:

I have a full hierarchy of users within divisions and subdivisions.  By adding users statically via automation to there lowest level AAD Security Group.  Then I can add the child groups.  This gives me multiple groups that have more and more users in as we go up the hierarchy.  Additive groups with positive security gives me the best options.  

Future Wishes:

If only Entra supported more dynamic AAD Groups per tenant or allowed Dynamic groups to be nested in static AAD groups



Monday, 12 May 2025

Playwright Post 5 - Understanding how Playwright Works

Playwright as a tool consists of two main parts.

Part 1: Playwright Library: This is the automation of a browser using the Page Object Model (POM). It provides a uniform API to run against the 3 main browser engines, automating tasks like navigating, clicking, filling in form data, and validating content on a web page. Classes include APIRequest, APIResponse, and BrowserContext. The worker process runs the API calls sequentially. Unified library API calls are sent to the browser context, which runs unaware of the calling context.  

Top link runs in Node.js and makes API library calls, there is no timing between the Node.js (Controller) and the browser instance (running Chromium instance)

Part 2: Test Runner: This part runs the Playwright tests.


Playwright Series