Thursday, 9 October 2025

Medallion Architecture in Fabric High Level Organisation Design Pattern

Microsoft Fabric is excellent!  We do still need to follow good practices we have been using for years, such as making data accessible and secure.   Possibly the most used architecture for Big Data is the Medallion Architecture pattern, where data is ingested normally in a fairly raw format into the bronze layer, then transformed into more meaningful and usable information. Lastly, the gold layer exposes data relationally using semantic models to reporting tools.

Overview: This document outlines my attempt to organise enterprise data into MS Fabric using a Medallion Architecture based on Fabric Workspaces.  Shortcuts are better than imported data, but it does depend on factors such as what the data source is, what data we need, how up-to-date the data is and performance requirements from the systems involved.

The reports and semantic models can get data from other workspaces at any of the medallion layers.  This architecture lends itself well to using the new Direct Lake Query mode.

Summary of a Design used by a Large Enterprise:

Medallion architecture using Fabric Workspaces.

Friday, 26 September 2025

Microsoft Fabric High-level architecture

Overview: Microsoft Fabric is an end-to-end analytics platform that unifies data movement, storage, processing, and visualisation. It integrates multiple services into a single SaaS experience, enabling organisations to manage their entire data lifecycle in one place.  One Lake is at the core of MS Fabric.

Image 1. One page High-Level Architecture of MS Fabric. 

European Fabric Conference in Vienna Sept 2025 takeways

FabConEurope25 was terrific in Vienna last week.  Great opportunity to meet Fabric and data experts, speak to the product teams and experts, and the presentations were fantastic.  The hardest part was deciding which session to attend as there are so many competing at the same time.  

My big takeaways:
  • Fabric SQL is excellent.  The HA, managed service, redundancy, and shipping logs ensure that OneLake is in near real-time.  Fabric SQL supports new native geospatial types.  SQL has temporal tables (old news), but row, column and object-level (incl. table) security is part of OneLake.   There are a couple of things security reviewers will query, but they are addressed.
  • Fabric Data Agent is interesting.  Connect to your SQL relational data and work with it.
  • User-defined functions (UDF), including Translytical (write-back), HTTP in or out, wrap stored procedures, notebooks,.... - amazing.
  • OneLake security is complex but can be understood, especially with containers/layers, such as Tenant, Workspace, Item, and Data.  There is more needed, but it's miles ahead of anything else, and Graph is the magic, so it will only continue to improve. - amazing, but understand security.  Embrace Entra and OAuth; use keys only as a last resort.
  • Snowflake is our friend.  Parquet is fantastic, and Snowflake, including Iceberg, play well together with MS Fabric.  There are new versions of Delta Parquet on the way (and this will even make Fabric stronger, supporting both existing and the latest formats).
  • Mirroring and shortcuts - don't ETL unless you need to shortcut, then mirror, then ETL.
  • Use workspaces to build out simple medallion architectures.
  • AI Search/Vector Search and SQL are crazy powerful.
  • New Map functionality has arrived and is arriving on Fabric.  Org Apps for Maps is going to be helpful in the map space.  pmtiles are native... (if you know you know)
  • Dataverse is great with Fabric and shortcuts, as I learned from Scott Sewell at an earlier conference.  Onelake coupled with Dataverse, is massively underutilised by most orgs, 
  • Power BI also features new Mapping and reporting capabilities related to geospatial data.
  • Other storageCosmosDB (it has its place, but suddenly, with shortcuts, the biggest issue of cost can be massively reduced with the right design decisions).  Postgres is becoming a 1st class citizen, which is excellent on multiple levels. The CDC stuff is fantastic already.
  • RTI on Fabric is going to revolutionise Open Telemetry and AI, networking through the OSI model, application testing, digital twins, and live monitoring,....  I already knew this, but it keeps getting better.  EventHub and notebooks are my new best friends.  IoT is the future; we all knew this, but now with Fabric, it will be much easier to implement safely and get early value.
  • Direct Lake is a game changer for Power BI - not new, but it just keeps getting better and better thanks to MS Graph.
  • Manage Private Endpoint as improved and should be part of all companies' governance.
  • Purview... It's excellent and solves/simplifies DLP, governance and permissions.  I'm out of my depth on Fabric Purview and governance, and I know way more than most people on DLP and governance. Hire one of those key folks from Microsoft here.  
  • Warehouse lineage of data is so helpful.  
  • We need to understand Fabric Digital Twins, as it is likely to be a competitor or a solution we offer and integrate. 
  • Parquet is brilliant and fundamentally is why AI is so successful.
  • Powerful stuff in RDF for modelling domains - this is going to be a business in itself.  I'm clueless here, but I won't be in a few weeks.
Now the arr..
  • Pricing and capacity are not transparent.  Watch out for the unexpected monster bill!  Saying that the monitoring and controls are in place, but switching off my tenant doesn't sit well with me if workloads aren't correctly set out. Resource governance at the workspace level will help fix the situation or design around it, but it will be more expensive.
  • Workspace resource reservation does not exist yet; however, it can be managed using multiple fabric tenants. Distribution will be significantly improved for cost control with Workspace resource management.
  • Licensing needs proper thought for an enterprise, including ours.  Reserve Fabric is 40% cheaper, and it cannot be suspended, so use the reserved fabric just as you would for most Azure Services.  Good design results in much lower cost with Workloads.  Once again, those who genuinely understand know my pain with the workload costs.
  • Vendors and partners are too far behind (probably due to the pace of innovation)
Microsoft Fabric is brilliant; it is all under one simple managed autoscaling umbrella.  It integrates and plays nicely with other solutions, has excellent access to Microsoft storage, and is compatible with most of the others.  Many companies will move onto Fabric or increase their usage in the short term, as it is clearly the leader in multiple Gartner segments, all under one hood.  AI will continue to help drive its adoption by enterprises.

Sunday, 17 August 2025

What is GIS?

GIS stands for Geographic Information Systems, which are tools and techniques for capturing, managing, storing, processing, and analysing spatial data. It is part of the broader geospatial technology ecosystem, which also includes drones, remote sensing, and GPS.

Geospatial data (Raw)

Definition: Any data that includes a geographic component, describing the location and attributes of features on Earth, contains raw information, like points, lines, and polygons, that has a real-world location associated with it.
Examples: A GPS position of a car or the address of a customer.

GIS data (Organised)

Definition: Geospatial data that is structured, stored, and analysed using Geographic Information System software.
Examples: include a digital map of roads created from GPS data or layers of data showing flood risk areas.

Summary: Geospatial data is the foundation: It is the raw material for all things spatial. GIS is a toolset that may include tools like ArcGIS from Esri.

Other:
In the AEC space, building and Asset management rely heavily on GIS within BIM.
ArcGIS is the industry leader in GIS tooling, and comes in three versions: 
  • Desktop (ArcPro, Arc Toolbox, ArcCatelog),
  • Server (), 
  • SaaS ArcGIS Online (AGOL).

What WGS84 and GeoJSON Mean?  

These are the most common formats for storing position (WGS84) and shape data with coordinates (GeoJSON) 

WGS84 (World Geodetic System 1984) is the standard geographic coordinate reference system used globally. It represents positions on Earth using latitude and longitude in decimal degrees.

GeoJSON is a widely used format for encoding geographic data structures in JSON. According to RFC 7946, all GeoJSON coordinates must use WGS84 (EPSG:4326).

Thursday, 7 August 2025

GitHub Copilot with Claude Sonnet 4 is amazing, and GPT 5 is even better

I couldn't sleep, so I decided to build a Pulumi C# application that uses an existing MCP Server. My forms will utilise the client to allow me to access my Azure subscriptions and resources - wow.  Build a really cool tool quickly - Claude Sonnet 4 is once again significantly better than GPT-4.1 for programming with GitHub Copilot.

Update Sept 2025: I'm now using GPT-5 over Claude Sonnet with GitHub Copilot when programming in VS Code.  Both feel about the same quality to me.

GitHub have this for comparing AI models for GHCP, which is very useful.

I am using GPT-5-Codex, which "is a version of GPT-5 optimised for agentic coding in Codex".

I am also really liking GitHub Copilot code review

Anthropic's Claud 4.5 is also excellent..

Wednesday, 30 July 2025

AI for developers and Architects

The cost of prototypes is unbelievably low using AI. 

Rapidly creating a prototype, especially with new or less well-known technology, is where I derive significant benefits from AI.

How to build application prototypes?

  1. Write /reverse prompt/Adjust instructions into md file
  2. Agentic AI (specialising in Doc Extraction) to extract and refine from md file
  3. Run using IDE-based copilot (VS Code with GitHub Copilot) (AmazonQ) (Cursor, Windsurf, Steamlit) 
Thoughts: Developers are adjusting to using Ai to support software solutions.  The developer role will continue the trend of making technical implementation more accessible, allowing knowledgeable IT engineers or domain experts to build faster and better than citizen/amateur developers.  Ai assists in complex decisions!  

What needs to improve?
  • Knowledge is key.  AI needs to have narrow expertise at the right time. i.e. only domain knowledge, not influenced by other data.  Quality of input data used to train.  Allows for dynamic reasoning.
  • Session/long-term contact agreement/understanding to improve the understanding between your IDE and me.  Remember how I prompt and provide feedback on how I digest information.  Context between the human developer and Ai is Paramount.
  • Control of IDE integration with coding copilots, clear return to the user developer to make better decisions.  Context is Paramount.
  • Governance & Data (Connectors, API's, code complex processes (MCP maybe), quality of data).

Retrieval Augmentation Generate (RAG)


Model Context Protocol (MCP)

MCP is a protocol (created by Anthropic) that allows an MCP client to connect to an MCP server, which in turn has specialist knowledge that the MCP server will expose. Authentication uses OAuth to secure access.

My Applications/Agents use the MCP to ask the MCP Server, 'What can you do?' so that they are aware of how to use the MCP Server.

The MCP Server, when built, informs the client of its capabilities and then performs actions such as updates using an API.

Summary: Use MCP to allow the client to talk to other resources/tools

Agents-to-agent (A2A) 

A2A allows agents to work together.  So two agents can leverage each other; the other agent solves the issue and returns the answer for the first agent to use.  Whereas MCP allows any agent to speak to a source.  Agents complete a task and give it back to the calling agent. 
 
Summary: Use A2A to talk to specialised Agents, and the agent returns the calling agent's answers.

Wednesday, 2 July 2025

Artificial Intelligence as a mega trend

Overview of AI

The world has gone mad for AI.  Constantly, I see overhype and poor messaging leading to a misunderstanding of potential.  

AI is not new; the ability to use it for commercial gain is new at the scale we now have.  AI excels at helping us identify patterns, gather information, and is primarily a decision support system (DSS).

AI is not intelligent, but rather good at making complex decisions, and it has biases that we teach it.

This means AI is useful for specialisation, not a generalisation of "smartness", now that ChatGPT et al. are wide-ranging, people are assuming it is a general-purpose tool.  Actually, ChatGPT is a specialist in breaking down and grouping language based on a data source.  For those in the technology industry, we pretty much know that ChatGPT is a good Google (search engine).  

So, what is AI going to be successful at?  Well, this is my prediction:

AI will have a massive impact on many industries:

1. Healthcare guess what? More surgeons and people will be needed, not fewer.  Here, I focus on Healthcare examples.  People need to interact with others; avatars are a joke. I can talk to Alexa already.  There is very little to nothing in this space except for snake oil salesmen.  Please prove me wrong! More skilled people are needed.

2. Software Development/IT - This is a significant one.  Programmers' roles will change significantly; those with a good understanding and knowledge will thrive, while those with superficial knowledge and a lack of ability to truly understand and work through challenges will likely disappear.  Technologists will focus on challenging problems and introduce significant improvements to all business processes.  The amount will continue to grow.  There is not a lot of agentic, "smart AI" in the space, and we are 50 years away from this, imo.

3. Manufacturing - it won't make the impact that the media says it will.  We are good at manufacturing.  The sub-functions that will benefit include machine maintenance, sensor usage, and performance/behaviour will change.  This will allow you to improve Machine Maintenance (MM) and scheduling.  Think of railway lines; they need to be shut down, and it costs millions to trim hedges. Imagine now that you know the level crossing "lifty uppy-doowny" box/bar is showing signs of fatigue.  Shift the fix left and save the unscheduled breakdown; the train line and knock-on effects shall result in massive improvement.  We are already proficient in manufacturing and, to some extent, automation. If the AI is not significantly better, it is not worthwhile. More skilled people are needed.  

Machine Maintenance in Manufacturing.  AI is needed to mature MM. 

Techniques such as defect detection are already well-established using Visual AI at the micron level.  Rubbish detection.  Using AI will be beneficial - sure, it will become cheaper and easier to acquire this system capability for less, but AI is merely an enabler, and it has been available for well over a decade.  More skilled people are needed.

4. Service Industry - Robots serving people, please, it's mad, except at MacyD's (McDonald's) and honestly, minimum wage workers are pretty efficient there, and it will be too sterile.  Pushing out patties, well, if you need AI for this, you don't know what AI tries to do.  AI & automation are already in the processing and packaging processes.  The big stuff with AI will be in social media and advertising (and don't get me started there, automated advertising will absolutely fail. We need to invent a missile to destroy non-human posts).  More people will be required in these new and changed services.  

Analogy:
1. Old technology: Hand weaving material was a big profitable business in Britain; along came looms; these workers got upset and broke the looms and ended up in prison or broken; these were the Luddites (who refused to embrace technology).  The Luddites ended up broke, and all this could have been avoided by embracing technology, as they were the most knowledgeable about materials and production. They are the natural experts.   

2. Trend jumpers on: Too many companies wanted to build looms, and a handful of players did brilliantly and still exist today.  Think Microsoft and AWS; they are transitioning from being programming technology companies to AI technology.  They still solve the same problem of process improvement.  The weavers who decided to go into building and repairing looms did exceptionally well, but ultimately ran out of demand, and their prices were driven down as there was an excess of supply.  Still a good change.  Many people also got hurt here. Be careful inventing new technology in processes; you get it right, you are a hero; get it wrong, go find a new job.  Lots of sales silver bullets are being produced.  There are tons of "AI experts", but mostly this is absolute rubbish.  With rare exceptions, you are not an AI expert unless AI was in your job description more than 5 years ago.  Beware the snake oil salesmen, nowadays they come in many forms, sizes and shapes :)

3. Embrace change:  Normal, common-sense (smart) people realised they actually had 4 options:

  1. Learn how to use a loom.  Use the technology available and use it to build garments faster;
  2. Build looms and support the loom business;
  3. Do nothing, continue to offer hand-weaving labour to the market.  So take your pension and hope like hell you win the lottery (I'm still backing this option for myself); or
  4. Expert hand craftsmen or women :) Become the best hand weaver in the world, and people pay you for your expertise; these people's descendants/business still exist.  But big surprise: it's hard, it takes a long time, it's unlikely to make you rich.,, So, sure, go do this if you are a genius in your field and love it, but don't die of surprise when you go broke or don't get the return you deserve for all that hard work.

Summary: Embrace technology and AI, it is only a decision support system.  More skilled people are needed, as you have the background, being professional and embracing change means you are more in demand.  Sitting on your backside waiting for the lottery means you are like 90% of people, and you'll get 2 jet skis and a new husband! yipee.

Healthcare

Good Use Case: Diagnostic medicine

Diagnostic medicine has become the centre of healthcare, and the ability to use AI, which is better at detecting abnormalities than the best radiologist using a single trained model, yields results in near real-time.  This means that consultant radiologists and specialists can receive reports in seconds that are of unparalleled quality.  GPs have the best guess within seconds, rather than well... we all know this.

AI also provides probability, so it's easy to prioritise any reporting that's life-threatening to a specialist, allowing them to focus on the most challenging work and receive the in-depth information provided by the AI.  

This is possible because we are dealing with a relatively narrow field of data that we have taught AI to handle. Think of X-rays; the results are far superior to an expensive resource (a Radiologist) that takes at least 12 years to train.  And more to get brilliant.

Should we stop training Radiologists and diagnosticians and allocate our resources to AI?  Absolutely not!!   

Radiologists should be using the AI reports, validating, using the info and extrapolating, when an issue is detected, this must be added back into the learning model resulting in improving the AI.   AI should not act. It must only be used to support.  Acting should be restricted to notifying relying parties such as GPs.  

Good Use case: Online GP Appointments and triage

If you have an issue, you go onto an NHS app that will ask you for your symptoms and ask a few follow-up questions.  It will only give you its best guess (this is already amazing imo.), this in turn, will triage your call into "go to your emergency department, they know you are coming", "let's book you an emergency appointment", or "this is my recommendation, and why".  Dr Google actually becomes useful (weird medical insider joke).  Honestly, we could do so much more, but care is given to the right people, "shift-left" (the sooner you catch it, the cheaper and better the solution, 100% applies to healthcare).

Preventive medicine and nudging technology will have profound improvements in people's lives and lifestyles.  Hooking into ambulance services and driverless automated vehicles,.. people do the hard stuff and make the decisions. AI does the piece efficiently and quickly that we as humans aren't good at. Hopefully, you are understanding the difference between narrow and wide industries.

Bad Examples: Robot Surgery or treatment rooms

Robots replace people in operating theatres. It is insane!! A surgeon could use AI to obtain better diagnostic data more quickly; they could even utilise technology like AI-assisted operations, then send messages if it detects that the actions are not optimal or there is a clear risk that a priority has changed.  It is brilliant for decision support. It's not a good idea to try to automate complex decision-making.


This post does not look at Strategy and Purpose

All AI solutions give themselves an exponentially better chance of success, regardless of industry, if they have a strategy, purpose, and FAIR data (Findable, Accessible, Interchangeable/exchangeable, and Reusable).