Showing posts with label OneLake. Show all posts
Showing posts with label OneLake. Show all posts

Wednesday, 30 April 2025

MS Fabric OneLake Shortcuts

 "Shortcuts in Microsoft OneLake allow you to unify your data across domains, clouds, and accounts by creating a single virtual data lake for your entire enterprise.MS Learn

It allows open storage format data to be stored in the source system, metadata is added to OneLake, and the data can be queried; the load is predominantly performed against the source system, e.g., Dataverse/Dynamics.

Clarification: A shortcut is automatically added to MS Fabric for each Dataverse.  Dataverse creates Parquet files (est 5-10% extra data storage, counts against Dataverse storage).  Via the shortcut, report writers or data engineers can access the Dataverse data as though it is inside MS Fabric's OneLake.

Understand: Dataverse creates Parquet files that MS Fabric can look at to generate dataset data.

"Shortcuts are objects in OneLake that point to other storage locations.MS Learn

External shortcuts (data is held at the source system) supports any open format storage format, including: 

  • Apache Iceberg Tables via Snowflake,
  • Parquet files on SnowFlake,
  • Microsoft Dataverse
  • Azure Data Lake Storage (ADLS), 
  • Google Cloud Storage, 
  • Databricks, 
  • Amazon S3 (including Iceberg tables),
  • Apache Spart (Iceberg)
Internal shortcuts supported:
  • SQL Databases: Connect to SQL databases within the Fabric environment.
  • Lakehouses: Reference data within different lakehouses.
  • Warehouses: Reference data stored in data warehouses.
  • Kusto Query Language (KQL) Databases: Connect to data stored in KQL databases.
  • Mirrored Azure Databricks Catalogs: Access data from mirrored Databricks catalogs.
I think these are also Internal shortcuts:
  • PostgreSQL
  • MySQL
  • MongoDB

Friday, 28 March 2025

Power BI Premium to MS Fabric Primer

Power BI Premium allows all users in your enterprise to consume (use) reports, you buy licences separately for the report builders.

Two methods of getting report data: Import Mode (data is not live/real-time, and retrieval is fast) & Direct Query Mode (real-time data, but retrieval is slow)

All Power BI Premium subscriptions will automatically become MS Fabric during 2025. 

Direct Lake Mode

An additional method of querying, "Direct Lake Mode," combines the best aspects of both old methods for retrieving data.  Direct Lake Mode (real-time and fast data retrieval).

OneLake

  • Storage is in Delta Parquet 
  • Data is stored once, along with permissions, when copied into Fabric; the individual Lakehouses, warehouses, and transformations still rely on the original Parquet file data.
  • Shortcuts create a virtual pointer to a variety of data types such as Snowflake, ADSL, ...

Great Visual Descriptions of the 3 options: Comprehensive Guide to Direct Lake Datasets in Microsoft Fabric

Wednesday, 15 November 2023

Ignite 2023 - Microsoft Fabric - Introduction

GA: Prepare your data for AI innovation with Microsoft Fabric—now generally available | Microsoft Fabric Blog

Everything is brought in and available for analysis in a single Service.  Microsoft Fabric is a unified platform that brings all your analytics under a single service.

OneLake - per Fabric instance.  Stores all data within the SaaS data lake (scales itself), automatically indexes data, and abides by AIP rules/labels.  Intelligent data foundations.

All data is held in the Delta Parquet format (same format for any source).  Data is ready to use.  One copy of data.

SaaS single service, no need to bring pieces together; one data source doesn't need to moved to slice data.  Data stays at the original source but can be worked with, this all falls under the OneLake concept.  Can query using multiple approaches. Create a shortcut to files/folders/databricks, and it becomes part of OneLake while the underlying data resides in the original location that is now linked (only works on Parquet and specific file types).  

Mirroring in MS Fabric - get same benefits of shortcuts, but can connect to databases including SnowFlake, Dataverse, AWS S3 buckets & CosmosDB.  Mirroring is always up to date in real time.  Data is stored in Delta Parquet format so can now use.  With these 2 approaches can use nearly any source. lots of connectors so you could use: Dataverse, Cosmos, Snowflake, SQL Server, blobs on S3,..  Then can write queries across all the data. 

Copilot in Microsoft Fabric will help bring and analyse all the data.

Copilot for Power BI is impressive for building reports, but I need to play with it more.

Power BI is basically becoming Microsoft Fabric. The report generation piece is still called Power BI, but it falls under the MS Fabric product. Licensing for PowerBI Pro is converted to MS Fabric, and you cannot stay on Power BI Premium.

MS Fabric has a new way to access data. It is impressive in that it is fast, real-time, stores data once, carries ACLs/permissions with a lot of the data.  The ETL capabilities are amazing and configured for development.

MS Fabric also supports Real-Tile Intelligence (RTI) and SQL Azure integrated.

Last updated: Feb 2025