Fabric Ideas

whopkins2 · ‎09-07-2024

its well known that intellisense has always been broken with things like ListList.Contains or pressing . after typing certain prefixes does not correctly show options, or typing null enter puts null.type in. Also a huge issue is no intellisense in Dataflows both need fixing

Kalyan12 · ‎02-05-2025

Orchestrating a refresh of a Dataflow Gen2 with CI/CD and Git support isn't possible in Fabric data pipelines. We request you to bring this feature in Fabric so that users can integrate these dataflow gen 2 with CI/CD and Git integration in data pipeline.

dsiska · ‎03-26-2024

Please add support for mirroring to OneLake from on-premise SQL Server.

fbcideas_migusr · ‎08-05-2024

Please make it possible to open a dataflow gen2 without taking it over. I wish we could open a dataflow gen2 even if it is "owned" by someone else in the workspace. I think that the user who opens the dataflow gen2 should need to apply their own - or shared - data source connections in order to see any data in the query steps. Otherwise, they would be able to use the owner's connections to interact with the data sources, and I don't think that's a good idea. Because the owner might not want other workspace members to use his/her connections. So I think that the user who opens the dataflow gen2 should need to apply their own - or shared - data source connections in order to see any data in the query steps. I think it should be possible to open the item and start editing it, and not having to formally "take over" until you are ready to publish the changes. Or some times you just want to open it for inspection without making any changes. And if you do start to edit the dataflow, but then realize you aren't able to authenticate to all the data sources, then it should be possible to cancel so that the dataflow remains unchanged and the previous connections keep working. Would be nice to have: before a user saves (publishes) changes to the dataflow, there could be a validation step to check that the user has authenticated properly to all of the item's data sources and data destinations, so that they won't cause the dataflow to stop working. If the validation fails, the dataflow should keep running with the previous connections and the changes should be undone.

231 Votes · ‎06-01-2023

Enable to modify the mashup document of a Dataflow Gen2 by passing a variable from a pipeline before an execution of the Dataflow

218 Votes · ‎05-31-2023

Enable an experience that allows the end-user to setup incremental refresh for Dataflows Gen2 in Data Factory for Microsoft Fabric

Alex_Gorbunov · ‎09-24-2023

Dataflows are supported by the Deployment pipelines, but there is no Git integration for dataflows. With the introduction of Git integration of workspaces, it would be fantastic if Dataflows could be stored and traced and reviewed in Git same like it is possible now for datasets and reports.

6 Votes · ‎02-05-2025

We propose the ability for users other than the Dataflow Gen2 owner to collaborate on the same Dataflow Gen2. We think this feature would be useful and would like you to improve the functionality.

fbcideas_migusr · ‎07-06-2023

In ADF, you can parameterize a linked service so that one linked service could communicate with multiple database servers/databases. In Fabric, you would need a connection for every database. Please make connections parameterized so that working with an enterprise-level multi-tenant source database system would be easier.

renato_lira · ‎08-13-2024

Several businesses have the need to exchange data via XLSX files. Due to company policies, the correct classification of sensitivity label for some files adds encryption to them. Power Query versions from Power BI Desktop or published Semantic Models(aka datasets) in Power BI Service/Fabric are already compatible with reading data from those encrypted files. Power Query online (attached to Dataflows Gen1 and Gen2) is not compatible to read encrypted files due sensitivity label policies Since best practices suggest that you should move your transformations as upstream as possible (which for the business user means using Dataflows), this is a much-needed feature.

chandulae · ‎06-20-2024

Hi, We are experiencing significant issues with deploying Dataflow Gen 2 from development to production. Currently, we are using the manual export and import method, saving them to a local drive. Handling over 100 tables makes this manual method very time-consuming. Could we have support for a deployment pipeline with Dataflow Gen 2? If it is already planned, could we know the release date? Thank you.

fbcideas_migusr · ‎06-14-2024

When you create a connection to a warehouse, lakehouse or KQL DB, the connection is "absolute", meaning that the workspace ID is hardcoded into the connection. This gives issues when you deploy the pipeline to another workspace with deployment pipelines, because ADF will try to resolve the connection using the original workspace, instead of the new workspace. It can't find it, so the connection gives an error. You can work around the issue by parameterizing the connection entirely, as described in this blog post: https://sqlkover.com/dynamic-warehouse-lakehouse-connections-in-microsoft-fabric-data-pipelines/ However, it would be better if you could create a "relative" connection. This means: find the warehouse with the same name in the current workspace.

fbcideas_migusr · ‎10-14-2024

Hi Team, How about implementing event-based triggers at the Lakehouse file path level? Thanks, Pavan

Duersin_Kurt · ‎10-08-2024

Current State With the current state within Fabric (and Data Factory and SQL Server Agent Jobs), we almost always need to execute pipelines in sequences, leading to multiple disadvantages. In the worst-case scenario, we end up with a monolithic pipeline that must ensure each other pipeline is executed in the required order. This complicates development processes and rolling out into existing environments. Furthermore, it sometimes forces us (without handcrafted job control solutions) to execute weekly/monthly jobs, even if they are not needed. Even with handcrafted job control frameworks, we need to develop patterns that wait for the successful execution of other pipelines/dataflows. Cons of the Current State From my point of view, there are several cons with this approach: No flexibility within dataflow designs. Schedules are not objects where multiple jobs can be started within them; rather, they are somewhat a property tied to an object. Unnecessary execution of pipelines just to ensure they run in sequence. Overhead in developing a job control framework individually (we could actually sell ours :D). Big monolithic parent pipelines. Much time spent figuring out which pipelines are needed and where the developed pipeline could fit in. External Dependencies There is a way to get around these issues. To be fair, there would be some disadvantages too, but I think the advantages are much more appealing and outweigh them. Some tools already allow this to be done, and there are possibilities to improve upon them. It is about so-called “external dependencies” and different types of dependencies. For example, we have three pipelines within two business domains: pip1_monthly_etl_dom_sales_orders_fact pip2_daily_etl_dom_sales_dim_customers pip3_daily_etl_dom_production_planning_fact Let’s say production_planning_fact needs sales order information and sales customer data. Furthermore, let’s assume that the data arrives at different points in time (a reality of our life). This explains the need to model a dependency from one task to another task that may reside elsewhere, instead of invoking pipelines within parent pipelines. So we can say something like: pip3_daily_etl_dom_production_planning_fact : Look up if the object pip1_monthly_etl_dom_sales_orders_fact has been executed within the start of this month and was executed successfully. Look for pip2_daily_etl_dom_sales_dim_customers to see if it was executed within a given timeframe (last 24 hours, same day as this task is executed, etc.), and wait for it until it is completed. If we fully embrace the idea of a layered design combined with business domains, we would need concepts from AUTOMIC or Apache Airflow. We could design one scheduler per layer and domain combination, where all jobs reside. Other layer/domain objects being dependent would be able to set a dependency without us even knowing. Lineage would be more telling when looking at it, so we can understand what data processing tasks are not done and which domains need to be informed about incomplete data, etc. Idea 1: Scheduler and Environment A scheduler is a job that runs permanently and invokes multiple pipelines at different timestamps, with the possibility to add maintenance plans. Environments from SSIS could be a great way to define environments with parameters used while triggering pipelines. Prioritizing either pipelines or a scheduler would help ensure the most important jobs run first (just a field called priority for a scheduler or a pipeline). Connections should be parameterizable objects (not directly tied to this :D). A new object, instead of invoking a pipeline: external dependency or pipeline dependency with different wait types and options (like run on complete, don’t run on error, etc.). The current logging for the last state of a pipeline could be used as types of dependency (hard, soft, etc.). What’s Needed What would be needed for this is a bit of a redesign for how executions are triggered and scheduled within Data Factory. Cons of External Dependencies Regarding the cons: There would be a need to think much more modularly than now. If we don’t have a tool that can find dependencies within a stored procedure (it’s hard to scan for FROMs and JOINs, even with CTEs and nested joins), we need to model the dependencies as developers, increasing the risk of errors. Fragmented jobs. Deployments become somewhat harder. Possibility for resource-intensive waits and multiple jobs running in parallel, consuming “too much” CUs. Conclusion I’m sorry for making it a bit longer, but I wanted to bring my point across since the days with SQL Server Agent jobs, where the missing piece to me was often a more flexible way to wait for jobs, instead of having to think in sequentially processed items.

jpelham1 · ‎01-28-2025

Right now, to view a pipeline schedule you have to go directly into the individual pipeline to view the schedule. It would be nice to be able to view and adjust pipeline schedules in one central location for all pipelines that an individual has access to, with view vs adjust capabilities limited to user permission levels. I believe this feature would help to easily coordinate and schedule multiple pipeline runs. Also, for users with only viewing permissions, it could be beneficial to see when new data is loaded.

timo_riikonen · ‎07-22-2024

Question: What does ETL always need in production? Answer: Messaging on failures Q: What is the best channel today? A: Teams (and Email because you can make email rules easier) Q: Can you do this in Fabric as of 2024/07? A: Yes, messaging to Teams is in preview state. What doesn't work: Data pipelines always use AND when combining several inputs. This means that you must use OR box to combine failures together before you report the failure. As Data Pipeline doesn't have an OR box, you must create failure reporting for every action separately. You can't copy-paste Teams-messaging or their content, so you have to make all of them manually one by one. I am not sure if can even copy paste functions, the UI is difficult. You can't tell what environment you are running the Data Pipeline: Dev, Test or Prod Now this idea suggesting is on the last one: You can now drag Workspace ID to the Teams Message. This gets renamed as DataFactory for some reason, but don't get confused with that. Problem is that this ID is a cryptic key such as 8466c1bc-e477-415e-8b77-a714cd012345, so you need to add separately a decrypt table so people will know what environment was this run on. So Data Pipeline should have "Workspace" or "Workspace name" dynamic content to choose from.

arto_valtanen · ‎09-26-2024

Hello, it would be great to have DB2 mirroring available on fabric. from ux & mf infras.

amolteni · ‎07-14-2023

be able to do incremental loading from a pipeline.

fbcideas_migusr · ‎02-13-2025

Please implement a mechanism to notify the admin and (or) workspace members when a pipeline fails

robert_evertz · ‎09-10-2024

We need private network support for Database mirroring when connecting to Azure SQL. Our security team has rejected our request to test with Fabric and database mirroring due to the requirement to allow public or all azure network access to the databases.

Share your ideas and vote for future features

Fix Power Query Intellisense

Orchestrating a refresh of a Dataflow Gen2 with CI/CD and Git should be supported in Fabric data pipelines.

Fabric Mirroring for SQL Server

Open Dataflow Gen2 without take over

Enable to pass variables from pipelines as parameters into a Dataflow Gen2

Enable a way to set incremental refresh in Dataflows

Git integration for Dataflows

How to allow users other than the user who created Dataflow Gen2 (=owner) to manage (modify, delete, etc.) Dataflow Gen2

Connections should be parameterized to allow for connections across multiple similar connection types

Make Power Query Online able to read encrypted XLSX files due to Sensitivity labels

Deployement Pipeline support for Dataflow Gen2

Data Pipelines - Create Relative Connection to Warehouse/Lakehouse/KQL DB

Event based trigger on Lakehouse

More flexibility within pipeline executions through external dependencies

Central Location to View Pipeline Schedules

Workspace name as dynamic content

DB2 mirroring on fabric

Support UPSERTs and DELETEs when copying data into Lakehouse Tables from Pipeline copy activity, as opposed to Appending new rows

Pipeline Failure Alert

Database mirroring - private network support

Helpful resources

Fabric Deployment Pipelines - Deployment Rules

Update/clean up the Power BI Ideas page

power bi mobile- option not to remember login

PowerBI Goals - Managing Permissions

Odata

New Offer! Become a Certified Fabric Data Engineer

Share your ideas and vote for future features

Helpful resources