Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedJoin us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now
In dataflow gen2, I am extracting BC365 data from an advanced/custom API. The data is salesInvoiceLines from BC (so the parent for this entity is "salesInvoices").
There are 350.000 rows for salesInvoiceLines, so this is not a lot of rows.
Even though there are not many rows, the dataflow gen2 has extremely poor performance. Pulling GLentries (8 million rows), I can easily do this by just selecting the GLentry API/table.
But for salesInvoiceLines of 350.000 rows, it can't even complete. Note that the query WORKS when I just e.g. select 10 rows, so I am completely sure that the reason is NOT the query.
So my question is: How do I performance optimize this? Incremental load is not an option. Also, I have tried with staging enabled/disabled and fast copy already. What am I doing wrong?
Here are my steps, and i HIGHLY think that the "Unpivot" and "Expand" steps are problematic. In the applied steps, I first navigate to "salesInvoices" (the parent) and then expand "salesInvoiceLines". Note that I cannot simply go straight to salesInvoiceLines, as I would then get this error: "(You must specify an Id or a Document Id (parent id from 'salesInvoices') to get the lines.)"
Thanks.
Solved! Go to Solution.
I understand your frustration. If you have already done these things, then you might want to consider whether it is related to capacity.
Capacity issues can sometimes affect the performance of Dataflow Gen2. Make sure Dataflow Gen2 has enough resources allocated to handle your data volume. If there is insufficient capacity, it may also cause performance degradation or job failures.
You can get detailed refresh activity information in the Monitoring Center, including the status, start and end time, duration, and CU usage of each activity. This may be helpful to you.
If you are still not sure what the issue is, you can create a support ticket and a dedicated Microsoft engineer will resolve the issue for you.
It would be great if you continue to share in this issue to help others with similar problems after you know the root cause or solution.
The link of Power BI Support: https://powerbi.microsoft.com/en-us/support/
For how to create a support ticket, please refer to How to create a support ticket in Power BI - Microsoft Fabric Community
Thank you for your understanding.
Regards,
Nono Chen
If this post helps, then please considerAccept it as the solution to help the other members find it more quickly.
My problem is solved, Thank you so much.
It sounds like you're running into a performance bottleneck with Dataflow Gen2, so let me offer you some suggestions for optimizing your data flow:
Given the structure of your data and the requirement to navigate through "salesInvoices" to access "salesInvoiceLines", apply filters as early as possible in your dataflow to reduce the amount of data being processed. For example, if you can filter "salesInvoices" before expanding "salesInvoiceLines," it will reduce the data volume.
Perform unpivot operations as early as possible and only on the necessary columns. This can help reduce the data volume and improve performance.
If possible, process the data in smaller batches. You can create multiple dataflows, each handling a subset of the data, and then combine the results.
Regards,
Nono Chen
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
I did everything, and smaller batches do work, but only occasionally. It is highly unstable. Also, the total dataset has 350k rows only, so I don't see why this is an issue. If that's the case, DF gen2 is really not a scalable or viable solution for data ingestion. Especially also due to transient errors and debug codes that are very hard to understand. Any other suggestions?
I understand your frustration. If you have already done these things, then you might want to consider whether it is related to capacity.
Capacity issues can sometimes affect the performance of Dataflow Gen2. Make sure Dataflow Gen2 has enough resources allocated to handle your data volume. If there is insufficient capacity, it may also cause performance degradation or job failures.
You can get detailed refresh activity information in the Monitoring Center, including the status, start and end time, duration, and CU usage of each activity. This may be helpful to you.
If you are still not sure what the issue is, you can create a support ticket and a dedicated Microsoft engineer will resolve the issue for you.
It would be great if you continue to share in this issue to help others with similar problems after you know the root cause or solution.
The link of Power BI Support: https://powerbi.microsoft.com/en-us/support/
For how to create a support ticket, please refer to How to create a support ticket in Power BI - Microsoft Fabric Community
Thank you for your understanding.
Regards,
Nono Chen
If this post helps, then please considerAccept it as the solution to help the other members find it more quickly.