Showing posts with label DAX. Show all posts
Showing posts with label DAX. Show all posts

01 June 2024

📊Graphical Representation: Graphics We Live By (Part VIII: List of Items in Power BI)

Graphical Representation Series
Graphical Representation Series

Introduction

There are situations in which one needs to visualize only the rating, other values, or ranking of a list of items (e.g. shopping cart, survey items) on a scale (e.g. 1 to 100, 1 to 10) for a given dimension (e.g. country, department). Besides tables, in Power BI there are 3 main visuals that can be used for this purpose: the clustered bar chart, the line chart (aka line graph), respectively the slopegraph:

Main Display Methods

Main Display Methods

For a small list of items and dimension values probably the best choice would be to use a clustered bar chart (see A). If the chart is big enough, one can display also the values as above. However, the more items in the list, respectively values in the dimension, the more space is needed. One can maybe focus then only on a subset of items from the list (e.g. by grouping several items under a category), respectively choose which dimension values to consider. Another important downside of this method is that one needs to remember the color encodings. 

This downside applies also to the next method - the use of a line chart (see B) with categorical data, however applying labels to each line simplifies its navigation and decoding. With line charts the audience can directly see the order of the items, the local and general trends. Moreover, a line chart can better scale with the number of items and dimension values.

The third option (see C), the slopegraph, looks like a line chart though it focuses only on two dimension values (points) and categorizes the line as "down" (downward slope), "neutral" (no change) and "up" (upward slope). For this purpose, one can use parameters fields with measures. Unfortunately, the slopegraph implementation is pretty basic and the labels overlap which makes the graph more difficult to read. Probably, with the new set of changes planned by Microsoft, the use of conditional formatting of lines would allow to implement slope graphs with line charts, creating thus a mix between (B) and (C).

This is one of the cases in which the Y-axis (see B and C) could be broken and start with the meaningful values. 

Table Based Displays

Especially when combined with color encodings (see C & G) to create heatmap-like displays or sparklines (see E), tables can provide an alternative navigation of the same data. The color encodings allow to identify the areas of focus (low, average, or high values), while the sparklines allow to show inline the trends. Ideally, it should be possible to combine the two displays.  

Table Displays and the Aster Plot

One can vary the use of tables. For example, one can display only the deviations from one of the data series (see F), where the values for the other countries are based on AUS. In (G), with the help of visual calculations one can also display values' ranking. 

Pie Charts

Pie charts and their variations appear nowadays almost everywhere. The Aster plot is a variation of the pie charts in which the values are encoded in the height of the pieces. This method was considered because the data used above were encoded in 4 similar plots. Unfortunately, the settings available in Power BI are quite basic - it's not possible to use gradient colors or link the labels as below:

Source Data as Aster Plots

Sankey Diagram

A Sankey diagram is a data visualization method that emphasizes the flow or change from one state (the source) to another (the destination). In theory it could be used to map the items to the dimensions and encode the values in the width of the lines (see I). Unfortunately, the diagram becomes challenging to read because all the lines and most of the labels intersect. Probably this could be solved with more flexible formatting and a rework of the algorithm used for the display of the labels (e.g. align the labels for AUS to the left, while the ones for CAN to the right).

Sankey Diagram

Data Preparation

A variation of the above image with the Aster Plots which contains only the plots was used in ChatGPT to generate the basis data as a table via the following prompts:

  • retrieve the labels from the four charts by country and value in a table
  • consolidate the values in a matrix table by label country and value
The first step generated 4 tables, which were consolidated in a matrix table in the second step. Frankly, the data generated in the first step should have been enough because using the matrix table required an additional step in DAX.

Here is the data imported in Power BI as the Industries query:

let
    Source = #table({"Label","Australia","Canada","U.S.","Japan"}
, {
 {"Credit card","67","64","66","68"}
, {"Online retail","55","57","48","53"}
, {"Banking","58","53","57","48"}
, {"Mobile phone","62","55","44","48"}
, {"Social media","74","72","62","47"}
, {"Search engine","66","64","56","42"}
, {"Government","52","52","58","39"}
, {"Health insurance","44","48","50","36"}
, {"Media","52","50","39","23"}
, {"Retail store","44","40","33","23"}
, {"Car manufacturing","29","29","26","20"}
, {"Airline/hotel","35","37","29","16"}
, {"Branded manufacturing","36","33","25","16"}
, {"Loyalty program","45","41","32","12"}
, {"Cable","40","39","29","9"}
}
),
    #"Changed Types" = Table.TransformColumnTypes(Source,{{"Australia", Int64.Type}, {"Canada", Int64.Type}, {"U.S.", Number.Type}, {"Japan", Number.Type}})
in
    #"Changed Types"

Transforming (unpivoting) the matrix to a table with the values by country:

IndustriesT = UNION (
    SUMMARIZECOLUMNS(
     Industries[Label]
     , Industries[Australia]
     , "Country", "Australia"
    )
    , SUMMARIZECOLUMNS(
     Industries[Label]
     , Industries[Canada]
     , "Country", "Canada"
    )
    , SUMMARIZECOLUMNS(
     Industries[Label]
     , Industries[U.S.]
     , "Country", "U.S."
    )
    ,  SUMMARIZECOLUMNS(
     Industries[Label]
     , Industries[Japan]
     , "Country", "Japan"
    )
)

Notes:
The slopechart from MAQ Software requires several R language libraries to be installed (see how to install the R language and optionally the RStudio). Run the following scripts, then reopen Power BI Desktop and enable running visual's scripts.

install.packages("XML")
install.packages("htmlwidgets")
install.packages("ggplot2")
install.packages("plotly")

Happy (de)coding!

29 April 2024

⚡️Power BI: Working with Visual Calculations (Part III: Matrix Tables with Square Numbers as Example)

Introduction

In the previous post I exemplified various operations that can be performed with visual calculations on simple tables based on square numbers. Changing the simple table to a matrix table doesn't bring any benefit. The real benefit comes when one restructures the table to store only a cell per row in a table. 

Data Modelling

For this the Magic5 table can be transformed via the following code, which creates a second table (e.g. M5):

M5 = UNION (
    SUMMARIZECOLUMNS(
     Magic5[Id]
     , Magic5[R]
     , Magic5[Index]
     , Magic5[C1]
     , "Col", "C1"
    )
    , SUMMARIZECOLUMNS(
     Magic5[Id]
     , Magic5[R]
     , Magic5[Index]
     , Magic5[C2]
     , "Col", "C2"
    )
    , SUMMARIZECOLUMNS(
     Magic5[Id]
     , Magic5[R]
     , Magic5[Index]
     , Magic5[C3]
     , "Col", "C3"
    )
    ,  SUMMARIZECOLUMNS(
     Magic5[Id]
     , Magic5[R]
     , Magic5[Index]
     , Magic5[C4]
     , "Col", "C4"
    )
    , SUMMARIZECOLUMNS(
      Magic5[Id]
     , Magic5[R]
     , Magic5[Index]
     , Magic5[C5]
     , "Col", "C5"
    )
)

Once this done, one can add the column [Col] as values for the matrix in a new visual. From now on, all the calculations can be done on copies of this visual. 

Simple Operations

The behavior of the RUNNINGSUM and other functions is different when applied on a matrix table because the formula is applied to every cell of the N*N table, a column with the result being added for each existing column of the matrix.

Moreover, there are four different ways of applying the formula based on the Axis used. ROW calculates the formula by the row within a column:

Run SumByRow(C) = RUNNINGSUM([C], ROWS)
Output:
R C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C)
R1 18 18 25 25 2 2 9 9 11 11
R2 4 22 6 31 13 15 20 29 22 33
R3 15 37 17 48 24 39 1 30 8 41
R4 21 58 3 51 10 49 12 42 19 60
R5 7 65 14 65 16 65 23 65 5 65

By providing COLUMNS as parameter for the Axis makes the calculation run by the column within a row: 

Run SumByCol(C) = RUNNINGSUM([C], COLUMNS)
Output:
R C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C)
R1 18 18 25 43 2 45 9 54 11 65
R2 4 4 6 10 13 23 20 43 22 65
R3 15 15 17 32 24 56 1 57 8 65
R4 21 21 3 24 10 34 12 46 19 65
R5 7 7 14 21 16 37 23 60 5 65

By providing ROW COLUMNS as parameter for the Axis makes the calculation run by the column and then continuing the next column (without resetting the value at the end of the column):
Run SumByRow-Col(C) = RUNNINGSUM([C],ROWS COLUMNS)
Output:
R C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C)
R1 18 18 25 90 2 132 9 204 11 271
R2 4 22 6 96 13 145 20 224 22 293
R3 15 37 17 113 24 169 1 225 8 301
R4 21 58 3 116 10 179 12 237 19 320
R5 7 65 14 130 16 195 23 260 5 325

By providing COLUMNS ROWS as parameter for the Axis makes the calculation run by the row and then continuing the next row (without resetting the value at the end of the column):
Run SumByCol-Row = RUNNINGSUM([C],COLUMNS ROWS)
Output:
R C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C) C Run Sum(C)
R1 18 18 25 43 2 45 9 54 11 65
R2 4 69 6 75 13 88 20 108 22 130
R3 15 145 17 162 24 186 1 187 8 195
R4 21 216 3 219 10 229 12 241 19 260
R5 7 267 14 281 16 297 23 320 5 325

Ranking

RANK can be applied independent of the values, or considering the value with ASC or DESC sorting:
RankByRow = RANK(DENSE,ROWS) -- ranking by row independent of values
RankByRow ASC = RANK(DENSE,ROWS, ORDERBY([C],ASC)) -- ranking by row ascending
RankByRow DESC = RANK(DENSE,ROWS, ORDERBY([C], DESC)) -- ranking by row descending
RankByRow-Col ASC = RANK(DENSE,ROWS COLUMNS, ORDERBY([C],ASC)) -- ranking by row columns ascending
RankByRow-Col DESC = RANK(DENSE,ROWS COLUMNS, ORDERBY([C], DESC)) -- ranking by row columns ascending

[RankByRow-Col ASC] matches the actual numbers from the matrix and is thus useful when sorting any numbers accordingly. 

Differences

Differences can be calculated between any of the cells of the matrix:
DiffToPrevByRow = [C] - PREVIOUS([C])  -- difference to previous record
DiffToPrevByRow* = IF(NOT(IsBlank(PREVIOUS([C]))), [C] - PREVIOUS([C])) -- extended difference to previous record
DiffToPrevByRow-Col = [C] - PREVIOUS([C],, ROWS COLUMNS) -- difference to previous record by ROWS COLUMNS
DiffToFirstByRow = [C] - FIRST([C]) -- difference to first record
DiffToPrevByCol = [C] - FIRST([C], COLUMNS) -- difference to previous record COLUMNS

Ranking = RANK(DENSE, ROWS COLUMNS, ORDERBY([C], ASC)) -- ranking of values by ROWS COLUMNS
OffsetDiffToPrevByRow = [C] - calculate([C], OFFSET(1, ROWS, ORDERBY([Ranking],DESC))) -- difference to the previous record by ROW
OffsetDiffToPrevByRow-Col = [C] - calculate([C], OFFSET(1, ROWS COLUMNS, ORDERBY([Ranking],DESC))) -- difference to the previous record by ROW

Ranking has been introduced to facilitate the calculations based on OFFSET.

The other functions [1] can be applied similarly.

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) Power BI: Using visual calculations [preview] (link)

⚡️Power BI: Working with Visual Calculations (Part II: Simple Tables with Square Numbers as Example)

Introduction

The records behind a visual can be mentally represented as a matrix, the visual calculations allowing to tap into this structure intuitively and simplify many of the visualizations used. After a general test drive of the functionality, it makes sense to dive deeper into the topic to understand more about the limitations, functions behavior and what it takes to fill the gaps. This post focuses on simple tables, following in a next post to focus on matrices and a few other topics. 

For exemplification, it makes sense to use a simple set of small numbers that are easy to work with, and magic squares seem to match this profile. A magic square is a matrix of positive sequential numbers in which each row, each column, and both main diagonals are the same [1]. Thus, a square of order N has N*N numbers from 1 to N*N, the non-trivial case being order 3. However, from the case of non-trivial squares, the one of order 5 provides a low order and allows hopefully the minimum needed for exemplification:

18252911
46132022
15172418
213101219
71416235
17131925

Data Modeling

One magic square should be enough to exemplify the various operations, though for testing purposes it makes sense to have a few more squares readily available. Each square has an [Id], [C1] to [C5] corresponds to matrix's columns, while [R] stores a row identifier which allows to sort the values the way they are stored in the matrix:

let
    Source = #table({"Id","C1","C2","C3","C4","C5","R"}
, {
{1,18,25,2,9,11,"R1"},
{1,4,6,13,20,22,"R2"},
{1,15,17,24,1,8,"R3"},
{1,21,3,10,12,19,"R4"},
{1,7,14,16,23,5,"R5"},
{2,1,7,13,19,25,"R1"},
{2,14,20,21,2,5,"R2"},
{2,22,3,9,15,16,"R3"},
{2,10,11,17,23,4,"R4"},
{2,18,24,5,6,12,"R5"},
{3,1,2,22,25,15,"R1"},
{3,9,10,16,11,19,"R2"},
{3,17,23,13,5,7,"R3"},
{3,24,12,6,20,3,"R4"},
{3,14,18,8,4,21,"R5"},
{4,22,6,3,18,16,"R1"},
{4,4,14,11,15,21,"R2"},
{4,5,8,12,23,17,"R3"},
{4,25,13,19,7,1,"R4"},
{4,9,24,20,2,10,"R5"},
{5,5,9,20,25,6,"R1"},
{5,13,15,2,11,24,"R2"},
{5,21,1,23,3,17,"R3"},
{5,19,18,4,14,10,"R4"},
{5,7,22,16,12,8,"R5"}
}
),
    #"Changed Type to Number" = Table.TransformColumnTypes(Source,{{"C1", Int64.Type}, {"C2", Int64.Type}, {"C3", Int64.Type}, {"C4", Int64.Type}, {"C5", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type to Number",{{"Id", Order.Ascending}, {"R", Order.Ascending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1, Int64.Type)
in
    #"Added Index"

The column names and the row identifier could have been numeric values from 1 to 5, though it could have been confounded with the actual numeric values.

In addition, the columns [C1] to [C5] were formatted as integers and an index was added after sorting the values after [Id] and [R]. Copy the above code as a Blank Query in Power BI and change the name to Magic5. 

Prerequisites

For the further steps you'll need to enable visual calculations in Power BI Developer via:
File >> Options and settings >> Options >> Preview features >> Visual calculations >> (check)

Into a Table visual drag and drop [R], [C1] to [C5] as column and make sure that the records are sorted ascending by [R]. To select only a square, add a filter based on the [Id] and select the first square. Use further copies of this visual for further tests. 

Some basic notions of Algebra are recommended but not a must. If you worked with formulas in Excel, then you are set to go. 

In Mathematics a matrix starts from the top left side and one moves on the rows (e.g. 18, 25, 2, ...) and then on the columns. With a few exceptions in which the reference is based on the latest value from a series (see Exchange rates), this is the direction that will be followed. 

Basic Operations

Same as in Excel [C1] + [C2] creates a third column in the matrix that stores the sum of the two. The sum can be further applies to all the columns:

Sum(C) = [C1] + [C2] + [C3] + [C4] + [C5] -- sum of all columns (should amount to 65)

The column can be called "Sum", "Sum(C)" or any other allowed unique name, though the names should be meaningful, useful, and succinct, when possible.

Similarly, one can work with constants, linear or nonlinear transformations (each formula is a distinct calculation):

constant = 1 -- constant value
linear = 2*[C1] + 1 -- linear translation: 2*x+1
linear2 = 2*[C1] + [constant] -- linear translation: 2*x+1
quadratic = Power([C1],2) + 2*[C1] + 1 -- quadratic translation: x^2+2*x+1 quadratic2 = Power([C1],2) + [linear] -- quadratic translation: x^2+2*x+1
Output:
R C1 constant linear linear2 quadratic quadratic2
R1 18 1 37 37 361 361
R2 4 1 9 9 25 25
R3 15 1 31 31 256 256
R4 21 1 43 43 484 484
R5 7 1 15 15 64 64
Please note that the output was duplicated in Excel (instead of making screenshots).

Similarly, can be build any type of formulas based on one or more columns.

With a simple trick, one can use DAX functions like SUMX, PRODUCTX, MINX or MAXX as well:

Sum2(C) = SUMX({[C1], [C2], [C3], [C4], [C5]}, [Value]) -- sum of all columns
Prod(C) = PRODUCTX({[C1], [C2], [C3], [C4], [C5]}, [Value]) -- product of all columns
Avg(C) = AVERAGEX({[C1], [C2], [C3], [C4], [C5]}, [Value]) -- average of all columns
Min(C) = MINX({[C1], [C2], [C3], [C4], [C5]}, [Value]) -- minimum value of all columns
Max(C) = MAXX({[C1], [C2], [C3], [C4], [C5]}, [Value]) -- maximum value of all columns
Count(C) = COUNTX({[C1], [C2], [C3], [C4], [C5]},[Value]) -- counts the number of columns
Output:
C1 C2 C3 C4 C5 Sum(C) Avg(C) Prod(C) Min(C) Max(C) Count(C)
18 25 2 9 11 65 13 89100 2 25 5
4 6 13 20 22 65 13 137280 4 22 5
15 17 24 1 8 65 13 48960 1 24 5
21 3 10 12 19 65 13 143640 3 21 5
7 14 16 23 5 65 13 180320 5 23 5

Unfortunately, currently there seems to be no way available for applying such calculations without referencing the individual columns. 

Working across Rows

ROWNUMBER and RANK allow to rank a cell within a column independently, respectively dependently of its value:

Ranking = ROWNUMBER() -- returns the rank in the column (independently of the value)
RankA(C) = RANK(DENSE, ORDERBY([C1], ASC)) -- ranking of the value (ascending) 
RankD(C) = RANK(DENSE, ORDERBY([C1], DESC)) -- ranking of the value (descending) 
Output:
R C1 Ranking RankA(C) RankD(C)
R1 18 1 4 2
R2 4 2 1 5
R3 15 3 3 3
R4 21 4 5 1
R5 7 5 2 4

PREVIOUS, NEXT, LAST and FIRST allow to refer to the values of other cells within the same column:

Prev(C) = PREVIOUS([C1]) -- previous cell
Next(C) = NEXT([C1])  -- next cell
First(C) = FIRST([C1]) -- first cell
Last(C) = LAST([C1]) -- last cell
Output:
R C1 Prev(C) NextC) First(C) Last(C)
R1 18 4 18 7
R2 4 18 15 18 7
R3 15 4 21 18 7
R4 21 15 7 18 7
R5 7 21 18 7

OFFSET is a generalization of these functions

offset(2) = calculate([C1], offset(2)) -- 
offset(-2) = calculate([C1], offset(-2))
Ind = ROWNUMBER() -- index
inverse = calculate([C1], offset(6-2*[Ind])) -- inversing the values based on index
Output:
R C1 offset(2) offset(-2) ind inverse
R1 18 15 1 7
R2 4 21 2 21
R3 15 7 18 3 15
R4 21 4 4 4
R5 7 15 5 18

The same functions allow to calculate the differences for consecutive values:

DiffToPrev(C) = [C1] - PREVIOUS([C1]) -- difference to previous 
DiffToNext(C) = [C1] - PREVIOUS([C1]) -- difference to next 
DiffTtoFirst(C) = [C1] - FIRST([C1]) -- difference to first
DiffToLast(C) = [C1] - LAST([C1]) -- difference to last
Output:
R C1 DiffToPrev(C) DiffToNextC) DiffToFirst(C) DiffToLast(C)
R1 18 18 14 0 11
R2 4 -14 -11 -14 -3
R3 15 11 -6 -3 8
R4 21 6 14 3 14
R5 7 -14 7 -11 0

DAX makes available several functions for working across the rows of the same column. Two of the useful functions are RUNNINGSUM and MOVINGAVERAGE:

Run Sum(C) = RUNNINGSUM([C1]) -- running sum
Moving Avg3(C) = MOVINGAVERAGE([C1], 3) -- moving average for the past 3 values
Moving Avg2(C) = MOVINGAVERAGE([C1], 2) -- moving average for the past 2 values

Unfortunately, one can use only the default sorting of the table with the functions that don't support the ORDERBY parameter. Therefore, when the table needs to be sorted descending and the RUNNINGSUM calculated ascending, for the moment there's no solution to achieve this behavior. However, it appears that Microsoft is planning to implement a solution for this issue.

RUNNINGSUM together with ROWNUMBER can be used to calculate a running average:

Run Avg(C) = DIVIDE(RUNNINGSUM([C1]), ROWNUMBER()) -- running average
Output:
R C1 Run Sum(C) Moving Avg3(C) Moving Avg2(C) Run Avg(C)
R1 18 18 18 18 18
R2 4 22 11 11 11
R3 15 37 12.33 9.5 12.33
R4 21 58 13.33 18 14.5
R5 7 65 14.33 14 13

With a mathematical trick that allows to transform a product into a sum of elements by applying the Exp (exponential) and Log (logarithm) functions (see the solution in SQL), one can run the PRODUCT across rows, though the values must be small enough to allow their multiplication without running into issues:

Ln(C) = IFERROR(LN([C1]), Blank()) -- applying the natural logarithm
Sum(Ln(C)) = RUNNINGSUM([Ln(C)]) -- running sum
Run Prod(C) = IF(NOT(ISBLANK([Sum(Ln(C))])), Exp([Sum(Ln(C))])) -- product across rows
Output:
R C1 Ln(C) Sum(Ln(C)) Run Prod(C)
R1 18 2.89 2.89 18
R2 4 1.39 4.28 72
R3 15 2.71 6.98 1080
R4 21 3.04 10.03 22680
R5 7 1.95 11.98 158760

These three calculations could be brought into a single formula, though the result could be more difficult to troubleshoot. The test via IsBlank is necessary because otherwise the exponential for the total raises an error. 

Considering that when traversing a column it's enough to remember the previous value, one can build MIN and MAX functionality across a column: 

Run Min = IF(OR(Previous([C1]) > [C1], IsBlank(Previous([C1]))), [C1], Previous([C1])) -- minimum value across rows
Run Max = IF(OR(Previous([C1]) < [C1], IsBlank(Previous([C1]))), [C1], Previous([C1])) -- maximum across rows

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Wikipedia (2024) Magic Squares (online)
[2] Microsoft Learn (2024) Power BI: Using visual calculations [preview] (link)

20 April 2024

⚡️🗒️Power BI: Visual Calculations [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources.
Last updated: 20-Apr-2024

[feature] Visual Calculations (aka Visual Calcs)

  • {definition} a type of DAX calculation that's defined and executed directly on a visual [1]
  • {benefit} make it easier to create calculations that were previously hard to create
    • leads to simpler DAX, easier maintenance, and better performance [1]
    • reuse the results from its components [2]
  • still in public preview
    • ⇒ must be enabled manually
  • are stored on the visual
    • ⇒ can refer to any data in the visual 
      • incl. columns, measures, or other visual calculations [1]
    • ⇒ anything in the model must be added to the visual before the visual calculation can refer to it [1]
    •  ⇒ they can refer to the visual structure
      • ⇒ leads to more flexibility
  • combine the simplicity of context from calculated columns with the on-demand calculation flexibility from measures [1]
    • ⇐ the context is "visible"
  • operate on aggregated data instead of the detail level [1]
    • ⇒ leads to performance benefits
  • introduce a set of functions specific to visual calculations [1]
    • {category} medium-level functions
      • {function} COLLAPSE
        • the calculation is evaluated at a higher level of the axis [1]
      • {function} COLLAPSEALL
        • the calculation is evaluated at the total level of the axis [1]
      • {function} EXPAND
        • the calculation is evaluated at a lower level of the axis [1]
      • {function}EXPANDALL
        • the calculation is evaluated at the leaf level of the axis [1]
      • {function} FIRST
        • refers to the first row of an axis [1]
      • {function} ISATLEVEL
        • reports whether a specified column is present at the current level.
      • {function} LAST
        • refers to the last row of an axis [1]
      • {function} NEXT
        • refers to a next row of an axis [1]
      • {function} PREVIOUS
        • refers to a previous row of an axis [1]
      • {function} RANGE
        • refers to a slice of rows of an axis [1]
    • {category} high-level functions
      • {function} MOVINGAVERAGE
        • adds a moving average on an axis [1]
      • {function} RUNNINGSUM
        • adds a running sum on an axis  [1]
    • {category} low-level functions
      • ⇐ {exception} are available in standard DAX
      • {function} INDEX
      • {function} OFFSET
      • {function} RANK
      • {function} ROWNUMBER
      • {function}WINDOW
      • {function} ORDERBY
      • {function} PARTITIONBY
      • {function} MATCHBY
  • {default} most of them are evaluated row-by-row [1]
    • ⇐ like a calculated column
    • there's no need to add an aggregation function [1]
      •  it's better not to add such aggregates when they're not necessary [1]
  • {operation} create calculation
    • adds the visual calculation to the visual
  • {operation} hide calculation
    • calculations that aren't needed in the visual can be hidden [2]
  • {operation} copy calculation
    • copies the calculation between visuals and if intermediary steps are not there, they will be copied as well [planned] [2]
  • {feature} templates 
    • ready available calculation constructs 
    • {benefit} make it easier to write common calculations [1]
  • {feature|planned} support for Scanner API [2]
  • {parameter} axis 
    • influences how the visual calculation traverses the visual matrix [1]
    • {default} set to the first axis in the visual
    • {value} ROWS
      • the visual calculation is evaluated row-by-row in the visual matrix, from top to bottom. [1]
    • {value} COLUMNS
      • the visual calculation is evaluated row-by-row in the visual matrix, from left to right [1]
    • {value} ROWS COLUMNS
      • calculates vertically across rows from top to bottom, continuing column by column from left to right [1]
    • {value} COLUMNS ROWS
      • calculates horizontally across columns from left to right, continuing row by row from top to bottom [1]
    • {warning} not all visuals provide all axes, and some visuals provide no axes [1]
  • {parameter| reset 
    • influences if and when the function resets its value to 0 or switches to a different scope while traversing the visual matrix [1]
    • expects there to be multiple levels on the axis [1]
      • ⇐ use PARTITIONBY if there's only one level on the axis [1]
    • {value|default} NONE
      • means the visual calculation is never restarted [1]
    • {value} HIGHESTPARENT 
      • resets the calculation when the value of the highest parent on the axis changes [1]
    • {value} LOWESTPARENT 
      • resets the calculations when the value of the lowest parent on the axis changes [1]
    • {value} numerical value
      • refers to the fields on the axis, with the highest field being one [1]
  • {limitation} functions that rely on model relationships  aren't available
    • e.g. USERELATIONSHIP, RELATED or RELATEDTABLE
  • {limitation} not all visual types are supported [1]
    • ⇐ for the full list of limitations see [1]
  • {limitation} one can't filter on visual calculations [1]
  • {limitation} underlying data can't be exported [1]
  • {limitation} don't support conditional formatting

References
[1] Microsoft Learn (2024) Power BI: Using visual calculations [preview] (link)
[2] SSBI Central (2024) Visual Calculations - Making DAX easier, with Jeroen ter Heerdt(link)

19 April 2024

⚡️🗒️Power BI: Working with Visual Calculations (Part I: Test Drive) [new feature]

Introduction

I recently watched a webcast with Jeroen (Jay) ter Heerdt (see [2]) in which he introduces visual calculations, a type of DAX calculation that's defined and executed directly on a visual [1]. Visual calculations provide an approach of treating a set of data much like an Excel table, allowing to refer to any field available on a visual and write formulas, which simplifies considerably the solutions used currently for ranking records, running averages and other windowing functions. 

The records behind a visual can be mentally represented as a matrix, while the visual calculations can refer to any column from the matrix, allowing to add new columns and include the respective columns in further calculations. Moreover, if a column is used in a formula, it's not recalculated as is the case of measures, which should improve the performance of DAX formulas considerably. 

Currently, one can copy a formula between visuals and if the formula contains fields not available in the targeted visual, they are added as well. Conversely, it's possible to build such a visual, copy it and then replace the dimension on which the analysis is made (e.g. Customer with Product), without being needed to make further changes. Unfortunately, there are also downsides: (1) the calculations are visible only within the visual in which were defined; (2) currently, the visual's data can't be exported if a visual calculation is added; (3) no formatting is supported, etc.

Ranking and Differences

I started to build a solution based on publicly available sales data, which offers a good basis for testing the use of visual calculations. Based on a Power BI visual table made of [Customer Name], [Sales Amount], [Revenue] and [Total Discount], I've added several calculations:

-- percentages
Sales % = 100*DIVIDE([Sales Amount], COLLAPSE([Sales Amount], ROWS))
Revenue % = 100*DIVIDE([Revenue],[Sales Amount])
Discount % = 100*DIVIDE([Total  Discount], [Total  Discount]+[Sales Amount])

-- rankings 
Rank Sales = Rank(DENSE, ORDERBY([Sales Amount], DESC))
Rank Revenue = Rank(DENSE, ORDERBY([Revenue], DESC))

-- differences between consecutive values
Diff. to Prev. Sales = IF([Rank Sales]>1, INDEX([Rank Sales]-1, , ORDERBY([Sales Amount], DESC)) - [Sales Amount] , BLANK())
Diff. to Prev. Rev. = IF([Rank Revenue]>1, INDEX([Rank Revenue]-1, , ORDERBY([Revenue], DESC)) - [Revenue] , BLANK())

Here's the output considered only for the first 10 records sorted by [Sales Amount]:

Customer Name Sales Amount Sales % Revenue Revenue % Total Discount Discount % Rank Sales Diff. to Prev. Sales. Rank Rev. Diff. to Prev. Rev.
Medline 1058923.78 3.76 307761.99 3.75 126601.02 10.68 1 1
Ei 707663.21 2.51 229866.98 2.8 95124.09 11.85 2 351260.57 2 77895.01
Elorac, Corp 702911.91 2.49 209078.76 2.55 83192.39 10.58 3 4751.3 6 20788.22
Sundial 694918.98 2.47 213362.1 2.6 78401.72 10.14 4 7992.93 4 -4283.34
OUR Ltd 691687.4 2.45 196396.26 2.4 78732.2 10.22 5 3231.58 10 16965.84
Eminence Corp 681612.78 2.42 213002.78 2.6 86904.03 11.31 6 10074.62 5 -16606.52
Apotheca, Ltd 667283.99 2.37 157435.56 1.92 101453.91 13.2 7 14328.79 31 55567.22
Rochester Ltd 662943.9 2.35 224918.2 2.74 81158.11 10.91 8 4340.09 3 -67482.64
ETUDE Ltd 658370.48 2.34 205432.79 2.51 89322.72 11.95 9 4573.42 9 19485.41
Llorens Ltd 646779.31 2.29 206567.4 2.52 82897.59 11.36 10 11591.17 8 -1134.61

Comments:
1) One could use [Total Amount] = [Total  Discount]+[Sales Amount] as a separate column.
2) The [Rank Sales] is different from the [Rank Rev.] because of the discount applied.
3) In the last two formulas a blank was considered for the first item from the ranking.
4) It's not possible to control when the totals should be displayed, however one can change the color for the not needed total to match the background.

Visualizing Differences 

Once the formulas are added, one can hide the basis columns and visualize the data as needed. To obtain the below chart I copied the visual and changed the column as follows:

Diff. to Prev. Rev. = IF([Rank Revenue]>1, [Revenue]- INDEX([Rank Revenue]-1, , ORDERBY([Revenue], DESC)) , [Revenue]) -- modified column

Differences Revenue between Customers

Comments:
1) Instead of showing the full revenue, the chart shows only the differences from the highest revenue, where the column in green is the highest revenue, while the columns in red are the differences of the current customer's revenue to the previous customer, as the data are sorted by the highest revenue. At least in this case it results in a lower data-ink ratio (see Tufte).
2) The values are sorted by the [Revenue] descending. 
3) Unfortunately, it's not possible to change the names from the legend.

Simple Moving Averages (SMAs)

Based on the [Sales Amount], [Revenue] and [Month] one can add the following DAX formulas to the table for calculating the SMA:

Sales Amount (SMA) = MOVINGAVERAGE([Sales Amount],6)
Revenue (SMA) = MOVINGAVERAGE([Revenue],6)

The chart becomes:


Comments:
1) Unfortunately, the formula can't project the values into the feature, at least not without the proper dates.
2) "Show items with not data" feature seems to be disabled when visual calculations are used.
3) The SMA was created via a template formula. Similarly, calculating a running sum is reduced to applying a formula:
Running Sales Amount = RUNNINGSUM([Sales Amount])

Wrap Up

It's easier to start with a table for the visual, construct the needed formulas and then use the proper visual while eliminating the not needed fields. 

The feature is still in public preview and changes can still occur. Unfortunately, there's still no information available on the general availability date. From the first tests, it provides considerable power with a minimum of effort, which is great! I don't want to think how long I would have needed to obtain the same results without it!

Happy coding!

Previous Post <<||>> Next Post

References
[1] Microsoft Learn (2024) Power BI: Using visual calculations [preview] (link)
[2] SSBI Central (2024) Visual Calculations - Making DAX easier, with Jeroen ter Heerdt (link)

⚡️Power BI: Preparatory Steps for Creating a Power BI Report

When creating a Power BI report consider the following steps when starting the actual work. The first five steps can be saved to a "template" that can be reused as starting point for each report.

Step 0: Check Power BI Desktop's version

Check whether you have the latest version, otherwise you can download it from the Microsoft website.
Given that most of the documentation, books and other resources are in English, it might be a good idea to install the English version.

Step 1: Enable the recommended options

File >> Options and settings >> Options >> Global >> Data Load:
.>> Time intelligence >> Auto date/time for new files >> (uncheck)
.>> Regional settings >> Application language >> set to English (United States)
.>> Regional settings >> Model language >> set to English (United States)

You can consider upon case also the following options (e.g. when the relationships are more complex than the feature can handle):
File >> Options and settings >> Options >> Current >> Data load:
.>> Relationship >> Import relationships from data sources on first load >> (uncheck)
.>> Relationship >> Autodetect new relationships after data is loaded >> (uncheck)

Step 2: Enable the options needed by the report

For example, you can enable visual calculations:
File >> Options and settings >> Options >> Preview features >> Visual calculations >> (check)

Comment:
Given that not all preview features are stable enough, instead of activating several features at once, it might be a good idea to do it individually and test first whether they work as expected. 

Step 3: Add a table for managing the measures

Add a new table (e.g. "dummy" with one column "OK"):

Results = ROW("dummy", "OK")

Add a dummy measure that could be deleted later when there's at least one other measure:
Test = ""

Hide the "OK" column and with this the table is moved to the top. The measures can be further organized within folders for easier maintenance. 

Step 4: Add the Calendar if time analysis is needed

Add a new table (e.g. "Calendar" with a "Date" column):

Calendar = Calendar(Date(Year(Today()-3*365),1,1),Date(Year(Today()+1*365),12,31))

Add the columns:

Year = Year('Calendar'[Date])
YearQuarter = 'Calendar'[Year] & "-Q" & 'Calendar'[Quarter]
Quarter = Quarter('Calendar'[Date])
QuarterName = "Q" & Quarter('Calendar'[Date])
Month = Month('Calendar'[Date])
MonthName = FORMAT('Calendar'[Date], "mmm")

Even if errors appear (as the columns aren't listed in the order of their dependencies), create first all the columns. Format the Date in a standard format (e.g. dd-mmm-yy) including for Date/Time for which the Time is not needed.

To get the values in the visual sorted by the MonthName:
Table view >> (select MonthName) >> Column tools >> Sort by column >> (select Month)

To get the values in the visual sorted by the QuarterName:
Table view >> (select QuarterName) >> Column tools >> Sort by column >> (select Quarter)

With these changes the filter could look like this:


Step 5: Add the corporate/personal theme

Consider using a corporate/personal theme at this stage. Without this the volume of work that needs to be done later can increase considerably. 

There are also themes generators, e.g. see powerbitips.com, a tool that simplifies the process of creating complex theme files. The tool is free however, users can save their theme files via a subscription service.

Set canvas settings (e.g. 1080 x 1920 pixels).

Step 6: Get the data

Consider the appropriate connectors for getting the data into the report. 

Step 7: Set/Validate the relationships

Check whether the relationships between tables set by default are correct, respectively set the relationships accordingly.

Step 8: Optimize the data model

Look for ways to optimize the data model.

Step 9: Apply the formatting

Format numeric values to represent their precision accordingly.
Format the dates in a standard format (e.g. "dd-mmm-yy") including for Date/Time for which the Time is not needed.

The formatting needs to be considered for the fields, measures and metrics added later as well. 

Step 10: Define the filters

Identify the filters that will be used more likely in pages and use the Sync slicers to synchronize the filters between pages, when appropriate:
View >> Sync slicers >> (select Page name) >> (check Synch) >> (check Visible)

Step 11: Add the visuals

At least for report's validation, consider using a visual that holds the detail data as represented in the other visuals on the page. Besides the fact that it allows users to validate the report, it also provides transparence, which facilitates report's adoption. 

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.