Power BI Data Profiling – distinct vs unique

“Another month, another Power BI Desktop update” as Guy(s) in a Cube use to say.
Power BI Desktop October 2018 update brings in quite a few updates, with one thing being my favorite as I always do/check it manually when working with new data: Data Profiling. For more info and how to enable it, visit Microsoft’s official Power BI blog post or the detailed post from Matt Allington on https://exceleratorbi.com.au.

Ok, I don’t want to make it a long post, so here’s the thing: one of the Data Profiling feature is “Column distribution”.

Column Distribution allows you to get a sense for the overall distribution of values within a column in your data previews, including the count of distinct values (total number of different values found in a given column) and unique values (total number of values that only appear once in a given column).

So, distinct and/or unique. For anyone else these 2 words may seem similar, but for Data Guys like us, these are 2 different things (just like empty and null 🙂 ). Still confused? Here’s a very simple example to clear things up:

Let’s say we have a small company with 10 employees using 10 laptops from different Manufacturers: HP, Dell, Apple and Lenovo.
Manufacturers in Power BI's Column distribution
Distinct: We have laptops from 4 different Manufacturers (HP, Dell, Apple and Lenovo) aka “total number of different values”, regardless of how many of each we have.
Unique: We have only 2 laptops that nobody else have in our company (one from Apple and another one from Lenovo) aka “total number of values that only appear once”.

Thinking legacy, here’s the same thing in Excel, using countif or countifs.
Excel CountIF formula

Install-WindowsFeature Web-Net-Ext failed. Source files could not be found.

Error:

Trying to install .NET Framework 3.5 Features on a Windows Server 2016, either through GUI or PowerShell, returns below error (truncated):

“The request to add or remove features on the specified server failed.
Installation of one or more roles, role services, or features failed.
The source files could not be found.”

Cause:

.Net Framework 3.5 sources file are not pre-installed so we need to explicitly specify where “installer” can get the source files from.

Solution:

As per Microsoft’s official docs, there are 2 main solutions to make it work.

  1. Through GUI (Server Manager), when asked during the wizard, you need to specify the path to “Media:\Sources\SxS” folder.
  2. Through PowerShell, Install-WindowsFeature -Name Web-Net-Ext -source D:\Sources\SxS

Still, Microsoft says that even if this defined source is not found, installer should go to Windows Update to download required files. Let’s exclude those many possibilities that Windows Update is not accessible due to various reasons (no internet connection, GPOs disabling something etc.).

Anyway, since neither one of the above methods worked for me, I kept reading posts from 2nd+ google pages. This led me to not-so-old post from Michael Niehaus: https://blogs.technet.microsoft.com/mniehaus/2015/08/31/adding-features-including-net-3-5-to-windows-10/

Not sure what’s different with calling DISM directly (since Server Manager and Powershell probably does the same), but below solution worked for me.

Run this from PowerShell (including tilde characters): DISM.exe /Online /Add-Capability /CapabilityName:NetFx3~~~~ /Source:D:\Sources\SxS where D was the drive with my mounted Server 2016 ISO.

When above command is done, run this: Add-WindowsCapability –Online -Name NetFx3~~~~ –Source D:\Sources\SxS

What and when: DateAdd vs ParallelPeriod vs SamePeriodLastYear

Reza Rad over at http://radacad.com/dateadd-vs-parallelperiod-vs-sameperiodlastyear-dax-time-intelligence-question explains the differences and provides easy usage scenarios for DateAdd, ParallelPeriod and SamePeriodLastYear DAX functions.

Read his blog post to see how he reached to below conclusion:

  • DateAdd and SamePeriodLastYear both work based on the DYNAMIC period in the filter context.
  • ParallelPeriod is working STATICALLY based on the interval selected in the parameter.
  • ParallelPeriod and DateAdd can go more than one interval back and forward, while SamePeriodLastYear only goes one year back.
  • DateAdd works on the interval of DAY, as well as month, quarter and year, but ParallelPeriod only works on month, quarter, and year.
  • Depends on the filter context you may get a different result from these functions. If you get the same result in a year level context, it doesn’t mean that all these functions are the same! Look more into the detailed context.

Power BI: The key didn’t match any rows in the table

Error:

Refreshing Power BI report generates “The key didn’t match any rows in the table” error.

Cause:

Click “Edit Queries” button. If you do not see the error message, click “Refresh Preview” button. Once you have the error message, click “Go To Error“.

Error message will remain, but you should have “Edit Settings” button now, click it.

In the next screen, Navigation, you should see what exactly Power BI is trying to access but cannot do so.

In my example, I have an Excel file as my source. When I first connected to this data source, my Table name was Table1. Yesterday I changed where my Excel file gets data from and this, in turn, changed Excel’s Table name to “report“.

Obviously, when Power BI tries to refresh the data, it cannot find Table1 table anymore.

Solution:

In the same Navigation screen, selecting my new Table name “report” will fix the issue assuming all other columns in the Excel file are the same.

Same is true for any data source, not only Excel file. Follow same steps to identify what’s causing the error and then fix it as needed.

How to get and view WordPress Statistics in Power BI

Soheil Bakhshi from http://biinsight.com has a super good article on how to get your stats from a WordPress site/blog and analyze them in Power BI.

For lazy guys out there, he already built a Power BI template file that you can use straight away. The only thing you’ll need is WordPress API key but don’t worry, he explains how to get that one too!

Check it out: http://biinsight.com/analyse-your-wordpress-blog-stats-in-power-bi/.

Expressions that yield variant data-type cannot be used to define calculated columns.

Error:

Expressions that yield variant data-type cannot be used to define calculated columns.

Cause:

IF based Calculated Columns in Power BI or Power Pivot with 2 different data types is not allowed.

Solution:

Format one data type as needed using FORMAT() function.

[code language=”sql”]

Age = IF (
TableName[Birthday] < TODAY(),
FORMAT (
Year ( TODAY() )- Year ( TableName[Birthday] ),"General Number" ),
"Invalid birth date"
)

[/code]

FORMAT Function (DAX): https://msdn.microsoft.com/en-us/query-bi/dax/format-function-dax

Pre-Defined Numeric Formats for the FORMAT function: https://msdn.microsoft.com/query-bi/dax/pre-defined-numeric-formats-for-the-format-function

Creating an e-book. Part 3.

Continuing the series about how to create an e-book. Today we’ll have a much shorter post. Specifically, we’ll look at how to create the Title page.

So, what is a Title Page? According to dictionary.com, Title Page is: “the page at the beginning of a volume that indicates the title, author’s or editor’s name, and the publication information, usually the publisher and the place and date of publication.”

If you’ll search Google Images for “title page”, you see lots of examples of how a Title Page looks like. And, it seems to be the easiest to create. Technically speaking, a bunch of paragraphs and that’s it.

Ok, following dictionary.com’s description, let’s create our Title Page. We will need the following:

  • Book title: I will use “My Book’s Title”
  • Author name: this is simple, “Vitalie Ciobanu”
  • Publisher info: I do not have one so I will use “Lorem Ipsum Publishing Inc.”
  • Place and date of publication: “Madagascar 2018”. I still want to visit this island…

Open mybook.html page in Notepad and add above details right before your first chapter (which should be <h2> tag); each detail on different row and inside a paragraph element (<p></p>). Here is my <body> element now (trimmed):

[code language=”html”]
<body>My Book’s Title

Vitalie Ciobanu

Lorem Ipsum Publishing. All rights reserved.

Madagascar 2018
<h2>Lorem Ipsum</h2>
<h3>What is Lorem Ipsum?</h3>
<p class="first">Lorem ipsum dolor sit….</p>
<p class="quote">Nullam tortor orci…..</p>

Curabitur ligula tortor, ullamcorper….

</body>
[/code]

And here’s my page viewed in a browser:

Because we have p selector defined in our stylesheet file, you see above we have some formatting applied already. Let’s add additional selectors to the stylesheet to make Title page information differently looking. So, let’s add some different formatting for book title, author, publisher and date of publication. Here’s just some basic CSS added to my epub.css file:

[code language=”css”]
p.booktitle {
    text-indent:0em;
    text-align:center;
    margin-top:5em;
    font-weight:bold;
    font-size:150%;
}

p.bookauthor {
    text-indent:0em;
    text-align:center;
    margin-top:1em;
    font-weight:bold;
    font-size:130%;
}

p.bookpublisher {
    text-indent:0em;
    text-align:center;
    margin-top:3em;
    font-weight:bold;
    font-size:100%;
}

p.bookpublished {
    text-indent:0em;
    text-align:center;
    margin-top:1em;
    font-weight:bold;
    font-size:100%;
}
[/code]

For each selector I wanted no indentation so I added “text-indent:0em”. Also, everything is centered and bold. Font size is different, with Book Title being the biggest one, 150%em. Lastly, I added different top margins to visibly separate each line. Afterwards, I updated the class details on my newly added paragraphs in html file.

[code language=”html”]
<body>
<p class="booktitle">My Book’s Title</p>
<p class="bookauthor">Vitalie Ciobanu</p>
<p class="bookpublisher">Lorem Ipsum Publishing. All rights reserved.</p>
<p class="bookpublished">Madagascar 2018</p>

<h2>Lorem Ipsum</h2>
<h3>What is Lorem Ipsum?</h3>
<p class="first">Lorem ipsum dolor sit….</p>
<p class="quote">Nullam tortor orci…..</p>

Curabitur ligula tortor, ullamcorper….

</body>
[/code]

Here’s the updated view in browser.

How to automatically create and update Visio diagrams from Excel

To successfully create Visio diagrams from Excel, Visio Pro for Office 365 is required. Here you can see the prices or download trial.

  1. Open Excel (I have latest release of Excel from Office 365 suite)
  2. In the search box, type “process” and hit Enter. Among other search results, 2 Excel templates should appear with Excel icons in upper left corner and Visio diagrams in lower right corner. Select the one you want/need. I will continue with Process Map for Cross-Functional Flowchart.

  3. A nicely formatted Excel file will open with pretty much all info you need to know about Data Visualizer; the solution that allows us to create diagrams from Excel.

  4. Head over to Process Map worksheet.
  5. You’ll notice here we have “ProcessMapDataTable populated with 2 rows. This is the Excel Table that we need to work with. Everything else can be deleted/ignored.

  6. Remove the 2 rows added as examples and add your process’ field mappings. For this example, I will use Software Development process found on www.smartdraw.com
  7. I will modify my Excel Table based on this process. Here is my final Table:
Process Step ID Process Step Description Next Step ID Connector Label Shape Type Function Phase
P01 Start P02 Start Project Management Planning
P02 Gather requirements P03 Document Project Management Planning
P03 Develop specifications P04 Process Software Design Design
P04 Develop internal design P05 Process Software Design Design
P05 Develop external design P06 Process Software Design Design
P06 Change required? P05,P07 Yes,No Decision Software Design Design
P07 Define Development team P08 Subprocess Software Development Coding
P08 Application development P09 Process Software Development Coding
P09 User Acceptance Testing P10 Process Software Development Coding
P10 Quality Assurance P11 Process Software Development Coding
P11 Defects P08,P12 Yes,No Decision Software Development Coding
P12 Release to production P13 Subprocess Release Management Release
P13 Stop End Project Management Release

Make sure that you have more than one value in Next Step ID, you should have similar number of values/labels in Connector Label column and that those values match its corresponding step.

For example, P06 Step ID is a decision point. If changes are required (Yes), process goes back to P05 Step ID. If no changes are required (No), process proceeds to P07 Step ID.

Lastly, ensure your last process step has nothing in Next Step ID field, because it’s the last one, there is nothing else afterwards… process ends here. Save your Excel file.

  1. Open Visio.
  2. In the search box, type “data visualizer” and hit Enter. You should get the Basic and Cross-Functional templates. Select the same one you used for creating the Excel file. Cross-Functional Flowchart in my example.

    Just in case you started it wrongly or you want to get Excel template again, click Excel data template. Select your local Unit and click Create.

  3. Visio will open Create Diagram from Data wizard.
  4. Using first drop-down box, select the correct diagram type to use.
  5. In the second field, browse to the Excel file you created and saved in Step 7.
  6. In the last field, Visio should automatically recognize your Excel Table name, ProcessMapData in my example. Click Next.

  7. If you did not modified headers in the Excel Table, next page should automatically recognize and map Function and Phase columns. Expand More Options if you want to modify the order of column values. Assuming you created your process in a chronological order, you’d probably want to retain the same order in Visio.

  8. On the next wizard page the same, if you left all default column headers, Visio will automatically recognize and map needed columns. Otherwise, drag and drop the columns from the left.

    Short note here: you probably noticed your Excel Table has blue and green headers. Blue headers are required for these mappings and building your final diagram. Whether green columns are nice to have. All those columns may have valuable info for the diagram, but I think Microsoft still has ways for improvement this specific side, especially Alt Description, because it may contain too much text. Add something to this column and you will see what I’m referring to.

  9. Next page is about choosing your required Visio shapes. I did not have anything to modify or comment here. It seems like Microsoft is limiting the user to use predefined shapes but, anyway, default shapes are kind of standard so should be sufficient.

  10. The last wizard step is the one we need to modify to suit our needs. Make sure Next Step ID is selected for Specify Column Name, Relationship is set to Next Step, Delimiter is set to comma and Connector Label is mapped to its column with same name. Click Finish.

  11. You should now have a fully functional Visio diagram. Note that Visio will zoom it in such a way that it fits your page; if needed, zoom it in or out to see the details. Also, for some reason, Visio creates a lot of unused space between each shape. Feel free to rearrange everything to save space.

One nice thing about having this Excel file linked to Visio is that any updates you make to the process in Excel, you can refresh Visio diagram to update it with those changes. Just select your Visio diagram and from Data tab, in Create from Data group select Refresh.

For reading official Microsoft article about this, please take a look at “Create a Data Visualizer diagram” post. It contains more valuable info, including a description about each Excel columns that interact with Visio flowchart components.

Similarly, here’s a video made by Microsoft.

July 20, 2018: Microsoft just posted another, shorter, video on YouTube marketing same thing.