Struggling to digest IoT data in your BI tool?

By Pilgrim - June 27, 2019

 

If you're a data analyst or really any kind of person trying to run a business, you'll probably be familiar such as Tableau, Looker, Qlik, and Power BI. With a user interface somewhat reminiscent of Excel, these SaaS tools enable you to ingest data and then build up reports and dashboards from it - turning data into useful information. 

Of course, they're far more powerful than Excel because they run in the cloud - but that doesn't mean they're good at everything

More and more business data is now coming from IoT devices, so it's natural to try to ingest that data into your BI tool of choice to try to turn it into useful information. However, if you've actually tried to do this then you've probably hit some snags.

Relational data

Relational query

At the heart of any BI tool is relational query - the world of SQL where you select values from tables, join them to values in other tables, and do the math on them. In principle, you can do anything with SQL... but in practice, the relational model underlying BI tools is a poor fit for most of the data that IoT devices produce. 

IoT data

For one thing, IoT devices produce telemetry - time-series data, and lots of it. When a BI tool talks about time-series, it's usually talking about something like "sales numbers over the past year, by month" and that's twelve time bins. But even a modest estate of IoT devices produces millions of data points a day - temperatures, heartbeats, button processes.... you name it.

Now sure, you could pile all of this data into a table and then bin it by each month, and that's fine if you need to answer a question like "What is the average temperature across all my devices in each of the past 12 months?" but that kind of answer is rarely useful. 

Judgements on individual data points 

Often, it's much more useful to make a judgement down at the level of the individual data points and then  aggregate up that judgement across devices and across time, to answer questions like - 

  • Which users have pressed a button more than three times an hour? (Perhaps a sign of frustration which you need to react to) 
  • Have any pumps been running for more than an hour? (Perhaps a sign that your tank is overflowing) 
  • Have any fridge doors been open for more than five minutes? (Perhaps a sign that the contents may be expiring) 
  • Have any devices been out of touch for more than 15 minutes? (Perhaps a sign of a communications failure)
  • Have any parking spaces been occupied for more than 24 hours? (Perhaps time to go and issue a parking ticket!)

Do note that we can't bin the data first - so, for example, if we're interested in smart fridge doors being open for five minutes, we can't bin the data by day or month because we'd lose all the information needed to make that judgement. Since binning is one of the core ways to reduce data to manageable volumes in a database and align it into correlatable chunks, not being able to bin early can stop BI tools in their tracks

Also note that IoT devices report asynchronously from each other and tend to report data only when it changes - they don't report everything all the time, for reasons of power and communications efficiency, so the resulting database rows are very sparse. In most rows, most columns are empty. Again, the SQL-type queries supported by BI tools just can't cope with this, e.g.to find out if a specific smart fridge door is open at any particular moment in time, you might have to look back through many rows to discover the last its state was reported. 

Aggregation

There are then many ways that we might want to aggregate up these individual judgements. For example, we might want to know how many times (or how much of the time) any individual smart fridge has had its door open for more than five minutes, over the past week.

Then we might want to aggregate further, to say "if having a smart fridge door open for more than five minutes represents a problem, then how often have we had a problem across our entire fleet of smart fridges over the past week?". That's a lot of words, but it's a well-defined and simple concept, as well as exactly the kind of KPI metrics that a business using IoT might have to report to its customers to prove its meeting its Service Level Agreement (SLA). 

An IoT query 2x final

Only then can we bin the results, for example, by day, in order to get, say, a bar chart. And speaking of bar charts, try to avoid confusing different tools based solely on the output they produce. IoT tools, BI tools, and even Excel can produce identical-looking charts and dashboards... but that doesn't mean they have the same kind of analytical enging under the hood.

Can DevicePilot help?

In short - yes. Of course!

As one goes higher and higher up this chain of analysis, the volume of data gets smaller and smaller, and more and more regular, until it's digested down to quite a small table of results, at which point the relational world of SQL and BI tools is excellent for taking it further. That's why DevicePilot is such a good complement for companies that already use BI tools are now getting into IoT. 

Further reading

DevicePilot vs Tableau

How to get SCADA working with IoT

The architecture of a modern IoT system

transparent inbox img

Want to see more great IoT content?

Sign up to our (spam-free!) newsletter for industry news, updates, and advice on how to get the most out of your project:

Comments

We promise that we won't SPAM you.