How do the smiley faces work?

Why the sad face?

The EDI-Net dashboard system (and its predecessor, smartspaces) aim to increase transparency in terms of building energy and water consumption and develop a common language for discussing energy performance between energy professionals, building users and other stakeholders. So its really important that we understand what the smiley faces mean.

In this post I will try to explain the calculations driving the smiley faces so we can dig a bit deeper into what they mean and how they can be interpreted. I have tried to keep it simple but the process is fairly technical so it is hard to avoid technical language.

Simple visualisations

The EDI-Net dashboard presents simple, user-friendly reports which can be understood by users with any level of experience with energy/water management. The information we present is in the form of smiley/neutral/sad faces.

Try the example above by moving the slider. You can see that the faces can present any value from happy to sad. It is very easy to interpret what the face is communicating. We can all very quickly understand that the happy face is telling us that something is good and the sad face is telling us that something is bad. But what exactly does it mean?

A fair comparison

Energy management systems often compare buildings to each other. This approach is very useful for top-down decision making in terms of energy management investment and budgeting. But what we are trying to achieve is a more detailed, operational approach. So rather than comparing buildings to each other, we compare each building to itself in the past. This approach encourages a continuous improvement and prevents the situation where a given building is always being unfairly compared to incomparable buildings.

Predicting usage

So we have designed the system to show a happy face when consumption is lower than predicted, a neutral face when consumption matches the prediction and a sad face when consumption is higher than predicted. The key to this approach is to establish a decent value for expected usage and to compare this to the actual usage. In EDI-Net we do this for every half hour period. Our expectation is generated by fitting a consumption model to historical data. We typically use 12 months of historical data for this.

This can be seen in the more detailed reports in the dashboard. The prediction is expressed as three coloured zones representing ‘normal’ consumption (the yellow zone), low consumption (the green zone) and high consumption (the red zone).

We can see in this case that the actual consumption (the black line) is floating around in the yellow zone indicating that consumption is in line with expectation. Note also that on Saturday the 17th of March (last weekend) there was a period where consumption was in the red zone. This indicates that this level of consumption can be expected, but only rarely. In this case it was an open day at the university. If the same consumption levels were seen on a Sunday then it would be considered extreme.

The model we developed for this captures the variation in usage with both outside air temperature and with time of week. The basic model is illustrated in the figure below as solid lines, the raw data points are the dots. Each colour represents a different time slot.

Looking at one “time of week” we can see that as temperatures fall, usage tends to increase. The model captures these relationships and can be used to make a prediction. If we know the outside air temperature at a given time of week then we can read off the predicted consumption.

How much is too much?

The core model provides us with a prediction. Since we continuously collect consumption and outside air temperature data we can generate predictions continuously and compare them directly with actual consumption data. However, we still need to understand how much we can deviate from the prediction before it should be considered significant. We do this by observing the scatter in the baseline model.

For each data point in the baseline period we calculate how far the data is from the prediction. Then we look at the distribution of the scatter.

The figure above shows our method explicitly. The histogram shows the distribution of data around the baseline model. As new data are collected the difference from the baseline model is calculated, imagine a vertical line crossing the x-axis. The baseline distribution is then used to convert this into a percentile score value, imagine a horizontal line on the cumulative frequency axis. This value is what we use for our smiley face. Low values are in the green, good zone. High values are in the red, bad zone.

The figure below shows how these zones relate to the model and the raw data more directly. New points in the green zone or below translate into green smiley faces. New points in the red zone or above are presented as red, sad faces.

In this way we can ensure that buildings are only compared to their own consumption. The method ensures that, if no change to the underlying pattern occurs then a building will see the full range of smiley faces. If a shift in consumption patterns occurs then the smiley faces will move towards showing happy or sad faces more often.

Aggregating the data

In the actual dashboard we are displaying daily and weekly averages rather than the raw data. This approach avoids an extremely erratic dashboard but captures the overall patterns nicely.

Please, if you have any questions or comments on the methodology then feel free to comment below. I will update the post as necessary for clarity and to remove ambiguity and mistakes.