How predicting streaming loads can help broadcasters and operators slash costs

Using advanced AI modelling, streamers can predict the future demands on streaming infrastructure, allowing them to save significant operational costs.

data analytics

Summary

The deployment of AI models enables media companies to accurately predict CPU loads and cloud capacity requirements, optimizing resource allocation and minimizing guesswork.
Early tests indicate that the AI system can help operators utilize 30% less infrastructure, leading to significant cost savings and a reduced carbon footprint.
The system leverages cloud TV platform metrics and viewer consumption patterns, providing insights into daily and weekly patterns to forecast demand more effectively.
By anticipating both regular demand and potential outlier events (like major sports matches or film releases), the AI system allows companies to right-size their cloud resources, enhancing economic and environmental efficiency.

Meeting peak demand

One of the major expenses media companies face when looking to stream video is provisioning the necessary cloud capacity needed to meet peak demand. Of course, the flexibility of the cloud allows them to spin servers up to meet an increased number of viewers, but this is not an instant process.

Faced with the problems that a potential shortfall of capacity might result in — buffering, loss of connection, and more, not to mention the associated reputational damage — many companies choose to either run at 100% capacity all the time or use rough rules of thumb to estimate future demand. If there is a big soccer match at the weekend, they will simply book more nodes.

That takes them so far. But we believe that AI can help make this much more sophisticated and take the guess work out of the process. We are currently field-testing an AI model — actually a collection of AI models working in combination — that can scan the coming two weeks of a programmed guide and predict the CPU loads and cloud capacity needed to handle all the processing across the catalogue. It can then give an operations team either a comprehensive guide to what they can expect or even automatically provision the capacity needed.

It’s a powerful system and data we are seeing so far from our tests with a Tier One operator indicate that they are able to utilize 30% less infrastructure as a result of its deployment. That represents both a significant saving in costs and an important reduction in its carbon footprint.

Analyzing the data

There are two main sources of data that can be used to create a system like this, and VO is across both of them.

First, you need the system metrics from the cloud TV platform and be able to analyze performance data from Kubernetes clusters etc. Once you can track processing load, memory load, network load and more with fine grain metrics that refresh the data every 30 seconds to one minute, you can start building up a deep understanding of how the system performs.

Machine learning is adept at detecting daily and weekly patterns. Some of these are obvious, greater demand at weekend and peak times for instance, some of them are more difficult to tease out and only start to make sense when analyzed with consumption patterns (see below). We have found that regular ML is already good at predicting what the platform load will be ahead of time, but what surprised us is that Deep Leaning techniques based on Large Language Models are even better. These can provide consistently superior results despite being far removed from their original purpose.

As mentioned, this data set should be combined with consumption patterns to produce accurate results and the sort of holistic overview that delivers real insight. Because of our pioneering work with analyzing viewer data, we know a great deal about this already. We know how much video end customers are consuming, we know what they are watching, we know what live events they are interested in, we have already used AI to extrapolate viewer profiles from first party data, and we can perform semantic analysis on all this information.

Building the system

So, we have our data from our two different sources. What we have done next is join several AIs together so that we can combine system metrics and viewership together to understand how viewership impacts the platform. We can then look at the impact that different types of content delivered at different times to different audiences have on the backend systems. From there what we have done is train the system to be able to ingest the EPG up to two weeks ahead and predict what the levels of demand will be.

Crucially, the system delivers a level of insight that does not just predict the pattern but can also predict what the outliers might be. It can look at potentially disruptive scheduling choices — news events, new movie releases, series finales, sports matches, and more — and accurately predict how these will affect the established trends.

This allows media companies to allocate the infrastructure they need to their future programming needs and no more. They are continually right-sized no matter the peaks and troughs. We already know that between 60-70% of the platform load will typically occur in primetime, but even cloud-native flexible infrastructure companies have a tendency to allocate maximum capacity Kubernetes clusters to manage the highest load. This is both economically and environmentally inefficient.

Our system allows them to plan ahead and ensure that cloud capacity requirements can be matched to a projection that will be both more accurate and more dynamic than if the operations team simply predicted their cloud workloads without the AI.

User feedback

When we demoed this system at IBC2024 some customers claimed that they already knew that an upcoming soccer match will need more resources. This is a valid point, but that is only one narrow area of the business. Our system will not only flag that up but all the other potential peaks and troughs over the coming two weeks. There are of course autoscaling capabilities already existing in the public cloud infrastructure, but they are rarely quick enough to catch up with demand. This allows broadcasters and operators to get ahead of the curve and flex their system appropriately.

We have used a combination of machine learning and deep learning techniques in developing this and have also paid close attention to the resulting solution being cost-effective. Trade-offs between accuracy and cost need to be balanced carefully, especially when using the latest genAI techniques running on the latest GPUs. But given genAI’s successes, especially in time series forecasting, we are looking forward to using it more as costs get driven down.

Currently this functionality only works with VO delivery platforms, but we are looking at integrating it with other systems so that we can draw in the raw datasets regarding metrics and consumption patterns for those too. This will allow more companies to take advantage of the cost-savings capabilities the system offers.

How predicting streaming loads can help broadcasters and operators slash costs

6:39

How predicting streaming loads can help broadcasters and operators slash costs

Summary

Meeting peak demand

Analyzing the data

Building the system

User feedback

Romary Dupuis

Keep Updated Receive our latest blogs by mail

Follow us

Related blogs

Keep Updated
Receive our latest blogs by mail