How to Build Unbreakable Live Streams at Scale

为了大规模交付可靠的实时流,需要开发的最佳工作流是什么? Adam Miller, CEO, Nomad Technologies, David Hassoun, Chief Technologist, Dolby Cloud Media Solutions, Dolby.io, Peter Wharton, Chief Strategy & Cloud Officer, TAG Video Systems, and Corey Smith, Sr. Director, Advanced Production Technology, CBS Sports Digital, Paramount 讨论他们的组织如何在流媒体西部2022的这个小组中提供无缝的实时扩展流媒体.

“Live event streams can break down anywhere along the line,” says the moderator, Eric Schumacher-Rasmussen, Chair, Streaming Media Conferences, and CMO, id3as. He asks Adam Miller of Nomad Technologies to kick off the discussion. “在构建可靠性工作流程时,您认为需要关注的关键接触点是什么?”

米勒强调,大规模的直播有无数可能的失败接触点. 他说,在Nomad,他们试图将这些可能性分解成一个有组织的图表,将类别缩小到三个部分. “What is the cost to make this redundancy better?” he says. “What is the effort level to make it better? [And] where do you put your budget and time and energy? And when you do that, you're going to find that out of those hundred breakage points, probably about six really become most prominent. 这将是你可以投入精力帮助提高可靠性的地方而不是有人说, ‘Hey, I need multiple encoders.’ Well, put the energy towards where it's going to matter the most. So that's what we normally do when we start looking at redundancy.”

TAG VS的彼得·沃顿指出,大多数行业外的人对这些直播系统的巨大复杂性知之甚少. “You need to monitor the whole thing, 特别是当你在做一个特别的系统构建或者你在按需建立一个实时系统的时候,” he says. “Because then you have all these moving parts and now, you're just building something instantaneously. So you have to make sure all those moving parts are working. It's not something you've had running for months, and you know how it works. 但与此同时,工作流中的每个点在工作流中都有不同的值.”

Wharton says that it’s a matter of adjusting for priorities. “As you go further down that food chain, 每一个能够影响观众的触点都在影响越来越少的观众,” he says. “So, therefore, 您还必须确保您的监控方式实际上是根据工作流程中每个点的内容价值来调整监控成本. And that's a challenge to get all that right but still make sure it works everywhere. 因为你不能花同样的钱在CDN的一些边缘上,而这些边缘只会影响到一个核心区域.”

David Hassoun of Dolby.io highlights the crucial need for validation testing of systems. “Every one of those key breakpoints, you really need to be able to hammer that, and you're going to have assumptions of how it can break, and those are the ones that you can at least test,” he says. “但也要试着超越你的预期,了解不同的失败场景是什么, how you're going to transition, and make sure that that is smooth. It needs to be really worked through. 它成为了所有球队的肌肉记忆,这将会影响到在这些情况下会发生什么.”

米勒提到了他所谓的“黄金法则”,即不去管正在工作的东西,不要过度修修补补. “When it comes close to time to deploy, don't touch anything!” he says. 人们会忘记这一点,他们会想,“哦,让我们在最后一刻换掉这个编码器。.’ If you're trying to reliably distribute something, touch as little as possible. And if you're going to do it ten times, don't touch it at all. Build it once, and then just leave it and reuse it ten times. 我发现很多人都忘记了“不要在两分钟前改变事情”这条黄金法则.’”

CBS体育和派拉蒙的科里·史密斯强调说:“弄清楚来自外部的遥测信息是一门真正的艺术, [and] also how the customers experience the event.他指出,在对所有可能的故障点进行压力测试时,cdn并不总是最能适应的.

“When I was at Xbox,” he says, “we were doing a lot of things to scale large customer events, 无论是E3 Keynote还是其他《百家乐软件》,还是那天主机上的任何游戏. But we took a lot of pride in actually testing to failure. But when you go to an Akamai or you go to a Limelight or some other CDN provider and say, ‘Hey, I'm going to stress test my network out. 你能帮助支持每秒2.5到3tb的流量吗,因为我想扩展到250万并发数? How do you tell a CDN provider that's where you want to test to? They're going to laugh you out the door and say we can't absorb that on our network. But a lot of it is knowing where your traffic's going, being multi-CDN agnostic or being agnostic to a single CDN provider, 而且还可以从客户那里获得反馈和遥测,这样你就可以根据客户实际看到的情况,近乎实时地做出明智的交通决策. So if you are pushing a bit rate of 10 megabits plus, so you're doing 1080p Plus video at high quality, you don't exactly own all of the edge ecosystem that you're actually deploying to. You just own the video player tech, so to speak. You kind of want to get that feedback so you can say, ‘Hey, 这个城市和这个特定地区的提供商做得不太好,这就是我们需要开始减少流量的CDN提供商. 你必须把这些遥测系统构建到应用程序的实际基础中因为你不能在现场活动中把它关掉再打开. 你必须能够让你的流量在全球互联网上潮起潮落,并且能够无缝地进行.”

