The SRE Low Frequency Reporting Platform

By

Grant Mitchell And Ashley Bullock

sre-logo

Hello, this is the first in a series of blog entries we intend to write about all things Site Reliability Engineering at Paddy Power Betfair. As this is our first ever blog post (so please be kind!), we thought we’d write about a straightforward project we ran recently, our low frequency reporting platform. We’ll discuss what we built, why we built it and some of the outcomes we saw. This project is quite simplistic, and likely not 100% transferable to your estate, however, hopefully our thoughts on the project may be useful, even if the detailed “how we did it” would not be :). The framework was delivered over a few sprints, the services running on it are continually being developed.

What is a Low Frequency Reporting Platform?

During our day to day work, we have discovered the need to collect and process lots of small datasets at a relatively low frequency such as daily or hourly collections. These include reports such as hardware health, app performance or information from ServiceNow. The typical way this is tackled in our environment would be to create a service or App – internally known as a TLA (we name them generally with a three letter acronym). All TLA’s are deployed with an A or B appended to their hostname. An update and re-deployment of a TLA using “A” will become “B” after re-deployment is complete. The next deployment will then become A, then B, etc… Rather than develop individual TLA’s for each little project, we decided to create a framework to host “small” services.

Continue reading “The SRE Low Frequency Reporting Platform”

The SRE Low Frequency Reporting Platform

An Agile Journey

So an amazing day comes to an end …

I am now sitting on my own, at my desk in the office.

The floor is empty, everybody has gone home, enjoying the sunshine and probably watching the Worldcup match that is on now (Colombia x England). And here on my own, I Iook at my day in Retrospectives.

Continue reading “An Agile Journey”

An Agile Journey

Balance And Exposure the Future

The Betfair exchange is a pioneering, market-leading online betting Exchange. It is the biggest and most mature betting exchange around.

What is a betting exchange?

A betting exchange allows customers to bet against each other rather than against a bookmaker. This differentiates them from traditional betting shops and bookmakers as the betting exchange allows the user to act as the bookie (by setting the odds for an event) or the customer (who bets using the odds set by the other user).

Whereas traditional bookmakers accept the risk of going head-to-head in various bets with customers, the business of a betting exchange does not involve any risk. The exchange simply provides the technology that pairs customers together in order for bets to take place and takes a commission from the net winnings that result.

Continue reading “Balance And Exposure the Future”

Balance And Exposure the Future

Our journey into planet hackathon…

I want to take you back about twelve months, so that I can set the scene for you. My team is still dealing with the fallout from a merger that forced together two very different tech stacks and teams that don’t know each other. On top of that we are still in the middle of what turns out to be a two-year mega project that has sucked time and energy from almost everyone.

Continue reading “Our journey into planet hackathon…”

Our journey into planet hackathon…

Exchange Market Data Streaming with Kafka

At Betfair our read services are struck with billions of requests per day, they are not evenly distributed either. These requests will arrive in huge spikes of traffic during key sporting events, putting our customer facing services under huge pressure during sustained time periods throughout the day. We develop our systems to cater for this demand, keeping true to our latency SLA’s all the while operating without downtime. Unlike comparable trading platforms used in the financial world, we don’t have the option of closing trading at 5pm – sporting events occur around the clock, every day.

When we talk about read services, we are referring to anything that is presented, in real-time to customers – either through the API or via our online channels. Notably, our price read services. They were the first to move to the streaming model. If you are not familiar with financial trading, price read services present ‘ticks’ on a market to our customers – billions of them. Ticks are price/volume pairings for a given selection on a market. See below.

Trades

Continue reading “Exchange Market Data Streaming with Kafka”

Exchange Market Data Streaming with Kafka