All computers have a limit to the amount of information they can handle at a given time. An increase in workload will inevitably lead to a decrease in performance if the system is not designed to handle that increase. We’ve all experienced it countless times as new and cutting edge software comes out and runs on obsolete hardware; a call to change hardware.
When dealing with multiplayer games, or games that run online, the problem is the same but it’s solved differently. These games run by communicating with a server somewhere else, and the server relays the data the user’s computer needs to display the game in real time. That server infrastructure is what we call the backend. And when the backend is overloaded with work coming from more and more users trying to connect, it results in a decrease in performance. To put it starkly, the blessing to any developer of receiving higher numbers of users can quickly turn into a real headache if their backend infrastructure can’t handle the new volume, resulting in those users being frustrated, dropping the game and even rating it poorly. Scalability is the forward-looking solution.
Scalability is a backend system’s ability to handle increasing numbers of users without a decrease in the system’s performance. Plenty of developers have committed the mistake of overestimating their backend’s ability to handle traffic, and their games have borne the brunt. SimCity (2013) is a case in point: launched with the then relatively novel “always-online” requirement, the game’s servers couldn’t handle the influx of players, who experienced long wait times and severe connectivity issues, damaging the game so much that the popular SimCity franchise never saw another title up to today. Newer games don’t get away with it either. A more recent example is Anthem (2019), which was heavily criticized for very long waiting times, as long as five minutes.
To build a backend infrastructure with robust scalability, developers and their service providers must take several factors into consideration, like the performance of its hardware and software components, its database’s efficiency, and the system’s overall design. Optimizing these components, adding more resources and/or redesigning the system may be required to have a more efficient backend infrastructure. The following is a list of the 7 most common ways to build a robust backend system’s scalability.
Caching
A fairly common technique among all kinds of software developers and website designers is the use of caching; the storing of frequently accessed data in memory to fasten the time it takes users to retrieve it. Despite the minor annoyance of asking users for permission to store cache in their devices, this technique improves a backend system’s performance and scalability, basically because it reduces the load imposed on the system and the number of requests to be handled. Backend systems can be benefited by three types of caching:
- In-memory caching, which stores data in the server’s memory. This can be very fast indeed. However, it has the drawback that the data is lost whenever the server is restarted or shut down.
- Disk-based caching, that stores data in a disk rather than in the server’s memory. It’s slower than in-memory caching, but the data persists when the server is restarted or shut down. This is the main reason why this type of caching is preferred for videogames in which the persistence of gamers’ save files is crucial.
- Content Delivery Network (CDN) caching consists of a network of servers used to cache static content like images, videos, and CSS files. This type of caching helps improve a backend system’s performance, because it reduces the load on the system, as well as by offloading the delivery of static content to the CDN.
Load balancing
Distributing incoming requests across multiple servers or instances in a backend system is a technique called load balancing. It aims at improving a system’s performance and scalability by distributing workload evenly across available resources. It prevents individual servers or instances from being overloaded, and as such, greatly increases scalability. Load balancing can be implemented through:
- Hardware load balancers. Specialized hardware devices that administer incoming requests between the client and the servers, using criteria like availability or number of connections each server is handling.
- Software load balancers. Software programs that run on servers or instances, and distribute incoming requests to the backend servers. They’re very common in cloud environments and offer more flexibility than hardware load balancers.
- Reverse proxies. A server that acts as intermediary between the client and the servers, used to distribute incoming requests to the backend servers. They can also perform other tasks, like caching and SSL termination.
Microservices
Sometimes, an applications’ sheer size moves slowly, like a giant tanker traversing the vastness of the oceans without the maneuverability to quickly change directions. Microservices is an architectural style that addresses that. Through it, large applications are divided into smaller services that can be developed and deployed independently, which has several benefits for scalability, such as:
- Independent scaling: allowing a system’s parts to scale based on their specific needs, which is a very efficient way of scaling an entire system as a whole.
- Resilience: the ability to isolate a failing microservice to fix it without affecting the rest of the system.
- Flexibility: the ability to develop and deploy microservices in different languages and technologies, contrary to a monolithic system built using a single technology stack.
- Continuous delivery: the independent development and deploying of microservices, which enables a more agile development process and allows new features to be released more quickly.
Implementing microservices must be an evaluated decision, though, because of their high management and internal communication complexities.
Asynchronous processing
A programming model in which requests are made and responses are returned at a later time, allowing backend systems to handle requests without waiting for a response, improving the system’s scalability. Implementations vary:
- Callback functions pass as an argument to another function and is executed when the other function completes, which allows the calling function to continue executing while the callback function is waiting for a response.
- Promises are objects representing an asynchronous operation’s result. They are structured ways to handle asynchronous operations’ responses.
- Async/await are syntaxes introduced in JavaScript that allow asynchronous code to be written in a synchronous-like style, which makes it easier to write and read asynchronous code.
- Reactive programming is a paradigm that involves building programs by composing asynchronous streams of data that can be used to implement asynchronous processing in a backend system.
Despite the fact that asynchronous processing can greatly increase a backend system’s scalability, it also introduces a complexity that makes it more difficult to debug and maintain the system.
Horizontal Scaling
A sudden burst of traffic is a serious challenge any developer must predict and address before it happens. Horizontal scaling helps with that, by adding more servers to a backend system. It stands in contrast to vertical scaling that involves upgrading hardware or software in single servers. Some of its benefits are:
- Elasticity: the system’s ability to scale up or down as needed, a good strategy when environments with highly variable workloads. A newly released game with high expectation of traffic can be an example of this.
- Fault tolerance: it allows the backend system to tolerate individual server failures, like with power outages, by distributing the workload to the servers still in operation.
- Cost-effectiveness, because it allows the system to scale up or down, without the need of upgrading hardware and software.
Database optimization
Database design, configuration and its usage has a direct impact on the performances and scalability of a backend system. A well ordered bookshelf allows a student to quickly find the tome she needs, whereas a disorganized one makes her lose time skimming through countless titles. A backend system searching for the data it needs doesn’t work differently, and there are several ways to optimize this work:
- Indexing: the creation of special data structures that facilitate its search, and improve the performance of queries.
- Partitioning: dividing a large table into smaller, more manageable pieces in a way that reduces the amount of data needed to be searched in a given query.
- Normalization: organizing databases into tables and columns in a way that reduces redundancy and dependency, which in its place reduces the amount of data needed to be stored and retrieved.
- Query optimization: to improve the way queries are written and executed, by implementing a systematic approach including all of the above.
Application Programming Interface (API)
Just as in real life countries have designed a complex network of diplomatic institutions and language, programs need to be able to exchange data in order to work properly. An API is a set of protocols, routines and tools for building software apps that facilitates just that. The scalability of backend systems can benefit from them through:
- Decoupling: APIs help decouple the backend from the frontend, freeing it to focus on core functionality while enabling the frontend to scale on its own.
- Reusability: APIs can also expose the backend system’s functionality to other developers, allowing it to be used in a wider range of applications and improve its overall value.
- Security: APIs can be implemented in such a way that they require authentication and authorization, improving security of the backend system overall.
- Versioning: APIs can coexist in different versions, making it easier to update the backend without breaking existing integrations.
All seven strategies to implement scalability have pros and cons that should be gauged before setting up the backend infrastructure for Apps or videogames, especially at the day of launching, when fortunately, traffic would beat the developer and publisher’s expectations. In the case of esports events, nothing can harm a game’s accolade more than failure to have a robust, scalable backend system, to avoid awkward delays while countless spectators watch on Twitch.
Overall, scalability strategies can introduce a high level of implementation complexity that presents a development team with challenges of its own. These can come with additional costs that could or could not be worth incurring, depending on the project’s size and ambition, meaning the volume of expected users.