Arkadiusz Krysik

Scalability, Allocation, and Processing of Data for Multitenancy

When developing a SaaS application and expanding business, companies are certain to eventually encounter large amounts of raw data which can lead to various technical difficulties.

Data allocation is especially important in applications based on multi-tenant solutions.

As we’ve mentioned in the previous post Introduction to Multi-Tenant Architecture, they, with only one instance of the software, serve multiple customers.

There are numerous aspects of data processing in multi-tenant applications, such as scalability, optimization, and performance.

Today we are going to go a little more in-depth into this topic.

In this article, we are going to answer the following questions:

  • How do types of data differ from one another?
  • Why characteristics of analyzed data are so important in multitenancy?
  • How do different types of multi-tenant databases influence system performance?

Building a new application or extending your development team?

🚀 We're here to assist you in accelerating and scaling your business. Send us your inquiry, and we'll schedule a free estimation call.

Estimate your project

How to allocate, scale and process data in a multi-tenant architecture

In our previous article from the Multi-Tenant Architecture series, we explained what multitenancy is using the shepherd metaphor.

We followed the adventures of an ambitious shepherd who’s decided to set up their own sheep-keeping business.

Now the shepherd is facing a tough decision. What is the best approach to labeling and storing all the sheep from various owners on the pasture?

metaphor shepherd for multitenancy

This, of course, is a broad analogy to different types of databases in a multi-tenant architecture in a SaaS application.

It shows that there are multiple approaches to data processing where multiple tenants are connected to a shared environment.

But how to decide which approach will work best in your organization? In order to make the most informed decision, it is advisable to look at the data that you are anticipating to receive.

There are a lot of specific characteristics of various datasets that need to be taken into consideration by developers working on big data multi-tenant solutions.

Explore this topic with us! Follow up reading: What is Multitenancy? Definition of Multi-Tenant Architecture.

Multi-tenant architecture — characteristics of the datasets

First, we should start with the basics.

A database is a systematic collection of data. It supports electronic storage and manipulation of stored information.

In the following parts, we are going to focus on relational databases.

Relational databases (RDB) consist of multiple arrays of data that can work together. Relational databases have internal programming languages (SQL), which can be used by software  developers to create custom menus and various advanced data handling functions. In such databases, all data values are based on the simple data types and are represented in the form of two-dimensional tables.

Scalability, allocation, and processing

Ok, if we have all the basics covered, let’s move on further.

If we look at the raw data stored in the databases themselves, we can point out 3 main characteristics of these datasets: scalability, allocation, and processing.

When starting to develop a data allocation solution for a system in which a single software instance can save multiple users, software architects have to consider the above characteristics. Only then any serious resources can be dedicated to the development process.

Each data characteristic is equally important, particularly from an economical standpoint. Understanding them can vastly improve eventual vertical and horizontal scaling in the future, as the system will be built with the specific type of client data in mind.

1. Data scalability

Data scalability is one of the most essential aspects to take into consideration when making a data warehouse for a system future-proof. It enables easy and seamless horizontal scaling.

It is even more critical when developing a system in multi-tenancy architecture. The reason why is because a single instance of software serves multiple customers and users can influence each other’s performance.

The size of datasets

The first thing to consider when designing a data storage system for a client or an internal project is the expected number of tenants and overall raw data volume.

Planning and developing a highly scalable data storage solution might turn out to be a waste of time if you:

  • have a relatively small number of tenants (under 10),
  • store a small number of entities for each tenant.

In database administration, an entity can be a single thing, person, place, or object.

On the other hand, when you have, say, around 60 tenants, or if one of your strategic goals is to reach that level in a feasible timeframe, building a scalable solution is becoming a key point of consideration in cloud computing.

This is probably one of the main reasons why companies are pursuing multi-tenant architectures in their big data solutions. They are dividing users’ data into multiple servers just because they are unable to store them on one machine.

As your business grows, the benefits of having a clear data storage strategy, and applying automation solutions to database management, will become more prevalent.

Data growth

The next characteristic is the future growth of the sheer volume of stored data. Under some circumstances, we have data whose volume is relatively stable with time.

Take, for instance, a database of states in the USA. The last change happened in 1959 when the state of Hawaii joined the Union.

There has been no indication that this may change in the foreseeable future.

If we are managing this kind of database, we can be sure that it won’t change in size in any meaningful way.

But on the other hand, there are modern SaaS applications or services like social media sites that can grow incredibly quickly, providing that they have a smart business plan and strategy. In this case, each new client and user means more memory requirements for the company’s servers.

That is another aspect that influences companies’ decisions about moving into multi-tenant architecture.
Multitenant architecture allows developers to relatively quickly add new shards to the database in order to serve different tenants and maintain the overall system performance.

2. Data allocation

When planning a data storage solution for a project in which a single application instance serves multiple users, you need to decide on an approach for sharing or isolating your tenants’ data.

Data is often considered to be one of the most valuable parts of a SaaS application since it provides invaluable business information for management teams.

It is worth mentioning that some data is more closely connected to specific tenants than others. Because of that, we can divide all the available user data into two main categories based on this characteristic.

The degree of tenant and data association

We can point out two approaches to differentiating databases when it comes to the degree of data association between them, and their owner.

1. type: a strong data association with tenants

This type of data is highly associated with its owner, which means that it is specific to only one tenant.

In reference to our shepherd metaphor from the introductory article, let’s take the owner of the herd of sheep who is a Tenant. The Sheep from a given herd would always be associated with this specific tenant. The degree of association between the Tenant and Sheep is strong.

2. type: optional association of data with tenants

On the other hand, there is a type of data that is only optionally associated with a specific tenant.

Imagine a brave shepherd dog that guards the sheep. The dog can be owned either by a specific tenant, and it is supposed to look only after their sheep, or by the shepherd – the owner of the farm. In the second case, the dog is not linked to the specific tenant, but it guards all the sheep from various tenants.

It means that the association of the dog with a given tenant is only optional.

Why are all of these things so important?

It all comes down to the choice of the multi-tenancy type that we’ve covered in the previous blog post.

three types of multi tenant database

For instance, if we have an application that deals with large volumes of data, a much better option would be to choose type 3 (One database per tenant). However, if this data is mostly non-tenant-specific, and we are usually processing it together, a much better option would be type 1 (Shared database).

It is up to the developers and software architects to weigh all of these aspects and make the most informed decision. It will influence the future prospects of horizontal and vertical scaling.

Shared data allocation

Modern data allocation algorithms can utilize the knowledge of user moving patterns for the proper allocation of shared data in a computing system.

By employing these algorithms in practice, the need for costly remote accesses can be minimized. Also, the overall performance of a mobile computing system is thus improved.

The aspect of shared data allocation is one of the key factors that should influence the decision of the type of multi-tenant architecture.

3. Data processing

Last, but not least, we have to cover data processing aspects of multi-tenant architectures.

The main problem to solve in this field is trying to answer the question of how to perform operations on multi-tenant databases faster?

A tenant’s data processing

Let’s say that we want to perform a fairly straightforward operation that will sum up all the entries of a certain type in the entire database.

Which multi-tenant database type will perform this operation the fastest? That is a question worth considering.

If you need to consult what type of multi-tenant database is best for your business, do not hesitate to reach us. We are a team of high-performance software experts experienced in introducing a multitenancy solutions to SaaS applications. Feel free to reach out!

Shared and separate database processing – what to choose

The first type is shared database multitenant architecture. In this type, performing such an operation is fairly quick.

On the other hand though, if we perform the same operation but only refer to a single user instead of the entire database, our overall performance may be slightly hurt.

In type two (shared database with separate schemas), the performance in these two situations will be pretty similar.

In the third type (separate databases) though, clear differences in the performance will be visible. Counting the entries specific to a single tenant will be the fastest solely because of the fact that this data can be physically separated on different server machines.

This database will have dedicated CPU threads and memory modules, which will increase the performance and improve latency.

The third type may encounter problems when trying to sum up all of the entries across multiple databases. The speed of this operation will depend on how much of the computing resources will be dedicated to the database management process.

How to perform operations on data fast and efficiently

Problems begin to arise when we add more complexity to our instructions.

If we, for instance, want to select a specific range of entries with a maximum value, from the entire database, the type three of multi-tenancy will lead to serious performance hurdles.

Nevertheless, server machines have a limited amount of processing cores and cache. Therefore, advantages gained by selecting the first type of multitenant architecture can be diminished in the long run by problems with optimization and latency.

The third type can be much more easily scaled vertically, which is especially important for large enterprises focused on rapid development and big volumes of processed data.

This example clearly shows how planning ahead is a crucial part of the development process when working on multi-tenant solutions.

Are you considering implementing a multi-tenant architecture to your SaaS application? 

Contact us

Summary

Understanding the characteristics of data in multi-tenant applications is crucial at the very beginning of the whole development cycle.
This issue is especially significant to smaller companies and start-ups, which cannot be entirely sure what volume of data they may need to process as the business grows.

Being prepared for the future expansion enables software developers to better allocate hardware resources, which are typically substantially limited.

Did you find this article useful? Don’t forget to share it!

Building a new application or extending your development team?

🚀 We're here to assist you in accelerating and scaling your business. Send us your inquiry, and we'll schedule a free estimation call.

Estimate your project

Testimonials

The developed software product was built from scratch with solid quality. We have had a long-term engagement with Stratoflow for nearly 10 years. We look at them as partners, rather than contractors. I'm impressed by their team culture and cross-team support.

Nathan Pesin

CTO, Legerity Financials

Stratoflow was a great partner, challenging as well as supporting our customer projects for the best outcome. They have a great pool of talent within the business - all very capability technologists, as well as being business-savvy and suitable for consultancy engagements.

Chris Goodall

Managing Consultant, CG Consultancy (UK) Limited

The bespoke metal exchange platform works great, it is easily accessible and richly functional. Stratoflow managed deadlines capably, meticulously documented their progress, and delivered a complex project at an affordable cost.

Bartlomiej Knichnicki

Vice Chairman, Supervisory Board

We are very pleased with our partnership with Stratoflow and, as we continue to grow, we expect to increase the numbers of developers that work with us on our projects. They have proven to be very skilled and flexible. They're extremely reliable, and they have a very good company culture of their own, which gives them a real edge compared to other providers that serve more as production shops rather than thought partners and creative problem solvers.

Andrew Kennedy

Founder & Managing Director, Tier 2 Consulting

Stratoflow successfully customized the system according to the specific functionalities and without bugs reported. The team was commended for their adaptability in the work process and for their responsiveness.

Joshua Blavins

Tech PM, Digital Agency

The features implemented have received overwhelmingly positive feedback from end-users. Stratoflow has an incredible technical expertise and a high degree of flexibility when it comes to changing project requirements.

Adam Hill

Chief Technology Officer, Legerity

They have impressively good knowledge of AI issues. Very responsive to any amendments and findings. Very good communication. We received a finished project which could be implemented into production shortly after testing.

CO-Founder & CTO

Circular Fashion Company

They provided superb service with seamless communication and a highly professional, technical approach. The team displays impressive technical expertise and are willing to share information and engage in constructive feedback.

Filip Stachnik

Operations Manager, Otwarte Klatki (part of Anima International)

They're very skilled technically and are also able to see the bigger picture. Stratoflow can actually think about solutions, not just the technical task at hand, which they've been assigned.

Arnd Jan Prause

Chief Operating Officer, musQueteer

Stratoflow delivered the website successfully within the timeframe and budget. They assured that the output met the set requirements. Overall, the team's performance was excellent and recommended for their exceptional technical business expertise. They've been able to deliver all of their work on time and within budget, which has been very impressive.

Lars Andersen

Founder & CEO, My Nametags

Travel sector rebound after the pandemic is complete. We have fantastic global coverage of travel data distribution due to mutual agreements and data exchange between aggregators. Competition for the best price of limited resources degradates margins.

How to win? Provide personalized experience and build your own products in the front-office. The missing bits: a traveller golden record collecting past activities and a AI/ML recommendation technology.

MichaƂ GƂomba

CEO at Stratoflow