What a modern data stack looks like in 2023

Categories

Table of Contents

Are you looking for a single must-have tool to capitalize on your data in 2023?

You may be disappointed as it is difficult to see clearly through the vast number of new data solutions that appear on the market every day. Each of these solutions has a specific promise and aims to improve a certain section of the data journey.

At first glance, it may seem impossible to find your way through this shifting ecosystem. In reality, the abundance of highly verticalized solutions is an opportunity: by combining the most powerful and best-fit tools, you can create your customized data arsenal. This composite arsenal is what we call a "data stack". What will yours look like in 2023?  

What is a data stack?

Raw data alone doesn't do much good. Today, companies have access to huge amounts of data... but that doesn't automatically make them more efficient. For that to happen, you need to set up an efficient processing system. The first challenge is to identify the richest sources of data and to succeed in the extraction process. Then, the extracted data must be transformed, prepared and sent to a storage location. It is after this step that data scientists can intervene. They carry out advanced and predictive analyses thanks to machine-learning algorithms. But data scientists are not the only ones who use data: business users must have access to clear and actionable insights that allows them to make good decisions and become more efficient daily. To do this, data must be presented in ergonomic dashboards or transformed and injected into their business tools to make decisions and monitor their operational actions.  

Each of these crucial steps requires the intervention of professionals with specific expertise. As these operations become more complex and sophisticated, new areas of expertise are emerging: for example, Machine Learning Ops or Analytics Engineers, who did not exist a few years ago, and whose expertise lies between that of a Data Scientist and a Data Engineer. Each of these specialists needs tools designed for them, which fit easily into the company's global data ecosystem. This arsenal of tools, which allows you to go from raw data to data that can be used by the business, is what we call a data stack. Except that in 2023, more and more companies are moving away from traditional data stacks, which are static, expensive and hosted on local servers, to modern data stacks, which are more agile and modular.

What is a modern data stack and how to build yours?

A cloud-based data stack

The difference between traditional data stores and modern data stores is the central innovation of recent years: the migration to the cloud. A few years ago, most companies chose to store their data and perform their operations on local servers that they owned (on-premise). Today, they have mostly converted to the cloud. In the cloud, they can store their data inexpensively, run faster, and perform more elastic calculations along with deployments. All this without having to worry about the technical maintenance of the servers, which are continuously optimized by the software providers. Above all, the cloud gives them access to the most innovative data storage and management solution today: the cloud data warehouse.

 
From an ETL model to an ELT one

Cloud data warehouses like Snowflake, BigQuery or Redshift are not only more scalable and less expensive analytical databases. They also allow you to organize your data in a way that makes it easy to find and use. They also stand out for their computing power. Before data warehouses, raw data had to undergo complex transformations to be centralized in a storage space (sorting, cleaning, de-duplicating, organizing...). Now, there is no need to do this work upstream. All you have to do is set up extraction pipelines that bring the data from various sources to the data warehouse. Once the data has reached its destination, it can be transformed by leveraging the computing power of the data warehouses. This simplifies the entire data lineage/data journey. To put it in expert terms, we are moving from an E-T-L model (Extract - Transform - Load) to an E-L-T model (Extract - Load - Transform).

The place of data warehouses in the modern data stack

Thanks to this revolutionary change, data warehouses are taking on a central role in the new data stacks. They are the beating heart of a growing number of modern data stacks. Another distinctive feature of data warehouses is connectivity. They can be connected to other data tools that participate in the data journey upstream (extraction, routing) or downstream (analytics, data visualization, data storytelling).

A modular and composite data stack

The rise of the cloud has led to the proliferation of a variety of solutions, each of which acts on a specific segment of the data journey. Each company can choose from among the tools available on the market that it would like to integrate into its data arsenal. The modern data stack is above all modular and composite: it is made up of a selection of these tools that allow the extraction, transformation, analysis, restitution or activation of this data.

Selecting the right tools for your modern data stack can seem daunting. Fortunately, it is possible to draw inspiration from those put in place by other companies, large or small, to build your own. On the modern data stack site for example, which has the vocation to become a directory of the different data stacks set up. It also offers an inventory of all the data tools that can be found on the market according to their category: data warehouses, BI tools, data streaming, and workflow monitoring...

A data stack adapted to your company

The main quality of a modern data stack is its modularity. There is no one-size-fits-all model to follow: you simply have to find the formula that best suits your company, your resources and your teams of experts. The tailor-made aspect of the modern data stack changes the game. It allows you to avoid fixed costs and ensures maximum flexibility.

If you're a startup with a few dozen employees, you won't have the same needs as a medium-sized company and will have to make do with a modest data stack. It is still in your best interest to set up easy-to-access reporting tools to get information to all members of your team without having to resort to spreadsheets.

Medium-sized companies often have a more structured data team. They will need a wider variety of tools to transform data before use, explore it with sophisticated BI tools, and integrate the data back into business tools, also known as Reverse ETL.

The most mature companies will use machine learning models to perform predictive analysis and better anticipate changes in their market. They will use a complete data stack that goes from data extraction and preparation to user-friendly visualization tools that will allow them to present the results obtained through machine learning to business users.

A data stack to make data accessible to all

The data analytics process is becoming more and more complex, it's true. This complexity allows for more advanced analyses and more powerful calculations. The major challenge is to make sure that the results of these analyses and calculations, however complex, remain accessible to all users, even those without technical expertise. This is our credo at Toucan Toco: advanced analyses and studies are useless if they do not have a concrete application and if they do not allow businesses to be more efficient daily.

A modern data stack is a data stack that is accessible to everyone, without any technical expertise barrier. This is not self-evident when it is composed of tools as complex as data warehouses or data mining software. This is why you need to include products that put accessibility and design at the center of your data tools arsenal. That's what data storytelling does, turning dry numbers and analysis into stories that everyone can understand.

We have made sure that Toucan Toco is a data storytelling tool that can be integrated with any modern data stack to make it more accessible. Our numerous connectors allow us to integrate with various tools. Thanks to Toucan, business users can make code-free queries by connecting to their data warehouses (Snowflake, Redshift, Big Query) from our interface in just a few clicks.

Thanks to Toucan's no-code interface, there is no need to learn SQL. Everything is done visually: it is possible to select the sources directly in Toucan. Our interface takes care of the rest: business questions configured in No Code are translated into code and sent directly to the data warehouse. Everything is done to capitalize on the computing power of the data warehouse without being slowed down by technical barriers.

A cost-optimized data stack

All this, at what cost? Here again, there is no standard answer, everything depends on the configuration of your data stack. It is possible to get a data stack for 500 euros, but the budget can quickly reach several tens of thousands depending on the tools you choose and how you use them. As we said, cloud solutions are flexible. That is, they allow you to pay for what you use, no more, no less. That's why it's important to use them well: unnecessary and repeated requests can quickly be reflected in your bill. This is a serious risk if you put your data tools in the hands of a large number of your employees.

So what should you do? You need to equip yourself with solutions that allow you to control the use of the most expensive tools. At Toucan, we have set up a cache system to avoid sending the same request several times. Let's say that two of your employees make two similar calculation requests via our interface. Instead of paying twice for this calculation, when the second identical request is made, the cached information will be used, without having to make a new request to get the data.   Practical, isn't it? Practical, isn't it?

But that's not all: we offer dashboards that allow you to monitor your data warehouse usage at a glance, pinpointing key expense items and optimizing your costs.

Try the Snowflake cost control dashboard

So which data stack for 2023?

Whatever your data maturity, the important thing is to build a data stack that allows you to quickly share it with your teams. You must not neglect the restitution stage in favor of the first steps. Data extraction, transformation and exploration remain useless if the data is not activated. At Toucan, the democratization of data is our top priority. We can help you make your modern data stack more accessible to everyone. Let's talk about it!

 

 

From data-driven to data everywhere

Get the Ebook
Ebook from data driven to data everywhere

Discover the path to immersive analytics with Toucan

TRY FOR FREE
illus-usp-2 - blog post
Ebooks visuals-1
Get the Ebook

Table of Contents