3 ways that spreadsheets fall short of corporate standards when seeking Everyday AI

News Desk -

Share

By Jad Khalife, Director of Sales Engineering – Middle East & Turkey, Dataiku

Spreadsheets. Some find them confusing. Others find them useful. Some even find them fun. Excel has been around since 1985 and has been a standard part of the UAE office tech suite since, shall we say, the mid-nineties? That is two generations of service, and we must salute the spreadsheet for its undeniable contribution to office automation.

But times are changing. AI has arrived. Today, executives often cite their capacity to manage volume and growth rate as two of the biggest problems they face when they try to monetize their data. Factors such as human error, accuracy, and the need for compliance have come to feature heavily in the argument that spreadsheets, while once golden nuggets of productivity and efficiency, now stand in the way of progress.

In our recent whitepaper, “AI Laggards vs. AI Leaders”, Dataiku explores the spreadsheet as a stumbling block on the road to Everyday AI — the cultural idyll in which everyone from the board on down thinks in terms of data and business intelligence. We discuss how information and operational silos are the natural byproducts of the spreadsheet. And we show why organizations that do not move on from yesterday’s hero will be left behind in the AI revolution. Here are three ways spreadsheets fall short in fulfilling enterprise data science requirements.

  1. They are inaccurate and opaque 

So many stakeholders must come to trust in the data held by an organization. Users must trust it to plan operations, serve customers, pay bills, chase debt, meet deadlines, and a hundred other things. Copying and pasting from different files and relying on macros written before many employees were born, flies in the face of the modern zeitgeist. Such practices should make any compliance officer flinch, as they all but guarantee the absence of audit trails.  

Governance goes out the window, down the street and over the mountaintop. The UAE government and other international regulators now play an intricate role in the life of the nation’s businesses. The spreadsheet cannot hope to appease auditors. Centralized data platforms are not so much the future as they are the present. Their implementation allows for the maintenance of data lineage and record of ownership, as well as an audit trail of every read-write event and uses in AI projects.  

  1. They are inefficient

AI projects don’t work by starter pistol, especially when using spreadsheets. Data must first be identified as relevant, then amalgamated into workflow artefacts, possibly reformatted, and likely transferred to another resource location. With spreadsheets, this must be done with every single project. This has a cost associated with it.

If the average data project spans eight months and 80% of that time is spent on data preparation, that means 6.4 months of data prep, which is 53% of the year. At one estimate, accounting for all experience levels, the average annual salary of a UAE data analyst is just under US$32,000. If we assume a minimum of two analysts working on any one project, we get a combined annual salary of US$64,000, and 53% of this is US$33,920. This shows us that the labour equivalent of more than one data analyst’s entire annual salary is being spent cleaning data for a single project.

This mounts up. If we assume the typical organization may be looking at 10 projects in their pipeline, this costs an unnecessary US$339,200. Scale to 100 projects and stakeholders are looking at US$3,392,000, And so on. These are the costs that can be saved by retiring the spreadsheet.

  1. They antiquate the entire technology stack

An alarming number of organizations — more than half, according to a recent Dataiku survey — perform data preparation in a separate system from the one in which they build their machine-learning models. This often occurs because data is held in spreadsheets. Moving beyond this antiquated setup can enliven the entire technology ecosystem. Spreadsheets tend to be files that belong to a single business unit or individual because their design is not easily changeable or interchangeable. The data and the view of the data are inextricably linked in the same file. Finance wants one view, the warehouse wants another, HR another, and on it goes.

In leaving the spreadsheet behind, the organization eliminates these silos. The alternative — the centralized single-truth — clears the way to Everyday AI. A unified platform is more than just a means to eliminate the time-wasting in data prep or to enable full audit trails and transparency of use. The platform approach introduces a new way of thinking about data. Cross-discipline collaboration allows better and more comprehensive models to be built than would have been possible under the spreadsheet-only regimen. Also, complexity is reduced. Spreadsheets, innovative though they were three decades ago, are not imbued with an abundance of flexibility. They do one thing at a time. AI platforms allow parameter tweaking in a fraction of the time and often with far more powerful results. 

And finally, data management in the platform approach allows policy rules on access to travel. What this means is that since access to any one piece of data is governed at the global level, data artefacts constructed for modelling will carry these rules with them across multiple projects. This level of reuse is problematic with spreadsheets because each file must be assessed by compliance teams before use in another project if teams want to ensure the satisfaction of regulators.

They are gone

Daily, enterprise-wide collaboration on powerful, reusable ML projects, where prep and modelling are done in a single location and compliance is guaranteed by global governance. Thus, we summarize the benefits of eliminating the spreadsheet in favour of a single, centralized data and AI platform. Thus, we summarize Everyday AI.


Leave a reply