The FDI angle:  

  • Software firm Databricks is at the coalface of putting artifical intelligence into action at companies across sectors from retail and finance to healthcare. 
  • The company has raised $3.5bn in funding since 2013 and has ambitious plans to grow across Europe and the Middle East. The company also has operations in the Americas and Asia-Pacific. 
  • Why it matters: AI companies are emerging as active international investors, opening offices and emploing thousands of people the world over. In the words of Databricks' CEO, Ali Ghodsi: "In the next few years, in every industry, the winners will be data and AI companies."

In 2009, Ali Ghodsi was at the heart of a new era of artificial intelligence (AI). As a computer scientist at the University of California (UC) Berkeley’s AMPLab, he undertook research funded by tech heavyweights like Facebook, Apple, Netflix and Google. 

Advertisement

These so-called Fang companies were already “extremely savvy” in applying large-scale machine learning on massive datasets, Mr Ghodsi says, in “stark contrast” to the vast majority of companies. “This was different from the AI that was happening before that,” recalls the 45-year old Iranian-Swedish computer scientist.

Along with six other researchers at UC Berkeley, Mr Ghodsi set out to democratise access to Silicon Valley’s quickly evolving big data, analytics and AI capabilities. With roots in academia and the open-source software community, they co-founded Databricks in 2013 as a cloud-based platform for enterprises to store, process and analyse large quantities of data.

Today, the San Francisco-based enterprise software provider has raised $3.5bn of funding and is at the coalface of putting AI into action at companies across sectors from retail and finance to healthcare. “In the next few years, in every industry, the winners will be data and AI companies,” says Mr Ghodsi, who is Databricks’ CEO and still an adjunct professor at Berkeley. 

In the next few years, in every industry, the winners will be data and AI companies. 

Ali Ghodsi, CEO, Databricks

As the popularity of its services grows, Databricks has expanded its presence into Europe, the Middle East and Africa (EMEA) to help better serve enterprise customers. It expects to surpass 1000 employees in the EMEA region this year, adding to its 15 offices in the Americas and seven locations in Asia-Pacific.

Advertisement

Its recent acquisitions, including a $1.3bn deal to buy MosaicML, a platform that helps companies build their own large language models (LLMs), are part of its strategy to be the go-to provider in the still uncertain progression of enterprise AI and data services. 

AI ‘awareness revolution’

A surge of interest in general pre-trained (GPT) AI models has opened the eyes of corporate executives to the power of generative AI tools, which can produce content such as text and images when prompted. 

Since OpenAI released ChatGPT in November 2022, Mr Ghodsi says there has been an “awareness revolution” about the potential of data and AI to revolutionise what businesses are able to do: “I’ve had more CEO conversations than ever before.”

Organisations today have to grapple with an ever-increasing volume of data from numerous sources, leading them to seek out tools to handle and analyse this data. While there are concerns about the potential of AI to disrupt and replace jobs, businesses are already exploring ways in which AI can uncover new insights from data, automate routine tasks and improve efficiency.

“As these AI tools become easier to use and understand, they’re being viewed not just as cutting-edge technology, but as valuable tools that can drive business success,” says Brad Yeoman, a senior consultant at NWorld, a tech and data consulting group, which uses database platforms like Databricks and Snowflake, a competitor. 

 

Further reading on technology and artificial intelligence: 

Databricks’ Lakehouse platform, which enterprises can use to unify all their data storage, processing and analysis, is underpinned by leading open-source software Apache Spark, MLFlow and DeltaLake developed by its co-founders. It has been used by more than 9000 organisations, such as US telecoms giant AT&T, France’s state-owned railway company SNCF and UK retailer M&S. 

While Mr Ghodsi lauds the sudden interest in AI as “fantastic”, he underlines some major misconceptions, principally that there is too much focus on LLMs, the type of AI used by tools like ChatGPT that can understand and generate human-like language. 

“There’s a broad spectrum of things you can do in AI,” says Mr Ghodsi, noting that predictions such as estimated times of arrival and cost for journeys on ride-hailing apps like Uber use different types of AI models to LLMs. 

EMEA expansion

Samuel Bonamigo, who formerly worked for Oracle, Salesforce and Google, is leading Databricks’ expansion across EMEA. In the 2022 fiscal year, Databricks increased its headcount in the region by 75% and has recently opened new offices in Tel Aviv, Stockholm and Zurich.

A driving force behind this expansion has been to “be much closer to our customers”, says Mr Bonamigo, who adds that this requires the company to have bases in important markets. 

“This is all about data and AI, which is a market that is really accelerating. We want to take advantage of that acceleration,” says Mr Bonamigo. Another critical factor in opening new offices is for talent attraction, aimed at giving “a sense of belonging” to Databricks staff, where they can come to meet, talk and share ways in which they can better collaborate.

Since 2022, Databricks has been the second most active global cross-border AI investor behind UK-based assurance company Qualitest, according to fDi Markets, a greenfield investment monitor. Alongside offices in Munich, Frankfurt, London, Paris and Berlin, Databricks has a key engineering, research and development hub located in Amsterdam. 

Due to the fast-moving nature of AI, Mr Ghodsi says it is critically important to have a strong research arm to be able to stay abreast of the latest developments. He cautions there is still a “very shallow understanding” of how LLMs and other AI models work.

“The world is better off if we understand how [AI] works,” says Mr Ghodsi, who argues that open source software and research is the best way to understand the dangers of AI and control them. In March 2023, Databricks released Dolly, an open-source LLM which can be used and modified for any research and commercial use. 

Spreading their bets

Databricks aims to cover all forms of AI – “from soup to nuts” – targeted at many different use cases. Because the use of generative AI remains in its infancy, Mr Ghodsi says it is “unclear” which techniques used to develop LLMs will be the most important for enterprises in the future.

Databricks has decided to make a bet on the four options that enterprises have to use AI for their own data strategy. “It’s too important of a bet to forego any of them, so that’s why we decided we want to do all of them,” says Mr Ghodsi.

The options are “pre-training”, which means building an AI model from scratch; “fine-tuning”, where an existing pre-trained model is taken and tuned for extra data; and “in-context learning”, a technique used to improve an LLM’s performance by giving it more context and better prompts during the training process. 

Finally, there is “vector search”, which uses machine learning to capture the meaning and context of complex data types like text, images and voice. Despite a lack of clarity around the future and concerns about “hallucinations” by AI models, Mr Ghodsi stresses that these challenges are well recognised and will be overcome: “Everything just ends up being a data problem.”

This article first appeared in the August/September 2023 print edition of fDi Intelligence