Building agentic AI for Arabic dialects

Why it's so difficult, the benefits of offering LLM flexibility, and how AI is changing the enterprise pricing model.

Hi, friends ๐Ÿ‘‹

Despite the hype, last year, funding for AI startups in the MENA region accounted for just 2.8% of total capital deployed, spread across 36 startups.

Compared to the 46.4% of US VC funding that went to AI startups in 2024 โ€“ itโ€™s a cavernous gap.

No doubt, weโ€™ll see (and are already seeing) this chasm begin to close in 2025, but that doesnโ€™t mean last year lacked standout regional AI startups securing backing.

Take for example, DXwand โ€“ a digital agents platform offering off-the-shelf AI agents for specific use cases, along with a builder platform for creating custom solutions.

Founded in 2017 by Ahmed Mahmoud and later joined by Mahmoud Gomaa, DXwand raised $4 million in Series A funding, led by UAE-based Shorooq and Cairo-based Algebra Ventures, with participation from existing investor Dubai Future District Fund.

I recently had the pleasure of sitting down with Ahmed, CEO and co-founder of DXwand, for a wide-ranging, insightful conversation on all things enterprise AI.

In our interview, we covered:

  • ๐ŸŒ The unique technical challenges of supporting Arabic dialects in AI

  • โš–๏ธ The horizontal vs. vertical specialisation debate in SaaS

  • ๐Ÿ”„ The benefits of LLM agnosticism and offering flexibility to clients

  • ๐Ÿ’ก How to show tangible value early in the enterprise sales process

  • ๐Ÿ’ฌ The shift from per-seat pricing to a per-conversation model in the age of AI

  • ๐Ÿ“Š Demonstrating ROI with metrics beyond automation

  • ๐Ÿ“ก Partnering with telecom companies and technology providers to scale effectively

Actionable insights ๐Ÿง  ๐Ÿ› ๏ธ

If you only have a few minutes to spare, here's what investors, operators, and founders should know about DXwand and building agentic AI for Arabic dialects and enterprises.

Premium members get the full version of this article, plus a TLDR summary right here.

Okay, letโ€™s dig into it ๐Ÿ‘‡

Ahmed Mahmoud and Mahmoud Gomaa

How did you first notice a gap in conversational AI tools for Arabic dialects in the MENA region?

Around 2017, conversational AI was just beginning to emerge as a commercial offering. While the technology was still in its infancy, there was already significant demand from clients. However, support for spoken Arabic dialects โ€“ and even standard Arabic โ€“ lagged far behind the extensive support available for English. This gap wasn't unique to Microsoft Middle East and Africa, where I had an engineering and enterprise sales background; it was a challenge across all major Silicon Valley companies. As a result, clients often had to manage Arabic translations themselves and develop their own Arabic language models to meet the growing demand.

The initial vision was to develop a digital sales agent and fulfilment system tailored for small businesses, particularly solo entrepreneurs using Facebook and other social platforms to sell their products. The tool enabled users to respond to customer inquiries and manage sales seamlessly through CRM integration, supporting both Arabic and English. No technical expertise was required โ€” users simply installed the app on their Shopify store, and it automatically synced with their product collections. For a straightforward monthly fee, the system handled customer queries effortlessly.

At the time, WhatsApp integration was not yet available in the Middle East, so our solution primarily focused on Facebook and websites. However, once the WhatsApp API was introduced, we quickly integrated it into the system.

What made Arabic dialects particularly challenging from a technical perspective?

The main challenge โ€” one that still persists today โ€” is the limited amount of digital Arabic content available online. Even if you create a massive language model with billions of parameters for Arabic, it would be like putting a high-performance engine into a basic car body. While the model would be powerful, the data fuelling it would be too limited for optimal performance.

The first challenge we faced was figuring out how to build effective models despite this data scarcity. Our approach was to acquire data directly from clients and develop highly specialised models tailored to specific use cases and industries. Creating a broad, general-purpose Arabic model would have been ineffective. Instead, we focused on modelling common use cases, such as customer complaints, sales inquiries, and branch locations.

By concentrating on these key areas, we achieved strong results across multiple Arabic dialects โ€” supporting around seven dialects at the time. Itโ€™s worth noting that even within Egypt alone, there are nearly 10 unique dialects, making the challenge even more complex.

While synthetic data can help, the challenge lies in sourcing it and the potential biases it introduces. Raw data is always preferred for training high-quality models. Iโ€™m not suggesting that Arabic content is unavailable, but itโ€™s significantly less abundant compared to English content.

Back in 2017 and 2018, generating synthetic data wasnโ€™t even a viable option. We had to acquire data from clients and purchase datasets where possible. Now, generating synthetic data has become more feasible, and in fact, our platform currently uses LLMs capable of generating Arabic data to train models that lack Arabic support. This approach has significantly improved model performance while still ensuring data quality and representation.

What drove the shift from SMB tools to a focus on enterprise conversational AI and knowledge mining?

The remainder of this newsletter is for premium members only.

Donโ€™t miss out! Become a member today.

A subscription gets you full access to our weekly deep-dives, which include:

โœ… Analysis, case studies and interviews unpacking trends, companies, or industries, and more.

โœ… Access to the strategies, tactics, and wisdom of MENA's best investors and founders.

โœ… Practical and actionable guides designed to make you a better investor and builder.

โœ… Unlimited access to our online archive where you can read previous editions of the newsletter.

๐Ÿ‘‹ Message from the team

Thanks for reading this weekโ€™s edition!

If youโ€™re enjoying the newsletter, donโ€™t forget to share it with a friend!

Have a question or any feedback? Just hit reply, or provide a rating below - we want to hear from you!!

How was this newsletter edition?

Rate it and shell out your feedback!

Login or Subscribe to participate in polls.

Was this forwarded to you? Sign up here.