Case Studies

l 5min

How a GCC Telco Built an Arabic Speech-to-Text Dataset from Call Archives

Enterprise AI

Author

Khalid Ghiboub

Table of Content

1 .

The Challenge

2 .

The Data Pipeline Problem

Powering the Future with AI

Join our newsletter for insights on cutting-edge technology built in the UAE

Key Takeaways

Arabic Voice AI models trained on public MSA data will underperform on real customer speech in Gulf markets; the dialect gap is real and measurable.

‍

Most telcos already hold the right raw material in their call archives. The missing piece is the infrastructure to process it at scale.

‍

A high-accuracy Arabic STT layer combined with a specialized Arabic annotation capability can convert that archive from a storage cost into a strategic AI training asset.

‍

The pipeline is repeatable, meaning the dataset grows as the business does, without starting from scratch each time.

‍

A GCC telecom operator transformed 10,000 archived customer calls into a high-quality Arabic speech-to-text training dataset using Munsit STT and expert annotation. The resulting Gulf dialect dataset improved intent-classification accuracy and created a scalable foundation for future AI model development.

‍

The Challenge

Telcos building AI for customer-facing applications, intent classification, sentiment analysis, churn prediction, and virtual agents need training data. For Arabic-language models, that data is hard to find. Publicly available Arabic speech datasets are mostly Modern Standard Arabic sourced from news broadcasts. They don't reflect how customers actually speak when calling a telco in the Gulf.
‍

A GCC telco's data science team had a clear use case: fine-tune an intent classification model covering billing, technical support, plan changes, complaints, and service requests. The training data had to reflect real customer language, Gulf dialect, code-switching, and the specific product vocabulary their customers used every day.
‍

They already had hundreds of thousands of call recordings in their archive. In theory, this was exactly the training source they needed. The recordings were unusable for training without transcription, diarization, and labeling.

‍

This is some text inside of a div block.

The Data Pipeline Problem

Transcribing and labeling call recordings at scale is expensive and slow. The team had received a quote from a general transcription provider for 10,000 calls. The cost was high, the turnaround was weeks, and the provider had no specific capability in Gulf Arabic, meaning every transcript would need native speaker review before it could be used for training.
‍

What the team needed was a pipeline that could handle three things:
‍

Transcribe Arabic calls at scale with Gulf dialect accuracy
‍
Segment transcripts by speaker so customer utterances could be extracted separately from agent responses
‍
Pass the resulting text to a labeling workflow where customer intent could be classified

‍

This is some text inside of a div block.

Heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

The Approach

CNTXT AI addressed both stages of the pipeline directly. Munsit STT processed a batch of 10,000 call recordings from the telco's archive via the API in batch mode. Each call was returned as a speaker-diarized transcript, with customer utterances automatically extracted and separated from agent turns.
‍

‍

Those customer utterances were then passed to CNTXT AI's Arabic data annotation team for intent labeling. Annotators classified each utterance against a 28-category taxonomy built jointly with the telco's data science team, covering intent categories specific to their service context, not generic call centre categories. Quality control included double annotation on 15% of utterances, with inter-annotator agreement measured and resolved by a senior reviewer.
‍

‍

The final output was a labeled Arabic speech dataset specific to the telco's customer interaction domain and formatted for direct use in the team's fine-tuning workflow.

‍

Results

The data science team had a usable training dataset within six weeks of project start. The manual route would have taken months.

‍

The intent classification model fine-tuned on this dataset outperformed the version trained on public Arabic data on the telco's internal evaluation set. The improvement was most visible in two areas:
‍

Gulf dialect inputs, where public MSA training data consistently fell short
‍
Product-specific terminology, vocabulary that appeared frequently in the telco's calls but was absent from broadcast Arabic datasets
‍

Beyond the initial model improvement, the team now has a repeatable pipeline for expanding training data as the product portfolio grows and new intent categories emerge. That same pipeline is currently being used to build a sentiment analysis training set from a separate call sample.

‍

See how Munsit performs on real Arabic speech

Evaluate dialect coverage, noise handling, and in-region deployment on data that reflects your customers.

Explore

FAQ

Powering the Future with AI

Join our newsletter for insights on cutting-edge technology built in the UAE

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Arabic Voice AI

Case Studies

From Audio Archive to Published Article: Arabic Podcast Transcription for Digital Media

Arabic podcast transcription: See how a MENA media company used Munsit STT to transcribe 200 episodes, cut article production time by 55%, and boost organic search traffic.

Arabic Voice AI

Case Studies

Arabic Voiceover at Scale: How a MENA Broadcaster Integrated TTS Into Its Production Workflow

See how a MENA broadcaster used Faseeh Arabic TTS to go from 7-day voiceover turnarounds to same-day production without compromising on audio quality.

Enterprise AI

Case Studies

How a GCC Telco Built an Arabic Speech-to-Text Dataset from Call Archives

A GCC telco used Munsit STT and specialized Arabic annotation to turn 10,000 call recordings into a labeled Arabic speech-to-text dataset, improving intent-classification on Gulf dialects in six weeks

Arabic Voice AI

Case Studies

How a GCC Telco Cut Misrouted Calls by Fixing Arabic IVR Speech Recognition

A GCC telecom operator reduced IVR intent fallback rates and misrouted calls by replacing generic ASR with Munsit's Gulf dialect Arabic speech-to-text. See how

Arabic Voice AI

Case Studies

Arabic TTS in Islamic Finance: How a Mobile Banking App Reduced Support Calls with Munsit

Learn how a regional Islamic finance institution used Munsit's Arabic text-to-speech (Faseeh) in its mobile banking app to reduce support calls and improve product comprehension.

Arabic Voice AI

Case Studies

Arabic Call Center QA at Scale: How a UAE Bank Moved from Sampling to Full Coverage

A UAE retail bank replaced manual Arabic call center QA with Munsit STT, achieving 100% call coverage, Gulf dialect accuracy, and compliance-ready transcripts at scale.

Arabic Voice AI

Case Studies

Arabic TTS for Government Digital Services: How Natural Voice Closed an Accessibility Gap

See how Arabic TTS improved accessibility in GCC government digital services with clearer voice guidance, better form completion, and fewer support issues.

Enterprise AI

Case Studies

How a Gulf Government Authority Cut Call Center Escalations with Arabic Speech Recognition

A Gulf government authority cut call center escalations and reduced compliance response time from days to hours using Munsit's Gulf dialect Arabic STT. See how purpose-built Arabic speech recognition outperformed generic ASR models.

Speech Recognition

Tech Deep Dive

Arabic ASR: A Guide to Why Dialects Are Key to Accuracy

A deep dive into how Automatic Speech Recognition (ASR) works for Arabic. Learn why dialects break generic models and why a dialect-first approach is essential for enterprise accuracy.

Compliance

How-To

From Transcription to Intelligence: Building Compliant Arabic Voice AI for Regulated Industries

Learn how to build compliant Arabic voice AI for GCC banking and healthcare. Navigate PDPL, UAE data laws, dialect complexity, and audit-ready voice intelligence

Machine Learning

Tech Deep Dive

Arabic Acoustic Modeling: A Guide to Vowels, Emphatics, and Dialects

A deep dive into the challenges of Arabic acoustic modeling for ASR. Learn about short vowels, diacritics, emphatic consonants, and dialectal shifts.

Performance

Tech Deep Dive

WER vs. CER: How to Measure Arabic ASR Accuracy

A guide to Word Error Rate (WER) and Character Error Rate (CER) for Arabic speech recognition. Learn why WER fails for Arabic and how to evaluate ASR accuracy.

Enterprise AI

Case Studies

The Strategic Value of Arabic Speech to Text for Enterprises

Learn about the strategic value of Arabic speech-to-text for enterprises. A deep dive into the market opportunity, business impact, and technical reality of Arabic ASR.

Machine Learning

How-To

The Foundation of Voice: How to Build High-Quality Arabic Speech Training Data

Learn how to build high-quality Arabic speech datasets for ASR and TTS. A deep dive into data curation, quality control, and handling dialectal diversity.

Ai Architecture

How-To

Streaming vs. Batch Transcription: A Guide to Real-Time Transcription Architecture

Learn when to use streaming vs. batch transcription for your enterprise. A deep dive into real-time transcription architecture, trade-offs, and hybrid approaches.

Arabic Voice AI

Product

Introducing Munsit: The First Arabic Speech-to-Text App Built for You

Introducing Munsit, the first Arabic transcription app built for dialects, code-switching, and real-world use. Download now for fast, accurate Arabic voice-to-text.

Performance

How-To

How to Optimize Real-Time Arabic ASR Performance

A deep dive into optimizing real-time Arabic ASR. Learn about latency, throughput, model compression (quantization, pruning), and streaming architectures.

Voice Technology

Tech Deep Dive

How Natural Arabic Text-to-Speech Works: A Guide to Prosody, Waveforms, and Voice Quality

A deep dive into how natural Arabic Text-to-Speech (TTS) is made. Learn about prosody, neural vocoders like HiFi-GAN, and the challenges of dialects and diacritization.

Speech Recognition

Tech Deep Dive

How Arabic Dialect Recognition Works

A deep dive into how Arabic Dialect Identification (ADI) works. Learn about the phonetic and morphological clues AI uses to distinguish Arabic dialects.

Voice Technology

How-To

A Guide to Designing Arabic Voice UX

Learn how to design effective Arabic voice UX. A deep dive into handling Arabic-English code-switching, designing for accessibility, and navigating cultural context.

Arabic Voice AI

Product

Beyond Multilingual Models: Why Arabic Voice AI Needs Its Own Technology

Explore the linguistic, dialectal, and cultural reasons why generic multilingual models fail for Arabic and why a ground-up approach to voice AI is essential for the Arab world.

Natural Language Processing

How-To

Arabic NLP: A Guide to Dialects, Code-Switching, and ROI

A comprehensive guide to enterprise Arabic NLP. Learn why global models fail on dialects and code-switching and how to achieve ROI with a regionally grounded approach.

Performance

Tech Deep Dive

Arabic Dialects and Domain Context: Why Generic Models Fail Business Accuracy Tests

Discover why generic ASR models fail on Arabic dialects and domain-specific terms. See how dialect-aware Arabic ASR achieves up to 6.5x better accuracy for business.

Ai Architecture

How-To

A Guide to Sovereign AI Architecture, GPU Infrastructure, and Hybrid Deployments

Learn about Sovereign AI architecture, from GPU infrastructure to hybrid cloud deployments. A deep dive into the strategic imperative for nations like the UAE and Saudi Arabia.

Ai Architecture

Product

A Guide to Retrieval-Augmented Generation (RAG) for Arabic Conversational AI

Learn how Retrieval-Augmented Generation (RAG) makes Arabic conversational AI more accurate. A deep dive into RAG architecture, challenges, and applications.

Compliance

How-To

Data Sovereignty in the UAE Public Sector

Learn how to navigate data sovereignty in the UAE public sector. A comprehensive guide to the PDPL, deployment models, and sovereign cloud solutions.

Arabic Voice AI

The Future of Arabic Speech Technology: 2025 Trends & Beyond

Explore the future of Arabic speech technology in 2025 and beyond, including AI voice agents, dialect support, speech recognition, and emerging trends.

Home

Blog

How a GCC Telco Built an Arabic Speech-to-Text Dataset from Call Archives

Last update :

June 24, 2026

How a GCC Telco Built an Arabic Speech-to-Text Dataset from Call Archives

Case Studies

Enterprise AI

Author

Sarra Turki

Khalid Ghiboub

5min read

Table of Content

1 .

The Challenge

2 .

The Data Pipeline Problem

Bring Arabic Voice AI to production

Native‑level Arabic STT & TTS

Built for GCC gov & enterprises

Sovereign and on‑prem deployment

Contact Sales

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Key Takeaways

Arabic Voice AI models trained on public MSA data will underperform on real customer speech in Gulf markets; the dialect gap is real and measurable.

‍

Most telcos already hold the right raw material in their call archives. The missing piece is the infrastructure to process it at scale.

‍

A high-accuracy Arabic STT layer combined with a specialized Arabic annotation capability can convert that archive from a storage cost into a strategic AI training asset.

‍

The pipeline is repeatable, meaning the dataset grows as the business does, without starting from scratch each time.

‍

The Challenge

‍

Lorem ipsum dolor

The Data Pipeline Problem

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

What the team needed was a pipeline that could handle three things:
‍

Transcribe Arabic calls at scale with Gulf dialect accuracy
‍
Segment transcripts by speaker so customer utterances could be extracted separately from agent responses
‍
Pass the resulting text to a labeling workflow where customer intent could be classified

‍

Training Data Deficiencies

The most significant contributor to AI hallucinations is the data on which the models are trained. LLMs learn from vast datasets scraped from the internet, which contain a mixture of factual information, opinions, misinformation, and biases. Several specific data-related issues can lead to hallucinations:

Enterprise Use Cases for Arabic Voice AI in 2025

The move to dialect-aware Arabic ASR is unlocking a new wave of enterprise applications across the GCC and MENA regions. Organizations are moving beyond basic transcription to sophisticated Arabic speech analytics.

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

The Approach

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

‍

The final output was a labeled Arabic speech dataset specific to the telco's customer interaction domain and formatted for direct use in the team's fine-tuning workflow.

‍

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

Building better AI systems takes the right approach

We help with custom solutions, data pipelines, and Arabic intelligence.

Learn more

Results

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

The data science team had a usable training dataset within six weeks of project start. The manual route would have taken months.

‍

Gulf dialect inputs, where public MSA training data consistently fell short
‍
Product-specific terminology, vocabulary that appeared frequently in the telco's calls but was absent from broadcast Arabic datasets
‍

‍

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

Understanding the origins of AI hallucinations is the first step toward mitigating them. The phenomenon is not a single problem but rather a complex issue with multiple contributing factors.

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Arabic speech technology is rapidly advancing in 2025, driven by massive multilingual models and new Arabic-centric foundation models.

FAQ

Bring Arabic Voice AI to production

Native‑level Arabic STT & TTS

Built for GCC gov & enterprises

Sovereign and on‑prem deployment

Contact Sales

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Start free.
Pay when you are ready.

10,000 credits. Test Munsit with your own audio, in your own dialect, and see the accuracy for yourself.

Start Free

Talk to Sales

How a GCC Telco Built an Arabic Speech-to-Text Dataset from Call Archives

Powering the Future with AI

Key Takeaways

The Challenge

The Data Pipeline Problem

Heading

The Approach

Results

See how Munsit performs on real Arabic speech

FAQ

Powering the Future with AI

Related articles

From Audio Archive to Published Article: Arabic Podcast Transcription for Digital Media

Arabic Voiceover at Scale: How a MENA Broadcaster Integrated TTS Into Its Production Workflow

How a GCC Telco Built an Arabic Speech-to-Text Dataset from Call Archives

How a GCC Telco Cut Misrouted Calls by Fixing Arabic IVR Speech Recognition

Arabic TTS in Islamic Finance: How a Mobile Banking App Reduced Support Calls with Munsit

Arabic Call Center QA at Scale: How a UAE Bank Moved from Sampling to Full Coverage

Arabic TTS for Government Digital Services: How Natural Voice Closed an Accessibility Gap

How a Gulf Government Authority Cut Call Center Escalations with Arabic Speech Recognition

Arabic ASR: A Guide to Why Dialects Are Key to Accuracy

From Transcription to Intelligence: Building Compliant Arabic Voice AI for Regulated Industries

Arabic Acoustic Modeling: A Guide to Vowels, Emphatics, and Dialects

WER vs. CER: How to Measure Arabic ASR Accuracy

The Strategic Value of Arabic Speech to Text for Enterprises

The Foundation of Voice: How to Build High-Quality Arabic Speech Training Data

Streaming vs. Batch Transcription: A Guide to Real-Time Transcription Architecture

Introducing Munsit: The First Arabic Speech-to-Text App Built for You

How to Optimize Real-Time Arabic ASR Performance

How Natural Arabic Text-to-Speech Works: A Guide to Prosody, Waveforms, and Voice Quality

How Arabic Dialect Recognition Works

A Guide to Designing Arabic Voice UX

Beyond Multilingual Models: Why Arabic Voice AI Needs Its Own Technology

Arabic NLP: A Guide to Dialects, Code-Switching, and ROI

Arabic Dialects and Domain Context: Why Generic Models Fail Business Accuracy Tests

A Guide to Sovereign AI Architecture, GPU Infrastructure, and Hybrid Deployments

A Guide to Retrieval-Augmented Generation (RAG) for Arabic Conversational AI

Data Sovereignty in the UAE Public Sector

The Future of Arabic Speech Technology: 2025 Trends & Beyond

How a GCC Telco Built an Arabic Speech-to-Text Dataset from Call Archives

Bring Arabic Voice AI to production

Key Takeaways

The Challenge

The Data Pipeline Problem

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

The Approach

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Building better AI systems takes the right approach

Results

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Training Data Deficiencies

Training Data Deficiencies

Enterprise Use Cases for Arabic Voice AI in 2025

Bring Arabic Voice AI to production

Start free. Pay when you are ready.

Start free.
Pay when you are ready.