آخر تحديث :

June 18, 2026

How a Gulf Government Authority Cut Call Center Escalations with Arabic Speech Recognition

دراسات الحالة

الذكاء الاصطناعي للمؤسسات

المؤلف

سارة تركي

Rym Bachouche

5 دقائق قراءة

جدول المحتويات

1 .

The Challenge

2 .

Why Generic Arabic ASR Was Not Enough

اطرح الذكاء الاصطناعي الصوتي العربي في الإنتاج

تحويل الكلام إلى نص والنص إلى كلام باللغة العربية بمستوى أصلي

مصمم لحكومات وشركات دول مجلس التعاون الخليجي

نشر سيادي ومحلي

احجز عرضًا توضيحيًا

شكرًا لك! لقد تم استلام طلبك!

عذرًا! حدث خطأ ما أثناء إرسال النموذج.

النقاط الرئيسية

A Gulf government authority reduced call center escalations and shortened compliance response times from days to hours by deploying Munsit’s Gulf dialect Arabic speech recognition. This case study explores how purpose-built Arabic STT improved call understanding, compliance monitoring, and operational efficiency beyond generic ASR solutions.

The Challenge

Government contact centres across the Gulf process millions of citizen interactions annually. For one authority managing a high-volume citizen services operation, the call center had become a structural bottleneck. Citizens calling to resolve queries, update records, or understand service eligibility were waiting longer than acceptable, and the internal QA team could review only a fraction of recorded calls manually.

‍

The underlying problem ran deeper than staffing. The existing transcription and logging system had been built around English-language ASR infrastructure. Arabic calls, which represented the majority of inbound volume, were either left untranscribed or routed through a generic Arabic model that struggled with Gulf dialect. Error rates were high enough that the output was unusable for compliance or QA purposes.

‍

Senior operations staff were manually reviewing calls flagged by supervisors, a process that was expensive, slow, and inconsistent. Reviewers had to listen to full recordings to identify issues, which meant patterns across thousands of calls remained invisible. Leadership had no reliable way to understand what citizens were asking about at scale or where agent responses were falling short.

‍

Lorem ipsum dolor

لوريم إيبسوم ألم

Lorem ipsum dolor

Why Generic Arabic ASR Was Not Enough

فهم أصول هلوسات الذكاء الاصطناعي هو الخطوة الأولى نحو التخفيف منها. هذه الظاهرة ليست مشكلة واحدة، بل هي قضية معقدة ذات عوامل متعددة تساهم فيها.

أوجه القصور في بيانات التدريب

The authority had tested two commercially available Arabic speech recognition systems before engaging CNTXT AI. Both produced transcriptions that required heavy manual correction before they could be used. The core issue: both systems were trained predominantly on Modern Standard Arabic broadcast data, while actual citizen calls featured a mix of Gulf dialect, Arabic-English code-switching for technical terminology, and significant audio quality variation due to mobile network conditions.

‍

Accuracy on these calls hovered between 55% and 65%; roughly one in three words required correction. At that error rate:

‍

Automated intent classification downstream became unreliable
The compliance team could not use the transcripts as an evidentiary record
The vendor suggestion to "clean up" call audio before processing was not operationally viable

‍

This is a gap that purpose-built Gulf dialect ASR is specifically designed to close. Generic models trained on broadcast Arabic simply do not reflect the way citizens speak.

‍

أوجه القصور في بيانات التدريب

العامل الأكثر أهمية في هلوسات الذكاء الاصطناعي هو البيانات التي تُدرّب عليها النماذج. تتعلم النماذج اللغوية الكبيرة (LLMs) من مجموعات بيانات ضخمة مجمعة من الإنترنت، والتي تحتوي على مزيج من المعلومات الواقعية والآراء والمعلومات المضللة والتحيزات. يمكن أن تؤدي العديد من المشكلات المحددة المتعلقة بالبيانات إلى الهلوسات:

حالات استخدام الذكاء الاصطناعي الصوتي العربي في الشركات لعام 2025

يفتح التحول نحو أنظمة التعرف التلقائي على الكلام (ASR) العربية التي تراعي اللهجات، آفاقاً جديدة لتطبيقات الشركات في جميع أنحاء منطقة الخليج والشرق الأوسط وشمال إفريقيا. تتجاوز المؤسسات الآن النسخ الأساسي لتصل إلى تحليلات كلام عربية متطورة.

تشهد تقنية الكلام العربية تطوراً سريعاً في عام 2025، مدفوعة بالنماذج اللغوية الضخمة متعددة اللغات والنماذج الأساسية الجديدة التي تركز على اللغة العربية.

تتقدم تقنية الكلام العربية بسرعة في عام 2025، مدفوعة بالنماذج اللغوية الضخمة متعددة اللغات ونماذج الأساس الجديدة المرتكزة على اللغة العربية.

The Approach

فهم أصول هلوسات الذكاء الاصطناعي هو الخطوة الأولى نحو التخفيف منها. هذه الظاهرة ليست مشكلة واحدة بل هي قضية معقدة ذات عوامل متعددة تساهم فيها.

أوجه القصور في بيانات التدريب

CNTXT AI deployed Munsit STT as the transcription layer within the authority's existing call recording pipeline, with no changes required to telephony infrastructure. Recorded calls were routed through the Munsit API and returned as structured transcripts with speaker-turn segmentation, allowing analysts to follow conversation structure without replaying audio.

‍

The initial deployment was handled by a single inbound service line covering approximately 2,000 calls per week. Compared to the earlier systems, Munsit's Arabic-first models, which were trained on spoken Gulf dialect data, handled dialect diversity and Arabic-English code-switching significantly more well. The transcripts were organised by speaker turn, returned almost instantly, and assigned a confidence score to each utterance.

‍

The authority's QA team immediately incorporated the transcripts directly into their existing workflow tooling. From that point, they could search by keyword, filter by topic, and flag calls for review without listening to recordings. The compliance team used the same output to build a verifiable audit trail for calls involving sensitive interactions.

‍

أوجه القصور في بيانات التدريب

أكبر عامل مساهم في هلوسات الذكاء الاصطناعي هو البيانات التي تُدرب عليها النماذج. تتعلم نماذج اللغة الكبيرة (LLMs) من مجموعات بيانات ضخمة مجمعة من الإنترنت، والتي تحتوي على مزيج من المعلومات الواقعية والآراء والمعلومات المضللة والتحيزات. يمكن أن تؤدي العديد من المشكلات المحددة المتعلقة بالبيانات إلى الهلوسات:

حالات استخدام المؤسسات للذكاء الاصطناعي الصوتي العربي في عام 2025

يفتح الانتقال إلى أنظمة التعرف التلقائي على الكلام (ASR) العربية المدركة للهجات موجة جديدة من تطبيقات المؤسسات عبر مناطق مجلس التعاون الخليجي والشرق الأوسط وشمال إفريقيا. تتجاوز المؤسسات الآن النسخ الأساسي لتصل إلى تحليلات الكلام العربية المتطورة.

يتطلب بناء أنظمة ذكاء اصطناعي أفضل اتباع النهج الصحيح

نساعد في تقديم حلول مخصصة، وخطوط أنابيب البيانات، والذكاء العربي.

اعرف المزيد

What Changed

أوجه القصور في بيانات التدريب

Three places saw operationally significant outcomes over the first eight weeks of the experimental service line.

‍

1. QA coverage and efficiency: Instead of depending only on random sampling, supervisors can search the full call record for particular terms, complaint kinds, or agent responses. Agents no longer had to self-report issues because calls with escalated complaints were immediately surfaced using keyword matching.

2. Agent training: Operations leadership identified the most frequently asked citizen queries that agents were inconsistently responding to using aggregated transcript data. The team's prior reliance on anecdotal supervisor feedback was replaced by organised training updates.

‍

3. Compliance response time: The time needed to reply to audit requests was shortened from days to hours by having a correct Arabic transcript as the record of every call. The team was surprised by how quickly the compliance use case gained practical significance.

‍

The authority is currently evaluating an expansion to cover all inbound lines, as well as automated categorization of call intent to improve routing before calls reach agents.

‍

أوجه القصور في بيانات التدريب

المساهم الأكبر في هلوسات الذكاء الاصطناعي هو البيانات التي تُدرّب عليها النماذج. تتعلم النماذج اللغوية الكبيرة (LLMs) من مجموعات بيانات ضخمة مجمعة من الإنترنت، والتي تحتوي على مزيج من المعلومات الواقعية والآراء والمعلومات المضللة والتحيزات. يمكن أن تؤدي عدة مشكلات محددة متعلقة بالبيانات إلى الهلوسات:

Three places saw operationally significant outcomes over the first eight weeks of the experimental service line.

‍

The authority is currently evaluating an expansion to cover all inbound lines, as well as automated categorization of call intent to improve routing before calls reach agents.

‍

حالات الاستخدام المؤسسية للذكاء الاصطناعي الصوتي العربي في عام 2025

يفتح الانتقال إلى تقنية التعرف التلقائي على الكلام (ASR) للغة العربية المدركة للهجات آفاقًا جديدة لتطبيقات الشركات في جميع أنحاء منطقة الخليج والشرق الأوسط وشمال إفريقيا. تتجاوز المؤسسات النسخ الأساسي لتصل إلى تحليلات الكلام العربية المتطورة.

تتطور تقنية الكلام العربية بسرعة في عام 2025، مدفوعة بالنماذج اللغوية الضخمة متعددة اللغات والنماذج التأسيسية الجديدة المرتكزة على اللغة العربية.

Key Takeaways

يُعد فهم أصول هلوسات الذكاء الاصطناعي الخطوة الأولى نحو التخفيف منها. هذه الظاهرة ليست مشكلة واحدة بل قضية معقدة ذات عوامل متعددة تساهم فيها.

أوجه القصور في بيانات التدريب

Government contact centers processing Arabic-language calls face a specific infrastructure gap. Dominant commercial ASR systems were not built for the Gulf dialect or for the code-switching patterns common in GCC citizen interactions. The consequences are measurable: low transcription accuracy blocks the downstream QA, compliance, and service improvement workflows that modern contact center operations depend on.

A purpose-built Arabic STT model by Munsit produces meaningfully better output, and the integration sits behind existing telephony infrastructure with no changes to call routing or agent tooling. The path from pilot to production is straightforward once the transcription layer is accurate enough to trust.

See how Munsit STT integrates into your call centre; try it for free today!

أوجه القصور في بيانات التدريب

حالات الاستخدام المؤسسية للذكاء الاصطناعي الصوتي العربي في عام 2025

تتقدم تقنية الكلام العربية بسرعة في عام 2025، مدفوعة بالنماذج اللغوية المتعددة الضخمة والنماذج التأسيسية الجديدة المرتكزة على اللغة العربية.