World Library  
Flag as Inappropriate
Email this Article

Urdu alphabet

Article Id: WHEBN0006688293
Reproduction Date:

Title: Urdu alphabet  
Author: World Heritage Encyclopedia
Language: English
Subject: Bharati Braille, Arabic script, Infobox language/testcases, Urdu, Urdu Braille
Collection: Arabic Alphabets, Hindustani Orthography, Urdu, Urdu Alphabets
Publisher: World Heritage Encyclopedia

Urdu alphabet

Urdu alphabet
اردو تہجی
Example of writing in the Urdu alphabet: Urdu
Languages Urdu, Balti, Burushaski, others
Parent systems

U+0600 to U+06FF
U+0750 to U+077F
U+FB50 to U+FDFF

U+FE70 to U+FEFF

The Urdu alphabet is the right-to-left alphabet used for the Urdu language. It is a modification of the Persian alphabet, which is itself a derivative of the Arabic alphabet. With 38 letters and no distinct letter cases, the Urdu alphabet is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly in the Naskh style. Usually, bare transliterations of Urdu into Roman letters (called Roman Urdu) omit many phonemic elements that have no equivalent in English or other languages commonly written in the Latin script. The National Language Authority of Pakistan has developed a number of systems with specific notations to signify non-English sounds, but these can only be properly read by someone already familiar with the loan letters.


  • History 1
    • Countries where Urdu language has been spoken 1.1
  • Nastaʿlīq 2
  • Alphabet 3
  • Vowels 4
    • Vowel chart 4.1
    • Alif 4.2
    • Wāʾo 4.3
    • Ye 4.4
  • Diacritics 5
    • Short vowels 5.1
    • Long vowels 5.2
    • Additional Diacritics 5.3
  • Special forms 6
    • Nūn ghunnah 6.1
    • Kāf or Gāf with Alif or Lām 6.2
    • Lām-alif 6.3
  • Use of specific letters 7
    • Retroflex letters 7.1
    • Do-cashmī he 7.2
  • Iẓāfat 8
  • Romanization standards and systems 9
  • See also 10
  • References 11
  • Sources 12
  • External links 13


The Urdu language emerged as a distinct register of Hindustani well before the Partition of India. It is distinguished most by its extensive Persian influences (Persian having been the official language of the Mughal government and the most prominent lingua franca of the Indian subcontinent for several centuries before the solidification of British colonial rule during the 19th century). The standard Urdu script is a modified version of the Perso-Arabic script, expanded to accommodate the phonology of Hindustani.

Despite the invention of the Urdu typewriter in 1911, Urdu newspapers continued to publish prints of handwritten scripts by calligraphers known as katibs or khush-navees until the late 1980s. The Pakistani national newspaper Daily Jang was the first Urdu newspaper to use Nastaʿlīq computer-based composition. There are efforts under way to develop more sophisticated and user-friendly Urdu support on computers and the internet. Nowadays, nearly all Urdu newspapers, magazines, journals, and periodicals are composed on computers with Urdu software programs.

Apart from being more or less Persianate, Urdu and Hindi are mutually intelligible.

Countries where Urdu language has been spoken

Afghanistan, Bahrain, Bangladesh, Botswana, Burma, Canada, France, Fiji, Germany, Guyana, India, Kenya, Malaysia, Malawi, Mauritius, Norway, Oman, Pakistan, Qatar, Saudi Arabia, Singapore, South Africa, Thailand, Tajikistan, the UAE,USA, the UK, Uganda, Uzbekistan, and Zambia.[1]


The Nastaʿlīq calligraphic writing style began as a Persian mixture of scripts Naskh and Ta'liq. After the Mughal conquest, Nasta'liq became the preferred writing style for Urdu. It is the dominant style in Pakistan, and many Urdu writers elsewhere in the world use it. Nastaʿlīq is more cursive and flowing than its Naskh counterpart.


A list of the letters of the Urdu alphabet and their pronunciation is given below. Urdu contains many historical spellings from Arabic and Persian, and therefore has many irregularities. The Arabic letters yāʾ and hāʾ both have two variants in Urdu: one of the yāʾ variants is used at the ends of words for the sound [eː], and one of the hāʾ variants is used to indicate the aspirated consonants. The retroflex consonants needed to be added as well; this was accomplished by placing a small t̤oʾe ـ﯀ـ above the corresponding dental consonants. Several letters which represent distinct consonants in Arabic are conflated in Persian, and this has carried over to Urdu. This is the list of the Urdu letters, giving the consonant pronunciation. Some of these letters also represent vowel sounds.

No. Name[2] ALA-LC[3] Hunterian[4] IPA Isolated glyph
1 الف alif ā, ʾ, – /ɑː, ʔ, ∅/ ا
2 بے be b /b/ ب
3 پے pe p /p/ پ
4 تے te t /t̪/ ت
5 ٹے ṭe t /ʈ/ ٹ
6 ثے s̱e s /s/ ث
7 جیم jīm j /d͡ʒ/ ج
8 چے ce c ch /t͡ʃ/ چ
9 بڑی حے baṛī ḥe h /h, ɦ/ ح
10 خے khe kh kh /x/ خ
11 دال dāl d /d̪/ د
12 ڈال ḍāl d /ɖ/ ڈ
13 ذال ẕāl z /z/ ذ
14 رے re r /r/ ر
15 ڑے ṛe r /ɽ/ ڑ
16 زے ze z /z/ ز
17 ژے zhe zh zh /ʒ/ ژ
18 سین sīn s /s/ س
19 شین shīn sh sh /ʃ/ ش
20 صواد ṣwād s /s/ ص
21 ضواد ẓwād z /z/ ض
22 طوئے t̤oʾe t /t̪/ ط
23 ظوئے z̤oʾe z /z/ ظ
24 عین ʿain ā, o, e, ʿ, – /ɑː, oː, eː, ʔ, ʕ, ∅/ ع
25 غین ghain gh gh /ɣ/ غ
26 فے fe f /f/ ف
27 قاف qāf q /q/ ق
28 کاف kāf k /k/ ك
29 گاف gāf g /ɡ/ گ
30 لام lām l /l/ ل
31 میم mīm m /m/ م
32a نون nūn n /n, ɲ, ɳ, ŋ/ ن
32b نون غنہ nūn ghunnah n /◌̃/ ں
33 واؤ wāʾo v, ū, o, au w, ū, o, au /ʋ, uː, oː, ɔː/ و
34 چھوٹی ہے choṭī he h /h, ɦ/ or /∅/ ه
35 دو چشمی ہے do-cashmī he h /ʰ/ or /ʱ/ ھ
36 ہمزہ hamzah ʾ, – /ʔ/, /∅/ ء
37 چھوٹی یے choṭī ye y, ī, á /j, iː, ɑː/ ی
38 بڑی یے baṛī ye ai, e /ɛː, eː/ ے


Vowels in Urdu are represented by letters that are also considered consonants. Many vowel sounds can be represented by one letter. Ambiguity can arise, but context is usually enough to figure out the correct sound.

Vowel chart

Urdu doesn't have standalone vowel letters. Vowels are either represented by diacritics upon the preceding consonant (often omitted unless ambiguity can arise), or by consonants y and w, (with or without diacritics) used as long vowels. The letter alif acts as a place holder for vowels beginning a syllable and as a long vowel ā within and at the end of a syllable. This is a list of Urdu vowels:

Romanization Pronunciation
a /ə/
ā /aː/
i /ɪ/
ī /iː/
u /ʊ/
ū /uː/
e /eː/
ai /ɛː/
o /oː/
au /ɔː/


Alif is the first letter of the Urdu alphabet, and it is used exclusively as a vowel. At the beginning of a word, alif can be used to represent any of the short vowels: اب āb, اسم ism, اردو Urdū. For long ā at the beginning of words alif-mad is used: آپ āp, but a plain alif in the middle and at the and: بھاگنا bhāgnā.


Wāʾo is used to render the vowels "ū", "o", "u" and "au" ([uː], [oː], [ʊ] and [ɔː] respectively), and it is also used to render the labiodental approximant, [ʋ].


Ye is divided into two variants: choṭī ye ("little ye") and baṛī ye ("big ye").

Choṭī ye (ی) is written in all forms exactly as in Persian. It is used for the long vowel "ī" and the consonant "y".

Baṛī ye (ے) is used to render the vowels "e" and "ai" (/eː/ and /ɛː/ respectively). Baṛī ye is distinguishable in writing from choṭī ye only when it comes at the end of a word/ligature. Additionally, Baṛī ye is never used to begin a word/ligature, unlike choṭī ye.


Urdu has several available diacritics, especially used for vowels. However, the diacritics for the short vowels are often omitted in practice.

Short vowels

Short vowels ("a", "i", "u") are represented by marks above and below a consonant.

Vowel Name Example Transcription IPA
اَ zabar بَن ban a /ə/
اِ zer بِن bin i /ɪ/
اُ pesh بُن bun u /ʊ/

Long vowels

Long vowels ("ā"/"á", "ī", "ū") are also represented by diacritical marks madd / khaṛā zabar, khaṛī zer, and ulṭā pesh, respectively.

Vowel Name Example Transcription IPA
آ madd قرآن Qurʾān ā /aː/
یٰ khaṛā zabar رحمٰن Raḥmān
موسیٰ Mūsá á
یٖ khaṛī zer بہٖ bihī ī /iː/
وٗ ulṭā pesh لہٗ lahū ū /uː/

Additional Diacritics

These other diacritics are seen less often in day-to-day Urdu.

Symbol Name Example Purpose
ّ tashdīd اچّھا acc Doubles the consonant that the sign is over. In this case, the چ is doubled.
ْ jazm / sukūn ارْدو Ur Means there is no vowel separating the consonants. Similar to the halant character in Hindi.
اً tanwīn فوراً fauran Mostly occurs in Arabic loan words (adverbs). Gives an '-an' sound at the end of a word

Special forms

Nūn ghunnah

Nūn ghunnah ("nasal nūn") is used to indicate nasalization in words. It is almost identical to nūn, missing the dot in the center. This is only apparent, however, when it is at the end of the word. In medial form, it will appear the same as nūn, though an additional small diacritical nūn ghunnah is also placed sometimes to distinguish it from a regular medial nūn.


Urdu Transcription
میں maiṉ
کنواں kuṉwāṉ

Kāf or Gāf with Alif or Lām

Both kāf (ك) and gāf (گ) take different forms when they combine with alif (ا)or lām (ل).

Letters Conjunct
ك + ا كا
گ + ا گا
ك + ل كل
گ + ل گل


Similar to Arabic and Persian, lām and alif form a special conjunct when lām precedes alif.

ل + ا = لا

Use of specific letters

Retroflex letters

Retroflex consonants were not present in the Persian alphabet, and therefore had to be created specifically for Urdu. This was accomplished by placing a superscript t̤oʾe ـ﯀ـ above the corresponding dental consonants.

Perso-Arabic dental consonant Derived retroflex consonant Name IPA
ت ٹ ṭe [ʈ]
د ڈ ḍāl [ɖ]
ر ڑ ṛe [ɽ]

Do-cashmī he

The letter do-cashmī he (ھ), "two-eyed he", is used in native Hindustani words, for aspiration of certain consonants. The aspirated consonants are sometimes classified as separate letters, although it takes two characters to represent them.

Digraph[3] Transcription[3] IPA
بھ bh [bʱ]
پھ ph [pʰ]
تھ th [t̪ʰ]
ٹھ ṭh [ʈʰ]
جھ jh [d͡ʒʰ]
چھ ch [t͡ʃʰ]
دھ dh [d̪ʱ]
ڈھ ḍh [ɖʱ]
ڑھ ṛh [ɽʱ]
کھ kh [kʰ]
گھ gh [ɡʱ]



is a syntactical construction of two nouns, where the first component is a determined noun, and the second is a determiner. This construction was borrowed from Persian. A short vowel "i" is used to connect these two words. It may be written as zer (ــِ) at the end of the first word, but usually is not written at all. If the first word ends in choṭī he (ه) or ye (ی) then hamzā (ء) is used above the last letter (ۂ or ئ). If the first word ends in a long vowel then baṛī ye (ے) with hamzā on top (ئے) is written.[5]

Forms Example Transliteration Meaning
ــِ شيرِ پنجاب sher-i Panjāb the lion of Punjab
ۂ قطرۂ آب qat̤rah-yi āb (a) drop of water
ئ ولئ كامل walī-yi kāmil perfect saint
ئے روئے زمين -yi zamīn surface of the Earth
صدائے بلند ṣadā-yi buland a high voice

Romanization standards and systems

There are several Romanization standards for writing Urdu. Among them, the most prominent are Uddin and Begum Urdu-Hindustani Romanization , ALA-LC romanization and ArabTeX .

See also


  1. ^ "Urdu".
  2. ^ Delacy 2003, p. XV–XVI.
  3. ^ a b c "Urdu romanization" (PDF). The Library of Congress. 
  4. ^ Geographical Names Romanization in Pakistan. UNGEGN, 18th Session. Geneva, 12-23 August 1996. Working Papers No. 85 and No. 85 Add. 1.
  5. ^ Delacy 2003, p. 99–100.


  • Delacy, Richard (2003). Beginner's Urdu Script. McGraw-Hill. 
  • Delacy, Richard (2010). Read and write Urdu script. McGraw-Hill. 
  • "Urdu romanization" (PDF). The Library of Congress. 
  • Ishida, Richard. "Urdu script notes". 

External links

  • Urdu alphabet
  • Urdu alphabet with Devanagari equivalents
  • Hugo's Urdu Alphabet Page
  •, a resource for Urdu calligraphy and script
  • Urdu Script Introduction from Columbia University
  • National Council for Promotion of Urdu Language
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World eBook Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.