TY - JOUR
T1 - Framework of diacritic segmentation for Arabic handwritten document
AU - Sheikh, Ahmed Abdalla
AU - Azmi, Mohd Sanusi
AU - Aziz, Maslita Abd
AU - Al-Mhiqani, Mohammed Nasser
AU - Bafjaish, Salem Saleh
N1 - Funding Information:
The authors thank the Ministry of Education for funding this study through the following grants: FRGS/1/2017/ICT02/FTMK-CACT/F00345. Gratitude is also due to Universiti Teknikal Malaysia Melaka and Faculty of Information Technology and Communication for providing excellent research facilities.
Publisher Copyright:
© This is an open access article under the CC BY-SA license.
PY - 2021/11/1
Y1 - 2021/11/1
N2 - In recent Arabic standard language and Arabic dialectal texts, diacritics and short vowels are absent. There are some exceptions have been made for the Arabic beginner learner scripts, religious texts and as well as a significant political text. In addition, the text without diacritics is considered ambiguous due to numerous words with different diacritic marks seem identical. However, this paper we present a framework for segmenting diacritics from Arabic handwritten document by using region-based segmentation technique. Since Arabic handwritten and Mushaf Al-Quran contain many diacritical marks. Hence, the diacritics must be properly extracted from Arabic handwritten document to avoid losing some good features. Furthermore, the proposed framework is devised specifically to segment diacritics from Arabic handwritten image, thus there will be no feature extraction, feature selection, and classification processes included. Besides, we will present the methodology that is used to fulfil the objectives of this paper. The preprocessing phases will be explained and more specifically segmentation phase for segmenting diacritics which is the phase we concentrate more in this article. Lastly, we will identify the proposed technique region-based segmentation to facilitate our development throughout the experimental process.
AB - In recent Arabic standard language and Arabic dialectal texts, diacritics and short vowels are absent. There are some exceptions have been made for the Arabic beginner learner scripts, religious texts and as well as a significant political text. In addition, the text without diacritics is considered ambiguous due to numerous words with different diacritic marks seem identical. However, this paper we present a framework for segmenting diacritics from Arabic handwritten document by using region-based segmentation technique. Since Arabic handwritten and Mushaf Al-Quran contain many diacritical marks. Hence, the diacritics must be properly extracted from Arabic handwritten document to avoid losing some good features. Furthermore, the proposed framework is devised specifically to segment diacritics from Arabic handwritten image, thus there will be no feature extraction, feature selection, and classification processes included. Besides, we will present the methodology that is used to fulfil the objectives of this paper. The preprocessing phases will be explained and more specifically segmentation phase for segmenting diacritics which is the phase we concentrate more in this article. Lastly, we will identify the proposed technique region-based segmentation to facilitate our development throughout the experimental process.
KW - Arabic handwritten segmentation
KW - Diacritics segmentation
KW - Image segmentation phase
KW - Pre-processing phase
KW - Region-based
UR - http://www.scopus.com/inward/record.url?scp=85118923956&partnerID=8YFLogxK
U2 - 10.11591/ijeecs.v24.i2.pp1001-1008
DO - 10.11591/ijeecs.v24.i2.pp1001-1008
M3 - Article
AN - SCOPUS:85118923956
VL - 24
SP - 1001
EP - 1008
JO - Indonesian Journal of Electrical Engineering and Computer Science
JF - Indonesian Journal of Electrical Engineering and Computer Science
SN - 2502-4752
IS - 2
ER -