Article Information  
A Hybrid Approach for NER System for Scarce Resourced Language-URDU: Integrating n-gram with Rules and Gazetteers

Keywords: Entity Recognition, Named Entities, N-Gram Model, Gazetteer Lists

Mehran University Research Journal of Engineering & Technology

Volume 34 ,  Issue 4

Saeeda  Naz , Arif Iqbal  Umar , Imran  Razzak ,

References
1. Huang, F., “Multilingual Named Entity Extraction and Translation from Text and Speech”, Ph.D. Thesis, Carnegie Mellon University, Pittisburg, USA, 2005
2. Bouzoubaa, N., and Lachemi, M., “Self-Compacting Concrete Incorporating High Volumes of Class Fly Ash Preliminary Results”, Cemement & Concrete Research, Volume 31, pp. 413–20, 2001
3. Naz, S., Umar, A.I., Shirazi, S.H., and Khan, S.A., “Challenges of Urdu Named Entity Recognition: A Scarce Resourced Language”, Research Journal of Applied Science Engineering & Technology, Volume 8, No. 10, pp. 1272–1278, 2014
4. Ehtnologue: Statistical Summaries” (Last Visited: February, 2015)
5. Bikel, D.M., Miller, S., Schwartz, R., and Weischedel, R., “Nymble: A High Performance Learning Name-Finder”, Proceedings of 5th International Conference on Applied Natural Language Processing, pp. 194-201, 1997
6. Nadeau, D., and Sekine, S., "A Survey of Named Entity Recognition and Classification", Lingvisticae Investig, Volume 30, No. 1, pp. 3-26, 2007.
7. Borthwick, A., "A Maximum Entropy Approach to Named Entity Recognition", Ph.D. Thesis, Departmemnt of Computer Science, New York University, USA, 1999.
8. Li, W., and McCallum, A., "Rapid Development of Hindi Named Entity Recognition using Conditional Random Fields and Feature Induction", ACM Transactions on Asian Language Information Processing, Volume 2, No. 3, pp. 290-294, 2003.
9. Becker, K.R., Bennett, B., Davis, E., and Panton, D., "Named Entity Recognition in Urdu: A Progress Report," Proceedings of International Conference on Internet Computing, 2002.
10. http://mirror.aclweb.org/ijcnlp08/index.html
11. "Workshop on NER for South and South East Asian Languages, International Joint Conference Natural Language Processing, 2008. [Online]. Available: http://ltrc.iiit.ac.in/ner-ssea-08/index.cgi?topic=5" (visited: 2011).
12. Chatterji, S.S., Dandapat, S., Sarkar, S., and Mitra, P., "A Hybrid Approach for Named Entity Recognition in Indian Languages", Proceedings of International Joint Conference on Natural Language Processing, pp. 17-24, Hyderabad, India, 2008
13. Gali, H., Surana, A., Vaidya, P., Shishtla, and Sharma, D.M., "Aggregating Machine Learning and Rule Based Heuristic for Named EntityRecognition", Proceedings of International Joint Conference on Natural Language Processing, pp. 25-32, Hyderabad, India, 2008.
14. Ekbal, A., Haque, R., Das, A., Poka, V., and Bandyopadhyay, S., "Language Independent Named Entity Recognition in Indian Languages" , Proceedings of International Joint Conference on Natural Language Processing, pp. 33-40, Hyderabad, India, 2008.
15. Kumar, P., and Kiran, R., "A Hybrid Named Entity Recognition System for South Asian Languages", Proceedings of International Joint Conference on Natural Language Processing, pp. 83-88, Hyderabad, India, 2008.
16. Mukund, S., and Srihari, R.K., "NE Tagging for Urdu Based on Bootstrap POS Learning", Proceedings of 3rd Intenational Workshop on Cross Lingual Information, 2009
17. Mukund, S., "An Information-Extraction System for Urdu-A Resource-Poor Language", ACM Transactions on Asian Language Information Processing, Volume 9, No. 4, 2010.
18. Riaz, K., "Rule-Based Named Entity Recognition in Urdu", Proceedings of ACL Named Entities Workshop, pp. 126-135, 2010.
19. Becker, B., and Riaz, K., "A Study in Urdu Corpus Construction", Proceedings of 3rd Workshop on Asian Language Resources and International Standardization at the International Conference on Computational Linguistics, Taipei, Taiwan, 2002
20. Singh, U., Goyal, V., and Lehal, G.S., "Named Entity Recognition System for Urdu", Proceedings of International Conference on Computational Linguistics, Bombay, India, 2012.
21. Jahangir, F., Anwar, W., Bajwa, U.I., and Wang, X., "N Gram and Gazetteer List Based Named Entity Recognition for Urdu: A Scarce Resourced Language", Proceedings of International Conference on Computational Linguisics, Bombay, India, 2012
22. "ACL NE Corpus"[Online]. Available: http:// crl.nmsu.edu/Resources/lang_res/urdu.html (Visited: 2010-2011)