Review implementation of linguistic approach in schema matching

Galih Hendro Martono, Azhari SN


Research related schema matching has been conducted since last decade. Few approach related schema matching has been conducted with various methods such as neuron network, feature selection, constrain based, instance based, linguistic, and so on. Some field used schema matching as basic model such as e-commerce, e-business and data warehousing. Implementation of linguistic approach itself has been used a long time with various problem such as to calculated entity similarity values in two or more schemas. The purpose of this paper was to provide an overview of previous studies related to the implementation of the linguistic approach in the schema matching and finding gap for the development of existing methods. Futhermore, this paper focused on measurement of similarity in linguistic approach in schema matching.


Schema Matching; Linguistic Approach; Natural Language Processing; Similarity Measure

Article metrics

Abstract views : 172 | PDF views : 34

Full Text:



H. H. Do and E. Rahm, “Matching large schemas: Approaches and evaluation,” Inf. Syst., vol. 32, no. 6, pp. 857–885, 2007.

P. A. Bernstein, J. Madhavan, and E. Rahm, “Generic Schema Matching: Ten Years Later,” Proc. VLDB Endow., vol. 4, no. 11, pp. 695–701, 2011.

B. He, K. C. Chang, and J. Han, “Discovering Complex Matchings across Web Query Interfaces : A Correlation Mining Approach,” Sigkdd, pp. 148–157, 2004.

T. Okawara, J. Tanaka, A. Morishima, and S. Sugimoto, “A Support Tool for XML Schema Matching and Its Implementation,” in Data Engineering, 2005, pp. 1–4.

L. Ratinov and E. Gudes, “Abbreviation expansion in schema matching and Web integration,” in Proceedings - IEEE/WIC/ACM International Conference on Web Intelligence, WI 2004, 2004, pp. 485–490.

B. He and K. C. C. Chang, “Statistical schema matching across web query interfaces,” Proc. 2003 ACM SIGMOD Int. Conf. Manag. data, no. 1, p. 228, 2003.

E. Rahm and P. A. Bernstein, “A survey of approaches to automatic schema matching,” VLDB J., vol. 10, no. 4, pp. 334–350, 2001.

C. Clifton, E. Hausman, and A. Rosenthal, “Experience with a Combined Approach to Attribute Matching Across Heterogeneous Databases,” Proc. 7th IFIP Conf. Database Semant., pp. 428–453, 1997.

J. Madhavan, P. A. Bernstein, and E. Rahm, “Generic schema matching using Cupid,” in Proc of 27th International Conference on Very Large Data Bases, 2001, pp. 49–58.

P. Shvaiko, “A Survey of Schema-based Matching Approaches,” J. Data Semant., vol. 3730, pp. 146–171, 2005.

J. Berlin and A. Motro, “Database schema matching using machine learning with feature selection,” Adv. Inf. Syst. Eng., pp. 452–466, 2002.

B. Kim, N. Ho, D. Lee, and S. J. Hyun, “A clustering based schema matching scheme for improving matching correctness of web service interfaces,” Proc. - 2011 IEEE Int. Conf. Serv. Comput. SCC 2011, pp. 488–495, 2011.

X. Zhong, Y. Fu, Q. Liu, X. Lin, and Z. Cui, “A holistic approach on deep web schema matching,” 2007 Int. Conf. Converg. Inf. Technol. ICCIT 2007, pp. 169–174, 2007.

P. Sinha, R. K. Raj, and C. J. Romanowski, “A Holistic Approach to Schema Matching,” 2009 WRI World Congr. Comput. Sci. Inf. Eng., pp. 116–120, 2009.

B. Villányi and P. Martinek, “Towards a novel approach of structural schema matching,” pp. 103–107, 2012.

E. Rahm, “Towards Large-Scale Schema and Ontology Matching,” Schema Matching Mapp., pp. 3–27, 2011.

T. Milo and S. Zohar, “Using schema matching to simplify heterogeneous data translation,” Vldb, pp. 1–21, 1998.

H. Elmeleegy, M. Ouzzani, and A. Elmagarmid, “Usage-based schema matching,” pp. 20–29, 2008.

E. Sutanta, R. Wardoyo, K. Mustofa, and E. Winarko, “A hybrid model schema matching using constraint-based and instance-based,” Int. J. Electr. Comput. Eng., vol. 6, no. 3, pp. 1048–1058, 2016.

N. Noy, “Semantic integration: a survey of ontology-based approaches,” SIGMOD Rec., vol. 33, no. 4, pp. 65–70, 2004.

M. Peluang and P. Model, “Kajian Model dan Prototipe Schema Matching.”

R. Blake, “ScholarWorks at UMass Boston A Survey of Schema Matching Research,” 2007.

X. L. Sun and E. Rose, “Automated Schema Matching Techniques: An Exploratory Study Heterogeneity Problems Interoperability Concerns Semantic heterogeneity Semantic interoperability Structural heterogeneity Structural interoperability,” Res. Lett. Inf. Math. Sci, vol. 4, pp. 113–136, 2003.

Z. Bellahsene, A. Bonifati, F. Duchateau, and Y. Velegrakis, “On Evaluating Schema Matching and Mapping,” Schema Matching Mapp., pp. 253–291, 2011.

L. Otero-Cerdeira, F. J. Rodríguez-Martínez, and A. Gómez-Rodríguez, “Ontology matching: A literature review,” Expert Syst. Appl., vol. 42, no. 2, pp. 949–971, 2015.

M. Granitzer, V. Sabol, K. W. Onn, D. Lukose, and K. Tochtermann, “Ontology Alignment—A Survey with Focus on Visually Supported Semi-Automatic Techniques,” Futur. Internet, vol. 2, no. 3, pp. 238–258, 2010.

J. Euzenat, C. Meilicke, H. Stuckenschmidt, P. Shvaiko, and C. Trojahn, “Ontology Alignment Evaluation Initiative : Six Years of Experience,” J. Data Semant. XV, LNCS 6720, vol. 6720, pp. 158–192, 2011.

A. Doan and A. Y. Halevy, “Semantic integration research in the database community: A brief survey,” AI Mag., vol. 26, no. 1, p. 83, 2005.

H. Wache et al., “Ontology-Based Information Integration: A Survey of Existing Approaches,” Int. Jt. Conf. Artif. Intell. Work. Ontol. Inf. Shar., pp. 108–117, 2001.

L. Palopoli, D. Saccà, G. Terracina, and D. Ursino, “Uniform techniques for deriving similarities of objects and subschemes in heterogeneous databases,” IEEE Trans. Knowl. Data Eng., vol. 15, no. 2, pp. 271–294, 2003.

J. Li, J. Tang, Y. Li, and Q. Luo, “RiMOM: A dynamic multistrategy ontology alignment framework,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 8, pp. 1218–1232, 2009.

J. Euzenat, D. Loup, and M. Touzani, “[OLA] Ontology alignment with OLA.”

H. He, W. Meng, C. Yu, and Z. Wu, “Automatic integration of Web search interfaces with WISE-integrator,” VLDB J., vol. 13, no. 3, pp. 256–273, 2004.

J. Lu, S. Wang, and J. Wang, “An Experiment on the Matching and Reuse of XML Schemas,” Web Eng. Proc. 5th Int. Conf. ICWE 2005, Sydney, Aust. July 27-29, 2005, pp. 273–284, 2005.

P. Mitra and G. Wiederhold, “Resolving Terminological Heterogeneity In Ontologies Declaratively,” Proc. Work. Ontol. Semant. Interoperability 15th Eur. Conf. Artif. Intell., pp. 45–50, 2002.

W. E. Djeddi and M. T. Khadir, “A Dynamic Multistrategy Ontology Alignment Framework Based on Semantic Relationships using WordNet.”

H. He, W. Meng, C.t.yu, and Z.wu, “Wise-integrator: an automatic integrator of web search interfaces for e-commerce,” VLDB, pp. 357–368, 2003.

S. Bergamaschi, S. Castano, M. Vincini, and D. Beneventano, “Semantic integration of heterogeneous information sources,” Data Knowl. Eng., vol. 36, no. 3, pp. 215–249, 2001.

R. Steinberger, B. Pouliquen, and J. Hagman, “Cross-lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC,” Comput. Linguist. Intell. Text Process. Proc. CICLing 2002, vol. LNCS (2276, pp. 415–424, 2002.

F. Boudin, J. Y. Nie, and M. Dawes, “Using a medical thesaurus to predict query difficulty,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 7224 LNCS, pp. 480–484, 2012.

T. Sabbah, A. Selamat, M. Ashraf, and T. Herawan, “Effect of thesaurus size on schema matching quality,” Knowledge-Based Syst., vol. 71, pp. 211–226, 2014.

L. A. P. Leme, D. F. Brauner, K. K. Breitman, M. A. Casanova, and A. Gazola, “Matching object catalogues,” Innov. Syst. Softw. Eng., vol. 4, no. 4, pp. 315–328, 2008.

S. Castano and V. De Antonellis, “A schema analysis and reconciliation tool environment fornheterogeneous databases,” Proceedings. IDEAS’99. Int. Database Eng. Appl. Symp. (Cat. No.PR00265), 1999.

S. Castano, V. De Antonellis, and S. C. Di De Vimercati, “Global viewing of heterogeneous data sources,” IEEE Trans. Knowl. Data Eng., vol. 13, no. 2, pp. 277–297, 2001.

I. W. S. Wicaksana and R. A. Hakim, “Pendekatan Schema Matching dalam Bahasa Indonesia.”

H.-H. Do and E. Rahm, “COMA: a system for flexible combination of schema matching approaches,” Proc. 28th Int. Conf. Very Large Data Bases, pp. 610–621, 2002.

H.-H. Do, “Schema Matching and Mapping-based Data Integration,” Dep. Comput. Sci., no. August, p. 222, 2006.

D. Engmann and S. Massmann, “Instance Matching with COMA++,” Citeseer, pp. 144–156, 2004.

D. Aumueller, H.-H. Do, S. Massmann, and E. Rahm, “COMA++ - Schema and ontology matching with COMA,” Proc. 2005 ACM SIGMOD Int. Conf. Manag. data SIGMOD 05, vol. pages, p. 906, 2005.

C. Shao, L. M. Hu, J. Z. Li, Z. C. Wang, T. Chung, and J. B. Xia, “RiMOM-IM: A Novel Iterative Framework for Instance Matching,” J. Comput. Sci. Technol., vol. 31, no. 1, pp. 185–197, 2016.

E. Rahm, P. A. Bernstein, and U. Leipzig, “On Matching Schemas Automatically On Matching Schemas Automatically On Matching Schemas Automatically,” Rep. Nr, vol. 1, 2001.

G. X. M. L. Schema, A. Algergawy, R. Nayak, and G. Saake, “QUT Digital Repository : XML Schema Element Similarity Measures :,” 2009.

B. C. Chien and S. Y. He, “A hybrid approach for automatic schema matching,” 9th Int. Conf. Mach. Learn. Cybern., vol. 6, no. July, pp. 2881–2886, 2010.

Y. U. of E. Sun, L. U. of E. Ma, and S. N. U. Wang, “A Comparative Evaluation of String Similarity Metrics for Ontology Alignment,” J. Inf. Comput. Sci., vol. 12, no. 3, pp. 957–964, 2015.

P. Bertolazzi, L. De Santis, and M. Scannapieco, “Automatic record matching in cooperative information systems,” Proc. ICDT, no. i, pp. 13–20, 2003.

A. McCallum, K. Nigam, and L. H. Ungar, “Efficient clustering of high-dimensional data sets with application to reference matching,” Proc. sixth ACM SIGKDD Int. Conf. Knowl. Discov. data Min. KDD 00, pp. 169–178, 2000.

W. W. Cohen, “Integration of heterogeneous databases without common domains using queries based on textual similarity,” Proc. 1998 ACM SIGMOD Int. Conf. Manag. data, vol. 27, no. 2, pp. 201–212, 1998.

I. Bhattacharya and L. Getoor, “Collective entity resolution in relational data,” ACM Trans. Knowl. Discov. Data, vol. 1, no. 1, p. 5–es, 2007.

W. W. Cohen and J. Richman, “Learning to match and cluster large high-dimensional data sets for data integration,” Proc. eighth ACM SIGKDD Int. Conf. Knowl. Discov. data Min., pp. 475–480, 2002.

L. J. Nederstigt, S. S. Aanen, D. Vandić, and F. Frasincar, “An automatic approach for mapping product taxonomies in E-commerce systems,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 7328 LNCS, pp. 334–349, 2012.

L. Nederstigt, D. Vandic, and F. Frasincar, “An Automated Approach to Product Taxonomy Mapping in E-Commerce,” Manag. Intell. Syst., pp. 1–10, 2012.

B. Jeong, D. Lee, H. Cho, and J. Lee, “A novel method for measuring semantic similarity for XML schema matching,” Expert Syst. Appl., vol. 34, no. 3, pp. 1651–1658, 2008.

P. Bruza, “QUT Digital Repository : Combining Structure and Content Similarities for XML Document Clustering,” no. November, pp. 27–28, 2003.

Copyright (c) 2017 International Journal of Advances in Intelligent Informatics

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Advances in Intelligent Informatics
(pISSN: 2442-6571 | eISSN: 2548-3161 )
Organized by Informatics Department - Universitas Ahmad Dahlan , and UTM Big Data Centre - Universiti Teknologi Malaysia
Published by Universitas Ahmad Dahlan
W :
E :,,
View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0