Review Implementation of Linguistic Approach in Schema Matching

Galih Hendro Martono

Abstract


Research related schema matching has been conducted since 1980. Few approach related schema matching has been conducted with various methods such as neuron network, feature selection, constrain based, instance based, linguistic, and so on. Some field used schema matching as basic model such as e-commerce, e-business and data warehousing. This paper explored implementation of linguistic approach in schema matching. Implementation of linguistic approach itself has been doing long ago with various problem like to calculated entity similarity values in two or more schemas.
This paper was a summary of the literature describing previous research in the implementation of schema matching with linguistic approach. The purpose of this paper was to provide an overview of previous studies related to the implementation of the linguistic approach in the schema matching and finding opportunities for the development of existing methods.

Keywords


Schema Matching; Linguistic Approach; Natural Language Processing; Similarity Measure

References


H. H. Do and E. Rahm, “Matching large schemas: Approaches and evaluation,” Inf. Syst., vol. 32, no. 6, pp. 857–885, 2007.

P. A. Bernstein, J. Madhavan, and E. Rahm, “Generic Schema Matching: Ten Years Later,” Proc. VLDB Endow., vol. 4, no. 11, pp. 695–701, 2011.

B. He, K. C. Chang, and J. Han, “Discovering Complex Matchings across Web Query Interfaces : A Correlation Mining Approach,” Sigkdd, pp. 148–157, 2004.

T. Okawara, J. Tanaka, A. Morishima, and S. Sugimoto, “A Support Tool for XML Schema Matching and Its Implementation,” in Data Engineering, 2005, pp. 1–4.

L. Ratinov and E. Gudes, “Abbreviation expansion in schema matching and Web integration,” in Proceedings - IEEE/WIC/ACM International Conference on Web Intelligence, WI 2004, 2004, pp. 485–490.

B. He and K. C. C. Chang, “Statistical schema matching across web query interfaces,” Proc. 2003 ACM SIGMOD Int. Conf. Manag. data, no. 1, p. 228, 2003.

E. Rahm and P. A. Bernstein, “A survey of approaches to automatic schema matching,” VLDB J., vol. 10, no. 4, pp. 334–350, 2001.

C. Clifton, E. Hausman, and A. Rosenthal, “Experience with a Combined Approach to Attribute Matching Across Heterogeneous Databases,” Proc. 7th IFIP Conf. Database Semant., pp. 428–453, 1997.

J. Madhavan, P. A. Bernstein, and E. Rahm, “Generic schema matching using Cupid,” in Proc of 27th International Conference on Very Large Data Bases, 2001, pp. 49–58.

P. Shvaiko, “A Survey of Schema-based Matching Approaches,” J. Data Semant., vol. 3730, pp. 146–171, 2005.

J. Berlin and A. Motro, “Database schema matching using machine learning with feature selection,” Adv. Inf. Syst. Eng., pp. 452–466, 2002.

B. Kim, N. Ho, D. Lee, and S. J. Hyun, “A clustering based schema matching scheme for improving matching correctness of web service interfaces,” Proc. - 2011 IEEE Int. Conf. Serv. Comput. SCC 2011, pp. 488–495, 2011.

X. Zhong, Y. Fu, Q. Liu, X. Lin, and Z. Cui, “A holistic approach on deep web schema matching,” in 2007 International Conference on Convergence Information Technology, ICCIT 2007, 2007, pp. 169–174.

P. Sinha, R. K. Raj, and C. J. Romanowski, “A Holistic Approach to Schema Matching,” in 2009 WRI World Congress on Computer Science and Information Engineering, 2009, pp. 116–120.

B. Villányi and P. Martinek, “Towards a novel approach of structural schema matching,” 2012, pp. 103–107.

E. Rahm, “Towards Large-Scale Schema and Ontology Matching,” Schema Matching Mapp., pp. 3–27, 2011.

T. Milo and S. Zohar, “Using schema matching to simplify heterogeneous data translation,” Vldb, pp. 1–21, 1998.

H. Elmeleegy, M. Ouzzani, and A. Elmagarmid, “Usage-based schema matching,” pp. 20–29, 2008.

E. Sutanta, R. Wardoyo, K. Mustofa, and E. Winarko, “A hybrid model schema matching using constraint-based and instance-based,” Int. J. Electr. Comput. Eng., vol. 6, no. 3, pp. 1048–1058, 2016.

N. Noy, “Semantic integration: a survey of ontology-based approaches,” SIGMOD Rec., vol. 33, no. 4, pp. 65–70, 2004.

E. Sutanta, R. Wardoyo, K. Mustofa, and E. Winarko, “Kajian Model dan Prototipe Schema Matching,” in Seminar Nasional Aplikasi Teknologi Informasi (sNATi) 2015, 2015, pp. J9–J15.

R. Blake, “Survey of Schema Matching Research,” 2007.

X. L. Sun and E. Rose, “Automated Schema Matching Techniques: An Exploratory Study,” Res. Lett. Inf. Math. Sci, vol. 4, pp. 113–136, 2003.

Z. Bellahsene, A. Bonifati, F. Duchateau, and Y. Velegrakis, “On Evaluating Schema Matching and Mapping,” Schema Matching Mapp., pp. 253–291, 2011.

L. Otero-Cerdeira, F. J. Rodríguez-Martínez, and A. Gómez-Rodríguez, “Ontology matching: A literature review,” Expert Syst. Appl., vol. 42, no. 2, pp. 949–971, 2015.

M. Granitzer, V. Sabol, K. W. Onn, D. Lukose, and K. Tochtermann, “Ontology Alignment—A Survey with Focus on Visually Supported Semi-Automatic Techniques,” Futur. Internet, vol. 2, no. 3, pp. 238–258, 2010.

J. Euzenat, C. Meilicke, H. Stuckenschmidt, P. Shvaiko, and C. Trojahn, “Ontology Alignment Evaluation Initiative : Six Years of Experience,” J. Data Semant. XV, LNCS 6720, vol. 6720, pp. 158–192, 2011.

A. Doan and A. Y. Halevy, “Semantic integration research in the database community: A brief survey,” AI Mag., vol. 26, no. 1, p. 83, 2005.

H. Wache, T. Vogele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann, and S. Hubner, “Ontology-Based Information Integration: A Survey of Existing Approaches,” in International Joint Conference on Artificial Intelligence; Workshop: Ontologies and Information Sharing, 2001, pp. 108–117.

L. Palopoli, D. Saccà, G. Terracina, and D. Ursino, “Uniform techniques for deriving similarities of objects and subschemes in heterogeneous databases,” IEEE Trans. Knowl. Data Eng., vol. 15, no. 2, pp. 271–294, 2003.

J. Li, J. Tang, Y. Li, and Q. Luo, “RiMOM: A dynamic multistrategy ontology alignment framework,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 8, pp. 1218–1232, 2009.

J. Euzenat, D. Loup, and M. Touzani, “Ontology alignment with OLA.”

H. He, W. Meng, C. Yu, and Z. Wu, “Automatic integration of Web search interfaces with WISE-integrator,” VLDB J., vol. 13, no. 3, pp. 256–273, 2004.

J. Lu, S. Wang, and J. Wang, “An Experiment on the Matching and Reuse of XML Schemas,” in Web Engineering. Proceedings of the 5th International Conference, ICWE 2005, Sydney, Australia, July 27-29, 2005, 2005, pp. 273–284.

P. Mitra and G. Wiederhold, “Resolving Terminological Heterogeneity In Ontologies Declaratively,” in Proceedings of Workshop on Ontologies and Semantic Interoperability at the 15th European Conference on Artificial Intelligence (ECAI), 2002, pp. 45–50.

W. E. Djeddi and M. T. Khadir, “A Dynamic Multistrategy Ontology Alignment Framework Based on Semantic Relationships using WordNet.”

H. He, W. Meng, C.t.yu, and Z.wu, “Wise-integrator: an automatic integrator of web search interfaces for e-commerce,” in In VLDB, 2003, pp. 357–368.

S. Bergamaschi, S. Castano, M. Vincini, and D. Beneventano, “Semantic integration of heterogeneous information sources,” in Data & Knowledge Engineering, 2001, vol. 36, no. 3, pp. 215–249.

R. Steinberger, B. Pouliquen, and J. Hagman, “Cross-lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC,” in Computational Linguistics and Intelligent Text Processing. Proceedings of the CICLing 2002, 2002, vol. LNCS (2276, pp. 415–424.

F. Boudin, J. Y. Nie, and M. Dawes, “Using a medical thesaurus to predict query difficulty,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7224 LNCS, 2012, pp. 480–484.

T. Sabbah, A. Selamat, M. Ashraf, and T. Herawan, “Effect of thesaurus size on schema matching quality,” Knowledge-Based Syst., vol. 71, pp. 211–226, 2014.

L. A. P. Leme, D. F. Brauner, K. K. Breitman, M. A. Casanova, and A. Gazola, “Matching object catalogues,” Innov. Syst. Softw. Eng., vol. 4, no. 4, pp. 315–328, 2008.

S. Castano and V. De Antonellis, “A schema analysis and reconciliation tool environment for heterogeneous databases,” in Proceedings. IDEAS’99. International Database Engineering and Applications Symposium (Cat. No.PR00265), 1999.

S. Castano, V. De Antonellis, and S. C. Di De Vimercati, “Global viewing of heterogeneous data sources,” IEEE Trans. Knowl. Data Eng., vol. 13, no. 2, pp. 277–297, 2001.

I. W. S. Wicaksana and R. A. Hakim, “Pendekatan Schema Matching dalam Bahasa Indonesia.”

H.-H. Do and E. Rahm, “COMA: a system for flexible combination of schema matching approaches,” in Proceedings of the 28th international conference on Very Large Data Bases, 2002, pp. 610–621.

H.-H. Do, “Schema Matching and Mapping-based Data Integration,” 2006.

D. Engmann and S. Massmann, “Instance Matching with COMA++,” in Citeseer, 2004, pp. 144–156.

D. Aumueller, H.-H. Do, S. Massmann, and E. Rahm, “COMA++ - Schema and ontology matching with COMA,” in Proceedings of the 2005 ACM SIGMOD international conference on Management of data SIGMOD 05, 2005, vol. pages, p. 906.

C. Shao, L. M. Hu, J. Z. Li, Z. C. Wang, T. Chung, and J. B. Xia, “RiMOM-IM: A Novel Iterative Framework for Instance Matching,” J. Comput. Sci. Technol., vol. 31, no. 1, pp. 185–197, 2016.

E. Rahm, P. A. Bernstein, and U. Leipzig, “On Matching Schemas Automatically,” 2001.

G. X. M. L. Schema, A. Algergawy, R. Nayak, and G. Saake, “QUT Digital Repository : XML Schema Element Similarity Measures : A Schema Matching Context,” 2009.

L. J. Nederstigt, S. S. Aanen, D. Vandić, and F. Frasincar, “An automatic approach for mapping product taxonomies in E-commerce systems,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7328 LNCS, 2012, pp. 334–349.

L. Nederstigt, D. Vandic, and F. Frasincar, “An Automated Approach to Product Taxonomy Mapping in E-Commerce,” Manag. Intell. Syst., pp. 1–10, 2012.

B. Jeong, D. Lee, H. Cho, and J. Lee, “A novel method for measuring semantic similarity for XML schema matching,” Expert Syst. Appl., vol. 34, no. 3, pp. 1651–1658, 2008.

T. Tran, R. Nayak, and P. Bruza, “QUT Digital Repository : Combining Structure and Content Similarities for XML Document Clustering,” in Conferences in Re- search and Practice in Information Technology (CRPIT), 2003, no. November, pp. 27–28.

P. Bertolazzi, L. De Santis, and M. Scannapieco, “Automatic record matching in cooperative information systems,” in Proceedings of the ICDT, 2003, no. i, pp. 13–20.

A. McCallum, K. Nigam, and L. H. Ungar, “Efficient clustering of high-dimensional data sets with application to reference matching,” in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining KDD 00, 2000, pp. 169–178.

W. W. Cohen, “Integration of heterogeneous databases without common domains using queries based on textual similarity,” in Proceedings of the 1998 ACM SIGMOD international conference on Management of data, 1998, vol. 27, no. 2, pp. 201–212.

I. Bhattacharya and L. Getoor, “Collective entity resolution in relational data,” ACM Trans. Knowl. Discov. Data, vol. 1, no. 1, p. 5–es, 2007.

W. W. Cohen and J. Richman, “Learning to match and cluster large high-dimensional data sets for data integration,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002, pp. 475–480.




DOI: http://dx.doi.org/10.12928/ijain.v3i1.75

Copyright (c) 2017 International Journal of Advances in Intelligent Informatics

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

_________________________________________________________
International Journal of Advances in Intelligent Informatics
(pISSN: 2442-6571 | eISSN: 2548-3161 )
Organized by Informatics Department - Universitas Ahmad Dahlan, and
                    UTM Big Data Centre - Universiti Teknologi Malaysia
Published by Universitas Ahmad Dahlan
W : http://ijain.org
E : info@ijain.org, andri.pranolo@tif.uad.ac.id, andri.pranolo.id@ieee.org
View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0