Responsive image
博碩士論文 etd-0718101-105738 詳細資訊
Title page for etd-0718101-105738
論文名稱
Title
XML文件與關聯式資料庫之間轉換方法之設計與實作
Design and Implementation of a Mapping Technique between XML Documents and Relational Databases
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
170
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2001-07-10
繳交日期
Date of Submission
2001-07-18
關鍵字
Keywords
關聯式資料庫、可延伸標示語言
Relational Database, XML
統計
Statistics
本論文已被瀏覽 5699 次,被下載 42
The thesis/dissertation has been browsed 5699 times, has been downloaded 42 times.
中文摘要
近年來,許多人利用全球資訊網(World Wide Web)和網際網路(Internet)去尋找他們所想要的資訊。超文件標示語言(HTML)是一種被用來發行超文件的文件標示語言。HTML是一種對於世界上的內容開發者而言的目標格式。基本上,超文件標示語言最主要的貢獻在於描述如何去展示一個資料的項目。因此,HTML文件很難去找到有用的資訊。那是因為HTML文件將所有的內容與展示的標籤混合在一起。可延伸標示語言(eXtensible Markup Language) 則是另一種在網際網路作為資料交換和內部企業應用的資料格式。為了能夠促進資料交換,企業的夥伴定義了共同XML文件的文件型別定義(Document Type Definitions)來作為他們的應用間的文件交換。而且,很受歡迎的WWW/EDI、電子商務和許許多多的商業資料都使用XML在WWW上作資料的交換。基本上,XML可以自行描述自己資料的意義,而且XML文件的內容本身是和展示的格式是分開的,所以很容易從中找到有意義的資訊並且進而分析它。然而,當商業資料大量的存在時,我們必須將XML文件轉換到關聯式資料庫內。為了要在不同的應用間交換資料,我們必須將在資料庫的資料重新產生出一XML的文件。在這一篇論文中,我們設計了在XML文件和關聯式資料庫之間的轉換方法和提出實作的轉換工具。然而,XML文件和一般的關聯式的資料不太相同。XML文件是具階層性的,並且元素(element)可以是巢狀的及重複多次的。因此,我們無法很輕易正確地從XML文件轉換到關聯式資料庫。我們的方法裡,必須解決上述的問題,我們設計和實作了在XML文件與關聯式資料庫間可自動轉換任何XML的文件及任何種類的商業性關聯式資料庫的轉換方法。我們實作的工具是用Visual Basic與SQL server 2000來實作。從我們的經驗之中,我們展示出我們的轉換技術可以有效率地適用在任何XML文件與任何種類的關聯式資料庫,而且資料庫將不需要任何額外的需求及改變。
Abstract
In recent years, many people use the World Wide Web and Internet to find information that they want. HTML is a document markup language for publishing hypertext on the WWW. HTML has been the target format for content developers around the world. Basically, HTML tags serve the primary purpose of describing how to display a data item. Therefore, HTML documents are difficult to find some useful information. That is because, HTML documents are mixed content with display tags. On the other hand, XML is the another data format for data exchange inter-enterprise applications on the Internet. In order to facilitate data exchange, industry groups define public Document Type Definitions (DTD) that specify the format of the XML documents to be exchanged between their applications. Moreover, WWW/EDI or Electric Commerce is very popular and a lot of business data uses XML to exchange on the World Wide Web. Basically, XML tags describe the data itself. The contents (meaning) of the XML documents and the display format is separated. It could be easily to find meaningful information of the XML documents and analyze the information. Moreover, when a large volume of business data (XML documents) exists, we must transform the XML documents to the relational databases. In order to exchange business data between applications, we must construct the XML documents from the relational database. In this thesis, we design the mapping technique and present the implementation of mapping tools between XML documents and relational databases. XML document is fundamentally different from relational data. XML document are hierarchy, and elements of document should be nested and repeated more times (i.e., set-valued and recursion). Therefore, we can not map from the XML documents to the relational databases straightforwardly. Our mapping technique must resolve the above problems. We design and implement a mapping technique between the XML documents and the relational database such that those mapping can be done automatically for any kind of XML documents and any kind of commercial relational databases. The whole tools are implemented in Visual Basic and SQL Server 2000. From our experiences, we show that our efficient mapping technique can be applied to any kind of relational databases without any extra requirements or changes to the databases.
目次 Table of Contents
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 HTML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Organization of the Thesis . . . . . . . . . . . . . . . . . . 11
2.A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Mapping from XML documents to Relational Databases . . . . . . 13
2.1.1 Mapping from XML to Relational DB with DTDs . . . . . . . . . 13
2.1.1.1 Simplifying DTDs . . . . . . . . . . . . . . . . . . . . . 14
2.1.1.2 Special Schema Conversion Techniques . . . . . . . . . . . 15
2.1.1.3 The Basic Inlining Technique . . . . . . . . . . . . . . . 15
2.1.1.4 The Shared Inlining Technique . . . . . . . . . . . . . . . 18
2.1.1.5 The Hybrid Inlining Technique . . . . . . . . . . . . . . . 22
2.1.2 Mapping from XML Document toRelational DB without DTDs . . . 23
2.1.2.1 Mapping Edges . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.2.2 Mapping Values . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Mapping from Relational Databases to XML Documents . . . . . . 31
2.2.1 A SQL-Based Language Speci cation . . . . . . . . . . . . . . 31
2.2.2 Implementation Alternatives . . . . . . . . . . . . . . . . . 36
2.2.3 Early Tagging, Early Structuring . . . . . . . . . . . . . . 37
2.2.3.1 The Stored Procedure Approach . . . . . . . . . . . . . . . 37
2.2.3.2 The Correlated CLOB Approach . . . . . . . . . . . . . . . 38
2.2.3.3 The De-Correlated CLOB Approach . . . . . . . . . . . . . . 38
2.2.4 Late Tagging, Late Structuring . . . . . . . . . . . . . . . 39
2.2.4.1 Content Creation: Redundant Relation Approach . . . . . . . 40
2.2.4.2 Content Creation: Unsorted Outer Union Approach . . . . . . 40
2.2.4.3 Structuring/Tagging: Hash-based Tagger . . . . . . . . . . 42
2.2.5 Late Tagging, Early Structuring . . . . . . . . . . . . . . . 43
2.2.5.1 Structured Content Creation: Sorted Outer Union Approach . 43
2.2.5.2 Tagging Sorting Data: Constant Space Tagger . . . . . . . . 44
3. Mapping From Relational Databases to XML Documents . . . . . . . 45
3.1 The Mapping Technique . . . . . . . . . . . . . . . . . . . . . 45
4. Mapping From Valid XML Documents to Relational Databases . . . . 65
4.1 The Mapping Technique . . . . . . . . . . . . . . . . . . . . . 65
5. Mapping From Well-Formed XML Documents to Relational Databases . 81
5.1 The Mapping Technique . . . . . . . . . . . . . . . . . . . . . 81
6. Inserting the XML Data into the Related Relation . . . . . . . 102
6.1 The Inserting Technique . . . . . . . . . . . . . . . . . . . 102
7. A Comparison . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.1 From the Relational Databases to the XML Documents . . . . . . 113
7.2 From the DTD to the Relational Database Schemas . . . . . . . 123
7.3 From the XML Documents to the Relational Database Tuples . . . 124
8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2 Further Research Work . . . . . . . . . . . . . . . . . . . . 127
BIBLIOGRAPH . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
A. Early Tagging, Early Structurin . . . . . . . . . . . . . . . . 131
A.1 The Stored Procedure Appro . . . . . . . . . . . . . . . . . . 131
A.2 De-Correlated CLOB Approach . . . . . . . . . . . . . . . . . 135
B. Late Tagging, Late Structuring . . . . . . . . . . . . . . . . 148
B.1 Redundant Relation Approach . . . . . . . . . . . . . . . . . 148
B.2 Unsorted Outer Union Approach . . . . . . . . . . . . . . . . 153
C. Late Tagging, Early Structuring . . . . . . . . . . . . . . . . 164
C.1 Sorted Outer Union Approac . . . . . . . . . . . . . . . . . . 164
D. Parts of the Trace of the Sample Example. . . . . . . . . . . . 166
參考文獻 References
[1] V. Apparao, S. Byrne, M. Champion, S. Isaacs, I. Jacobs, A. L. Hors, G.
Nicol, J. Robie, R. Sutor, C. Wilson, and L. Wood, "Document Object
Model(DOM) Level 1 Speci cation Version 1.0," W3C Recommendation, Oc-
tober 1998. Http://www.w3.org/TR/REC-DOM-Level-1-199981001.

[2] J. Bosak, T. Bray, D. Connolly, E. Maler, G. Nicol, C. M. Sperberg-
McQueen, L. Wood, and J. Clark, "W3C XML Speci cation DTD,"
http://www.w3.org/XML/1998/06/xmlspec-report.htm.

[3] T.Bray, J.Paoli, and C. M. Sperberg-McQueen, " Extensible Markup Language
(XML)1.0," http://www.w3.org/TR/REC-xml.

[4] T.bray, J.Paoli, and C. M. Sperberg-McQueen, "Document Content Description
for XML," Http://www.w3.org/TR/NOTE-dcd.

[5] P. Buneman, "Semistructured Data," Proc. of ACM Symposium on Principles
of Database Systems, pp.117-121, 1997.

[6] S. S. Chawathe, "Describing and Manipulating XML Data," IEEE Data Engi-
neering Bulletin, Vol. 22, No 3, pp.3-9, 1999.

[7] R. Cover, " The SGML/XML Web Page,"
http://www.oasis-open.org/cover/xml.html.

[8] A. Deutsch, M. Fernandez, and D. Suciu, " Storing Semistructured Data with
STORED," Proc. of ACM SIGMOD Conference, pp. 431-442, Philadelphia,
Pennslyvania, May 1999.

[9] R. Fagin, "Multi-Valued Dependencies and a New Normal Form for Relations
Databases," ACM Transactions on Database Systems, Vol. 2, No 3, pp. 128-146,
1977.[10] M. Fernandez, W. Tan, and D. Suciu, "SilkRpute: Trading Between Relational and XML," In 9th International WWW Conference, pp.723-745 , May 2000.

[11] D. Florescu and D. Kossmann, "Storing and Querying XML Data Using an
RDMBS," IEEE Data Engineering Bulletin, Vol. 22, No 3, pp.27-34, 1999.

[12] R. Goldman, J McHugh, and J. Widom, "From Semistructured Data to XML:
Migrating the Lore Data Model and Query Language," WebDB (Informal Pro-
ceedings), pp.25-30, 1999.

[13] M. J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom, "Lore: A
Database Management System for Semistructured Data," SIGMOD Record, Vol.
26, No 3,pp. 54-56, September 1997.

[14] Oracle Corporation, "XML Support in Oracle 8 and beyond," Technical white
paper, http://www.oracle.com/xml/documents.

[15] K. Ramasamy, J.F.Naughton, and D. Maier, "Storage Representations for Set-
Valued Attributes," Working Paper Department of Computer Sciences, Univer-
sity of Wisconsin-Madision.

[16] M. Scholl, S. Abiteboul, F. Bancilhon, N. Bidoit, S. Gamerman, D. Plateau,
P. Richard, and A. Verroust, "VERSO: A Database Machine Based On Nested
Relations," Nested Relations and Complex Objects, pp. 27-49, Germany, April
1987.

[17] Microsoft Corporation, "XML Schema,"
http://www.microsoft.com/xml/schema/reference/start.asp.

[18] J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt, and J. Naughton,
"Relational Databases for Querying XML Documents: Limitations and Oppor-
tunities," Proc. of VLDB, pp. 302-314, Edinburgh, Scotland, 1999.

[19] J. Shanmugasundaram, E. Shekita, R. Barr, M. Carey, B. Lindasy, H. Pirahesh, and B. Reinwald, "EÆciently Publishing Relational Data as XML Documents," Proc. of VLDB, pp. 65-76, Cairo, Egypt, 2000.

[20] L. D. Shapiro, "Join Processing in Database Systems with Large Main Memo-
ries," ACM Transactions on the Database System (TODS), Vol. 11, No.3, pp.
239-264, 1986.

[21] J. Widom, "Data Management for XML: Research Directions," IEEE Data En-
gineering Bulletin, Vol. 22, No 3, pp. 44-52, 1999.

[22] R. V. Zwol, P. Apers, and A. Wilschut, "Modeling and Querying Semistructured Data with MOA," Workshop on Query Processing for Semistructured Data and Non-standard Data Format, 1999.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內公開,校外永不公開 restricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 3.144.77.71
論文開放下載的時間是 校外不公開

Your IP address is 3.144.77.71
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code