國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,在網際網路上支援協同合作、負載分散及容錯機制之計算架構,A Skeleton Supporting Group Collaboration, Load Distribution, and Fault Tolerance for Internet-based Computing

論文名稱 Title	在網際網路上支援協同合作、負載分散及容錯機制之計算架構 A Skeleton Supporting Group Collaboration, Load Distribution, and Fault Tolerance for Internet-based Computing
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	89 學年度第 2 學期 The spring semester of Academic Year 89	語文別 Language	英文 English
學位類別 Degree	博士 Ph.D.	頁數 Number of pages	97
研究生 Author	江傳文 Chuanwen Chiang
指導教授 Advisor	李宗南 Chungnan Lee
召集委員 Convenor	李新林 Sing-Ling Lee
口試委員 Advisory Committee	張玉盈, 朱治平, 謝錫堃, 楊竹星, 姚任之 Ye-In Chang; Chih-Ping Chu; Ce-Kuen Shieh; C. S. Yang; Jen-Chih Yao
口試日期 Date of Exam	2001-07-16	繳交日期 Date of Submission	2001-08-13
關鍵字 Keywords	網際網路、群組合作、容錯、負載分散 Group Collaboration, Load Distribution, Internet, Fault Tolerance
統計 Statistics	本論文已被瀏覽 5762 次，被下載 2691 次 The thesis/dissertation has been browsed 5762 times, has been downloaded 2691 times.

中文摘要
本論文旨在探討對偶連結骨架 (dual connection skeleton，簡稱DCS) 之設計。此一骨架有助於在網際網路環境中有效率地發展群組合作及高效能運算之應用系統，其與傳統架構之間的最大差異在於其中包含了一個由多個代理人 (broker) 所組織而成的邏輯環 (logical ring)。群組合作、負載分散及容錯處理係對偶連結骨架所主要考量之三大關鍵議題。當網際網路成為主流，群組合作便成為吾人在網際網路上建構電腦支援協同工作應用的重要課題。因此，在對偶連結骨架的基礎上，我們提出了一個並行控制的機制，用以確保群組成員在存取共享資源時的公平性與一致性。此外，在關於負載分散的議題上，DCS 採用了適應性最高回應比優先 (adaptive highest response ratio next，AHRRN) 演算法來處理工作排程。我們藉由與最短工作優先、最高回應比優先及先到先服務等演算法的比較，來進行對 AHRRN 演算法效能的評估。相較於以上的這些演算法，AHRRN 不僅在效能上有較好的表現也避免了排程過程中工作飢餓的問題。再者，在平行處理的應用中，一個工作可能會進一步地被分割成數個子工作，而且這些子工作可以被指派給不同的處理單元。DCS 因此利用一個名為動態群組排程 (dynamic grouping scheduling，DGS) 的方法來處理這些子工作的排程以增進系統效能。DGS具有以下不同於目前現有排程演算法的特性：第一、DGS 利用動態群組的策略來決定子工作的計算成本。第二、排程過程中，DGS 會估算每一個處理單元對尚未排程子工作的勝任度，進而找出更為適宜的分派決策。DGS 此一演算法的效能評估是藉由與其它演算法在排程長度的比較而達成。實驗結果印證了 DGS 在網際網路運算上的優越性。除此之外，在容錯處理機制的議題上，DCS 利用對偶連結策略以修復系統所發生之錯誤，典型的檢查點技術亦被 DCS 用於工作的再配置。我們同時也探討了五種建立對偶連結的方式，實驗結果表現出這五種方式在不同的系統異質性影響之下所呈現之效能。針對群組合作、負載平衡以及容錯處理等議題，DCS 藉由整合上述所提的機制而成為一個完整且統合的網際網路計算架構。
Abstract
This dissertation is intended to explore the design of a dual connection skeleton (DCS), which facilitates effective and efficient exploitation of Internet-centric collaborative workgroup and high performance metacomputing applications. The predominant difference between DCS and conventional frameworks is that DCS administers a network of brokers that are grouped into a logical ring. New mechanisms for group collaboration, load distribution, and fault tolerance, which are three crucial issues in Internet-based computing, are proposed and integrated into the dual connection skeleton. Collaborative workgroup becomes a significant common issue when we attempt to develop wide area applications supporting computer-supported cooperative work (CSCW). For group collaboration, DCS therefore offers a strategy for concurrency control that ensures the consistency of shared resources. By using the strategy, multiple users in a collaborative group are able to simultaneously access shared data without violating its consistency. With respect to load distribution, additionally, DCS applies an adaptive highest response ratio next (AHRRN) algorithm to job scheduling. Performance evaluations on competing algorithms, such as shortest job first (SJF), highest response ratio next (HRRN), and first come, first served (FCFS) are conducted. Simulation results demonstrate that AHRRN is not only an efficient algorithm, but also is able to prevent the well-known job starvation problem. In a parallel computational application, one can further decompose a composite job into constituent tasks such that these tasks can be assigned to different PEs for concurrent execution. The dual connection skeleton thus makes use of a proposed dynamic grouping scheduling (DGS), to undertake task scheduling for performance improvement. The DGS algorithm employs a task grouping strategy to determine computational costs of tasks. It re-prioritizes unscheduled tasks at each scheduling step to explore an appropriate task allocation decision. In terms of the schedule length, the performance of DGS has been evaluated by comparing with some existing algorithms, such as Heavy Node First (HNF), Critical Path Method (CPM), Weight Length (WL), Dynamic Level Scheduling (DLS), and Dynamic Priority Scheduling (DPS). Simulation results show that DGS outperforms these competing algorithms. Moreover, as for fault tolerance, DCS utilizes a dual connection mechanism for computational reliability enhancement. For the sake of constructing dual connection, we examine five approaches: RANDOM, NEXT, ROTARY, MINNUM, and WEIGHT. Each one of these approaches can be incorporated into DCS-based wide-area metacomputing systems. Performance simulation shows that WEIGHT benefits the dual connection the most. A DCS-based scientific computational application named the motion correction is used to demonstrate the fault tolerant ability of DCS. Putting the group collaboration, load distribution, and fault tolerance issues together, the dual connection skeleton forms a seamless and integrated framework for Internet-centric computing.

目次 Table of Contents
1. Introduction 1-1 1.1 Motivations and Objectives 1-1 1.2 Paradigms of Internet-centric Metacomputing 1-4 1.3 Organization of this Dissertation 1-7 2. The Dual Connection Skeleton 2-1 2.1 Concepts 2-1 2.2 Design Models 2-5 2.3 Summary 2-13 3. Group Collaboration 3-1 3.1 Background 3-1 3.2 Strategy for Concurrency Control 3-3 3.3 Summary 3-8 4. Job Scheduling 4-1 4.1 Background 4-2 4.2 Definitions 4-4 4.3 The Adaptive Highest Response Ratio Next Algorithm 4-5 4.4 Performance Evaluation 4-9 4.5 Summary 4-17 5. Task Scheduling 5-1 5.1 Background 5-2 5.2 Preliminary Assumptions and Definitions 5-4 5.3 The Dynamic Grouping Scheduling Algorithm 5-10 5.4 Some Heuristics for Task Scheduling 5-14 5.4.1 The Heavy Node First Algorithm 5-14 5.4.2 The Critical Path Method Algorithm 5-14 5.4.3 The Weight Length Algorithm 5-15 5.4.4 The Dynamic Level Scheduling Algorithm 5-15 5.4.5 The Dynamic Priority Scheduling Algorithm 5-16 5.5 Performance Evaluation 5-17 5.6 Summary 5-22 6. Fault Tolerance 6-1 6.1 Fault-Tolerant Mechanisms 6-1 6.2 Selection Policies for Dual Connection 6-2 6.3 Simulation 6-5 6.4 A Motion-correction Application 6-7 6.5 Summary 6-9 7. Conclusions and Future Works 7-1 References Reference-1

參考文獻 References
References [1] L. Smarr and C. E. Catlett, “Metacomputing,” Comm. of the ACM, Vol. 35, No. 6, pp. 44-52, June 1992. [2] A. D. Alexandrov, M. Ibel, K. E. Schauser, and C. J. Scheiman, “SuperWeb: Research Issues in Java-Based Global Computing,” Concurrency: Practice and Experience, Vol. 9, No. 6, pp.535-553, June 1997. [3] A. Baratloo, M. Karaul, Z. M. Kedem, and P. Wyckoff, “Charlotte: Metacomputing on the Web,” Proc. of the 9th International Conference on Parallel and Distributed Computing Systems, pp. 181-188, September 2000. [4] J. Gosling, B. Joy and G. Steele, The Java Language Specification, Addison-Wesley, Reading, MA, 1996. [5] P. Ciancarini, R. Tolksdorf, F. Vitali, D. Rossi, and A. Knoche, “Coordinating Multiagent Applications on the WWW: A Reference Architecture,” IEEE Trans. Software Eng., Vol. 24, No. 5, pp. 362-375, May 1998. [6] V. Anupam and C. Bajaj, “Shastra: An Architecture for Development of Collaborative Applications,” International Journal of Intelligent and Cooperative Information Systems, Vol. 3, No. 2, pp. 155-166, February 1994. [7] C. E. Chronaki, D. G. Katehakis, X. C. Zabulis, M. Tsiknakis, and S. C. Orphanoudakis, “WebOnColl: Medical Collaboration in Regional Healthcare Networks,” IEEE Trans. Information Technology in Biomedicine, Vol. 1, No. 4, pp. 257-269, December 1997. [8] J. Bai, Y. Zhang, and B. Dai, “Design and Development of an Interactive Medical Teleconsultation System over the World Wide Web,” IEEE Trans. Information Technology in Biomedicine, Vol. 2, No. 2, pp. 74-79, June 1997. [9] CCF Project Team, “CCF: A Framework for Collaborative Computing,” IEEE Internet Computing, Vol. 4, No. 1, pp. 16-24, January/February 2000. [10] C. Lee, C. Chiang, and M. Horng, “Collaborative Web Computing Environment: An Infrastructure for Scientific Computation,” IEEE Internet Computing, Vol. 4, No. 2, pp. 27-35, March/April 2000. [11] I. Foster and C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit,” Int. Journal of Supercomputer Applications, Vol. 11, No. 2, pp. 115-128, Summer 1997. [12] M. Philippsen and M. Zenger, “JavaParty-Transparent Remote Objects in Java,” Concurrency: Practice and Experience, Vol. 9, No. 11, pp. 1225-1242, November 1997. [13] P. Cappello, B. Christiansen, M. F. Ionescu, M. O. Neary, K. E. Schauser, and D. Wu, “Javelin: Internet-Based Parallel Computing Using Java,” Concurrency: Practice and Experience, Vol. 9, No. 11, pp. 1139-1160, November 1997. [14] P. Gray and V. S. Sunderam, “IceT: Distributed Computing and Java,” Concurrency: Practice and Experience, Vol. 9, No. 11, pp. 1161-1167, November 1997. [15] Z. Chen, K. Maly, P. Mehrotra, P. Vangala, and M. Zubair, “Web-based Framework for Distributed Computing,” Concurrency: Practice and Experience, Vol. 9, No. 11, pp. 1175-1180, November 1997. [16] W. Yu and A. Cox “Java/DSM: A Platform for Heterogeneous Computing,” Concurrency: Practice and Experience, Vol. 9, No. 11, pp. 1213-1224, November 1997. [17] R. Buyya, High Performance Cluster Computing, Prentice Hall, New Jersey, 1999. [18] B. Wims and C. Xu, “Traveler: A Mobile Agent Infrastructure for Wide Area Parallel Computing,” Tech. Report, WSU, January 1999. [19] J. Waldo, “Remote procedure calls and Java Remote Method Invocation,” IEEE Concurrence, Vol. 6, No. 3, pp. 5-7, July-September 1998. [20] E. Evans and D. Rogers, “Using Java Applets and CORBA for Multi-User Distributed Applications,” IEEE Internet Computing, Vol. 1, No. 3, pp. 43-55, May/June 1997. [21] A. Wollrath, J. Waldo, and R. Riggs, “Java-Centric Distributed Computing,” IEEE Micro, Vol. 17, No. 3, pp. 44-53, May/June 1997. [22] N. M. Karnik and A. R. Tripathi, “Design Issues in Mobile-Agent Programming Systems,” IEEE Concurrency, Vol. 6, No. 3, pp. 52-61, July-September 1998. [23] C. J. Beckmann, D. D. McManus, and G. Cybenko, “Horizons in Scientific and Distributed Computing,” Computing in Science and Engineering, Vol. 1, No. 1, pp. 23-30, January/February, 1999. [24] G. Cabri, L. Leonardi, and F. Zambonelli, “Mobile-Agent Coordination Models for Internet Applications,” Computer, Vol. 33, No. 2, pp. 82-89, February 2000. [25] T. Sandholm and Q. Huai, “Nomad: Mobile Agent System for an Internet-Based Auction House,” IEEE Internet Computing, Vol. 4, No. 2, pp. 80-86, March/April 2000. [26] H. Gomaa, Designing Concurrent, Distributed, and Real-Time Applications with UML, Addison Wesley, Tokyo, 2000. [27] T. L. Casavant and J. G. Kuhl, “A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems,” IEEE Trans. Software Eng., Vol. 14, No. 2, pp. 141-154, February 1988. [28] W. H. Kohler and K. Steiglitz, “Characterization and Theoretical Comparison of Branch-and-Bound Algorithms for Permutation Problems,” J. ACM, Vol. 21, No. 1, pp 140-156, January 1974. [29] H. EI-Rewini, T. Lewis, and H. Ali, Task Scheduling in Parallel and Distributed Systems, Prentice Hall, Englewood Cliffs, N. J., 1994. [30] T. C. Hu, “Parallel Sequencing and assembly Line Problems,” Oper. Research, Vol. 19, No. 6, pp. 244-257, April 1989. [31] R. Sethi, “Scheduling Graphs on Two Processors,” SIAM J. Computing, Vol. 5, No. 1, pp. 73-82, March 1976. [32] S. H. Bokhari, “A Shortest Tree Algorithm for Optimal Assignments Across Space and Time in Distributed Processor Systems,” IEEE Trans. Software Eng., Vol. 7, No. 6, pp. 335-341, November 1981. [33] M. J. Gonzalez, “Deterministic Processor Scheduling,” ACM Computing Surveys, Vol. 9, No. 3, pp. 173-204, September 1977. [34] C. Larman, Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design, Prentice Hall, New Jersey, 1999. [35] C. Lee, C. C. Hsu, S. B. Yai, W. C. Huang and W. C. Lin, “Distributed Robust Image Mosaics,” Proc. of the 14th International Conference on Pattern Recognition, Vol. 2, pp. 1213-1215, August 1998. [36] D. L. Eager, E. D. Lazowska, and J. Zahorjan, “Adaptive Load Sharing in Homogeneous Distributed Systems,” IEEE Trans. Software Eng., Vol. SE-12, No. 5, pp. 662-675. May 1986. [37] S. Dandamudi, “Performance Impact of Scheduling Discipline on Adaptive Load Sharing in Homogeneous Distributed Systems” Proc. of the International Conference on Distributed Computing System, pp. 484-492, 1995. [38] N. G. Shivaratri, P. Krueger, and Mukesh Singhal, “Load Distributing for Locally Distributed Systems,” Computer, Vol. 25, No. 12, pp. 33-44, December 1992. [39] W. Stallings, Operating Systems Internals and Design Principles, Prentice-Hall International, 1998. [40] T. L. Adam, K. Chandy, and J. Dickson, “A Comparison of List Scheduling for Parallel Processing Systems,” Comm. ACM, Vol. 17, No. 12, pp. 685-690, December 1974. [41] T. Yang and A. Gerasoulis, “List Scheduling with and without Communication Delays,” Parallel Computing, Vol. 19, No. 12, pp. 1321-1344, December 1993. [42] H. EI-Rewini and T. Lewis, “Scheduling Parallel Programs onto Arbitrary Target Machines,” J. Parallel and Distributed Computing, Vol. 9, No. 2, pp. 138-153, June 1990. [43] B. Shirazi, M. Wang, and G. Pathak, “Analysis and Evaluation of Heuristic Methods for Static Scheduling,” J. Parallel and Distributed Computing, Vol. 10, No. 3, pp. 222-232, March 1990. [44] M. Y. Wu and D. D. Gajski, “Hypertool: A Programming Aid for Message-Passing Systems,” IEEE Trans. Parallel and Distributed Systems, Vol. 1, No. 3, pp. 330-343, July 1990. [45] J. Y. Colin and P. Chritienne, “C.P.M. Scheduling with Small Communication Delays and Task Duplication,” Oper. Research, Vol. 39, No. 4, pp. 680-684, July 1991. [46] E. S. H. Hou, N. Ansari, and H. Ren, “A Genetic Algorithm for Multiprocessor Scheduling,” IEEE Trans. Parallel and Distributed Systems, Vol. 5, No. 2, pp. 113-120, February 1994. [47] T. Yang and A. Gerasoulis, “DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors,” IEEE Trans. Parallel and Distributed Systems, Vol. 5, No. 9, pp. 951-967, September 1994. [48] M. AI-Mouhamed and A. AI-Mouhamed, “Performance Evaluation of Scheduling Precedence-Constrained Computations on Message-Passing Systems,” IEEE Trans. Parallel and Distributed Systems, Vol. 5, No. 12, pp. 1317-1322, December 1994. [49] H. EI-Rewini, H. Ali, and T. Lewis, “Task Scheduling in Multiprocessing Systems,” Computer, Vol. 28, No. 12, pp. 27-37, December 1995. [50] M. A. Palis, J. Liou, and D. S. L. Wei, “Task Clustering and Scheduling for Distributed Memory Parallel Architectures,” IEEE Trans. Parallel and Distributed Systems, Vol. 7, No. 1, pp. 46-55, January 1996. [51] Y. K. Kwok and I. Ahmad, “Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors,” IEEE Trans. Parallel and Distributed Systems, Vol. 7, No. 5, pp. 506-521, May 1996. [52] G. C. Sih and E. A. Lee, “A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures,” IEEE Trans. Parallel and Distributed Systems, Vol. 4, No. 2, pp. 175-187, February 1993. [53] I. Ahmad, M. K. Dhodhi, and R. UI-Mustafa, “DPS: Dynamic Priority Scheduling Heuristic for Heterogeneous Computing Systems,” IEE Proc-Comput. Digit. Tech., Vol. 145, No. 6, pp. 411-418, November 1998. [54] J. A. Mariani and T. Rodden, “Cooperative Information Sharing: Developing a Shared Object Service,” The Computer Journal, Vol. 39, No. 6, pp. 455-470, June 1996. [55] C. Ellis, S. Gibbs, and G. Rein, “Groupware: Some Issues and Experiences,” Comm. of the ACM, Vol. 34, No. 1, pp. 38-58, January 1991. [56] S. Greenberg and D. Marwood, “Real Time Groupware as a distributed System: Concurrency Control and its Effect on the Interface,” Proceedings of the Conference on Computer Supported Cooperative Work, pp. 207-217, 1994. [57] M. Raynal and M. Singhal, “Logical Time: Capturing Causality in Distributed Systems,” Computer, Vol. 29, No. 2, pp. 49-56, February 1996. [58] C. Fidge, “Fundamentals of Distributed System Observation,” IEEE Software, Vol. 13, No. 6, pp. 77-83, November 1996.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內立即公開，校外一年後公開 off campus withheld 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0813101-155034.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS