Responsive image
博碩士論文 etd-0612116-093823 詳細資訊
Title page for etd-0612116-093823
論文名稱
Title
資料科學產業職能與職位分類之研究
The Study of Competence and Job Classification in Data Science Industry
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
61
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2016-06-29
繳交日期
Date of Submission
2016-07-14
關鍵字
Keywords
主題模型、職位分類、資料科學、LDA、職能
LDA, Topic Model, Job Classification, Competency, Data Science
統計
Statistics
本論文已被瀏覽 5774 次,被下載 28
The thesis/dissertation has been browsed 5774 times, has been downloaded 28 times.
中文摘要
近年來隨著大數據的興起,以及資訊科技的持續進步,資料科學逐漸在產官學各界中成為顯學,而人才市場中對於資料科學家等資料科學人才的需求亦呈現爆炸式的成長。
然而資料科學人才的培育難度極高,導致產業中出現人才供給不足的問題。對於上述問題的有效解決方式之一,即為將資料科學工作進行更專業的細分,以此降低各職位人才培訓的難度。
本研究利用資料科學職位的招募資訊中所提及的職能,以LDA分析技術建立主題模型,分析資料科學產業中的職位類型及各類職位所須具備的職能。並透過上述研究結果針對不同知識背景的潛在工作者提出成為資料科學產業人才的建議,並供潛在工作者及培訓機構作為進修及訓練的參考。
最終,本研究結果將資料科學產業的職位共分為七類,包括:商業應用分析師、資料架構師、機器學習專家、資料工程師、資料專案經理、大數據工程師及資料分析師等。
Abstract
In recent years, with the rise of big data, as well as continued progress in information technology, data science has gradually become prominent in academia, business and government. Simultaneously, the demand for data scientists from the talent market also showed explosive growth.
However, because of extremely difficult of talent development of data science, the talent market supply shortage has become a problem in the data science industry. For one of the effective ways to solve this problem is to recruit specialized talent in order to reduce the difficulty of talent development of data science.
This paper uses the competency from job recruitment information and LDA analysis to achieve the purpose of job classification in data science industry. In the end, this paper provides the different suggest to the potential workers of different knowledge background.
Eventually, the result of this paper divided data science industry’s job into seven categories, including “Business Analyst”, “Data Architect”, “Machine Learning Expert”, “Data Engineer”, “Data Project Manager”, “Big Data Engineer” and “Data Analyst”.
目次 Table of Contents
[論文審定書 i]
[致謝 ii]
[中文摘要 iii]
[英文摘要 iv]
[目錄 v]
[圖目錄 vi]
[表目錄 vii]
[第一章 緒論 1]
[第一節 研究背景與研究動機 1]
[第二節 文獻探討 2]
[第三節 研究目的 2]
[第四節 研究流程 3]
[第二章 文獻探討 4]
[第一節 資料科學 4]
[第二節 職能 11]
[第三節 主題模型與LDA 13]
[第三章 研究方法 18]
[第一節 資料收集 18]
[第二節 資料清理 19]
[第三節 模型建立 22]
[第四章 研究結果 25]
[第一節 資料科學產業職能 25]
[第二節 資料科學職位分類 29]
[第五章 結論與建議 42]
[第一節 研究結論 42]
[第二節 研究貢獻與建議 47]
[第三節 研究限制 48]
[參考文獻 50]
參考文獻 References
一、英文文獻
Ariker, M., McGuire, T., & Perry, J. (2013). Five Roles You Need on Your Big Data Team. Harvard Business Review.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
Cira, D. J., & Benjamin, E. R. (1998). Competency-based pay: A concept in evolution. Compensation & Benefits Review, 30(5), 21-28.
Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems (pp. 288-296).
Cleveland, W. S. (2001). Data science: an action plan for expanding the technical areas of the field of statistics. International statistical review, 69(1), 21-26.
Dubois, D. D. (1993). Competency-Based Performance Improvement: A Strategy for Organizational Change. HRD Press, Inc., 22 Amherst Road, Amherst, MA 01002.
Ennis, M. R. (2008). Competency models: a review of the literature and the role of the employment and training administration (ETA) (pp. 1-25). Office of Policy Development and Research, Employment and Training Administration, US Department of Labor.
Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37.
Gantz, J., & Reinsel, D. (2012). The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east. IDC iView: IDC Analyze the future, 2007, 1-16.
Lucia, A. D., & Lepsinger, R. (1999). Art & Science of Competency Models. San Francisco, CA: Jossey-Bass.
Mansfield, R. S. (1996). Building competency models: Approaches for HR professionals. Human Resource Management (1986-1998), 35(1), 7.
McClelland, D. C. (1973). Testing for competence rather than for" intelligence.". American psychologist, 28(1), 1.
Mirabile, R. J. (1997). Everything you wanted to know about competency modeling. Training & Development, 51(8), 73-78.
Naur, P. (1974). Concise survey of computer methods.
Patil, T. H., & Davenport, D. J. (2012). Data Scientist: The Sexiest Job of the 21st Century. Harvard Business Review.
Rothwell, W., & Wellins, R. (2004). Putting new competencies to work for you. T AND D, 58(5), 94-101.
Shearer, C. (2000). The CRISP-DM model: the new blueprint for data mining. Journal of data warehousing, 5(4), 13-22.
Tukey, J. W. (1962). The future of data analysis. The Annals of Mathematical Statistics, 33(1), 1-67.
Turner, V., Gantz, J. F., Reinsel, D., & Minton, S. (2014). The digital universe of opportunities: rich data and the increasing value of the internet of things. IDC Analyze the Future.
Wallach, H. M., Mimno, D. M., & McCallum, A. (2009). Rethinking LDA: Why priors matter. In Advances in neural information processing systems (pp. 1973-1981)

二、網路資料
靳志輝,(2013),LDA數學八卦
Blitzstein, J., Pfister, H., & Kaynig-Fittkau, V. (2015). CS109 Data Science. Harvard University. http://cs109.github.io/2014/
Blumenstock, J. (2013). INFX 573: Introduction to Data Science. University of Washington School of Information. http://www.jblumenstock.com/teaching/course=infx573
DataCamp. (2015). The Data Science Industry: Who Does What. https://www.datacamp.com/community/tutorials/data-science-industry-infographic
Drew Conway. (2010). The Data Science Venn Diagram. http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Data Science Association. Data Science Code of Professional Conduct. http://www.datascienceassn.org/code-of-conduct.html
Hayes, B. (2015). Statistics: Is This Big Data’s Biggest Hurdle. http://businessoverbroadway.com
Journal of Data Science. About JDS. http://www.jds-online.com/about
KDnuggets. (2014). What Main Methodology Are You Using For Your Analytics, Data Mining, Or Data Science Projects? http://www.kdnuggets.com/polls/2014/analytics-data-mining-data-science-methodology.html
Ledell, E. (2015). Intro to Data Science. H2O World. http://www.slideshare.net/0xdata/h2o-world-intro-to-data-science-with-erin-ledell
Lee, C. H. (2014). 3 Data Careers Decoded and What It Means for You. http://blog.udacity.com/2014/12/data-analyst-vs-data-scientist-vs-data-engineer.html
RJMetrics. (2015). The State of Data Science.
Saraswat, M. (2015). Job Comparison – Data Scientist vs Data Engineer vs Statistician. http://www.analyticsvidhya.com/blog/2015/10/job-comparison-data-scientist-data-engineer-statistician/
Wikibooks. Data Science: An Introduction/A Mash-up of Disciplines. https://en.wikibooks.org/wiki/Data_Science:_An_Introduction/A_Mash-up_of_Disciplines
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code