論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
基於食材食譜的網路和矩陣分解方法發現料理風格 Cuisine Discovery based on Recipe-Ingredient Network and Matrix Factorization |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
43 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2017-09-05 |
繳交日期 Date of Submission |
2017-09-13 |
關鍵字 Keywords |
矩陣分解、分群、推薦、網路、文字探勘 clustering, network, recommendation, matrix factorization, text mining |
||
統計 Statistics |
本論文已被瀏覽 5983 次,被下載 140 次 The thesis/dissertation has been browsed 5983 times, has been downloaded 140 times. |
中文摘要 |
本研究的主要目的是試圖在食譜、食材和料理的製作方法中找出料理風格。料理風格可以被料理來自的文化、食材和處理的步驟或動作所區分,因此,研究中使用了三種方式分析這些資料,分別是:nsNMF、nsNMF with processing sequence constraint和網路分析。 在文字探勘的領域當中,nsNMF大部分被使用在主題建模(Topic Modeling),分析文章和字詞可以組成的主題,但在這個研究中它被使用在分析食材與食譜,並且找出可能的潛在料理風格,另外一個料理風格的維度-處理的動作,被加入nsNMF的演算法中,並且提出了nsNMF with processing sequence constraint。 我們使用網路分析中的演算法,greedy−community,來處理食材和食材之間的關係,這個方法可以偵測出這些食材可以被分為多少個群集。最後,我們會對於矩陣分解和網路分析,這兩種方法的分析結果做比較,並且找出它們之間的差異。 |
Abstract |
This research proposes an approach to find the cuisines, the types of dishes, from the recipes, ingredients and methods of producing dishes. We believe that the cuisines can be distinguished by the culture, the ingredients, and the processing action of a dish. Therefore, we applied three methods, the nsNMF, the regularized nsNMF and network analysis to analyze recipe data. The nsNMF is mostly employed in the field of text mining and implemented the topic modeling, but we used it on the cuisine modeling throw the correlations between recipes and ingredients. On the other hand, another dimension of the cuisines− processing action, was introduced into the modeling to produce the nsNMF with constraint. The network analysis was implemented to process the relationships among ingredients. We employed an algorithm, which is greedy−community in network analysis, to detect how many clusters there was in the ingredients. Finally, we analogized what the difference are between the results of the matrix factorization and the network analysis. |
目次 Table of Contents |
論文審定書 i 誌謝 ii 摘要 iii Abstract iv Contents v Tables vii Figures viii 1 Introduction 1 2 Background and Related Works 3 3 Methodology 7 3.1 The CoreNLP 7 3.2 The TF−IDF 8 3.3 The NMF 9 3.4 The nsNMF 10 3.5 The Longest Common Subsequence (LCS) 11 3.6 The nsNMF with Action Similarity Constraint 12 3.7 The nsNMF with Processing Sequence Constraint 13 4 Experimental Result 15 4.1 Raw Data 15 4.2 The Recipe−Ingredient Matrix 15 4.3 TF−IDF 16 4.4 Mutual Information 17 4.5 Cuisine Modeling 18 4.6 The nsNMF with Action Similarity Constraint 19 4.7 The nsNMF with Processing Sequence Constraint 21 4.8 Recipe Recommendation 25 4.9 Network Analysis 26 5 Conclusion 29 6 Reference 31 |
參考文獻 References |
Ahn, Y.-Y., Ahnert, S. E., Bagrow, J. P., & Barabási, A.-L. (2011). Flavor network and the principles of food pairing. Scientific Reports, 1. Retrieved from http://www.nature.com/srep/2011/111215/srep00196/full/srep00196.html?message-global=remove&WT.i_dcsvid=6042130-NzQwMTE2NDA3OQS2&WT.ec_id=MARKETING&WT.mc_id=SR1205CEPHYS Butts, C. T. (2009). Revisiting the Foundations of Network Analysis. Science, 325(5939), 414–416. https://doi.org/10.1126/science.1171022 Cai, D., He, X., Han, J., & Huang, T. S. (2011). Graph Regularized Nonnegative Matrix Factorization for Data Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(8), 1548–1560. https://doi.org/10.1109/TPAMI.2010.231 Greene, D., O’Callaghan, D., & Cunningham, P. (2014). How many topics? stability analysis for topic models. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 498–513). Springer. Retrieved from http://link.springer.com/chapter/10.1007/978-3-662-44848-9_32 Isaacson, D. L., & Madsen, R. W. (1976). Markov chains, theory and applications (Vol. 4). Wiley New York. Retrieved from http://tocs.ulb.tu-darmstadt.de/129935549.pdf Kim, H., & Park, H. (2008). Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method. SIAM Journal on Matrix Analysis and Applications, 30(2), 713–730. https://doi.org/10.1137/07069239X Kolaczyk, E. D., & Csárdi, G. (2014). Statistical analysis of network data with R (Vol. 65). Springer. Retrieved from http://link.springer.com/content/pdf/10.1007/978-1-4939-0983-4.pdf Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix Factorization Techniques for Recommender Systems. Computer, 42(8), 30–37. https://doi.org/10.1109/MC.2009.263 Langville, A. N., & Meyer, C. D. (2011). Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press. Lehrer, A. (1969). Semantic cuisine. Journal of Linguistics, 5(1), 39–55. https://doi.org/10.1017/S0022226700002048 Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., & McClosky, D. (2014). The stanford corenlp natural language processing toolkit. In ACL (System Demonstrations) (pp. 55–60). Retrieved from http://www.aclweb.org/website/old_anthology/P/P14/P14-5.pdf#page=67 Newman, M. (2010). Networks: an introduction. Oxford university press. Retrieved from https://www.google.com/books?hl=zh-TW&lr=&id=-DgTDAAAQBAJ&oi=fnd&pg=PR5&dq=networks+an+introduction&ots=PBXZgtnUFQ&sig=uXoIH4xUVMZtk8ZYkc8qhaEmxQE Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., & Pascual-Marqui, R. D. (2006). Nonsmooth nonnegative matrix factorization (nsNMF). IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3), 403–415. https://doi.org/10.1109/TPAMI.2006.60 Paterson, M., & Dančík, V. (1994). Longest common subsequences. In Mathematical Foundations of Computer Science 1994 (pp. 127–142). Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58338-6_63 Strang, G., Strang, G., Strang, G., & Strang, G. (1993). Introduction to linear algebra (Vol. 3). Wellesley-Cambridge Press Wellesley, MA. Retrieved from http://www.ise.ufl.edu/wp-content/uploads/2011/08/ESI4327c_Spring2017_Syllabus-1-30-17.pdf Teng, C.-Y., Lin, Y.-R., & Adamic, L. A. (2012). Recipe recommendation using ingredient networks. In Proceedings of the 4th Annual ACM Web Science Conference (pp. 298–307). ACM. Retrieved from http://dl.acm.org/citation.cfm?id=2380757 Xu, W., Liu, X., & Gong, Y. (2003). Document clustering based on non-negative matrix factorization. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval (pp. 267–273). ACM. Retrieved from http://dl.acm.org/citation.cfm?id=860485 |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:自定論文開放時間 user define 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |