Technology Assessment VIA Web metrics, Text Mining and Temporal Trend Detection

אילן ששון גלעד רביד נאוה פליסקין
הנדסת תעשיה וניהול, אוניבריסטת בן גוריון בנגב

In today’s hyper-competitive business environment, in which Information technology (IT) innovations occur at increasing speeds with shorter life cycles, companies engage in technology assessment (TA) prior to IT investments. Assessment of a specific technology, however, presents a tough challenge for decision makers due to the inability of humans to manually process the abundance of data available on the Internet in the form of unstructured text. This research responds to the TA challenge by modeling a decision support tool for knowledge discovery from a diverse corpus of unstructured date-tagged textual data available about a certain technology on the Internet.

The decision support tool modeled in this research involves automatic generation of a knowledge map from which TA propositions can be derived, leveraging a unique synergy of several well-established research fields. Initially, a conventional concept map (co-occurrence network) is generated via co-word analysis, drawing upon the area of information extraction (IE) via text mining (TM) based on natural language processing (NLP), to yield named entity concepts. Then, the concept map is improved in two novel ways by assessing the contextual and temporal distance between linked concepts on the map. First, the relatedness proximity is measured through a series of web-based bibliometric indicators, amplifying silent information and reducing noisy information. Second, pair-wise temporal analysis is conducted via emerging trend detection, applying the Vector Space Model (VSM), the Cosine Similarity Measure (CSM) and quantitative temporal operators, to temporally distinguish between co-occurring hot concepts and co-occurring emerging concepts. The transformative process of combining relatedness proximity measurement and pair-wise temporal analysis yields a knowledge map which is more accurate and augmented than the initial concept map.
The proposed TA framework is demonstrated and validated in this work for five fundamental IT domains. The results show that the computed relatedness proximity measurement is highly correlated with experts subjective ratings (n =136): r = 0.91 to 0.98. Also, high inter-rater reliability scores were found based on Intraclass Correlation Coefficient (ICC) = 0.92 to 0.94. In addition, when the pair-wise temporal analysis detection of co-occurring hot concepts was compared with rating by the same experts, the Fleiss Kappa reliability of agreement value was above 0.72 for each of the five domains and the average predictive validity value was above 85%. Finally, TA propositions derived based on the knowledge maps for the five domains were compatible with the respective five TA reports by a leading IT consulting firm complemented with scholar assessment studies, especially for emerging technologies.








Powered by Eventact EMS