Abstract: Software maintenance is challenging as developers need to comprehend a large codebase in a short time to fix bugs, improve performance for existing features, or upgrade existing applications. More often, proper documentation (requirements, design, code), version change traces, and knowledge transfer from earlier developers are not available or inadequate. Hence, the developer is forced to engage in the repetitive and menial task of running the application with various test cases to identify execution patterns or to painstakingly examine the source code to understand the design and comprehend the program. This process is monotonous, error-prone, time-intensive, and often unmanageable for developers, leading to low productivity and a considerable drop in the quality of output. Surveys by Ogheneovo and Hajrahmi et al. have shown that to ensure software quality and operability, there has been a significant increase in maintenance cost — up to 70% to 80% of the total software life cost — in the last three decades. To reduce this substantial overhead, we intend to incorporate an appropriate degree of smartness into the process of knowledge transfer to help comprehend a program effectively and increase the efficiency of software maintenance tasks.
The research so far attempts to analyze and comprehend specific aspects of an application. Researchers have proposed approaches to increase code readability, locate relevant code fragments, extract design models, detect bugs, analyze the nondeterministic nature in the case of parallel programs, predict the algorithmic patterns. and the like. No framework can provide a near-complete multifaceted understanding of a program from various perspectives. To address this challenge, we propose a search framework called SmartKT (Smart Knowledge Transfer) to assist program comprehension by responding to queries based on various aspects of an application, including design, implementation, run-time behavior, bug detection, performance profiling, and the like. SmartKT may be compared to a key person on a software project who is aware of every aspect of the project and can answer any query. We integrate SmartKT with Eclipse IDE for use of use.
The SmartKT query engine sits on a semantic graph built from various knowledge sources. As the design and code documentations are mostly noisy and not machine readable, we explore alternate sources such as source code, run-time traces, source code annotations, and program comments to extract knowledge and concepts. Based on source code, using static instrumentation, we create our custom syntax tree with information related to type, scope, and definition-use for all program symbols. We implement a DCUBE-ML analyzer to discover design models from applications based on runtime traces by using dynamic instrumentation in a machine learning framework. We are working on a metadata analyzer called COMMENT-MINE, which extracts program and problem domain concepts from user comments and correlates them to program symbols by using NLP
techniques. Finally, we intend to collate these concepts to build a semantic graph using Resource Description Framework and build an intelligent query system over it.