Design Pattern Detection Using Machine Learning
DOI:
https://doi.org/10.64751/ijdim.2026.v5.n2.pp145-150Keywords:
Design Pattern Detection; Random Forest; Machine Learning; Software Architecture; Static Code Analysis; MultiLanguage Parsing; Flask Web Application; Feature Extraction; Object-Oriented Design; Software Engineering; Explainable AI; Code ClassificationAbstract
The identification of software design patterns within source codebases is a task of substantial practical importance for both academic assessment and industrial code comprehension, yet it remains overwhelmingly dependent on manual expert inspection—a process that is slow, subjective, and resistant to scaling across large or multi-language repositories. This paper presents an automated, machine learning-driven Design Pattern Detection System that classifies four canonical Gang-of-Four patterns—Singleton, Factory, Observer, and Strategy—directly from static source code characteristics, without requiring rigid rule templates or languagespecific toolchains. The system extracts a nine-dimensional structural feature vector per project, capturing class count, method count, inheritance depth, static instance presence, object creation frequency, interface count, method call diversity, static method count, and delegation indicators. A Random Forest ensemble of 100 decision trees, trained on a manually curated dataset with an 80/20 stratified split and Gini impurity as the splitting criterion, achieves prediction confidence exceeding 85% across all four pattern classes on held-out test samples. The trained classifier is serialised via Joblib and loaded into a Flask RESTful web application, providing five core capabilities: (1) single-file and ZIP project upload with automated multi-language routing to language-specific parsers for Python, Java, JavaScript, and C/C++; (2) dominant pattern prediction with perclass probability distribution; (3) human-readable explanation generation linking feature values to architectural reasoning; (4) pairwise project comparison computing common patterns, unique patterns, and a Jaccard-derived similarity score; and (5) persistent SQLite-backed analysis history with dashboard analytics. Inference latency remains below 50 milliseconds per project on standard development hardware, confirming suitability for interactive academic use. The proposed system is the first published design pattern detector to simultaneously address multi-language support, ML-based adaptive classification, human-readable explanation, pairwise project comparison, and historical analytics within a unified web platform—directly resolving six research gaps identified through systematic literature review
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.






