Context-Aware Decision Intelligence for Defence Clearance Using Ensemble Learning

I.V Prakash; Sameer; Yousaf

doi:10.64751/ijdim.2026.v5.n2(3).1062

Authors

I.V Prakash Author
Sameer Author
Yousaf Author

DOI:

https://doi.org/10.64751/ijdim.2026.v5.n2(3).1062

Keywords:

Security Clearance Adjudication, Decision Support Systems, Natural Language Processing (NLP), Ensemble Learning, Stacked Classifier.

Abstract

The rapid expansion of digital documentation in legal and defence domains has generated a critical need for intelligent systems capable of extracting meaningful insights from large volumes of unstructured data. Security clearance adjudication processes involve complex case narratives, categorical records, and appeal decisions that require accurate, consistent, and unbiased analysis. Traditional manual approaches are labor-intensive and prone to inconsistency, while standard machine learning techniques often struggle with class imbalance and lack the ability to capture contextual meaning within textual data. This study addresses these limitations by combining natural language processing with data balancing and ensemble learning methods to improve predictive performance in defence clearance decision-making. The research focuses on two binary classification problems: determining whether a decision is favourable (approval or denial of clearance) and whether an initial decision is upheld or overturned upon appeal. Baseline models, including K-Nearest Neighbors, Logistic Regression, and Multinomial Naive Bayes, were evaluated but showed limitations in handling imbalanced datasets and complex textual patterns. To overcome these challenges, the proposed framework integrates NLP-based preprocessing with the SMOTE method to address class imbalance, and employs a stacked ensemble model. In this architecture, Random Forest acts as the base learner, while Logistic Regression serves as the meta-classifier. This combination effectively utilizes the feature learning strengths of Random Forest along with the decision boundary optimization capability of Logistic Regression, resulting in enhanced classification accuracy and balanced predictions across both target classes. Model performance is evaluated using confusion matrices, ROC curves, and accuracy metrics, demonstrating that the stacked ensemble approach outperforms individual models. Although developed for defence adjudication, the proposed methodology is broadly applicable to other domains such as finance, healthcare, insurance, and regulatory compliance, where unstructured text and imbalanced data influence binary decision-making processes. This work emphasizes the importance of integrating NLP with ensemble learning to build scalable, reliable, and interpretable systems for critical decision support.