A Machine Learning-Based Predictive Framework for Patent Infringement Detection: Enhancing Intellectual Property Protection Through Hybrid Ensemble Models
Keywords:
Patent Infringement Prediction, Machine Learning, Hybrid Ensemble Algorithm, Intellectual Property Management, Feature Selection, Data BalancingAbstract
Patent infringement poses significant risks to innovation and economic growth. Traditional intellectual property (IP) protection methods are often reactive, expensive, and inefficient for large-scale patent management. This study introduces an optimized machine learning framework designed to predict patent infringements proactively. The research evaluates the performance of Random Forest, Support Vector Machines (SVM), and Logistic Regression on a curated dataset enriched with patent citations, legal status, and family size. The study employs Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance and Recursive Feature Elimination (RFE) for feature selection. A novel hybrid ensemble model integrating Random Forest and SVM is developed, achieving 75% precision, 95% recall, and an F1-score of 84%, outperforming baseline models. The findings contribute to IP management by offering a scalable predictive framework that minimizes litigation costs and enhances proactive infringement detection.
References
World Intellectual Property Organization (WIPO), World Intellectual Property Indicators 2022. Geneva, Switzerland: WIPO, 2022. [Online]. Available: https://www.wipo.int/publications/en/details.jsp?id=4589
Z. Zhao, L. Wang, and X. Liang, “Patent litigation risk analysis for SMEs based on machine learning,” Technovation, vol. 94–95, p. 102089, 2020, doi: 10.1016/j.technovation.2020.102089.
H. D. Nguyen, N. H. Tran, M. T. Nguyen, and T. Q. Dinh, “Artificial intelligence in intellectual property management: Applications and research challenges,” IEEE Access, vol. 9, pp. 123154–123172, 2021, doi: 10.1109/ACCESS.2021.3110103.
J. Son, H. Lim, and J. Lee, “Deep learning-based infringement risk prediction using patent documents,” Journal of Informetrics, vol. 16, no. 1, p. 101203, 2022, doi: 10.1016/j.joi.2021.101203.
S. Lee, H. Park, and H. Kim, “Patent infringement prediction using machine learning techniques: Evidence from U.S. patents,” Technological Forecasting and Social Change, vol. 174, p. 121253, 2022, doi: 10.1016/j.techfore.2021.121253.
S. Juranek and H. Otneim, “Predicting patent litigation with machine learning,” Research Policy, vol. 50, no. 2, p. 104154, 2021, doi: 10.1016/j.respol.2020.104154.
Y. Qi, “Patent characteristics and patent litigation: Empirical evidence from China,” J. World Intellect. Prop., vol. 17, no. 5–6, pp. 204–217, 2014, doi: 10.1111/jwip.12035.
K. Cremers, “Determinants of patent litigation in Germany,” Centre for European Economic Research, ZEW Discussion Paper No. 04-072, 2004. [Online]. Available: https://ftp.zew.de/pub/zew-docs/dp/dp04072.pdf
L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.
C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995, doi: 10.1007/BF00994018.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Sciences: Basic and Applied Research (IJSBAR)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who submit papers with this journal agree to the following terms.