A Machine Learning-Based Predictive Framework for Patent Infringement Detection: Enhancing Intellectual Property Protection Through Hybrid Ensemble Models

Authors

  • Kang Jin Gang Malaysia University of Science and Technology (MUST), Block B, Encorp Strand Garden Office, No. 12, Jalan PJU 5/5, Kota Damansara, 47810 Petaling Jaya, Selangor, Malaysia
  • Ang Ling Weay Malaysia University of Science and Technology (MUST), Block B, Encorp Strand Garden Office, No. 12, Jalan PJU 5/5, Kota Damansara, 47810 Petaling Jaya, Selangor, Malaysia

Keywords:

Patent Infringement Prediction, Machine Learning, Hybrid Ensemble Algorithm, Intellectual Property Management, Feature Selection, Data Balancing

Abstract

Patent infringement poses significant risks to innovation and economic growth. Traditional intellectual property (IP) protection methods are often reactive, expensive, and inefficient for large-scale patent management. This study introduces an optimized machine learning framework designed to predict patent infringements proactively. The research evaluates the performance of Random Forest, Support Vector Machines (SVM), and Logistic Regression on a curated dataset enriched with patent citations, legal status, and family size. The study employs Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance and Recursive Feature Elimination (RFE) for feature selection. A novel hybrid ensemble model integrating Random Forest and SVM is developed, achieving 75% precision, 95% recall, and an F1-score of 84%, outperforming baseline models. The findings contribute to IP management by offering a scalable predictive framework that minimizes litigation costs and enhances proactive infringement detection.

References

World Intellectual Property Organization (WIPO), World Intellectual Property Indicators 2022. Geneva, Switzerland: WIPO, 2022. [Online]. Available: https://www.wipo.int/publications/en/details.jsp?id=4589

Z. Zhao, L. Wang, and X. Liang, “Patent litigation risk analysis for SMEs based on machine learning,” Technovation, vol. 94–95, p. 102089, 2020, doi: 10.1016/j.technovation.2020.102089.

H. D. Nguyen, N. H. Tran, M. T. Nguyen, and T. Q. Dinh, “Artificial intelligence in intellectual property management: Applications and research challenges,” IEEE Access, vol. 9, pp. 123154–123172, 2021, doi: 10.1109/ACCESS.2021.3110103.

J. Son, H. Lim, and J. Lee, “Deep learning-based infringement risk prediction using patent documents,” Journal of Informetrics, vol. 16, no. 1, p. 101203, 2022, doi: 10.1016/j.joi.2021.101203.

S. Lee, H. Park, and H. Kim, “Patent infringement prediction using machine learning techniques: Evidence from U.S. patents,” Technological Forecasting and Social Change, vol. 174, p. 121253, 2022, doi: 10.1016/j.techfore.2021.121253.

S. Juranek and H. Otneim, “Predicting patent litigation with machine learning,” Research Policy, vol. 50, no. 2, p. 104154, 2021, doi: 10.1016/j.respol.2020.104154.

Y. Qi, “Patent characteristics and patent litigation: Empirical evidence from China,” J. World Intellect. Prop., vol. 17, no. 5–6, pp. 204–217, 2014, doi: 10.1111/jwip.12035.

K. Cremers, “Determinants of patent litigation in Germany,” Centre for European Economic Research, ZEW Discussion Paper No. 04-072, 2004. [Online]. Available: https://ftp.zew.de/pub/zew-docs/dp/dp04072.pdf

L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.

C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995, doi: 10.1007/BF00994018.

Downloads

Published

2025-05-05

How to Cite

Kang Jin Gang, & Ang Ling Weay. (2025). A Machine Learning-Based Predictive Framework for Patent Infringement Detection: Enhancing Intellectual Property Protection Through Hybrid Ensemble Models. International Journal of Sciences: Basic and Applied Research (IJSBAR), 76(1), 187–195. Retrieved from https://gssrr.org/index.php/JournalOfBasicAndApplied/article/view/17413

Issue

Section

Articles