Model inversion attack
Model inversion attack is a type of adversarial machine learning attack where an attacker tries to reconstruct or infer sensitive information about a model's training data by analyzing the outputs of a trained machine learning model.[1][2] Instead of directly querying the underlying dataset, attackers query the model (usually via APIs or prediction interfaces), and leverage patterns in the model responses to infer properties of the original inputs.[1] These attacks leverage the fact that machine learning models encode statistical information about their training data in their parameters and outputs, which can unintentionally leak private or proprietary information.[3]
Depending on the access level to the target model, model inversion attacks can be performed in both black-box and white-box settings.[2] In a generic attack, an adversary makes several queries to a model and leverages the responses (e.g. confidence scores, predictions) to train a surrogate or inversion model that learns to approximate the inverse mapping from outputs to inputs.[1][4] This process may enable the reconstruction of sensitive attributes, e.g., facial features, medical data, or user behavior patterns, from models trained on such data. The technique has been demonstrated against various models like deep neural networks, classification systems etc. The technique has significant privacy risks in areas like healthcare, finance, biometric identification etc. Mitigation strategies include restricting model access, reducing output granularity, using differential privacy and monitoring anomalous query patterns.[5]
See also
References
- ^ a b c Fredrikson, Matt; Jha, Somesh; Ristenpart, Thomas (2015). "Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures". Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS). pp. 1322–1333. doi:10.1145/2810103.2813677.
- ^ a b Zhang, Jiaqi; Chen, Kai (2021). "Model Inversion Attacks: A Survey". IEEE Transactions on Knowledge and Data Engineering. doi:10.1109/TKDE.2021.3065936.
- ^ Shokri, Reza; Stronati, Marco; Song, Congzheng; Shmatikov, Vitaly (2017). "Membership Inference Attacks Against Machine Learning Models". IEEE Symposium on Security and Privacy: 3–18. doi:10.1109/SP.2017.41.
- ^ Yang, Zhengxue; Zhang, Jian; Chang, Eugene; Liang, Yingyu (2019). "Adversarial Model Inversion for Deep Neural Networks". Advances in Neural Information Processing Systems (NeurIPS).
- ^ Dwork, Cynthia; Roth, Aaron (2014). The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science. doi:10.1561/0400000042.
Content Disclaimer
Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.
- The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
- There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
- It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
- Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
- Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.