{"id":3847,"date":"2022-01-17T09:46:00","date_gmt":"2022-01-17T09:46:00","guid":{"rendered":"https:\/\/techpolicy.org.il\/?p=3847"},"modified":"2022-01-18T07:49:47","modified_gmt":"2022-01-18T07:49:47","slug":"adversarial-machine-learning-research-developments-dangers-and-implications","status":"publish","type":"post","link":"https:\/\/techpolicy.org.il\/he\/blog\/adversarial-machine-learning-research-developments-dangers-and-implications\/","title":{"rendered":"An Introduction to Adversarial Machine Learning \u2013 Developments in Research, Risks and Other Implications"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large is-style-default\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"682\" src=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2021\/12\/web-4033235_1280-1024x682.jpg\" alt=\"artificial intelligence facial app\" class=\"wp-image-3852\" srcset=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2021\/12\/web-4033235_1280-1024x682.jpg 1024w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2021\/12\/web-4033235_1280-300x200.jpg 300w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2021\/12\/web-4033235_1280-768x512.jpg 768w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2021\/12\/web-4033235_1280-18x12.jpg 18w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2021\/12\/web-4033235_1280.jpg 1280w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>While there is an extensive ongoing public debate about the ethical difficulties of relying on artificial intelligence-based models, little attention is being paid to the ability of malicious actors to disrupt and deceive such models. This article introduces the evolving field of \u2018adversarial machine learning\u2019, its risks and various implications.<\/p>\n\n\n\n<p>In 2016, Microsoft launched a Twitter chatbot in the form of an American teenager called Tay. The chatbot was designed to communicate with the platform&#8217;s users for entertainment purposes, but first and foremost, for learning from the interaction with the users to improve the model on which it was built. In under a day from its launch and after posting about 96,000 &#8220;Tweets&#8221;, Microsoft was forced to issue an apology and to announce the suspension of the chatbot. Tay, as it turned out, also learned to imitate the abusive behavior of Twitter users who quickly realized that they could adversely affect her learning \u2013 and followed through \u2013 and began tweeting racist and antisemitic tweets herself. This incident was among the first to demonstrate in such a tangible and disturbing manner the feasibility of manipulation of and disruption to artificial intelligence (AI) models, even without full access to the model and its parameters.<\/p>\n\n\n\n<p>While there is an extensive ongoing public debate concerning the ethical issues in relying on algorithms for decision-making and the various human biases that models&#8217; creators might implement into them \u2013 little attention is being paid to the power of foreign agents, often malicious ones, to disrupt these models. This article introduces the evolving field of Adversarial Machine Learning (AML) and identifies its various implications.<\/p>\n\n\n\n<div class=\"wp-block-cover has-background-dim\"><img loading=\"lazy\" decoding=\"async\" width=\"554\" height=\"277\" class=\"wp-block-cover__image-background wp-image-4049\" alt=\"\" src=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture1.jpg\" data-object-fit=\"cover\" srcset=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture1.jpg 554w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture1-300x150.jpg 300w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture1-18x9.jpg 18w\" sizes=\"auto, (max-width: 554px) 100vw, 554px\" \/><div class=\"wp-block-cover__inner-container is-layout-flow wp-block-cover-is-layout-flow\">\n<p class=\"has-large-font-size\"><\/p>\n<\/div><\/div>\n\n\n\n<p class=\"has-small-font-size\">1. One of the &#8220;Tweets&#8221;<em> published by Microsoft&#8217;s Chatbot in response to a user&#8217;s question: <a href=\"https:\/\/fortune.com\/2016\/03\/24\/chat-bot-racism\/\">https:\/\/fortune.com\/2016\/03\/24\/chat-bot-racism\/<\/a><\/em><\/p>\n\n\n\n<p><strong><u>What is Adversarial Machine Learning and How Fooling Models is Made Possible?<\/u><\/strong><\/p>\n\n\n\n<p>AML is a machine learning technique that allows taking advantage of, disrupting, or fooling AI models, with or without access to the model itself and to the data on which it is trained and operating. In order to understand how fooling models is made possible, it would be useful to first get a basic grasp of how machine learning models work.<\/p>\n\n\n\n<p>Machine learning (ML), a subset of AI, is the general term for the field in which algorithms can autonomously, namely without being explicitly programmed to do so, learn and improve to perform complex tasks, by imitating human functionalities, such as learning by example and by analogy. ML is performed in two main ways: Supervised Learning \u2013 in which a human moderator collects data, feeds it to the machine and labels it until the machine learns to perform this classification process by itself. The second method is Unsupervised Learning \u2013 in which the machine is fed with data, but it performs the labeling and the classification processes completely autonomously by dividing the data into groups and discovering patterns according to certain characteristics. <a href=\"https:\/\/medium.com\/neuralmagic\/2012-a-breakthrough-year-for-deep-learning-2a31a6796e73\" target=\"_blank\" rel=\"noreferrer noopener\">In 2012, the field of ML has experienced significant acceleration<\/a> with advances in the development of Deep Learning \u2013 an ML capability based on numerous layers of artificial neural networks (ANN) that simulate the behavior of the human brain and are able to perform complex calculations with a very high level of accuracy. Today, we make extensive use of such models, from Siri and Alexa, through the Netflix recommendation system \u2013 to autonomous vehicles. In many cases, deep learning models are considered as a &#8216;black box&#8217; since the inner workings of their computation process are non-transparent and ambiguous and consequently \u2013 non-explainable and uninterpretable to those applying them, to the AI-system subjects (seeking to legally challenge AI-related harm without being able to base their claims on the model&#8217;s computation process), and sometimes even to the model developers themselves. <strong>But despite the accuracy, complexity and ambiguity of ML models (or, perhaps, because of it), it turns out that these models can be rather easily tricked and fooled<\/strong>.<\/p>\n\n\n\n<p>A model&#8217;s disruption or fooling can be attempted by feeding the model with misleading information during its training process or during its implementation. The disruption can be either targeted, so that it causes the model to classify input X as if it were Y, or untargeted, which simply causes the model to not classify input X as X. Since the first type is considered more complex and expensive in terms of time and resources, the second method is more commonly used. The methods applied to disrupt models, can also be classified according to the level of access the attacker has to the model:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Black box attacks<\/strong> \u2013 In these types of attacks, the attacker has only partial information about the model and its parameters. One way to fool models using this method is, for example, by feeding the model with a large number of inputs to receive outputs in a way that allows the attacker to study the model, then produce a competing model to fool the original one. Another way is to feed the model with adversarial input designed to mislead it, and thus disrupt its performance.<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>White box attacks<\/strong> \u2013 In these types of attacks, the attacker has full knowledge of the model, its parameters, and the data it is trained and operating on. Therefore, it can conduct various manipulations, during either the data collection process or the classification process.<\/li><\/ul>\n\n\n\n<p>AML attacks are typically divided into the following main categories:<\/p>\n\n\n\n<ol class=\"wp-block-list\" type=\"1\"><li><strong>Data Poisoning Attacks<\/strong> \u2013 In these attacks, the attacker maliciously affects the data that feeds the model or the labels through which it learns to classify the data, in order <a href=\"\/Users\/sivantamir\/Downloads\/to%20corrupt%20the%20model%20and%20undermine%20its%20integrity\" target=\"_blank\" rel=\"noreferrer noopener\">to corrupt the model and undermine its integrity<\/a>, for example by &#8216;injecting&#8217; false data used to train the model, thus disrupting its performance. Such an attack can be performed both during the model&#8217;s training phase or after its implementation (such as was the case with Chatbot Tay).<\/li><li><a href=\"https:\/\/arxiv.org\/abs\/2106.08299\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Evasion Attacks<\/strong><\/a> \u2013 In these attacks, the attacker manipulates the input in a way that misleads the model \u2013 even after it has been deployed \u2013 and causes it to perform an incorrect classification, so that the output obtained is incorrect. A basic and well-known example of such attacks is the ability to evade email spam filtering systems, which is based on word identification, by linking words that are not labeled as spam to other words that, if appeared on their own, would have been labeled as such.<\/li><li><strong>Model Extraction<\/strong> \u2013 These are black-box attacks in which the attacker sufficiently learns the original model to produce a surrogate model, or to extract the data used for training the model. Such attack can be employed to steal the model itself, or alternatively, the surrogate model can be used by the attacker to attack the original model itself.<\/li><\/ol>\n\n\n\n<p><strong><u>Illustrating the Potential of Adversarial Machine Learning<\/u><\/strong><\/p>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/1312.6199.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">The groundbreaking research<\/a> of researchers from Google and the Universities of New-York and Montreal from 2014, illustrated, for the first time, the power of AML. The researchers found that only a slight alteration of the input, such as adding &#8216;noise&#8217; that is invisible to the human eye, or a slight rotation of the image, may cause the model to misclassify the data. <a href=\"https:\/\/users.cs.northwestern.edu\/~srutib\/papers\/face-rec-ccs16.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Other researchers<\/a> have demonstrated how AML can affect the physical world, as well. They showed that by wearing certain glasses made of colored paper, sophisticated facial recognition systems can be disrupted, causing them to identify the wrong person, or by making it possible to impersonate another person. Similarly, <a href=\"https:\/\/arxiv.org\/pdf\/1707.08945.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">researchers were able to disrupt an autonomous vehicle<\/a> by affixing a sticker to a stop sign. While humans would still be able to recognize the stop sign even though a sticker is affixed to it, the AI model of the autonomous vehicle mistakenly identified the stop sign as a speed limit sign, and made the vehicle slow down.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture2.png\" alt=\"\" class=\"wp-image-4050\" width=\"840\" height=\"406\" srcset=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture2.png 554w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture2-300x145.png 300w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture2-18x9.png 18w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><\/figure>\n\n\n\n<p class=\"has-small-font-size\"><em>2. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture3.jpg\" alt=\"\" class=\"wp-image-4051\" width=\"839\" height=\"370\" srcset=\"https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture3.jpg 554w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture3-300x132.jpg 300w, https:\/\/techpolicy.org.il\/wp-content\/uploads\/2022\/01\/Picture3-18x8.jpg 18w\" sizes=\"auto, (max-width: 839px) 100vw, 839px\" \/><\/figure>\n\n\n\n<p class=\"has-small-font-size\">3. Adding &#8216;noise&#8217;, which is invisible to the human eye to an image causes the model to misclassify it.<br>Goodfellow, I. J., Shlens, J., &amp; Szegedy, C. (2014). Explaining and harnessing adversarial examp<em>les.<\/em>\u200f<\/p>\n\n\n\n<p>Notably, to date, current concerns with respect to AML, as well as the examination of its implications for our lives, are mainly confined to academic research. It is also likely to assume that even as this field of AML will gain momentum \u2013 the capacity to carry out complex attacks will not be shared by the general public.<\/p>\n\n\n\n<p>However, we can offer a few common examples of AML attacks that each of us is already capable of performing \u2013 even if to a limited extent. For example, <a href=\"https:\/\/www.independent.co.uk\/news\/business\/news\/uber-drivers-work-together-price-surge-go-offline-charge-customers-more-game-app-supply-demand-algorithm-a7872871.html\" target=\"_blank\" rel=\"noreferrer noopener\">it was discovered<\/a> that Uber taxi drivers jointly coordinated disconnection from the company&#8217;s application in order to raise the price per ride once reconnecting, as the company&#8217;s pricing algorithm was based on supply and demand. Another example is one relating to rating and reviews algorithms. It turns out, that feeding such algorithms with false data could cause them to display incorrect ratings and information. Such was the case of British journalist, Oobah Butler, who in 2017, <a href=\"https:\/\/www.washingtonpost.com\/news\/food\/wp\/2017\/12\/08\/it-was-londons-top-rated-restaurant-just-one-problem-it-didnt-exist\/\" target=\"_blank\" rel=\"noreferrer noopener\">managed to fool TripAdvisor&#8217;s algorithm<\/a> after promoting a restaurant that never existed to the top of the rankings.<\/p>\n\n\n\n<p>The appreciation of the significance of individuals&#8217; ability to influence AI models even without actual access to them, finds its current expression in the field of art, particularly in the form of protest and resistance. For example, a number of artists subversively begun displaying <a href=\"https:\/\/adversarialfashion.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">unique clothing items<\/a> and bizarre<a href=\"https:\/\/cvdazzle.com\/\" target=\"_blank\" rel=\"noreferrer noopener\"> hairstyles and makeup<\/a> as a way of avoiding facial recognition systems. Although basic and limited, such examples of model fooling may indicate a counter-reaction against the backdrop of growing governmental and law enforcement agencies&#8217; reliance on AI models to collect data and analyze behavioral patterns of individuals.<\/p>\n\n\n\n<p>AML, therefore, poses significant threats to all sectors relying on AI; most notably \u2013 the security, <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7657648\/\" target=\"_blank\" rel=\"noreferrer noopener\">health<\/a>, law and welfare sectors. In light of this, it would not be unreasonable to expect that in the not-too-distant future, malicious attackers will opt to exploit AI models for illegal purposes and in their self-interest, in a way that may violate human rights, and in some cases \u2013 even risk human lives. For example, certain adversarial assaults may infringe the right to privacy in cases where the attacker <a href=\"https:\/\/www.usenix.org\/system\/files\/sec21-carlini-extracting.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">gains access to personally identifiable information<\/a> about individuals or groups \u2013 who might be completely unaware that their personal data are used to train any kind of model \u2013 and uses it illegally. In an age when our personal information is constantly collected and traded, adversarial threats emphasize the importance of privacy and data protection regulation. Moreover, as the use of AI systems in the governmental and public service increases, so does the ability to disrupt the performance of algorithmic decision-making systems (e.g., for loan approval or for determining entitlement to social welfare payments). The operation of such systems affects the lives of many individuals, and therefore their susceptibility to adversarial attacks poses a tangible threat to basic human rights and interests. As the use of AML will expand to actual impact on our daily lives, it may undermine human trust in AI systems, and not least \u2013 erode public trust in in those who initiate their implementation into various aspects of our lives, namely government bodies. Such a situation may lead to skepticism, evoking strong feelings of injustice among citizens. What can be done, then, in view of such threats?<\/p>\n\n\n\n<p><strong><u>Suggested Solutions<\/u><\/strong><\/p>\n\n\n\n<p>AML is considered a relatively new phenomenon that is mainly limited to the scientific field. Notwithstanding, recently, there has been a growing interest in securing AI systems (including against AML attacks) in cyber security contexts, with several initiatives already underway.<\/p>\n\n\n\n<p>Academia and private sector cyber security companies are searching for ways to deal with AML&#8217;s potential threats, leading to the development of novel, creative technological approaches to adversarial attacks. One of the main challenges in defending against these attacks, is the difficulty of detecting possible disruption in the model \u2013 from the early stages of data collection, through the classification and learning phase, and well after the model has been trained and implemented. One defensive approach is based on an equivalent retaliation model (&#8220;tit for tat&#8221;), called &#8216;adversarial training&#8217;. Namely, training a model to identify adversarial attacks and detect weaknesses in the model, thus making the original one more resilient. Another approach is to create multiple models to act as a &#8216;moving target&#8217;, rendering it impossible for the attacker to know which model is actually being used.<\/p>\n\n\n\n<p>As far as regulation and policy are concerned, governments and law enforcement bodies, in Israel and worldwide, have yet to establish significant regulatory measures for dealing with AML. <a href=\"https:\/\/digital-strategy.ec.europa.eu\/en\/policies\/regulatory-framework-ai\" target=\"_blank\" rel=\"noreferrer noopener\">The European Union regulation<\/a> in the field of AI, for example, indicates the importance of accurate and resilient systems \u2013 including the ability to confront adversarial attacks. Similarly, reference to the need to defend against adversarial threats to AI systems, can be found in both <a href=\"https:\/\/www.gov.il\/BlobFolder\/news\/international_strategy\/en\/Israel%20International%20Cyber%20Strategy.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Israel&#8217;s International Cyber Strategy<\/a> and <a href=\"https:\/\/assets.publishing.service.gov.uk\/government\/uploads\/system\/uploads\/attachment_data\/file\/1020402\/National_AI_Strategy_-_PDF_version.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">the UK&#8217;s National AI Strategy<\/a>.<\/p>\n\n\n\n<p>However, alongside the development of technical solutions and regulatory advancements, currently, there is still no comprehensive solution for meeting adversarial threats. Therefore, it is important to employ other defense mechanisms, such as strengthening information security, validating the information used to train the model, and routinely examining the model&#8217;s performance and outputs.<\/p>\n\n\n\n<p class=\"has-text-align-left\">In conclusion, as AI systems increasingly dominate various aspects of our lives and as we become more dependent on them \u2013 the greater the threats these systems face, and consequently \u2013 the threats for human (AI-system-subjects) rights and interests. This requires developing capabilities and appropriate regulations that will be accommodated with the current reality. Technology companies like <a href=\"https:\/\/ai.googleblog.com\/2018\/09\/introducing-unrestricted-adversarial.html\" target=\"_blank\" rel=\"noreferrer noopener\">Google<\/a>, <a href=\"https:\/\/www.microsoft.com\/security\/blog\/2020\/10\/22\/cyberattacks-against-machine-learning-systems-are-more-common-than-you-think\/\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft<\/a> and <a href=\"https:\/\/researcher.watson.ibm.com\/researcher\/view_group.php?id=9571\" target=\"_blank\" rel=\"noreferrer noopener\">IBM<\/a> have already begun to invest considerable resources in developing defensive tools against adversarial threats, acknowledging their profound implications. Respectively, as government and public agencies rely more on AI models in the provision of government and administrative services, they must also be equipped with an appropriate strategy to confront the potential threats and implement resilient, reliable, and secure models.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>Guest author: Yael Ram [PhD candidate, the Hebrew University of Jerusalem]<\/p>\n\n\n\n<p>Edited by: Dr. Sivan Tamir<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While there is an extensive ongoing public debate about the ethical difficulties of relying on artificial intelligence-based models, little attention is being paid to the ability of malicious actors to [&hellip;]<\/p>\n","protected":false},"author":15,"featured_media":3852,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_seopress_robots_primary_cat":"none","_seopress_titles_title":"%%post_title%% %%sitetitle%%","_seopress_titles_desc":"","_seopress_robots_index":"","inline_featured_image":false,"footnotes":""},"categories":[9],"tags":[77,56,57],"publication_type":[],"class_list":["post-3847","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","tag-adversarial-machine-learning","tag-artificial-intelligence","tag-machine-learning"],"acf":[],"_links":{"self":[{"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/posts\/3847","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/comments?post=3847"}],"version-history":[{"count":14,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/posts\/3847\/revisions"}],"predecessor-version":[{"id":4052,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/posts\/3847\/revisions\/4052"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/media\/3852"}],"wp:attachment":[{"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/media?parent=3847"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/categories?post=3847"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/tags?post=3847"},{"taxonomy":"publication_type","embeddable":true,"href":"https:\/\/techpolicy.org.il\/he\/wp-json\/wp\/v2\/publication_type?post=3847"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}