On the 8th of August, 2024 during the Black Hat security conference, the co-founder and chief technology officer of Zenity Michael Bargury managed to display how hackers can exploit Microsoft’s Copilot to extract data and information out of your device without leaving logs, and lead victims into phishing sites without them even opening their emails or clicking any links.
What’s Wrong with Copilot?
During the session, Michael Bargury used one of his tools to demonstrate how Copilot is vulnerable to prompt injections that attackers use to evade the security controls of the AI.
He also demonstrated how developers could unknowingly create chatbots capable of infiltrating data and bypassing security and data-loss prevention control through the chatbot creation tool made by Microsoft, Copilot studio.
Moreover, Michael Bargury showed how all this could be done using one of the tools he released to GitHub, the tool LOLCopilot module, which is part of his PowerPwn. LOLCopilot is an offensive security tool designed for Copilot, Copilot Studio, and finally Power Platform. The tool according to Bargury is a red-team hacking tool to show how to change the behavior of a bo through prompt injection.
How LOLCopilot works
Copilot works by sending one of 2 types of prompt injections, either a direct prompt injection, where you can directly interact and alter the LLM prompt, or an indirect prompt in which the attacker can manipulate the data sources the model can access.
Using the tool, Baguary managed to jailbreak Copilot and modify its parameters and instructions within the model, in which he could edit emails, or manipulate the commands given to the bot and alter them without altering the model itself or replacing any of the existing reference information. “I can do this with everyone you have ever spoken to, and I can send hundreds of emails on your behalf,” and he demonstrated that by showing how the AI can copy your writing style and word choice, and even your emoji use, to make a perfect replica of your emails and potentially send ones with malicious links to unsuspecting victims.
Furthermore, the tool can do all that and still go undetected, according to Baguary “There is no indication here that this comes from a different source,” he also stated “This is still pointing to valid information that this victim actually created, and so this thread looks trustworthy. You don’t see any indication of a prompt injection.”
Microsoft’s stance on this
Microsoft stated to have taken safety measures against this with plenty of security filters and is working hard on finding counter-measures for this issue, according to Mark Russinovich, Microsoft Azure’s CTO, “The idea here is that we’re looking for signs that there are instructions embedded in the context, either the direct user context or the context that is being fed in through the RAG [retrieval-augmented generation], that could cause the model to misbehave,”.
Recently they addressed the issue and stated that they have been working on releasing “prompt shields, and recently they have indeed launched them; however, according to Baguary, this isn’t enough
He states that this case requires more filters for “promptware” which is defined as hidden instructions and untrusted data. “I’m not aware of anything you can use out of the box today [for detection],” he claims
“Microsoft Defender and Purview don’t have those capabilities today,” he adds. “They have some user behavior analytics, which is helpful. If they find the copilot endpoint having multiple conversations, that could be an indication that they’re trying to do prompt injection. But actually, something like this is very surgical, where somebody has a payload, they send you the payload, and [the defenses] aren’t going to spot it.”
Bargury says he regularly communicates with Microsoft’s Red Team, furthermore, he believes Microsoft is working well to address the issue with AI and with Copilot, as he claims, “They are working really hard,”
“I can tell you that in this research, we have found 10 different security mechanisms that Microsoft’s put in place inside of Microsoft Copilot. These are mechanisms that scan everything that goes into Copilot, everything that goes out of Copilot, and a lot of steps in the middle.”
and as always we’ll be sure to update you on any news regarding the matter, so stay tuned for more news here at Tech Exposed.