Factbox-What We Know About US Stress Tests of Google, XAI and Microsoft AI Models

2 hours ago

By Courtney Rozen and Jody Godoy

WASHINGTON, May 5 (Reuters) – The Trump administration ⁠on ⁠Tuesday announced it had expanded a program ⁠to give U.S. government scientists access to unreleased artificial intelligence models to conduct risk assessments to include Google’s DeepMind, xAI and Microsoft.

ChatGPT maker OpenAI and Claude owner Anthropic had already been voluntarily working with the U.S. Center ‌for AI Standards and Innovation, the team ‌of U.S. government scientists, to test unreleased models for vulnerabilities, according to the companies.

Here is what we know about the ⁠reviews:

WHAT RISKS ARE ⁠THE U.S. FOCUSED ON?

U.S. government scientists are focused on “demonstrable risks,” such as the risk that advanced models can be used to launch cyberattacks on American infrastructure, according to the CAISI website. They want to limit opportunities for U.S. adversaries to use AI to develop chemical or biological weapons, or corrupt the data used to train American AI models.

WHAT WILL COMPANIES HAND OVER?

OpenAI is working with the group to test GPT-5.5-Cyber, said Chris Lehane, head of global affairs at OpenAI, in ⁠a ⁠LinkedIn post on Tuesday. GPT-5.5-Cyber is ⁠a variant of its latest model designed for defensive cybersecurity work.

Microsoft will work with the scientists to build shared datasets and workflows to assess advanced AI models, the ⁠company said in a statement. Microsoft did not specify which models.

Anthropic gave CAISI access to both publicly available and unreleased models, allowing researchers to probe for vulnerabilities in a process known as “red-teaming,” or simulating the behavior of malicious actors, the company said in September. The company also gave CAISI detailed documentation on known vulnerabilities and safety mechanisms.

Google DeepMind, Alphabet’s AI research arm, will provide access ⁠to its “proprietary models” and data, a spokesperson said.

xAI did not immediately respond to a request for comment ⁠from Reuters.

WHAT HAS THE U.S. FOUND SO FAR?

Anthropic’s work with CAISI revealed that tricks such as claiming that human review had occurred, or substituting characters, could get around safety mechanisms, the company said, adding that it had patched the vulnerabilities.

OpenAI said in September that it worked with CAISI to probe vulnerabilities in its ChatGPT Agent that could have allowed sophisticated actors to bypass OpenAI’s cybersecurity measures. The exploit would have allowed the attacker to “remotely control the computer systems the agent could access for that session and successfully impersonate the user for other websites they’d logged into,” the company said.

The companies, along with Meta, Amazon and Inflection AI, agreed in 2023 to ⁠allow independent experts to check their models for biosecurity and cybersecurity risks.

The U.S. government scientists, organized under a different name during former U.S. President Joe Biden’s tenure, also released voluntary guidelines to protect against the risk of AI models leaking private health information or producing incorrect answers.

The scientists are now working on guidelines for critical infrastructure providers, such as the communications and emergency services sectors, to test their AI systems, according to their website.

(Reporting by Courtney Rozen; Editing by Stephen Coates)