Publications
News
September 17, 2025
Research
August 7, 2025
As frontier AI models get more powerful, the stakes for secure deployment get higher. Irregular was brought in to put GPT-5 to the test. Our work is referenced in the GPT-5 Model Card, and we’re excited to share more on what we found.
Research
July 15, 2025
Irregular and Anthropic have completed an extensive cybersecurity evaluation of Claude Sonnet 4 and Claude Opus 4, which have shown significant improvements over previous generations. Using our evaluation suite, as described expansively in Anthropic's model card, the models were tested across 48 challenges covering web exploitation, cryptography, binary exploitation, reverse engineering, and network attacks.
Research
June 19, 2025
We're happy to announce the publication of our collaborative whitepaper with Anthropic on Confidential Inference Systems, an approach to using Confidential Computing technologies to enhance the security of AI model weights, as well as the privacy of user data being processed by the model.
News
May 5, 2025
Research
April 17, 2025
As frontier models become increasingly more capable, the need to ensure security upon deployment grows. Irregular is proud to share that we played a significant role in assessing OpenAI's o3 and o4-mini's cybersecurity capabilities through a comprehensive evaluation, as referenced in the models’ System Card.
News
April 7, 2025
Research
March 31, 2025
In today's rapidly evolving AI landscape, understanding and precisely evaluating the capabilities of advanced AI systems has become a critical security concern. Even though different benchmarks are constantly being developed and published, a significant challenge lies in converting raw evaluation results into meaningful capability levels of AI systems, as part of a greater risk evaluation system. This blog post presents a specific framework to translate those evaluation results into capability levels enabling the assessment of risk levels for AI models.
Research
March 27, 2025
Following our series of blog posts about “Best Practices for Evaluations and Evaluation Suites”, this blog post introduces our existing state-of-the-art Evaluation Platform. Our platform is already actively deployed and is assisting multiple top frontier labs to measure the risks associated with AI systems through cutting-edge empirical testing. In this post, we highlight our security evaluations, one of the facets of our platform.
Research
February 25, 2025
Irregular participated in the security evaluation of Anthropic's Claude 3.7 Sonnet model using our state-of-the-art cyber evaluation suite. It also utilized the SOLVE scoring system we recently introduced. Our real-world attack simulations tested capabilities across the entire cyber kill chain, helping responsible development of this frontier AI model.
Research
February 19, 2025
Modern AI systems possess significant capabilities across various domains. In cybersecurity, these systems can perform complex tasks such as vulnerability research, log analysis, and security architecture design. Many of these capabilities are inherently dual-use: they can be employed both defensively to protect systems and offensively to cause harm. This dual-use nature creates a significant challenge for AI system providers and policy makers.
News
February 11, 2025
Research
February 10, 2025
We introduce a new scoring system for assessing the difficulty of a vulnerability discovery & exploit development challenge. The scoring system described here is a framework for making a judgement about how complicated it is to discover vulnerabilities and develop working exploits for them within an end-to-end challenge.
News
October 12, 2024
Research
October 2, 2024
This is the third and final part in our series outlining the best practices for the design and creation of evaluations and evaluation suites.
Research
October 2, 2024
This is the second part in our series outlining the best practices for the design and creation of evaluations and evaluation suites.
Research
October 2, 2024
We believe that quality evaluation suites are crucial for labs’ and governments’ policy making ability, both in the short and long term. While considerable academic research has been done on evaluating AI models, especially since the breakthrough in LLMs, we have seen comparatively little written about assessing the evaluations themselves.
Research
October 2, 2024
At Irregular, we’ve been focusing some of our efforts on evaluating the cybersecurity capabilities of frontier models. To do so, one of the first questions we tackled was how to define these capabilities in a meaningful and useful way. The following describes the taxonomy we are currently using internally, and while it is constantly evolving and a work in progress, we believe it is mature enough to be useful to others as well.
News
July 22, 2024
Research
May 21, 2024
Yoni Rozenshein's BlueHat IL 2024 talk is about our philosophy for evaluating AI dangerous cyber capabilities, how we actually do it (let's make an LLM play CTF!), and who cares about it (governments and frontier AI labs).