Publications

News

September 17, 2025

Introducing Frontier AI Security

Irregular (formerly Pattern Labs) is the first frontier security lab with the mission of protecting the world in the time of increasingly capable and sophisticated AI systems.

Introducing Frontier AI Security

Irregular (formerly Pattern Labs) is the first frontier security lab with the mission of protecting the world in the time of increasingly capable and sophisticated AI systems.

Introducing Frontier AI Security

Irregular (formerly Pattern Labs) is the first frontier security lab with the mission of protecting the world in the time of increasingly capable and sophisticated AI systems.

Research

August 7, 2025

Irregular x OpenAI: Evaluating GPT-5's Cybersecurity Capabilities

As frontier AI models get more powerful, the stakes for secure deployment get higher. Irregular was brought in to put GPT-5 to the test. Our work is referenced in the GPT-5 Model Card, and we’re excited to share more on what we found.

In collaboration with

Research

July 15, 2025

From Scripts to Strategy: Claude 4's Advanced Approach to Offensive Security

Irregular and Anthropic have completed an extensive cybersecurity evaluation of Claude Sonnet 4 and Claude Opus 4, which have shown significant improvements over previous generations. Using our evaluation suite, as described expansively in Anthropic's model card, the models were tested across 48 challenges covering web exploitation, cryptography, binary exploitation, reverse engineering, and network attacks.

In collaboration with

Research

June 19, 2025

Irregular and Anthropic Publish Whitepaper on Confidential AI Inference Systems

We're happy to announce the publication of our collaborative whitepaper with Anthropic on Confidential Inference Systems, an approach to using Confidential Computing technologies to enhance the security of AI model weights, as well as the privacy of user data being processed by the model.

In collaboration with

News

May 5, 2025

Irregular cybersecurity talk at AI evaluations session of EU AI office

Irregular's CEO Dan Lahav was an invited expert at the European Commission's AI Office, addressing the vital connection between AI and cybersecurity. We are excited to continue contributing to this critical dialogue on securing AI systems at the EU level.

Irregular cybersecurity talk at AI evaluations session of EU AI office

Irregular's CEO Dan Lahav was an invited expert at the European Commission's AI Office, addressing the vital connection between AI and cybersecurity. We are excited to continue contributing to this critical dialogue on securing AI systems at the EU level.

Irregular cybersecurity talk at AI evaluations session of EU AI office

Irregular's CEO Dan Lahav was an invited expert at the European Commission's AI Office, addressing the vital connection between AI and cybersecurity. We are excited to continue contributing to this critical dialogue on securing AI systems at the EU level.

Research

April 17, 2025

Irregular's Role in OpenAI's o3 and o4-mini's Security Evaluation

As frontier models become increasingly more capable, the need to ensure security upon deployment grows. Irregular is proud to share that we played a significant role in assessing OpenAI's o3 and o4-mini's cybersecurity capabilities through a comprehensive evaluation, as referenced in the models’ System Card.

In collaboration with

News

April 7, 2025

Irregular to Present Two Talks at BlueHat 2025

On the talk "Hey AI, how many "r" in "buffer overflow"? we discuss how AI models can excel at certain security tasks while surprisingly failing at seemingly simple tasks, and on Hack like a robot: Journey into the logic of LLM-based vulnerability hunters." we explore what happens when LLMs tackle vulnerability discovery, examining their successes, spectacular failures, and surprising insights.

Irregular to Present Two Talks at BlueHat 2025

On the talk "Hey AI, how many "r" in "buffer overflow"? we discuss how AI models can excel at certain security tasks while surprisingly failing at seemingly simple tasks, and on Hack like a robot: Journey into the logic of LLM-based vulnerability hunters." we explore what happens when LLMs tackle vulnerability discovery, examining their successes, spectacular failures, and surprising insights.

Irregular to Present Two Talks at BlueHat 2025

On the talk "Hey AI, how many "r" in "buffer overflow"? we discuss how AI models can excel at certain security tasks while surprisingly failing at seemingly simple tasks, and on Hack like a robot: Journey into the logic of LLM-based vulnerability hunters." we explore what happens when LLMs tackle vulnerability discovery, examining their successes, spectacular failures, and surprising insights.

Research

March 31, 2025

Deriving Capability Levels From Evaluation Results

In today's rapidly evolving AI landscape, understanding and precisely evaluating the capabilities of advanced AI systems has become a critical security concern. Even though different benchmarks are constantly being developed and published, a significant challenge lies in converting raw evaluation results into meaningful capability levels of AI systems, as part of a greater risk evaluation system. This blog post presents a specific framework to translate those evaluation results into capability levels enabling the assessment of risk levels for AI models.

Research

March 27, 2025

Irregular's AI Evaluation Platform: Cyber Use-Case

Following our series of blog posts about “Best Practices for Evaluations and Evaluation Suites”, this blog post introduces our existing state-of-the-art Evaluation Platform. Our platform is already actively deployed and is assisting multiple top frontier labs to measure the risks associated with AI systems through cutting-edge empirical testing. In this post, we highlight our security evaluations, one of the facets of our platform.

Research

February 25, 2025

Irregular's Role in Claude 3.7 Sonnet Security Evaluation

Irregular participated in the security evaluation of Anthropic's Claude 3.7 Sonnet model using our state-of-the-art cyber evaluation suite. It also utilized the SOLVE scoring system we recently introduced. Our real-world attack simulations tested capabilities across the entire cyber kill chain, helping responsible development of this frontier AI model.

In collaboration with

Research

February 19, 2025

Navigating Dual-Use: Refusal Policy for AI Systems in Cybersecurity

Modern AI systems possess significant capabilities across various domains. In cybersecurity, these systems can perform complex tasks such as vulnerability research, log analysis, and security architecture design. Many of these capabilities are inherently dual-use: they can be employed both defensively to protect systems and offensively to cause harm. This dual-use nature creates a significant challenge for AI system providers and policy makers.

News

February 11, 2025

The AI Security Landscape: Irregular and RAND Keynote at Paris Security Forum '25

Irregular CEO Dan Lahav co-delivered the keynote "The AI Security Landscape" with Sella Nevo (RAND) at the Paris AI Security Forum ‘25, a satellite event of the Paris AI Action Summit. The forum also featured Yoshua Bengio (Turing Award winner), David 'davidad' Dalrymple (ARIA), and Xander Davies (AISI), and many others, to accelerate both our understanding of the critical importance and practical approaches to securing frontier AI models.

The AI Security Landscape: Irregular and RAND Keynote at Paris Security Forum '25

Irregular CEO Dan Lahav co-delivered the keynote "The AI Security Landscape" with Sella Nevo (RAND) at the Paris AI Security Forum ‘25, a satellite event of the Paris AI Action Summit. The forum also featured Yoshua Bengio (Turing Award winner), David 'davidad' Dalrymple (ARIA), and Xander Davies (AISI), and many others, to accelerate both our understanding of the critical importance and practical approaches to securing frontier AI models.

The AI Security Landscape: Irregular and RAND Keynote at Paris Security Forum '25

Irregular CEO Dan Lahav co-delivered the keynote "The AI Security Landscape" with Sella Nevo (RAND) at the Paris AI Security Forum ‘25, a satellite event of the Paris AI Action Summit. The forum also featured Yoshua Bengio (Turing Award winner), David 'davidad' Dalrymple (ARIA), and Xander Davies (AISI), and many others, to accelerate both our understanding of the critical importance and practical approaches to securing frontier AI models.

Research

February 10, 2025

Introducing SOLVE: Scoring Obstacle Levels in Vulnerabilities & Exploits (Version 0.5)

We introduce a new scoring system for assessing the difficulty of a vulnerability discovery & exploit development challenge. The scoring system described here is a framework for making a judgement about how complicated it is to discover vulnerabilities and develop working exploits for them within an end-to-end challenge.

News

October 12, 2024

FAISC submission accepted: "What Makes an Evaluation Useful?"

Irregular researchers' paper "What Makes an Evaluation Useful? Key Guidelines and Best Practices" was accepted to the conference on Frontier AI Safety Frameworks. This paper synthesizes and updates parts of the blog posts series we published in the autumn of 2024 and is published in the conference proceedings. Our researchers also took part in the conference workshop, discussing the most pressing challenges in designing and implementing frontier AI safety frameworks.

FAISC submission accepted: "What Makes an Evaluation Useful?"

Irregular researchers' paper "What Makes an Evaluation Useful? Key Guidelines and Best Practices" was accepted to the conference on Frontier AI Safety Frameworks. This paper synthesizes and updates parts of the blog posts series we published in the autumn of 2024 and is published in the conference proceedings. Our researchers also took part in the conference workshop, discussing the most pressing challenges in designing and implementing frontier AI safety frameworks.

FAISC submission accepted: "What Makes an Evaluation Useful?"

Irregular researchers' paper "What Makes an Evaluation Useful? Key Guidelines and Best Practices" was accepted to the conference on Frontier AI Safety Frameworks. This paper synthesizes and updates parts of the blog posts series we published in the autumn of 2024 and is published in the conference proceedings. Our researchers also took part in the conference workshop, discussing the most pressing challenges in designing and implementing frontier AI safety frameworks.

Research

October 2, 2024

Best Practices for Evaluations and Evaluation Suites: Part 3

This is the third and final part in our series outlining the best practices for the design and creation of evaluations and evaluation suites.

Research

October 2, 2024

Best Practices for Evaluations and Evaluation Suites: Part 2

This is the second part in our series outlining the best practices for the design and creation of evaluations and evaluation suites.

Research

October 2, 2024

Best Practices for Evaluations and Evaluation Suites: Part 1

We believe that quality evaluation suites are crucial for labs’ and governments’ policy making ability, both in the short and long term. While considerable academic research has been done on evaluating AI models, especially since the breakthrough in LLMs, we have seen comparatively little written about assessing the evaluations themselves.

Research

October 2, 2024

Offensive Cyber Capabilities Analysis

At Irregular, we’ve been focusing some of our efforts on evaluating the cybersecurity capabilities of frontier models. To do so, one of the first questions we tackled was how to define these capabilities in a meaningful and useful way. The following describes the taxonomy we are currently using internally, and while it is constantly evolving and a work in progress, we believe it is mature enough to be useful to others as well.

Introducing Frontier AI Security

Introducing Frontier AI Security

Introducing Frontier AI Security

Irregular cybersecurity talk at AI evaluations session of EU AI office

Irregular cybersecurity talk at AI evaluations session of EU AI office

Irregular cybersecurity talk at AI evaluations session of EU AI office

Irregular to Present Two Talks at BlueHat 2025

Irregular to Present Two Talks at BlueHat 2025

Irregular to Present Two Talks at BlueHat 2025

The AI Security Landscape: Irregular and RAND Keynote at Paris Security Forum '25

The AI Security Landscape: Irregular and RAND Keynote at Paris Security Forum '25

The AI Security Landscape: Irregular and RAND Keynote at Paris Security Forum '25

FAISC submission accepted: "What Makes an Evaluation Useful?"

FAISC submission accepted: "What Makes an Evaluation Useful?"

FAISC submission accepted: "What Makes an Evaluation Useful?"

In the news: Irregular featured in Forbes

In the news: Irregular featured in Forbes

In the news: Irregular featured in Forbes