AIM Intelligence: Inside ACL 2025's "Triple Threat" to Unsafe AI - A Global Alliance of Stanford, AWS, UMich, SNU, Yonsei, KAIST & UOS

June 06, 2025 11:52 AM EDT | By News File Corp
 AIM Intelligence: Inside ACL 2025's
Image source: News File Corp

Seoul, South Korea--(Newsfile Corp. - June 6, 2025) - AIM Intelligence is pleased to announce that, in one of the most high-profile security spotlights of ACL 2025, a global research alliance—with collaboration from Stanford University, Amazon AWS, the University of Michigan, Seoul National University, Yonsei University, KAIST, and the University of Seoul—has unveiled three papers that redefine the frontiers of LLM red teaming, representation-level alignment, and agentic system defense.

Two of the papers were accepted to the ACL 2025 Main Conference, while a third was selected for the ACL Industry Track, underscoring not just academic rigor but also real-world relevance.

"This isn't speculative. These are attack blueprints we've seen succeed in multimodal agents—inside real systems, with real risks," said Sangyoon Yu, CEO of AIM Intelligence.

1. One-Shot Jailbreaking (ACL 2025 Main Conference)

"One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs"

The first paper shows how single-turn prompts can achieve what once took multi-turn dialogues to jailbreak even the most advanced LLMs. The M2S framework compresses complex attacks into highly effective one-liners—faster, stealthier, and harder to detect.

"You don't need a conversation to subvert a model anymore. One shot is enough," said Junwoo Ha, Product Engineer of AIM Intelligence.

The research was led by Junwoo Ha (University of Seoul) and Hyunjun Kim (KAIST) as part of AIM Intelligence's red-teaming internship program.

Cannot view this image? Visit: https://images.newsfilecorp.com/files/7768/254234_4d72868bb4c34ba7_001.jpg

Main Figure

Cannot view this image? Visit: https://images.newsfilecorp.com/files/7768/254234_4d72868bb4c34ba7_002.jpg

AIM Intelligence Joint Research Team: From left: Sangyoon Yoo [Seoul National University], Junwoo Ha [Univeristy of Seoul], Hyunjun Kim [Korea Advanced Institute of Science and Technology], Haon Park [CTO of AIM Intelligence]

2. Representation Bending (ACL 2025 Main Conference)

"Representation Bending for Large Language Model Safety"

The second paper, REPBEND, attacks the problem not at the prompt level—but deep inside the model's latent space (where the model "thinks"). Developed in collaboration with Amazon AWS, Stanford, Seoul National University, Yonsei University, and the University of Michigan, the method bends unsafe internal representations toward safety without sacrificing performance.

Unlike reactive filters, this approach re-engineers harmful behavior before it appears, setting a new standard for inherent alignment.

Led by Ashkan Yousefpour (AIM Intelligence, Seoul National University, Yonsei University) and Taeheon Kim (Seoul National University), the work highlights how alignment can be achieved not just through surface-level prompting, but by transforming a model's internal logic itself.

"We're aligning the model at its 'brain' level—where it forms its thoughts—not just a filter on the words it speaks," said Ashkan Yousefpor, Chief Scientist of AIM Intelligence.

Cannot view this image? Visit: https://images.newsfilecorp.com/files/7768/254234_4d72868bb4c34ba7_003.jpg

Main Figure

3. Agentic Jailbreaking (ACL 2025 Industry Track)

"sudo rm -rf agentic_security"

The third study debuts SUDO, a real-world attack framework targeting computer-use LLM agents. Using a detox-to-retrox approach (DETOX2TOX), it bypasses refusal filters, rewrites toxic requests into harmless-looking plans, and executes them via VLM-LLM integration.

In live desktop and web environments, SUDO succeeded in tasks like adding bomb-making ingredients to shopping carts and generating sexually explicit images using a vision language model.

Led by Sejin Lee and Jian Kim (Yonsei University), and Haon Park (AIM Intelligence CTO), all part of AIM Intelligence, the paper reveals a future where LLM agents can act-and attack-autonomously.

"They clicked. They executed. No human needed," said Haon Park, CTO of AIM Intelligence.

Cannot view this image? Visit: https://images.newsfilecorp.com/files/7768/254234_4d72868bb4c34ba7_004.jpg

Main Figure

Cannot view this image? Visit: https://images.newsfilecorp.com/files/7768/254234_4d72868bb4c34ba7_005.jpg

AIM Intelligence Joint Research Team: From left: Haon Park [Seoul National University], Sejin Lee [Yonsei University], Jian Kim [Yonsei University]

Why It Matters

These findings paint a chilling picture: today's LLM safety protocols can be bypassed not only with clever prompts, but with subtle, representation-level manipulations—and when embedded in agentic systems, these models can become operational attack surfaces.

The implications span:

  • Computer AI agents that can perform real illegal and dangerous actions
  • Multimodal exploits across images, text, and software interfaces
  • Novel attack pathways that compromise models through both hyper-efficient prompts and deep internal subversion

"This is a new era of AI red teaming. Our work exposes complex, real-world dangers—threats far beyond text—already impacting new systems and industries," said AIM Intelligence CEO Sangyoon Yu.

Open Tools for the Community

AIM Intelligence has publicly released both RepBend and SUDO to support open research and real-world defense.

  • REPBEND (GitHub) offers training and evaluation tools for representation-level alignment using LoRA fine-tuning.
  • SUDO (GitHub) includes a 50-task agentic attack dataset, the DETOX2TOX framework, and an evaluation suite for testing desktop/web-based AI agents.

These tools help turn frontier AI vulnerabilities into testable, fixable problems—available now for red-teamers, developers, and researchers worldwide.

###

About AIM Intelligence

Founded in 2024, AIM Intelligence is a deep-tech AI safety company developing red-teaming methodologies and scalable defenses for large-scale language, vision, and agentic models. Its research spans adversarial benchmarking, LLM alignment, multimodal jailbreaks, and agentic system security.

Media Contact

Sangyoon Yu | Co-Founder & CEO, AIM Intelligence

Email: [email protected]

Website: https://aim-intelligence.com/en

Contact Form: https://aim-intelligence.com/en/contact

Demo Videos

To view the source version of this press release, please visit https://www.newsfilecorp.com/release/254234


Disclaimer

The content, including but not limited to any articles, news, quotes, information, data, text, reports, ratings, opinions, images, photos, graphics, graphs, charts, animations, and video (Content) is a service of Kalkine Media Incorporated (“Kalkine Media, we or us”), Business Number: 720744275BC0001 and is available for personal and non-commercial use only. The advice given by Kalkine Media through its Content is general information only and it does not take into account the user’s personal investment objectives, financial situation and specific needs. Users should make their own enquiries about any investment and Kalkine Media strongly suggests the users to seek advice from a financial adviser, stockbroker or other professional (including taxation and legal advice), as necessary. Kalkine Media is not registered as an investment adviser in Canada under either the provincial or territorial Securities Acts. Some of the Content on this website may be sponsored/non-sponsored, as applicable, however, on the date of publication of any such Content, none of the employees and/or associates of Kalkine Media hold positions in any of the stocks covered by Kalkine Media through its Content. Kalkine Media hereby disclaims any and all the liabilities to any user for any direct, indirect, implied, punitive, special, incidental or other consequential damages arising from any use of the Content on this website, which is provided without warranties. The views expressed in the Content by the guests, if any, are their own and do not necessarily represent the views or opinions of Kalkine Media.
The content published on Kalkine Media also includes feeds sourced from third-party providers. Kalkine does not assert any ownership rights over the content provided by these third-party sources. The inclusion of such feeds on the Website is for informational purposes only. Kalkine does not guarantee the accuracy, completeness, or reliability of the content obtained from third-party feeds. Furthermore, Kalkine Media shall not be held liable for any errors, omissions, or inaccuracies in the content obtained from third-party feeds, nor for any damages or losses arising from the use of such content. Some of the images/music that may be used in the Content are copyrighted to their respective owner(s). Kalkine Media does not claim ownership of any of the pictures displayed/music used in the Content unless stated otherwise. The images/music that may be used in the Content are taken from various sources on the internet, including paid subscriptions or are believed to be in public domain. We have used reasonable efforts to accredit the source wherever it was indicated or was found to be necessary.
This disclaimer is subject to change without notice. Users are advised to review this disclaimer periodically for any updates or modifications.


Sponsored Articles


Investing Ideas

Previous Next
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.