Guanyu Hou

E-mail / CV (English)  / CV (中文)  / GitHub / Google Scholar

Hello, I'm Guanyu Hou. I am currently a Master of Science in Artificial Intelligence student at The University of Manchester. My research focuses on AI safety and security. To date, I have 7 publications. I also have served as a reviewer for AAAI 2026 and am eager to explore this field in the future.


Research

I am interested in the field of trustworthy and safe AI. I have participated in research related to the security of large models including LLMs, VLM, LALMs and T2I models.


  • emnlp
    Evaluating Robustness of Large Audio Language Models to Audio Injection: An Empirical Study
    Large Audio-Language Models (LALMs) are increasingly deployed in real-world applications, yet their robustness against malicious audio injection attacks remains underexplored. This study systematically evaluates five leading LALMs across four attack scenarios......
    Click to read more

  • mmasia
    When Hallucinated Concepts Cross Modals: Unveiling Backdoor Vulnerability in Multi-modal In-context Learning
    Due to the remarkable performance of multi-modal large language models (MLLMs)inmulti-modalcapabilities, multi-modal in-context learning (M-ICL) has garnered widespread attention for fast adapting MLLMs to downstream tasks. However, the vulnerability of M-ICL to attacks remains largely unexplored.......

  • Watch Out for Your Guidance on Generation!
    Watch Out for Your Guidance on Generation! Exploring Conditional Backdoor Attacks against Large Language Models
    To enhance the stealthiness of backdoor activation, we present a new poisoning paradigm against LLMs triggered by specifying generation conditions, which are commonly adopted strategies by users during model inference. The poisoned model performs normally for output under normal/other generation conditions, while becomes harmful for output under target generation conditions. To achieve this objective, we introduce BrieFool, an efficient attack framework......
    Click to read more

  • PREES
    PRESS: Defending Privacy in Retrieval-Augmented Generation via Embedding Space Shifting
    Retrieval-augmented generation (RAG) systems are exposed to substantial privacy risks during the information retrieval process, leading to potential data leakage of private information. In this work, we present a Privacy-preserving Retrieval-augmented generation via Embedding Space Shifting (PRESS), systematically exploring how to protect privacy in RAG systems......
    Click to read more

  • ICME
    Weaponizing Tokens: Backdooring Text-to-Image Generation via Token Remapping
    In this work, we first investigate the backdoor attack against Text-to-image generation by manipulating text tokenizer. Our backdoor attack exploits the semantic conditioning role of text tokenizer in the text-to-image generation. We propose an Automatized Remapping Framework with Optimized Tokens (AROT) for finding the best target tokens to remap the trigger token in the mapping space......

  • data_stealing
    Data Stealing Attacks against Large Language Models via Backdooring
    Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring......
    Click to read more

  • element_injection
    Embedding Based Sensitive Element Injection against Text-to-Image Generative Models
    Text-to-image technique is becoming increasingly popular among researchers and the public. Unfortunately, we found that text-to-image technique has certain security issues. In our work, we explore a novel attack paradigm for the text-to-image scenarios. By our attack, we will use target embeddings to manipulate the user embeddings to generate malicious images......
    Click to read more