Guanyu Hou

E-mail / CV  / GitHub / Google Scholar

Hello, I'm Guanyu Hou, an undergraduate student majoring in Software Engineering at Oxford Brookes College, Chengdu University of Technology and expected to graduate in July 2025. I have been involved in AI-related research since 2024. So far, I have completed three works in the field of AI safety and security. In the future, I hope to be exposed to more research in this field.


Research

I am interested in the field of trustworthy and safe AI. Under the guidance of various scholars, including Dr. Rang Zhou, I have participated in some research on backdoor attacks on large language models, and element injection attacks on Text-to-image models, and have gained some achievements and experience.


  • Watch Out for Your Guidance on Generation!
    Watch Out for Your Guidance on Generation! Exploring Conditional Backdoor Attacks against Large Language Models
    To enhance the stealthiness of backdoor activation, we present a new poisoning paradigm against LLMs triggered by specifying generation conditions, which are commonly adopted strategies by users during model inference. The poisoned model performs normally for output under normal/other generation conditions, while becomes harmful for output under target generation conditions. To achieve this objective, we introduce BrieFool, an efficient attack framework......
    Click to read more

  • PREES
    PRESS: Defending Privacy in Retrieval-Augmented Generation via Embedding Space Shifting
    Retrieval-augmented generation (RAG) systems are exposed to substantial privacy risks during the information retrieval process, leading to potential data leakage of private information. In this work, we present a Privacy-preserving Retrieval-augmented generation via Embedding Space Shifting (PRESS), systematically exploring how to protect privacy in RAG systems......
    Click to read more

  • ICME
    Weaponizing Tokens: Backdooring Text-to-Image Generation via Token Remapping
    In this work, we first investigate the backdoor attack against Text-to-image generation by manipulating text tokenizer. Our backdoor attack exploits the semantic conditioning role of text tokenizer in the text-to-image generation. We propose an Automatized Remapping Framework with Optimized Tokens (AROT) for finding the best target tokens to remap the trigger token in the mapping space......

  • data_stealing
    Data Stealing Attacks against Large Language Models via Backdooring
    Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring......
    Click to read more

  • element_injection
    Embedding Based Sensitive Element Injection against Text-to-Image Generative Models
    Text-to-image technique is becoming increasingly popular among researchers and the public. Unfortunately, we found that text-to-image technique has certain security issues. In our work, we explore a novel attack paradigm for the text-to-image scenarios. By our attack, we will use target embeddings to manipulate the user embeddings to generate malicious images......
    Click to read more