profile

Guanyu Hou

E-mail / CV  / GitHub / Google Scholar

Hello, I'm Guanyu Hou, an undergraduate student majoring in Software Engineering at Oxford Brookes College, Chengdu University of Technology and expected to graduate in July 2025. I have been involved in AI-related research since 2024. So far, I have completed three works in the field of AI safety and security. In the future, I hope to be exposed to more research in this field.


Research

I am interested in the field of trustworthy and safe AI. Under the guidance of various scholars, including Dr. Rang Zhou, I have participated in some research on backdoor attacks on large language models, and element injection attacks on Text-to-image models, and have gained some achievements and experience.


  • data_stealing
    Data Stealing Attacks against Large Language Models via Backdooring
    Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring......
    Click to read more

  • element_injection
    Embedding Based Sensitive Element Injection against Text-to-Image Generative Models
    Text-to-image technique is becoming increasingly popular among researchers and the public. Unfortunately, we found that text-to-image technique has certain security issues. In our work, we explore a novel attack paradigm for the text-to-image scenarios. By our attack, we will use target embeddings to manipulate the user embeddings to generate malicious images......
    Click to read more

  • talk_too_much
    Watch Out for Your Guidance on Generation! Exploring Conditional Backdoor Attacks against Large Language Models
    To enhance the stealthiness of backdoor activation, we present a new poisoning paradigm against LLMs triggered by specifying generation conditions, which are commonly adopted strategies by users during model inference. The poisoned model performs normally for output under normal/other generation conditions, while becomes harmful for output under target generation conditions. To achieve this objective, we introduce BrieFool, an efficient attack framework......
    Click to read more