Header

Search

Building an LLM Agent to empower Malware

Type Status Published Supervisors Email
MA Open 31 March 2026

Francisco Enguix

Alberto Huertas

enguix@ifi.uzh.ch

alberto.huertas@um.es

 

 

In autonomous reinforcement learning (RL), high-level decisions are often not sufficient on their own. Once a promising direction has been identified, procedures may need to be adapted to the current context, previous outcomes, and constraints of the environment.

This thesis investigates how a large language model (LLM) agent can support context-aware adaptation of procedures (RL actions) in a Cybersecurity Offensive AI system. The goal is to study how script procedural variants can be generated or refined in a structured way so that they better match the current experimental situation. The thesis is part of a broader Cybersecurity Offensive AI research line at the intersection of intelligent agents, multi-agent systems, reinforcement learning, large language models, and controlled cyber experimentation. Strong results may contribute to a scientific publication.

Sources:

[1]    X. Shen et al., ‘PentestAgent: Incorporating LLM Agents to Automated Penetration Testing’, in Proceedings of the 20th ACM Asia Conference on Computer and Communications Security, 2025, pp. 375–391.
[2]    J. Palanca, A. Terrasa, V. Julian, and C. Carrascosa, ‘SPADE 3: Supporting the New Generation of Multi-Agent Systems’, IEEE Access, vol. 8, pp. 182537–182549, 2020.
[3]    H. van Hasselt, A. Guez, and D. Silver, ‘Deep reinforcement learning with double Q-Learning’, in Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp. 2094–2100.

Prerequisites

  • Good Python programming skills.
  • Prior coursework or experience in machine learning and NLP
  • Familiarity with large language models (LLMs) workflows
  • Basic understanding of reinforcement learning (RL)
  • Comfort with experimentation and result interpretation