- When LLMs Meet Cybersecurity: A Systematic Literature Review
- 🌈 Introduction
- 🚩Features
- 🌟 Literatures
- 📖BibTeX
We are excited to present "When LLMs Meet Cybersecurity: A Systematic Literature Review," a comprehensive overview of LLM applications in cybersecurity.
We seek to address three key questions:
- RQ1: How to construct cyber security-oriented domain LLMs?
- RQ2: What are the potential applications of LLMs in cybersecurity?
- RQ3: What are the existing challenges and further research directions about the application of LLMs in cybersecurity?
Our study encompasses an analysis of over 180 works, spanning across 25 LLMs and more than 10 downstream scenarios.
-
CyberMetric: A Benchmark Dataset for Evaluating Large Language Models Knowledge in Cybersecurity [paper] 2024.02.12
-
SecEval: A Comprehensive Benchmark for Evaluating Cybersecurity Knowledge of Foundation Models [paper] 2023
-
SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer Security [[paper]] https://arxiv.org/abs/2312.15838v1 2023.12.26
-
Securityeval dataset: mining vulnerability examples to evaluate machine learning-based code generation techniques. [paper] 2022.11.09
-
Can llms patch security issues? [paper] 2024.02.19
-
DebugBench: Evaluating Debugging Capability of Large Language Models [paper] 2024.01.11
-
An empirical study of netops capability of pre-trained large language models. [paper] 2023.09.19
-
OpsEval: A Comprehensive IT Operations Benchmark Suite for Large Language Models [paper] 2024.02.16
-
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models [paper] 2023.12.07
-
LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations [paper] 2023.03.16
-
Finetuning Large Language Models for Vulnerability Detection [paper] 2024.02.29
-
SecureFalcon: The Next Cyber Reasoning System for Cyber Security [paper] 2023.07.13
-
Large Language Models for Test-Free Fault Localization [paper] 2023.10.03
-
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair [paper] 2024.03.11
-
Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding [paper] 2023.10.06
-
Instruction Tuning for Secure Code Generation [paper] 2024.02.14
-
Nova+: Generative Language Models for Binaries [paper] 2023.11.27
-
Owl: A Large Language Model for IT Operations [paper] 2023.09.17
-
HackMentor: Fine-tuning Large Language Models for Cybersecurity [paper] 2023.09
-
LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge [paper] 2024.01.18
-
AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation [paper] 2023.10.04
-
On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions [paper] 2023.08.22
-
Advancing TTP Analysis: Harnessing the Power of Encoder-Only and Decoder-Only Language Models with Retrieval Augmented Generation [paper] 2024.01.12
-
An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures [paper] 2023.08.09
-
ChatGPT, Llama, can you write my report? An experiment on assisted digital forensics reports written using (Local) Large Language Models [paper] 2023.12.22
-
Time for aCTIon: Automated Analysis of Cyber Threat Intelligence in the Wild [paper] 2023.07.14
-
Cupid: Leveraging ChatGPT for More Accurate Duplicate Bug Report Detection [paper] 2023.08.27
-
HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion [paper] 2023.12.21
-
Cyber Sentinel: Exploring Conversational Agents in Streamlining Security Tasks with GPT-4 [paper] 2023.09.28
-
Evaluation of LLM Chatbots for OSINT-based Cyber Threat Awareness [paper] 2024.03.13
-
Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models [paper] 2024.03.01
-
Augmenting Greybox Fuzzing with Generative AI [paper] 2023.06.11
-
How well does LLM generate security tests? [paper] 2023.10.03
-
Fuzz4All: Universal Fuzzing with Large Language Models [paper] 2024.01.15
-
CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models [paper] 2023.07.26
-
Understanding Large Language Model Based Fuzz Driver Generation [paper] 2023.07.24
-
Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models [paper] 2023.06.07
-
Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT [paper] 2023.04.04
-
Large language model guided protocol fuzzing [paper] 2024.02.26
-
Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing [paper] 2024.03.06
-
Evaluation of ChatGPT Model for Vulnerability Detection [paper] 2023.04.12
-
Detecting software vulnerabilities using Language Models [paper] 2023.02.23
-
Software Vulnerability Detection using Large Language Models [paper] 2023.09.02
-
Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities [paper] 2023.11.16
-
Software Vulnerability and Functionality Assessment using LLMs [paper] 2024.03.13
-
Finetuning Large Language Models for Vulnerability Detection [paper] 2024.03.01
-
The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models [paper] 2023.11.15
-
DefectHunter: A Novel LLM-Driven Boosted-Conformer-based Code Vulnerability Detection Mechanism [paper] 2023.09.27
-
Prompt-Enhanced Software Vulnerability Detection Using ChatGPT [paper] 2023.08.24
-
Using ChatGPT as a Static Application Security Testing Tool [paper] 2023.08.28
-
LLbezpeky: Leveraging Large Language Models for Vulnerability Detection [paper] 2024.01.13
-
Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives [paper] 2023.10.16
-
Software Vulnerability Detection with GPT and In-Context Learning [paper] 2024.01.08
-
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis [paper] 2023.12.25
-
VulLibGen: Identifying Vulnerable Third-Party Libraries via Generative Pre-Trained Model [paper] 2023.08.09
-
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning [paper] 2024.01.29
-
Large Language Models for Test-Free Fault Localization [paper] 2023.10.03
-
Multi-role Consensus through LLMs Discussions for Vulnerability Detection [paper] 2024.03.21
-
How ChatGPT is Solving Vulnerability Management Problem [paper] 2023.11.11
-
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection [paper] 2023.08.09
-
The FormAI Dataset: Generative AI in Software Security through the Lens of Formal Verification [paper] 2023.09.02
-
How Far Have We Gone in Vulnerability Detection Using Large Language Models [paper] 2023.12.22
-
Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants [paper] 2023.02.27
-
Bugs in Large Language Models Generated Code [paper] 2024.03.18
-
Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions [paper] 2021.12.16
-
The Effectiveness of Large Language Models (ChatGPT and CodeBERT) for Security-Oriented Code Analysis [paper] 2023.08.29
-
No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT [paper] 2023.08.09
-
Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code [paper] 2023.11.01
-
Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation [paper] 2023.10.30
-
Can Large Language Models Identify And Reason About Security Vulnerabilities? Not Yet [paper] 2023.12.19
-
A Comparative Study of Code Generation using ChatGPT 3.5 across 10 Programming Languages [paper] 2023.08.08
-
How Secure is Code Generated by ChatGPT? [[paper]](How Secure is Code Generated by ChatGPT?) 2023.04.19
-
Large Language Models for Code: Security Hardening and Adversarial Testing [paper] 2023.09.29
-
Pop Quiz! Can a Large Language Model Help With Reverse Engineering? [paper] 2022.02.02
-
LLM4Decompile: Decompiling Binary Code with Large Language Models [paper] 2024.03.08
-
Large Language Models for Code Analysis: Do LLMs Really Do Their Job? [paper] 2024.03.05
-
Understanding Programs by Exploiting (Fuzzing) Test Cases [paper] 2023.01.12
-
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures [paper] 2023.08.07
-
Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4 [paper] 2023.12.13
-
Using ChatGPT to Analyze Ransomware Messages and to Predict Ransomware Threats [paper] 2023.11.21
-
Shifting the Lens: Detecting Malware in npm Ecosystem with Large Language Models [paper] 2024.03.18
-
DebugBench: Evaluating Debugging Capability of Large Language Models [paper] 2024.01.11
-
Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions [paper] 2023.10.24
-
FLAG: Finding Line Anomalies (in code) with Generative AI [paper] 2023.07.22
-
Automatic Program Repair with OpenAI's Codex: Evaluating QuixBugs [paper] 2023.11.06
-
An Analysis of the Automatic Bug Fixing Performance of ChatGPT [paper] 2023.01.20
-
AI-powered patching: the future of automated vulnerability fixes [paper] 2024.01.31
-
Practical Program Repair in the Era of Large Pre-trained Language Models [paper] 2022.10.25
-
Security Code Review by LLMs: A Deep Dive into Responses [paper] 2024.01.29
-
Examining Zero-Shot Vulnerability Repair with Large Language Models [paper] 2022.08.15
-
How Effective Are Neural Networks for Fixing Security Vulnerabilities [paper] 2023.05.29
-
Can LLMs Patch Security Issues? [paper] 2024.02.19
-
InferFix: End-to-End Program Repair with LLMs [paper] 2023.03.13
-
ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching [paper] 2023.08.24
-
DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection [paper] 2023.08.14
-
Fixing Hardware Security Bugs with Large Language Models [paper] 2023.02.02
-
A Study of Vulnerability Repair in JavaScript Programs with Large Language Models [paper] 2023.03.19
-
Enhanced Automated Code Vulnerability Repair using Large Language Models [paper] 2024.01.08
-
Teaching Large Language Models to Self-Debug [paper] 2023.10.05
-
Better Patching Using LLM Prompting, via Self-Consistency [paper] 2023.08.16
-
Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair [paper] 2023.11.08
-
LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward [paper] 2024.02.22
-
ContrastRepair: Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs [paper] 2024.03.07
-
When Large Language Models Confront Repository-Level Automatic Program Repair: How Well They Done? [paper] 2023.03.01
-
Benchmarking Large Language Models for Log Analysis, Security, and Interpretation [paper] 2023.11.24
-
Log-based Anomaly Detection based on EVT Theory with feedback [paper] 2023.09.30
-
LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection [paper] 2023.09.14
-
LogGPT: Log Anomaly Detection via GPT [paper] 2023.12.11
-
Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies [paper] 2024.01.26
-
Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging [paper] 2024.03.02
-
Web Content Filtering through knowledge distillation of Large Language Models [paper] 2023.05.10
-
Application of Large Language Models to DDoS Attack Detection [paper] 2024.02.05
-
An Improved Transformer-based Model for Detecting Phishing, Spam, and Ham: A Large Language Model Approach [paper] 2023.11.12
-
Evaluating the Performance of ChatGPT for Spam Email Detection [paper] 2024.02.23
-
Prompted Contextual Vectors for Spear-Phishing Detection [paper] 2024.02.14
-
Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models [paper] 2023.11.30
-
Explaining Tree Model Decisions in Natural Language for Network Intrusion Detection [paper] 2023.10.30
-
Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices [paper] 2024.02.08
-
HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs) [paper] 2023.09.27
-
ChatGPT for digital forensic investigation: The good, the bad, and the unknown [paper] 2023.07.10
-
Identifying and mitigating the security risks of generative ai [paper] 2023.12.29
-
Impact of Big Data Analytics and ChatGPT on Cybersecurity [paper] 2023.05.22
-
From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy [paper] 2023.07.03
-
LLMs Killed the Script Kiddie: How Agents Supported by Large Language Models Change the Landscape of Network Threat Testing [paper] 2023.10.10
-
Malla: Demystifying Real-world Large Language Model Integrated Malicious Services [paper] 2024.01.06
-
Evaluating LLMs for Privilege-Escalation Scenarios [paper] 2023.10.23
-
Using Large Language Models for Cybersecurity Capture-The-Flag Challenges and Certification Questions [paper] 2023.08.21
-
Exploring the Dark Side of AI: Advanced Phishing Attack Design and Deployment Using ChatGPT [paper] 2023.09.19
-
From Chatbots to PhishBots? - Preventing Phishing scams created using ChatGPT, Google Bard and Claude [paper] 2024.03.10
-
From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads [paper] 2023.05.24
-
PentestGPT: An LLM-empowered Automatic Penetration Testing Tool [paper] 2023.08.13
-
AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks [paper] 2024.03.02
-
RatGPT: Turning online LLMs into Proxies for Malware Attacks [paper] 2023.09.07
-
Getting pwn’d by AI: Penetration Testing with Large Language Models [paper] 2023.08.17
-
An LLM-based Framework for Fingerprinting Internet-connected Devices [paper] 2023.10.24
-
Anatomy of an AI-powered malicious social botnet [paper] 2023.07.30
-
Just-in-Time Security Patch Detection -- LLM At the Rescue for Data Augmentation [paper] 2023.12.12
-
LLM for SoC Security: A Paradigm Shift [paper] 2023.10.09
-
Harnessing the Power of LLM to Support Binary Taint Analysis [paper] 2023.10.12
-
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations [paper] 2023.12.07
-
LLM in the Shell: Generative Honeypots [paper] 2024.02.09
-
Employing LLMs for Incident Response Planning and Review [paper] 2024.03.02
-
Enhancing Network Management Using Code Generated by Large Language Models [[paper]] (https://arxiv.org/abs/2308.06261) 2023.08.11
-
Prompting Is All You Need: Automated Android Bug Replay with Large Language Models [paper] 2023.07.18
-
Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions [paper] 2024.02.07
-
Cybersecurity Issues and Challenges [paper] 2022.08
-
A unified cybersecurity framework for complex environments [paper] 2018.09.26
-
LLMind: Orchestrating AI and IoT with LLM for Complex Task Execution [paper] 2024.02.20
-
Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments [paper] 2023.08.28
-
Llm agents can autonomously hack websites. [paper] 2024.02.16
-
Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides [paper] 2024.02.27
-
TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage [paper] 2023.11.07
-
The Rise and Potential of Large Language Model Based Agents: A Survey [paper] 2023.09.19
-
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [paper] 2023.10.03
-
From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs [paper] 2024.02.28
-
If llm is the wizard, then code is the wand: A survey on how code empowers large language models to serve as intelligent agents. [paper] 2024.01.08
-
TaskWeaver: A Code-First Agent Framework [paper] 2023.12.01
-
Large Language Models for Networking: Applications, Enabling Techniques, and Challenges [paper] 2023.11.29
-
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents [paper] 2024.02.18
-
WIPI: A New Web Threat for LLM-Driven Web Agents [paper] 2024.02.26
-
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents [paper] 2024.03.25
@misc{zhang2024llms,
title={When LLMs Meet Cybersecurity: A Systematic Literature Review},
author={Jie Zhang and Haoyu Bu and Hui Wen and Yu Chen and Lun Li and Hongsong Zhu},
year={2024},
eprint={2405.03644},
archivePrefix={arXiv},
primaryClass={cs.CR}
}