LLMs for CyberSecurity¶

LLMs for CyberSecurity Users and Use Cases¶

Image from Generative AI and Large Language Models for Cyber Security: All Insights You Need.

Empathy Map¶

Tip

See also MITRE’s Innovation Toolkit https://itk.mitre.org/toolkit/tools-at-a-glance/ a collection of proven and repeatable problem-solving methods to help you and your team do something different that makes a difference.

Targeted PreMortem for Trustworthy AI¶

In general, it is good practice to start with the end in mind ala "Destination Postcard" from the book Switch, Dan and Chip Heath which looks at the aspirational positive outcomes.

This is also useful for Premortems to proactively identify failures so they can be avoided, to ensure the positive outcomes.

Quote

The Targeted Premortem (TPM) is a variant of Klein's Premortem Technique, which uses prospective hindsight to proactively identify failures. This variant targets brainstorming on reasons for losing trust in AI in the context of the sociotechnical system into which it is integrated. That is, the prompts are targeted to specific evidence-based focus areas where trust has been lost in AI. This tool comes with instructions, brainstorming prompts, and additional guidance on how to analyze the outcomes of a TPM session with users, developers, and other stakeholders.

MITRE Targeted PreMortem for Trustworthy AI

References¶

LLMs for CyberSecurity References¶

Generative AI and Large Language Models for Cyber Security: All Insights You Need, May 2024
A Comprehensive Review of Large Language Models in Cyber Security, September 2024
Large Language Models in Cybersecurity: State-of-the-Art, January 2024
How Large Language Models Are Reshaping the Cybersecurity Landscape | Global AI Symposium talk, September 2024
Large Language Models for Cyber Security: A Systematic Literature Review, July 2024
Using AI for Offensive Security, June 2024

Agents for CyberSecurity References¶

Comparing LLMs¶

There are several sites that allow comparisons of LLMs e.g.

https://winston-bosan.github.io/llm-pareto-frontier/
1. LLM Arena Pareto Frontier: Performance vs Cost
https://artificialanalysis.ai/
1. Independent analysis of AI models and API providers. Understand the AI landscape to choose the best model and provider for your use-case
https://llmpricecheck.com/
1. Compare and calculate the latest prices for LLM (Large Language Models) APIs from leading providers such as OpenAI GPT-4, Anthropic Claude, Google Gemini, Mate Llama 3, and more. Use our streamlined LLM Price Check tool to start optimizing your AI budget efficiently today!
https://openrouter.ai/rankings?view=day
1. Compare models used via OpenRouter
https://github.com/vectara/hallucination-leaderboard
1. LLM Hallucination Rate leaderboard
https://lmarena.ai/?leaderboard
1. Chatbot Arena is an open platform for crowdsourced AI benchmarking
https://aider.chat/docs/leaderboards/
1. Benchmark to evaluate an LLM’s ability to follow instructions and edit code successfully without human intervention
https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro
1. Benchmark to evaluate language understanding models across broader and more challenging tasks

Books¶

Build a Large Language Model (from Scratch) by Sebastian Raschka, PhD
LLM Engineer's Handbook by Paul Iusztin and Maxime Labonne
AI Engineering by Chip Huyen
Hands-On Large Language Models: Language Understanding and Generation, Oct 2024, Jay Alammar and Maarten Grootendorst
Building LLMs for Production: Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG, October 2024, Louis-Francois Bouchard and Louie Peters
LLMs in Production From language models to successful products, December 2024, Christopher Brousseau and Matthew Sharp
Fundamentals of Secure AI Systems with Personal Data, June 2025, Enrico Glerean