A simple local LLM security project that simulates prompt injection attacks and evaluates how well a defense system can block them.
Built using local models via Ollama — no external APIs required.
This system:
- Simulates prompt injection attacks (e.g., "ignore previous instructions")
- Sends them to a local LLM
- Applies a defense filter
- Measures how many attacks are blocked
👉 In short: Attack → Defense → Result → Evaluation
- Python
- FastAPI (backend)
- Streamlit (dashboard)
- Ollama (local LLM runtime)
Download and install Ollama.
Then run:
ollama pull llama3
ollama serveKeep this running.
git clone https://github.com/Sahojit/Prompt-Injection-Attack-Simulator.git
cd Prompt-Injection-Attack-Simulatorpython -m venv .venv
source .venv/bin/activate # Mac/Linux
# Windows:
.venv\Scripts\activatepip install -r requirements_simple.txtuvicorn src.api.main:app --reloadstreamlit run dashboard/app.py- API Docs → http://localhost:8000/docs
- Dashboard → http://localhost:8501
👉 Open the dashboard in browser to see:
- attacks
- blocked vs successful
- evaluation results
Each teammate should:
-
Install Ollama
-
Pull model:
ollama pull llama3
-
Clone repo
-
Install requirements
-
Run backend + dashboard
- Ollama must be running locally
- Ports 8000 and 8501 should be free
- System sends attack prompt
- Defense checks it
- If safe → goes to LLM
- If malicious → blocked
- Results are logged and shown on dashboard
This project demonstrates:
- How prompt injection attacks work
- How simple defenses can reduce risk
- How to measure LLM security
- Uses simple rule-based defense
- Not fully secure (for learning/demo purposes)
- Can be extended with ML-based detection
- Add ML-based classifier
- Improve detection rules
- Add more attack types
Sahojit Karmakar