Build an AI Phishing Detector

By RecOsint | Dec 6, 2025

[{"selector":"#anim-97fa61b6-7a33-4519-acdb-5276a1413ce4 [data-leaf-element=\"true\"]","keyframes":{"transform":["translate(0%, 0%) scale(1.5)","translate(0%, 0%) scale(1)"]},"delay":0,"duration":2000,"easing":"cubic-bezier(.3,0,.55,1)","fill":"forwards"}] [{"selector":"#anim-40879776-efa6-4e84-ac20-bdfdb18730f0","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] Humans Are Too Slow. You receive 100 emails a day. You can't check every link manually. – The Goal: Build a Machine Learning model that looks at a URL and instantly says: Safe or Phishing . – The Tool: We will use Python and a library called Scikit-Learn .

1) Get the Data

[{"selector":"#anim-1edeb3ca-3768-45d4-a290-3073d4e0291b [data-leaf-element=\"true\"]","keyframes":{"transform":["translate(0%, 0%) scale(1.5)","translate(0%, 0%) scale(1)"]},"delay":0,"duration":2000,"easing":"cubic-bezier(.3,0,.55,1)","fill":"forwards"}] [{"selector":"#anim-e2ea4932-7d58-4f73-9bbb-1b3ffb455898","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-ac96e2af-3d4b-4671-859e-1d588182cb9d","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] AI needs examples to learn. We need a dataset containing thousands of: – Legit URLs: (https://www.google.com/search?q=google.com, amazon.com) – Phishing URLs: (secure-login-bank.xyz) – Source: Download a free dataset from Kaggle or PhishTank .

2) Extract "Features"

[{"selector":"#anim-8301ca26-f6ca-40e9-adae-20064473aaad [data-leaf-element=\"true\"]","keyframes":{"transform":["translate(0%, 0%) scale(1.5)","translate(0%, 0%) scale(1)"]},"delay":0,"duration":2000,"easing":"cubic-bezier(.3,0,.55,1)","fill":"forwards"}] [{"selector":"#anim-c11077d1-a41c-494e-a03b-f8c49e484d62","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-76ef1037-7619-4cef-b075-f0452841caf0","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] The computer can't read text like us. We must convert the URL into numbers (Features). Key things the AI looks for: – Length: Is the URL suspiciously long? (>54 chars) – Special Chars: Does it have too many @ or - symbols? – "https": Is the token missing?

Visualizing the Trap

[{"selector":"#anim-2bef75b2-14d0-4607-88a2-c755a2f2ae28 [data-leaf-element=\"true\"]","keyframes":{"transform":["translate(0%, 0%) scale(1.5)","translate(0%, 0%) scale(1)"]},"delay":0,"duration":2000,"easing":"cubic-bezier(.3,0,.55,1)","fill":"forwards"}] [{"selector":"#anim-3cbcc55f-5d78-46ff-b808-80105f4b2c66","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-25e5fe4b-eb29-41c8-ad2c-16db55f1c391","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] – Phishing: paypal-secure-login-update.com (4 hyphens, 1 "login" keyword). – Legit: paypal.com (0 hyphens). The AI learns this pattern: More Hyphens + "Login" keyword = High Danger.

3) Train the Model

[{"selector":"#anim-6b31f712-0da9-47f6-ae0f-a1d53d5c30f9 [data-leaf-element=\"true\"]","keyframes":{"transform":["translate(0%, 0%) scale(1.5)","translate(0%, 0%) scale(1)"]},"delay":0,"duration":2000,"easing":"cubic-bezier(.3,0,.55,1)","fill":"forwards"}] [{"selector":"#anim-a5dded62-bd91-46af-a573-8f66a365c8c3","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-871776b1-c830-4318-b9bf-820441045115","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] We feed these numbers into an algorithm like Random Forest . – The Code Logic: model.fit(features, labels) (Translation: "Hey Computer, look at these patterns and learn the difference.")

4) Test It

[{"selector":"#anim-ab5f83da-400a-43a3-ad08-3573cc96f91c [data-leaf-element=\"true\"]","keyframes":{"transform":["translate(0%, 0%) scale(1.5)","translate(0%, 0%) scale(1)"]},"delay":0,"duration":2000,"easing":"cubic-bezier(.3,0,.55,1)","fill":"forwards"}] [{"selector":"#anim-62242eb0-7eb0-476b-9772-50c2a1c89892","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-01d6cf81-8b62-407c-a63f-c9cb7f783aca","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] Now, give it a URL it has never seen before. – Input: netflix-payment-verify.net – AI Prediction: PHISHING (98% Confidence) – Success: You just built a cyber defense tool.

Next Level

[{"selector":"#anim-3df9dd89-cd1d-47e5-ad99-9332dd34380f [data-leaf-element=\"true\"]","keyframes":{"transform":["translate(0%, 0%) scale(1.5)","translate(0%, 0%) scale(1)"]},"delay":0,"duration":2000,"easing":"cubic-bezier(.3,0,.55,1)","fill":"forwards"}] [{"selector":"#anim-b6ff3c6d-e36d-457a-b059-9ce4e0a9f55e","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-e96c2f36-0a1b-41c3-ba22-3c22e9966ba4","keyframes":{"opacity":[0,1]},"delay":0,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] This is just the beginning. Real-world tools also check: – Domain Age: (Created yesterday? Suspicious). – Hosting Country: (Hosted in a high-risk region?).