How to Design a Spam Filter for GMAIL
This is one of the trending topic in interview questions.
I have read few blogs but i did not get any useful information to answer this. In reality, There are so many lengthy algorithm which run behind it and these are really hard to explain in interview. However interviewer is also not expecting those Algos from candidate.
Here i am trying to figure out how a developer can think of it and gathered below information from couple of blogs.
Types:
— Phishing (online fraud — trick the victim — revealing sensitive details — as a trustworthy)
— Spam
— Hijack
— Clickjacking protection (hyperlinks beneath legitimate clickable content)
Filter:
Text Filters
Blacklisted domains/senders
Spoofed email addrs: Greek character (“Σ”) for the Latin character “E”
Unconfirmed sender
Messages already marked as spam and by how many people
Message content is empty
User already tried to unsubscribe
Community feedback
Language difference
There are so many other points which we can add here. Hence please feed free to leave your comments. I am going to update this post with HLD and LLD of this Spam-Filter. Please share some useful link on same topic.