What the Worst Passwords Really Say
Spoiler: A new ranking of the worst passwords and always the same ones at the top of the list. Don’t users remember anything? What if it was all just a buzz? In fact, they are not representative at all, but since that confirms our prejudices and serves the marketing narrative, why deprive yourself of publishing those lists…
Cette page est également disponible en français.
Like every end of the year, specialized sites publish their “TOP of the worst passwords of the year”. I know, it’s the end of November, we’ve only consumed 90% of 2020, nothing is decided yet. But if they are slow to publish, a competitor will do it and that is not acceptable. And it’s cyber month so everything is allowed.
An example ? Here are the top 5 of the top 200 published by nordpass.com (after analysis of 275 million passwords).
|1 🔼||123456||2543285||< 1s|
|2 🔼||123456789||961435||< 1s|
|3 🆕||picture1||371612||3 heures|
|4 🔼||password||360467||< 1s|
|5 🔼||12345678||322187||< 1s|
We find there the invincible
123456 (first two lines of
the numeric keypad) and its brother
123456789 (one more
line for one third security bonus), the humorous
(the interface tells me “enter your password”, I obey) and the little
picture1 (for which I have no explanation).
I might as well continue to comment on the list (yes, I know, it’s a
chart)… Among the stars,
iloveyou historically in the Top 1
at the start of the Internet still decline as it fell to
17th position as well as his buddy
falls to 171th place.
qwertyuiop, first line of
the keyboard in our many neighbors is 25th, well ahead of
azertyuiop not even
You get the idea… We could occupy our little world for a while with comments like this without ever really telling you anything useful or truly interesting (except when I am commenting, because I’m always interesting).
So, behind these comments of questionable utility, what does the publication of these lists really say?
Shame, Schadenfreude and virality
At the first level of reading, these lists put shame on those who still use these passwords.
« Il semble que beaucoup d’entre nous soient encore réticents à utiliser des mots de passe forts et difficiles à cracker. A la place, nous choisissons des options comme “football”, “iloveyou”, “letmein” et “pokemon”. »
ZDNET.com in his article of November 21, 2020.
These rankings are regularly and widely published and yet they do not seem to change (we will see later why). So we can reasonably wonder why all these people don’t change their habits, it’s not that complicated…
Are they that stupid? Not of course (we will see later why too).
In fact, the feeling that dominates and justifies the publication and sharing of these lists, would rather be Schadenfreude…
The schadenfreude: experience of pleasure, joy, or self-satisfaction that comes from learning of or witnessing the troubles, failures, or humiliation of another.
Seeing these lists, we actually experience a small intimate happiness in noting that our passwords do not appear there. Unlike so many people (150 million according to the figures) who are there.
And that’s why these publications can be shared. Schadenfreude is a positive emotion and this kind of emotion generates commitment. So it’s easy to share. Like the other cyber miscellaneous fact.
By scratching a little (but really not much), we also realize that these publications are Trojan horses, pretexts to put advertising. Who for their product, who for their services, who for their expertise and who for their personal branding…
Except at the arsouyes, we are not like that 🙄.
To show you how marketing it is, let’s go back to the source: the 2020 nordpass page (because that’s improved a bit since then). It shows a table, but you can try to copy this data to analyze it, you will struggle:
- The page does not contain a real table, but an HTML hack
div) which, by CSS magic, will be displayed as if it was a table. If you copy this content and paste it into a text editor or a spreadsheet, it results in a single rotten column.
- To make matters worse, the “rows” don’t all have the same number of “columns”, so even using a script (or a formula) you can’t distribute them quite well and you have to do the job by hand.
Everything is done to avoid reposting data and pushing to share the original page or a screenshot.
For those who would be interested, we have recovered the data, everything stored properly in a CSV and we offer it to you: top200_nordpass.csv (without even asking you to register for a mailing list…).
Hence the second trap: just below the table, we see a big super
visible button marked “Free Download”, we say to ourselves that,
finally, it’s nice to provide us with the database and we click. Except
that when it asks us where to put the file, we realize that it is an
Rather than the data you wanted, they offers you to download their sensible tool to prevent your passwords from appearing in their list… Classic among the white hats: After scaring you, they sell you a product that magically solves your problems.
And it’s even more rude as their tool will not avoid it because it does not secure the passwords where they are stolen, on the application servers.
It’s even worst since by providing your passwords to a third party, you expose them to theft from this third party which can be hacked (e.g. at lastpass),
To prevent these passwords from ending up in their list, they must have been stored securely by the developers.
What about developers?
What a great transition!?
Now that we’re done mocking victims and messengers, we can turn our gaze to the culprits: the developers.
After all, how can these passwords, which are supposed to be super-protected by the apps that store them, end up in cleartext on the internet? They had however promised us, in their confidentiality policy, that they would pay attention to it as the apple of their eye.
A leaked database is one of the things that happens. It’s ugly, the system and network administrators might have been able to do something about it (although if the leak comes from an insider or a bug they can’t do much about it).
When we store users’ passwords, we therefore assume that they will end up on the net, which avoids unpleasant surprises. It’s called doing defense in depth and we even wrote an article about it (storing user passwords):
- We hash passwords to prevent them from being readable,
- We add a salt to them to avoid dictionary attacks,
- We used specific functions to slow down the brute force.
It’s very simple to set up since web languages offer functions made
for that (i.e.
password_verify() in PHP).
But if it’s so easy to secure, we can start to doubt about these lists…
What about data?
So let’s take it a step further and take a closer look at what these lists really contain…
Disclosure: we’ll realize it’s bogus…
No reference. The authors tell us that they have analyzed 275 million passwords, from leaks in 2019 and 2020, but do not provide any reference to these famous leaks or to their content. They do not publish all the passwords found, but only an extract of the 200 most frequent.
Maybe a point of detail, but my legal expert side gets irritated when the analyzes are not transparent… Hiding stuff in your sleeves is generally not a good sign.
This top represents 4% of passwords analyzed. In other words, a drop of water in the ocean of possibilities. Moreover, they do not say that they have found them all but analyzed…
Rather than thinking that a lot of people are still using weak passwords, we can also think that 96% of passwords have remained unbroken and therefore, overall, people are doing very well.
Lots of old passwords. As much as it is normal for
old first names to appear regularly in the statistics (fashion is an
eternal restart) but for passwords…
letmein or even
myspace1 (80th with
26363 occurrences). There is reason to wonder about the origin of the
databases in question…
If the analysis looks at user bases from 1990 that leaked before 2020, does that really say anything about passwords created in 2020?
Only simple passwords. What strikes me about this list is that no password is really complicated. Aside from a UFO that contains random letters and numbers (which we’ll talk about later), all of them are amazingly simple.
I would expect to see more constructs that are simple but involve
exotic characters, like
qwerty123&1, that pass
site-imposed complexity constraints while still being simple to
remember. But no, not a single bit of sharp or exclamation.
The answer could come from the column “difficulty” which gives the time to crack the password and shows that the analysis was done by a dictionary of passwords known a priori.
If the analysis only covers passwords already seen last year, this may be why the rankings do not change from one year to the next.
Bots? If we look at the numbers, we find other strange things…
20100728, 37th. Surely a lot of interesting things happened (it was a Wednesday) but why would 45000 accounts use this date when no other date appear in the list?
x4ivygA51F(the UFO we talked about). It is found in 148th position with 18267 occurrences. That it is the only of this complexity is statistically interesting.
If we dismiss the obvious explanations (the aliens who install 5G so that Bill Gates vaccinates us against the reptilians who want to exterminate us with COVID19, or the reverse, I have a doubt), I bet more on accounts of robots created by their master.
If the base contains lots of robots, are the passwords representative of humans (and reptilians)?
Empty accounts. By dint of asking us to create an account for everything and anything, just for the pleasure of increasing the customer databases (when we know very well that we will never come back on those sites), we put anything there. Addresses of disposable email and, of course, bogus passwords…
It’s a shame because not only do you end up with a completely false base of “prospects”, but creating an account is actually a barrier to conversion. The proof ? removing this step brought 300 million euros.
When using a password as weak as
123456 (and all its
variants up to the very simple
0, 15th place,
seen 123000 times), we know that we are not doing neither proof of
originality nor of prudence. We do it knowingly because the account we
are configuring does not interest us.
The users of these passwords are no longer so oblivious after all…
And now ?
Scientifically, we should therefore have titled “The ranking of the worst passwords usually used for fake accounts and robots since 1980 and containing only letters and numbers representing 4% of the total”. I know, it’s a long title.
To simplify, we forget all these little useless and insignificant methodological details to retain only the essential, “the ranking of the worst passwords of 2020” (because we publish in 2020 after all).
Those who don’t use these passwords will be reassured and their Schadenfreude will force them to share, getting a little bit of publicity along the way. With luck, others might buy the magic product.
In the end, as always, when information comes from a company to serve its purpose, it’s not information, it’s communication. Indeed, it works the same for pesticides or climate change.
Except at the arsouyes (not for lack of repeating it to you).