Alexander Fishkov, Ph.D. student Computer Science

In this post, we return to the collection of 10 million passwords. Now we are interested in uncovering the common character patterns in passwords. To strengthen their password, people add numbers or special characters, and we can explore what combinations of these people usually use.

To encode character patterns we use the following notation: “a denotes one or more Latin letters, “0” denotes one or more digits and “\$” denotes special characters not belonging to the previous groups. According to this scheme, “John314@” will be encoded as “a0\$”.

As we can see, most people use only letters and numbers, gradually increasing the number of character groups. The top 10 patterns contain no special characters. The longest pattern in passwords is “0a0a0a0a0a0a0a0a0a0a0a0a0a0a”, which was used two times.

If we look into logins, the situation is only slightly different. Special characters are more common. This includes “firstname.lastname” and “word-number” patterns. The longest pattern in logins is “a\$a\$a\$a\$a\$a\$a\$a\$a\$a\$a\$a\$a\$a\$a”, containing 29 character groups. The surprising thing is that this pattern was used not just once, but three times.

In one of our previous posts, we analyzed the use of whole words in passwords dataset. Here we combine it with our pattern searching technique to see how people try to make their password stronger if they use words. Here we use a more easily readable notation for the patterns: character group titles are separated by a dash.

Obfuscating words with numbers and letters is more popular than using special characters. In fact, no pattern with special characters made it to the top (here we only considered patterns with words). We can also note that digits are usually added after words, while letters can sometimes appear before. Surprisingly, a simple pattern of a few words in a row is rarely used: only one “two words” pattern made it to the top 25.