rockyou2024 – it’s a mess

Hacking

A collection of passwords with no context; it’s not even a decent rainbow table

A lot of hype has been made around the “rockyou2024.txt” file that contains nearly 10 billion lines of records (the unpacked file being 152Gb in size). There appears to be 9’948’575’739 lines of text in the file (-51’424’261 short of 10 billion).

It’s relatively easy to acquire but not so easy to open and assess. So cue some scripting and some basic cat-ting of the file…!

Using a simple Python script, I processed the file for the word “monkey”. This occurred many times and there appears to be no de-duplication of the file. !monkeypoo occurs more than once, for example.

As far as it’s possible to tell, it’s simply a group of joined password files. Which might be useful to somebody as a reference but I think is really unlikely to feature in people’s minds as something against which they should check their passwords (after all, 99.9% of the 10 billion password owners would never dream of changing their password, using random passwords from a password manager and so on).

But it gets worse – the quality of the dataset is very mixed and I would dare say relatively low. It’s very newsworthy but I don’t know what a bad guy might do with the passwords in the file.

Looking at the output of the rockyou2024.txt file, it appears that it is a collection of passwords (hashed and clear text).

The image to the right is unlikely to be anything but hashed (and perhaps salted values) that have probably been dumped from a database. They are all the same length and would not be possible to use in any login interface (especially if salted).

This suggests to me that rockyou2024.txt is simply a dump that could be interesting to a few people. I can imagine that the non-hashed values are being added to browsers like Chrome and password managers galore to bring a list of “known” passwords that should be avoided.

The problem is that the people who need to listen will not listen; they are mostly deaf to this password-reuse problem.

It’s been a great news cycle and made it into regular media with reporters – who probably did not get the file or did not know how to open it – reporting on Internet Security being over. It’s not, it’s just business as usual.