To date, the storage capacity limit of the knowledge base corresponds to the maximum number of entries that can be inserted into a “sorted set” object: 2^32 – 1 elements, or more than 4 billion unique entries (4,294,967,295). This capability can be extended by using a dynamic number of “sorted sets.”
To increase the file analysis speed submitted in entry, Redis offers the possibility of using a “pipeline” to overcome the RTT (Round Trip Time) between two network requests: the pipeline allows multiple queries to be sent to the database without having to wait for responses, and then reads all the responses.
Another approach consists of distributing the analysis execution between several processes through the use of the Python “multiprocessing” module.
In the case of a “local” execution where the same machine performs the analysis and storage of data, using several process has proven to be more effective than using a pipeline. This solution offers us a gain of more than three minutes on the analysis of a “RockYou” type of file.
The global statistics obtained are as follows:
[+] Global statistics
**********************
[-] Total entries: 14,344,390
[-] Analyzed password entries: 14,331,467
[-] Excluded password entries: 12,923
[-] Unique passwords: 14,330,630
The analyzed “RockYou” file contained 14,344,390 entries, of which 12,923 were excluded by the various filters enabled during the execution (default configuration of the script). Among the non-excluded entries, 14,330,630 were unique entries. Redundant entries are due to backslash clean-up operations.
The distribution of unique entries in terms of length is as follows:
[+] Password lengths (for unique entries)
*****************************************
[-] 1: 46 password(s) [ 00.0003 % ]
[-] 2: 339 password(s) [ 00.0024 % ]
[-] 3: 2,472 password(s) [ 00.0172 % ]
[-] 4: 18,099 password(s) [ 00.1263 % ]
[-] 5: 259,533 password(s) [ 01.8110 % ]
[-] 6: 1,948,796 password(s) [ 13.5988 % ]
[-] 7: 2,507,212 password(s) [ 17.4955 % ]
[-] 8: 2,966,487 password(s) [ 20.7003 % ]
[-] 9: 2,190,663 password(s) [ 15.2866 % ]
[-] 10: 2,012,917 password(s) [ 14.0463 % ]
...
...
We found that the vast majority of passwords (approximately 80%) have a length between 6 and 10 characters.
The following statistics highlight the composition of various identified passwords and the most frequently encountered entries during the analysis.
[*] Top 10 passwords
---------------------------------------------------
[-] \ 11 appearance(s)
[-] asdfghjkl;' 05 appearance(s)
[-] 1234567890-=\ 04 appearance(s)
[-] ojkiyd0y' 04 appearance(s)
[-] iydotgfHdF'j 04 appearance(s)
[-] J'ADENKHYA 04 appearance(s)
[-] iyd0y' 04 appearance(s)
[-] iydgmv0y' 04 appearance(s)
[-] ohv's,k 04 appearance(s)
[-] zhane' 03 appearance(s)
[*] Charset analysis (for unique entries)
------------------------------------------
[-] lowercase / numbers:
6,082,774 password(s) [ 42.4460 % ]
[-] lowercase:
3,771,685 password(s) [ 26.3190 % ]
[-] numbers:
2,347,074 password(s) [ 16.3780 % ]
[-] lowercase / ascii_special / numbers:
415,202 password(s) [ 02.8973 % ]
...
...
The analysis of these results indicates that 40% of the analyzed passwords consisted of lowercase letters and numbers, 26% had only lowercase letters, and 16% consisted of only numbers.
Our PassMAID tool is available free by simple request at tools@talsion.com, for anyone with a legitimate reason to use it and hold it in compliance with Article 323-3-1 of the French criminal code.