Anth's Computer Cave Tutorials

Python Password Analyzer

In this series we create a new open-source password-cracking program for penetration testing, and for clowns who've locked themselves out of their files.

First we build scripts to analyze passwords and find the average number of guesses for different password lengths and guessing methods, then we use the results to design our program.

Use the links below to read each article.

You can download all of the code for this series here.


Password Analyzer three: Brute-force 2

Today we'll look at the finished version of our brute-force password analyzer we started in the last article.

Click the button below to copy the full code, or open the article_three_brute_force.py file from the download folder.

Changes

There are some new options in todays code and some of the existing options have changed.

The password length is now auto-ranging. You can choose a starting length and when all combinations have been tried it will automatically increase the length and start again.

Their are new character array options that include upper and lower case, just lower case, and a numeric array.

New characters array

Adding the upper-case letters to the characters array has really slowed the guessing process. The number of guesses required has increased roughly ten-fold.

For this reason I've kept the option of using just lower-case letters. After all, not everybody uses upper-case letters.

You select which array to use by changing the mode variable


# Choose which characters to use in password guesses
mode = "string_full"

# Character lists to create passwords based on mode
# Upper and lower case letters with numbers
if mode == "string_full":
    characters = ["s", "a", "t", "c", "b", "d", "e", "f", "w", "g", "h", "i", \
                  "l", "p", "r", "m", "u", "n", "o", "j", "k", "x", "v", "y", \
                  "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "q", "z", \
                  "S", "A", "T", "C", "B", "D", "E", "F", "W", "G", "H", "I", \
                  "L", "P", "R", "M", "U", "N", "O", "J", "K", "X", "V", "Y", \
                  "Q", "Z"]
# Lower case letters with numbers
elif mode == "string_lower":
    characters = ["s", "a", "t", "c", "b", "d", "e", "f", "w", "g", "h", "i", \
                  "l", "p", "r", "m", "u", "n", "o", "j", "k", "x", "v", "y", \
                  "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "q", "z"]
# Numeric characters only for PIN-type passwords    
else:
    characters = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]

There is also a numeric array for PIN-type passwords. This can find numeric passwords much faster than the equivalent-length alpha-numeric password. It may be a good first option before moving on to an alpha-numeric quest.

I've moved the letters around in the array to get some of the more common starting letters to the front. The left-hand characters of the password are the real decider of how many guesses are required

elif mode == "string_lower":
    characters = ["s", "a", "t", "c", "b", "d", "e", "f", "w", "g", "h", "i", \
                  "l", "p", "r", "m", "u", "n", "o", "j", "k", "x", "v", "y", \
                  "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "q", "z"]

I've based this on a word list I've been creating for the next article. I've made a text-grabber program which stores words in alphabetical sections and prints the number of words starting with each letter. The results were from a sample-size of 8000 words.

I've left the numbers towards the end of the array on the assumption that people are more likely to put numbers at the end of their password, and we're trying to get the letters at the beginning.

I haven't decided where to position the upper-case letters yet, they're just tacked onto the end of the array in the same order as the lower-case letters. I have a feeling they may be better set alongside their lower-case equivelents. I'm assuming people would be more likely to use an upper-case letter at the start of their password out of habit from general writing norms.

Password length

The program now starts guessing at your selected starting length and increases the length automatically once all combinations have been used.

Set the password_length_to start variable to your desired starting length.

The configuration below will begin guessing at two characters.


# Length to start guessing, will increment as all combinations are tried
password_length_to_start = 2
# Current guessing length
password_length = password_length_to_start
# Length of target passwords to create for bulk runs
password_length_to_generate = 4

To test multiple random passwords, you'll also need to set the password_length_to_generate variable to the length of the random target passwords you would like to generate. You can ignore this for single runs.

Running the program

We'll test a single password called 'test'. Change the reps[1] entry to 1 to perform a single run.

# Number of passwords to crack
reps = [0, 1]

Save and run the program, and enter the test password when prompted.

Guessing a four-character password

I've set my password_length to begin guessing two-character passwords and you can see it has reach 'zz' then moved on to three-character guesses. When those combinations were exhausted it move on to four-characters and soon found my password, 'test'

To run a bulk test against multiple random passwords change the reps[1] entry to the number of passwords you want to break.

Guessing five four-character passwords.

This was the results from running five four-character password tests with guesses starting at three-characters.

Mode comparisons

Let's compare the different number of guesses required for the different modes.

Here is a test using upper and lower-case letters by setting the mode variable to 'string_full'.

Guessing four-character password.

That's half a million guesses for a four-character password. Now I'll change the mode variable to 'string_lower' and try again with the same password.

Guessing four-character password.

At 100,000 guesses, there is a hugh difference. Now for the numeric test.

Guessing four-character password.

With the mode variable set to 'numeric' it took just 3714 guesses for a four-character password. This is just three per cent of the lower-case mode and less than one per cent of the uppercase mode.

I chose a number that used the same indexes in the numeric characters list as 'test' does in the string characters list

Unless I definitely know there are letters in a target password, I think I would start with the numeric mode and let it zap through to four or six characters. After that I would use either the string_lower or string_full mode and get on with life for a while.

Once again, unless I was sure there were upper-case letters in the password I would take a gamble and use the faster string_lower method.

If the max guesses with lower-case letters is twenty-percent of the upper-case mode, the most you are loosing if lower-case mode fails is twenty-percent of the time it could take to run upper-case mode.

Once we start using this for real tasks you'll see the agonizing time that every guess takes, and you may need some luck to crack anything over four or five characters using brute-force methods alone.

Dictionary to the rescue

In the next article we start on our dictionary-based attack method. This is the only real hope of guessing long passwords in your lifetime. Using whole words, the password length is irrelevant. Each word is one guess, regardless of length.

The downside of course is the password you are guessing must be a whole word.

First we need to make a word list. We'll write a quick script to scrape text from files, and itemize them alphabetically in a JSON file. I've scraped 8000 unique words so far. While that's enough for initial testing, we'll need to add another hundred-odd thousand to use in the finished program.

Once we have our word list we can build the function to use the words. This should only require a few basic modifications to our existing code.

Cheers

Anth


Previous: Brute-force password analyzer_1

Next: Create a word list

_____________________________________________


Comments

Leave a comment on this article