Anth's Computer Cave Tutorials

Python Password Analyzer

In this series we create a new open-source password-cracking program for penetration testing, and for clowns who've locked themselves out of their files.

First we'll build scripts to analyze passwords and find the average number of guesses for different password lengths and guessing methods, then we use the results to design our program.

Use the links below to read each article.

You can download all of the code for this series here.

Password Analyzer five: Dictionary-attack

In the previous article we created a script to scrape words from text files and started building our word lists.

Today we'll build a dictionary-attack method into our password-analyzer and put our word lists to use.

We're using the script from the download folder.

We've tidied up the existing code and added the new dictionary-attack capabilities.

Once again I'll go through the code and explain how it works. Its a large code file now, so I'll only cover the parts that have changed since the brute-force password analyzer article.


You select dictionary-attack by changing the mode variable to 'dictionary'. The program will load your JSON word list.

elif mode == "dictionary": # Single run only, no bulk passwords
    # Try words starting with each letter in this order. These are all sections in the word list
    word_order = ["common", "s", "a", "t", "c", "b", "d", "e", "f", "w", "g", "h", "i", \
                  "l", "p", "r", "m", "u", "n", "o", "j", "k", "x", "v", "y", "q", "z"]
    # Load the word file into words_list array
    with open("words.txt") as word_holder:
        words_list = json.load(word_holder)[0]
    # Currently trying words starting with this letter in word list    
    current_letter = word_order[0]
    # Index of current word withing its letter's section
    current_letter_index = 0

It will then declare a list called word_order, used to determine the order of starting letters for parsing the word list. By default the first item is 'common', meaning the program will try all the words in the common section first. It then moves to words beginning with 's', then 'a', etc. You can change the order by moving the letters in the word-Order list.

The current_letter variable and the current_letter_index variable are used to keep track of where the program is within the word list.


We've renamed the existing generate() function to generate_brute_force(), but the inner workings are exactly the same.

There is a new function called generate_dictionary_word() that is the dictionary equivalent of the original function.

First it chooses a word from your list according to the current_letter variable and the current_letter_index variable.

# Select a password guess from the word list
def generate_dictionary_word():
    global target_password, current_guess, status, current_letter, current_letter_index, guesses, all_guesses
    # Retrieve current word from list
    if len(words_list[current_letter]) > 0:
        current_guess = words_list[current_letter][current_letter_index]
    # Prepare for next guess
    # Move to next index in the current letter array if not at end
    if current_letter_index < (len(words_list[current_letter]) - 1):
        current_letter_index += 1
        # Change to the next letter if any remaining
        if word_order.index(current_letter) < (len(word_order) - 1):
            current_letter = word_order[word_order.index(current_letter) + 1]  
            # Start from beginning of new letter section          
            current_letter_index = 0
            # All words from all letters tried, bad luck
            status = "not_found"
            print("DONE: " + current_letter)
            all_guesses += (guesses + 1)
            guesses = 0
            return True

Like the brute-force function, it then increments the indexing variables to prepare for the next guess.

When current_letter is the last item in the word_order list and current_letter_index is the last index for the words in that letter section, the function will exit with a 'not_found' status.

Main loop

The main loop will run until it has cracked the number of passwords specified in the reps list.

If running bulk tests (reps[1] set above 1), the program will reset the status and indexing counters back to default and create a new random target password between successful password guesses.

# Get the starting time to compare to end time    
start_time = time.time()

# Until the quota of passwords have been cracked
while reps[0] < reps[1]: >
    status = "ongoing"

    # Brute-force
    if mode != "dictionary":
        # Reset the digits array to default
        for pos in digits:
            digits[pos] = 0

        # If multiple runs, create random target_password for this run      
        if reps[1] != 1:
            # Generate target password to guess
            target_password = ""
            # Reset current password length to default starting length
            password_length = password_length_to_start
            # Number of characters generated so far
            chars = 0
            # Create a string from randomly chosen characters
            while chars < password_length_to_generate: >
                # Append a random index from the characters list to target password
                target_password += random.choice(characters)
                chars += 1
    # Dictionary, single run, program has already prompted for target password    
        current_letter = word_order[0]
        current_letter_index = 0

For single runs it will instead use the target password entered at the initial promp.

It then begins guessing.

It calls one of the generate functions based on the mode variable to load the next guess.

    # Start guessing
    while status == "ongoing":
        # Generate next password guess
        if mode != "dictionary":
        if target_password == current_guess:
            status = "Cracked"
            print(status + ": " + current_guess + " in " + str(guesses + 1) + " guesses")
            all_guesses += (guesses + 1)
            guesses = 0
            guesses += 1
    if status != "Cracked":
        print(str(all_guesses / reps[1]) + " guesses")
    reps[0] += 1

It compares the guess to the target password and if they match, prints the details and changes the status variable to stop the loop.

If the guess doesn't match it increments the number of guesses and calls the generate function again to get the next guess.

Once either the password is found or all combinations exhausted, it prints the results.

average_time = (time.time() - start_time) / reps[1]
print("Mode: " + mode)
if reps[1] != 1:
    print(str(all_guesses / reps[1]) + " guesses per password")
    print(str(average_time) + " seconds per password")
    print(str(average_time) + " seconds")

For a bulk password test it will display the average number of guess and average elapsed time for all successful attempts.

For single tests it will display the time for that attempt.

Dictionary in action

Lets take our new attack-method for a spin.

I'll try it with our test password from the brute-force articles, 'test'.

The Python password analyzer guessing the password 'test'

Thats 13,900 guesses compared to 100,000 guesses using our string_lower brute-force attack and 500,000 for our string_full attack.

You can see it has gone through the common section, then the 's' and 'a' section before starting on 't' and finding our target password, 'test'.

I'll try it with one created by swiping a finger along the keys, 'qwerty'.

The Python password analyzer guessing the password 'qwerty'

It found this one in six guesses, a record so far, because it was one of the 2000 words in the 'common' section of the word list.

This is where our 'common' section really shines, because it contains many inline keystrokes like this. 'qwerty', starting with an uncommon letter, would have taken the brute-force methods millions of attempts.


In the next password analyzer series we'll move the operation to Linux to take advantage of the more powerful network and file access tools, and add some attack capability to our program

Then we'll try it out against some real passwords. We'll start with locked RAR files, for several reasons.

Firstly I've had a few people over the years ask if I could open their password-protected RAR documents. Before programs like LastPass came along this was a common way to securely store lists of logins and password details, but it's not much use when you forget the RAR password.

Secondly, opening RAR files is quicker on a per-guess basis than some of the network-based barriers we will cover. This is because there seems to be no deliberate glue pot in place to limit password attempts.

So get your Linux computer ready, and check back soon for the next article.



Previous: Build a word list

Next: Coming soon



Leave a comment on this article