Anth's Computer Cave Tutorials

Python Password Analyzer

In this series we create a new open-source password-cracking program for penetration testing, and for clowns who've locked themselves out of their files.

First we'll build scripts to analyze passwords and find the average number of guesses for different password lengths and guessing methods, then we use the results to design our program.

Use the links below to read each article.

You can download all of the code for this series here.

Password Analyzer five: Dictionary-attack

In the previous article we created a script to scrape words from text files and started building our word lists.

Today we'll build a dictionary-attack method into our password-analyzer and put our word lists to use.

We're using the script from the download folder.

We've tidied up the existing code and added the new dictionary-attack capabilities.

Once again I'll go through the code and explain how it works. Its a large code file now, so I'll only cover the parts that have changed since the brute-force password analyzer article.

Dictionary Mode

You select dictionary-attack by changing the mode variable to 'dictionary'.

The program will then declare a list called word_order, used to determine the order of starting letters for parsing the word list. By default the first item is 'common', meaning the program will try all the words in the common section first. It then moves to words beginning with 's', then 'a', etc. You can change the order by moving the letters in the word-Order list.

elif mode == "dictionary": # Single run only, no bulk passwords
    # Try words starting with each letter in this order. These are all sections in the word list
    word_order = ["common", "s", "a", "t", "c", "b", "d", "e", "f", "w", "g", "h", "i", \
                  "l", "p", "r", "m", "u", "n", "o", "j", "k", "x", "v", "y", "q", "z"]
    # Load the word file into words_list array
    with open("words.txt") as word_holder:
        words_list = json.load(word_holder)[0]
    # Currently trying words starting with this letter in word list    
    current_letter = word_order[0]
    # Index of current word withing its letter's section
    current_letter_index = 0

The program then loads your JSON word list from file.

The current_letter variable and the current_letter_index variable are used to keep track of where the program is within the word list.


We've renamed the existing generate() function to generate_brute_force(), but the inner workings are exactly the same.

There is a new function called generate_dictionary_word() that is the dictionary equivalent of the original function.

First it chooses a word from your list according to the current_letter variable and the current_letter_index variable.

# Select a password guess from the word list
def generate_dictionary_word():
    global target_password, current_guess, status, current_letter, current_letter_index, guesses, all_guesses
    # Retrieve current word from word list array
    if len(words_list[current_letter]) > 0:
        current_guess = words_list[current_letter][current_letter_index]
    # Prepare for next guess
    # Move to next word in the current letter if not at end
    if current_letter_index < (len(words_list[current_letter]) - 1):
        current_letter_index += 1

    # If all words from that letter were tried    
        # Change to the next letter if any remaining
        if word_order.index(current_letter) < (len(word_order) - 1):
            current_letter = word_order[word_order.index(current_letter) + 1]
            # Start from beginning of new letter section
            current_letter_index = 0
            #All words from all letters tried, bad luck
            status = "not_found"
            print("DONE: " + current_letter)
            all_guesses += (guesses + 1)
            guesses = 0

Like the brute-force function, it then increments the indexing variables to prepare for the next guess.

When current_letter is the last item in the word_order list and current_letter_index is the last index for the words in that letter section, the function will exit with a 'not_found' status.

Main loop

For bulk runs the main loop will run until it has cracked the number of passwords specified in the reps list.

For single runs it will keep prompting for another password to crack until the user enters an empty password to exit the program.

Between each run the program will reset the status to ongoing, the set the indexing counters back to zero.

# Get the starting time to compare to end time after all runs  
total_time = 0.0

# Until the quota of passwords have been cracked (Bulk runs)
## or the user enters an empty password (single runs)
while reps[0] < reps[1]: >
    status = "ongoing"

    if mode == "dictionary":
        # Reset the indexes for the all_words array to zero
        current_letter = word_order[0]
        current_letter_index = 0
        # Reset the digits array indexes to zero
        for pos in digits:
            digits[pos] = 0
        password_length = password_length_to_start
    # If multiple runs, create random target_password for this run
    if reps[1] > 1:
        # Create a string from randomly chosen characters
        target_password = ""
        while len(target_password) < password_length_to_generate: >
            # Append a random index from the characters list to target password
            target_password += random.choice(characters)
        reps[0] += 1
    # For single runs, prompt for target password each time instead
        target_password = str(input("Enter a password to test\n"))
        # Leaving empty will exit main loop
        if target_password == "":
            reps[0] += 1
            status = "stopped"

For bulk runs it will create a new random target password between each successful password guesse.

For single runs it will instead promp the user for a target password .

It then logs the start time for this run and begins guessing.

It calls one of the generate functions based on the mode variable to load the next guess.

    # Get the starting time for this run  
    this_time = time.time()
    guesses = 0
    # Start guessing until correct
    while status == "ongoing":
        # Generate a new guess
        guesses += 1
        if mode == "dictionary":
        if current_guess == target_password:
            elapsed = time.time() - this_time
            status = "Cracked"
            print(status + ": " + current_guess)
            print(str(guesses) + " guesses, " + str(elapsed) + " seconds.\n___________\n")
            all_guesses += guesses
            total_time += elapsed

It compares the guess to the target password and if they match, changes the status variable to stop the loop.

If the guess doesn't match it instead increments the number of guesses and calls the generate function again to get the next guess.

Once either the password is found or all combinations exhausted, it prints the results for that password.

Once the main loop finishes it will display the average number of guesses and average elapsed time for all successful attempts.

average_time = (time.time() - start_time) / reps[1]
print("Mode: " + mode)
if reps[1] != 1:
    print(str(all_guesses / reps[1]) + " guesses per password")
    print(str(average_time) + " seconds per password")
    print(str(average_time) + " seconds")

Dictionary in action

Lets take our new attack-method for a spin.

I'll try it with our test password from the brute-force articles, 'test'.

The Python password analyzer guessing the password 'test'

Thats 13,900 guesses compared to 100,000 guesses using our string_lower brute-force attack and 500,000 for our string_full attack.

You can see it has gone through the common section, then the 's' and 'a' section before starting on 't' and finding our target password, 'test'.

I'll try it with one created by swiping a finger along the keys, 'qwerty'.

The Python password analyzer guessing the password 'qwerty'

It found this one in six guesses, a record so far, because it was one of the 2000 words in the 'common' section of the word list.

This is where our 'common' section really shines, because it contains many inline keystrokes like this. 'qwerty', starting with an uncommon letter, would have taken the brute-force methods millions of attempts.


In the next password analyzer series we'll move the operation to Linux to take advantage of the more powerful network and file access tools, and add some attack capability to our program

Then we'll try it out against some real passwords. We'll start with locked RAR files, for several reasons.

Firstly I've had a few people over the years ask if I could open their password-protected RAR documents. Before programs like LastPass came along this was a common way to securely store lists of logins and password details, but it's not much use when you forget the RAR password.

Secondly, opening RAR files is quicker on a per-guess basis than some of the network-based barriers we will cover. This is because there seems to be no deliberate glue pot in place to limit password attempts.

So get your Linux computer ready, and check back soon for the next article.



Previous: Build a word list

Next: Coming soon



Leave a comment on this article

Leave a comment on this article