Anth's Computer Cave Tutorials

Python Password Analyzer

In this series we'll create a new open-source password-cracking program for penetration testing, and for clowns who've locked themselves out of their files.

First we'll build scripts to analyze passwords and find the average number of guesses for different password lengths and guessing methods, then we'll use the results to design our program.

This is an ongoing series, and we'll add to it every week. Use the links below to read each article.


Password Analyzer five: Dictionary-attack

In the previous article we created a script to scrape words from text files and started building our word lists.

Today we'll build a dictionary-attack method into our password-analyzer and put our word lists to use.

If you haven't downloaded the password analyser code yet, click here to download the code, the word grabber script and the basic 9500-word word file.

We've tidied up the existing code and added the new dictionary-attack capabilities.

Once again I'll go through the code and explain how it works. Its a large code file now, so I'll only cover the parts that have changed since the brute-force password analyzer article.

Mode

You select dictionary-attack by changing the mode variable to 'dictionary'. The program will load your JSON word list.

elif mode == "dictionary":
    # Load the word file into words_list array
    with open("words.txt") as word_holder:
        words_list = json.load(word_holder)[0]
        print(len(words_list))
    # Try words starting with each letter in this order
    word_order = ["common", "s", "a", "t", "c", "b", "d", "e", "f", "w", "g", "h", "i", \
                  "l", "p", "r", "m", "u", "n", "o", "j", "k", "x", "v", "y", "q", "z"]
    # Trying words starting with this letter    
    current_letter = word_order[0]
    # Index of current word withing its letter array
    current_letter_index = 0

It will then declare a list called word_order, used to determine the order of starting letters for parsing the word list. By default the first item is 'common', meaning the program will try all the words in the common section first. It then moves to words beginning with 's', then 'a', etc. You can change the order by moving the letters in the word-Order list.

The current_letter variable and the current_letter_index variable are used to keep track of where the program is within the word list.

Functions

We've renamed the existing generate() function to generate_brute_force(), but the inner workings are exactly the same.

There is a new function called generate_dictionary_word() that is the dictionary equivalent of the original function.

First it chooses a word from your list according to the current_letter variable and the current_letter_index variable.

# Select a password guess from the word list
def generate_dictionary_word():
    global target_password, current_guess, status, current_letter, current_letter_index, guesses, all_guesses
    # Retrieve current word from list
    if len(words_list[current_letter]) > 0:
        current_guess = words_list[current_letter][current_letter_index]
    # Prepare for next guess
    # Move to next index in the current letter array if not at end
    if current_letter_index < (len(words_list[current_letter]) - 1):
        current_letter_index += 1
        
    else:
        # Change to the next letter if any remaining
        if word_order.index(current_letter) < (len(word_order) - 1):
            current_letter = word_order[word_order.index(current_letter) + 1]            
            current_letter_index = 0
            print(current_letter)
            
        else:
            #All words tried, bad luck
            status = "not_found"
            print("DONE: " + current_letter)
            all_guesses += (guesses + 1)
            guesses = 0
            return True

Like the brute-force function, it then increments the indexing variables to prepare for the next guess.

When current_letter is the last item in the word_order list and current_letter_index is the last index for the words in that letter section, the function will exit with a 'not_found' status.

As before, the main loop checks the password guesses againt the target password and calls the generate function again and again until either the password is found, or the list of words or characters is exhausted.

If running bulk tests (reps[1] set above 1), the program will reset the status and indexing counters back to default between successful password guesses

start_time = time.time()
while reps[0] < reps[1]: >
    status = "ongoing"
    if mode != "dictionary": # Brute-force
        # Reset the digits array to default
        for pos in digits:
            digits[pos] = 0
            
        if reps[1] != 1: # Multiple runs
            # Generate target password to guess
            target_password = ""
            password_length = password_length_to_start
            chars = 0
            # Create a string from randomly chosen characters
            while chars < password_length_to_generate: >
                # Append a random index from the characters list to target password
                target_password += random.choice(characters)
                chars += 1
    else:
        print(current_letter)
        current_letter = word_order[0]
        current_letter_index = 0                

It then begins guessing.

It calls one of the generate functions based on the mode variable to load the next guess.

    # Start guessing
    while status == "ongoing":
        # Generate next password guess
        if mode != "dictionary":
            generate_brute_force() 
        else: # Dictionary 
            generate_dictionary_word()
        if target_password == current_guess:
            status = "Cracked"
            print(status + ": " + current_guess + " in " + str(guesses + 1) + " guesses")
            all_guesses += (guesses + 1)
            guesses = 0
        else:
            guesses += 1
    if status != "Cracked":
        print(status)
        print(str(all_guesses / reps[1]) + " guesses")
    reps[0] += 1

Next it compares the guess to the target password and if they match, prints the details and changes the status variable to stop the loop.

If the guess doesn't match it increments the number of guesses and calls the generate function again to get the next guess.

Once either the password is found or all combinations exhausted, it prints the results.

average_time = (time.time() - start_time) / reps[1]
print("Mode: " + mode)
if reps[1] != 1:
    print(str(all_guesses / reps[1]) + " guesses per password")
    print(str(average_time) + " seconds per password")
else:
    print(str(average_time) + " seconds")

For a bulk password test it will display the average number of guess and average elapsed time for all successful attempts.

For single tests it will display the time for that attempt.

Dictionary in action

Lets take our new attack-method for a spin.

I'll try it with our test password from the brute-force articles, 'test'.

The Python password analyzer guessing the password 'test'

Thats 13,900 guesses compared to 100,000 guesses using our string_lower brute-force attack and 500,000 for our string_full attack.

You can see it has gone through the common section, then the 's' and 'a' section before starting on 't' and finding our target password, 'test'.

I'll try it with one created by swiping a finger along the keys, 'qwerty'.

The Python password analyzer guessing the password 'qwerty'

It found this one in six guess, a record so far, because it was one of the 2000 words in the 'common' section of the word list.

This is where our 'common' section really shines, because it contains many inline keystrokes like this. 'qwerty', starting with an uncommon letter, would have taken the brute-force methods millions of attempts.

Next

In the next artcle we'll move the operation to Linux to take advantage of the more powerful network and file access tools, and add some attack capability to our program

Then we'll try it out against some real passwords. We'll start with locked RAR files, for several reasons.

Firstly I've had a few people over the years ask if I could open their password-protected RAR documents. Before programs like LastPass came along this was a common way to securely store lists of logins and password details, but it's not much use when you forget the RAR password.

Secondly, opening RAR files is quicker on a per-guess basis than some of the network-based barriers we will cover. This is because there seems to be no deliberate glue pot in place to limit password attempts.

So get your Linux computer ready, and check back soon for the next article.

Cheers

Anth


Previous: Build a word list

Next: Coming soon

_____________________________________________


Comments

Leave a comment on this article