Automating Android Games with Python & Pytesseract - Sudoku

Introduction

I made a Python Script to Automate a Sudoku Game on Android after watching Engineer Man's Videos on Youtube doing the same for different games.

The script can be divided into 5 parts

Connecting to an Android device using ADB, and getting the screenshot of the game from it
Using Pillow to process the screenshot for pytesseract
Using pytesseract to extract the Sudoku Game Grid to a 2D List in Python.
Solving the Sudoku Game
Sending the solved input to your Android Device using Python

Out of the 5, I will be focusing mostly on 2,3 & 5 as 1 & 4 are topics that have been extensively covered.

Link to the game I automated: https://play.google.com/store/apps/details?id=com.quarzo.sudoku

The complete code is available on the following repository:

Github: haideralipunjabi/sudoku_automate

You can also watch the script in action on:

Libraries Used

Tutorial

1 (a). Using ADB to Connect to your Device

Most of the tutorials on internet use Wired ADB, which discourages many people from using this method. I will be using Wireless ADB, which isn't very difficult to setup.

Go to your Phone Settings > System > Developer Options (This might vary in different phones, so if it is not the same in your's, look it up on the internet)
Turn on Android Debugging and ADB over Network.

ADB over Network
Note the IP Address and Port shown under ADB over Network
Install ADB on your computer
Go to your command-line / command prompt and enter
adb connect <ip-address>:<port>
Use the IP Address and Port from Step 3
When connecting for the first time, you will need to authorize the connection on your phone.
Your device should be connected to your PC over WiFi.

1 (b). Using ADB with Python (pure-python-adb)

You can define the following function to connect to the first ADB device connected to your computer using Python

from ppadb.client import Client

def connect_device():
    adb = Client(host='127.0.0.1',port=5037)
    devices = adb.devices()

    if len(devices) == 0:
        print("No Devices Attached")
        quit()
    return devices[0]```

We will be using this function later to return an instance of ppadb.device.Device which will be used to take a screenshot, and send input to your device.

1 (c). Taking a Screenshot and saving it

pure-python-adb makes it very easy to capture a screenshot of your device. The screencap function is all that you need to get the screenshot. Use Pythons File IO to save it to `screen.png`

def take_screenshot(device):
    image = device.screencap()
    with open('screen.png', 'wb') as f:
        f.write(image)```

Screenshot of Sudoku

2. Processing the screenshot with Pillow

In the captured screenshot, the accuracy of any OCR will be very low. To increase accuracy, I used Pillow to process the screenshot so that it only shows the numbers in black color on a white background.

To do that, we first convert the image to grayscale (or single channel) using image.convert('L'). This will make the convert the colors to shades of greys (0-255).

Grayscale Screenshot of Sudoku

After this, we need the numbers (which are the darkest, or very near to black) in black color, and the rest in white. For this, we use image.point() so that all the greys > 50 become white (255) and the rest (numbers) become 0. I also increased the Contrast and Sharpness a bit to be on the safer side.

Processed Screenshot of Sudoku

def process_image(image):
    image = image.convert('L')
    image = image.point(lambda x: 255 if x > 50 else 0, mode='L')
    image = ImageEnhance.Contrast(image).enhance(10)
    image = ImageEnhance.Sharpness(image).enhance(2)
    return image```

3. Extracting the numbers from the image using pytesseract

Using pytesseract on the whole image might give us the numbers, but it won't tell us in which box the number was present. So, I use Pillow to crop each box and then use pytesseract on the cropped images. Before using pytesseract, I defined some functions to give me the coordinates of each box and to give me a cropped image of each box.

Since Sudoku has a 9x9 grid, I use two for loops from 0 to 8 to loop over each box. The pytesseract wasn't accurate enough on the default configuration and I had to pass the config --psm 10 --oem 0.

The --psm argument defines the Page Segmentation Method. 10 stands for Treat the image as a single character. This seemed most appropriate since I am passing cropped images of each box.
The --oem argument defines the OCR Engine Mode. 0 stands for Legacy Engine Only.

The following function will extract the numbers from the passed image and return a 9x9 2D List with the numbers.

def get_grid_from_image(image):
    grid = []
    bar = Bar("Processing: ", max=81)
    for i in range(9):
        row = []
        for j in range(9):
            digit = pytesseract.image_to_string(
                get_box(image, i, j), config='--psm 10 --oem 0')
            if digit.isdigit():     # If pytesseract returned a digit
                row.append(int(digit))
            else:
                row.append(0)
            bar.next()
        grid.append(row)
    return grid```

4. Solving the Sudoku Game

Now that we have the 9x9 Sudoku, we need to solve it. Solving Sudoku is a topic that has been covered a lot, and I also copied this code from geeksforgeeks.org.

Here's the geekforgeeks article on Sudoku

# Code from https://www.geeksforgeeks.org/sudoku-backtracking-7/
# A Backtracking program  in Python to solve Sudoku problem 
  
  
# A Utility Function to print the Grid 
# Modifed it to suit my code
def print_grid(arr): 
    for i in range(9): 
        print(' '.join([str(x) for x in arr[i]]))
  
          
# Function to Find the entry in the Grid that is still  not used 
# Searches the grid to find an entry that is still unassigned. If 
# found, the reference parameters row, col will be set the location 
# that is unassigned, and true is returned. If no unassigned entries 
# remains, false is returned. 
# 'l' is a list  variable that has been passed from the solve_sudoku function 
# to keep track of incrementation of Rows and Columns 
def find_empty_location(arr, l): 
    for row in range(9): 
        for col in range(9): 
            if(arr[row][col]== 0): 
                l[0]= row 
                l[1]= col 
                return True
    return False
  
# Returns a boolean which indicates whether any assigned entry 
# in the specified row matches the given number. 
def used_in_row(arr, row, num): 
    for i in range(9): 
        if(arr[row][i] == num): 
            return True
    return False
  
# Returns a boolean which indicates whether any assigned entry 
# in the specified column matches the given number. 
def used_in_col(arr, col, num): 
    for i in range(9): 
        if(arr[i][col] == num): 
            return True
    return False
  
# Returns a boolean which indicates whether any assigned entry 
# within the specified 3x3 box matches the given number 
def used_in_box(arr, row, col, num): 
    for i in range(3): 
        for j in range(3): 
            if(arr[i + row][j + col] == num): 
                return True
    return False
  
# Checks whether it will be legal to assign num to the given row, col 
# Returns a boolean which indicates whether it will be legal to assign 
# num to the given row, col location. 
def check_location_is_safe(arr, row, col, num): 
      
    # Check if 'num' is not already placed in current row, 
    # current column and current 3x3 box 
    return not used_in_row(arr, row, num) and not used_in_col(arr, col, num) and not used_in_box(arr, row - row % 3, col - col % 3, num) 
  
# Takes a partially filled-in grid and attempts to assign values to 
# all unassigned locations in such a way to meet the requirements 
# for Sudoku solution (non-duplication across rows, columns, and boxes) 
def solve_sudoku(arr): 
      
    # 'l' is a list variable that keeps the record of row and col in find_empty_location Function     
    l =[0, 0] 
      
    # If there is no unassigned location, we are done     
    if(not find_empty_location(arr, l)): 
        return True
      
    # Assigning list values to row and col that we got from the above Function  
    row = l[0] 
    col = l[1] 
      
    # consider digits 1 to 9 
    for num in range(1, 10): 
          
        # if looks promising 
        if(check_location_is_safe(arr, row, col, num)): 
              
            # make tentative assignment 
            arr[row][col]= num 
  
            # return, if success, ya ! 
            if(solve_sudoku(arr)): 
                return True
  
            # failure, unmake & try again 
            arr[row][col] = 0
              
    # this triggers backtracking         
    return False 
  
# Driver main function to test above functions 
if __name__=="__main__": 
      
    # creating a 2D array for the grid 
    grid =[[0 for x in range(9)]for y in range(9)] 
      
    # assigning values to the grid 
    grid =[[3, 0, 6, 5, 0, 8, 4, 0, 0], 
          [5, 2, 0, 0, 0, 0, 0, 0, 0], 
          [0, 8, 7, 0, 0, 0, 0, 3, 1], 
          [0, 0, 3, 0, 1, 0, 0, 8, 0], 
          [9, 0, 0, 8, 6, 3, 0, 0, 5], 
          [0, 5, 0, 0, 9, 0, 6, 0, 0], 
          [1, 3, 0, 0, 0, 0, 2, 5, 0], 
          [0, 0, 0, 0, 0, 0, 0, 7, 4], 
          [0, 0, 5, 2, 0, 6, 3, 0, 0]] 
      
    # if success print the grid 
    if(solve_sudoku(grid)): 
        print_grid(grid) 
    else: 
        print("No solution exists")
  
# The above code has been contributed by Harshit Sidhwa. ```

5. Sending the solved input to your Android Device using Python

To send the input, I first filtered out the input from the solved Sudoku Grid,i.e, only send the values which were missing. I used the get_coords function from earlier to get the coords of each box and then calculated their centres. I sent a touch at that centre using ADB, and then sent over the solution.

def automate_game(org_grid, solved_grid):
    for i in range(9):
        for j in range(9):
            if org_grid[i][j] == 0:     # If the box was blank in the game
                x1, y1, x2, y2 = get_coords(i, j)
                center = (x1 + (x2 - x1)/2, y1 + (y2-y1)/2)     # Calculating the center of the box (to select it)
                solution = solved_grid[i][j]
                device.shell(
                    f'input touchscreen swipe {center[0]} {center[1]} {center[0]} {center[1]} 5')
                device.shell(f'input text {solution}')```

Running the code

All the code that I wrote is in functions and they are called one by one. Note that the grid that I get in step 3 isn't passed directly to step 4. I use deepcopy to create a copy of it, so that I can compare the solved grid with the unsolved/original one in step 5.

if __name__ == "__main__":
    # Connect the device using ADB
    device = adb.connect_device()
    # Take Screenshot of the screen and save it in screen.png
    adb.take_screenshot(device)
    image = Image.open('screen.png')
    image = process_image(image)        # Process the image for OCR
    org_grid = get_grid_from_image(image)      # Convert the Image to 2D list using OCR / Pytesseract
    solved_grid = deepcopy(org_grid)        # Deepcopy is used to prevent the function from modifying the original sudoku game
    solve_sudoku(solved_grid)
    automate_game(org_grid, solved_grid)        # Input the solved game into your device```