Breaking

Friday, July 14, 2023

Automate game using Hand Gestures || Hand Gestures recognition🖐️

hand gesture recognition

Introduction

In the dynamic realm of gaming, the quest for immersive experiences and innovative controls leads us to the exciting intersection of hand gesture recognition and the popular game Hill Climb. Traditional controllers are now giving way to a new era where the movements of your hands in the air become the steering wheel, accelerator, and brake, offering an unparalleled level of control and engagement.

Imagine navigating the treacherous terrains of Hill Climb not by pressing buttons, but by the intuitive gestures of your hands. The fusion of gaming and hand gesture recognition opens doors to a world of immersion where your movements seamlessly translate into on-screen actions. The responsiveness of this technology creates an unbroken flow between the virtual and physical, making every gameplay session an adventure.

Hand gesture recognition isn't just a technological marvel; it's a driving force behind innovation in the gaming industry. It introduces a tactile and intuitive dimension to the gaming experience, erasing the boundaries between the player and the game world. As we delve into the integration of this technology with Hill Climb, we uncover the potential for an entirely new paradigm in interactive entertainment.

As we embark on this exploration, the promise of hand gesture recognition in gaming becomes a beacon illuminating the future of how we play. The merging of physical gestures and virtual landscapes is a testament to the relentless pursuit of pushing gaming boundaries. The adventure of Hill Climb becomes not just a game but a canvas where gestures paint the path ahead.

In this article, we embark on a journey into the fascinating fusion of hand gesture recognition and Hill Climb, unraveling the layers of innovation that promise to reshape our gaming experiences. As we navigate this frontier, the question isn't just about controlling a game; it's about redefining how we connect with the digital worlds we love

Unlocking the Potential of Human Pose Estimation with Mediapipe in Python

Mediapipe, a powerful Python library developed by Google, revolutionizes the world of computer vision by offering a comprehensive suite of tools for various applications. At its core, Mediapipe specializes in solving complex problems related to human pose estimation, facial recognition, and hand tracking. This versatile package empowers developers and researchers to integrate advanced computer vision capabilities seamlessly into their Python projects.

Mediapipe shines with its sophisticated features, with human pose estimation being a standout capability. By leveraging advanced machine learning models, Mediapipe can accurately detect key points on the human body, enabling applications that require precise understanding of body movements and postures.

Beyond body pose estimation, Mediapipe extends its prowess to facial recognition. It can identify facial landmarks, track facial expressions, and even estimate head poses. This functionality proves invaluable for applications ranging from augmented reality filters to emotion analysis.

Mediapipe further impresses with its ability to perform robust hand tracking. The library excels in recognizing hand gestures, tracking fingers, and estimating hand poses with remarkable accuracy. This makes it a go-to solution for creating interactive interfaces, virtual reality applications, and more.

One of the noteworthy aspects of Mediapipe is its user-friendly design and ease of integration. Developers, irrespective of their expertise level, can swiftly incorporate its capabilities into Python projects. The library's documentation provides clear guidelines, making it accessible for beginners while offering depth for advanced users.

From gaming and animation to healthcare and robotics, Mediapipe finds applications in diverse fields. Its accuracy and versatility open doors for creating immersive experiences, enhancing accessibility, and solving real-world problems through computer vision.

Live Demonstration

Discover the secret to build your own automated tool to automate hill climb racing game using python! Watch our easy-to-follow video tutorial and download the source code today.


Source code : Hand Gesture Recognition

Prerequisites

Before we dive into the practical implementation, there are a few prerequisites we need to address:
1. Python: Ensure you have Python installed on your system. You can download the latest version of Python from the official website and follow the installation instructions.

Step 1 : Creating Virtual environment

Creating a virtual envirnoment helps to reduce errors.So Let's create a virtual environemnt in empty floder.
If you have no idea about virtual environemnt, then you can refer this blog.
After sucessfully creating a python file in virtual environment, Download the following Packages which are required for runing the script

Download only specificed version of pacakges mentioned below.If mentioned version is deleted in future then you can download a version nearer to specified version

1. cv2: cv2 is a module of the OpenCV library, which provides computer vision and machine learning software. Download the specific version of cv2 by typing following command in terminal
Code
pip install opencv-python==4.4.0.42

2. Mediapipe: MediaPipe can be used in Python to build machine learning pipelines for processing time-series data like video, audio, etc. Download the specific version by typing following command in terminal.
Code
pip install mediapipe==0.8.3.1

3. Autopy: Autopy includes functions for controlling the keyboard and mouse, finding colors and bitmaps on-screen, and displaying alerts. Download the specific version by typing following command in terminal.
Code
pip install autopy==4.0.0

4. numpy: Numpy provides a high-performance multidimensional array object, and tools for working with these arrays. Download the specific version by typing following command in terminal.
Code
pip install numpy==1.21.6

5. pydirectinput: PyDirectInput is a library that aims to replicate the functionality of the PyAutoGUI mouse and keyboard inputs. Download the specific version by typing following command in terminal.
Code
pip install PyDirectInput

6. protobuf: Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. Download the specific version by typing following command in terminal.
Code
pip install protobuf==3.20.0

Step 2 : Creating The Script

Create a new python file as mentioned in step 1 and copy-paste the following code in it and save the file
Code
import sys
import cv2
import mediapipe
import numpy
import autopy
import pydirectinput as p1

cap = cv2.VideoCapture(1) #make this '0' if you get assertion error 

# detector = HandDetector(detectionCon=0.8,maxHands=2)
initHand = mediapipe.solutions.hands  # Initializing mediapipe
# Object of mediapipe with "arguments for the hands module"
mainHand = initHand.Hands(min_detection_confidence=0.8, min_tracking_confidence=0.8)
draw = mediapipe.solutions.drawing_utils  # Object to draw the connections between each finger index
wScr, hScr = autopy.screen.size()  # Outputs the high and width of the screen (1920 x 1080)
pX, pY = 0, 0  # Previous x and y location
cX, cY = 0, 0  # Current x and y location

def handLandmarks(colorImg):
    landmarkList = []  # Default values if no landmarks are tracked
    landmarkPositions = mainHand.process(colorImg)  # Object for processing the video input
    landmarkCheck = landmarkPositions.multi_hand_landmarks  # Stores the out of the processing object (returns False on empty)
    if landmarkCheck:  # Checks if landmarks are tracked
        for hand in landmarkCheck:  # Landmarks for each hand
            for index, landmark in enumerate(
                    hand.landmark):  # Loops through the 21 indexes and outputs their landmark coordinates (x, y, & z)
                draw.draw_landmarks(img, hand,
                                    initHand.HAND_CONNECTIONS)  # Draws each individual index on the hand with connections
                h, w, c = img.shape  # Height, width and channel on the image
                centerX, centerY = int(landmark.x * w), int(
                    landmark.y * h)  # Converts the decimal coordinates relative to the image for each index
                landmarkList.append([index, centerX, centerY])  # Adding index and its coordinates to a list
    return landmarkList

def fingers(landmarks):
    fingerTips = []  # To store 4 sets of 1s or 0s
    tipIds = [4, 8, 12, 16, 20]  # Indexes for the tips of each finger
    # Check if thumb is up
    if landmarks[tipIds[0]][1] > lmList[tipIds[0] - 1][1]:
        fingerTips.append(1)
    else:
        fingerTips.append(0)
    # Check if fingers are up except the thumb
    for id in range(1, 5):
        if landmarks[tipIds[id]][2] < landmarks[tipIds[id] - 3][2]:  # Checks to see if the tip of the finger is higher than the joint
            fingerTips.append(1)
        else:
            fingerTips.append(0)
    return fingerTips


while True:
    check, img = cap.read()  # Reads frames from the camera
    imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # Changes the format of the frames from BGR to RGB
    lmList = handLandmarks(imgRGB)


    if len(lmList) != 0:
        x1, y1 = lmList[8][1:]  # Gets index 8s x and y values (skips index value because it starts from 1)
        x2, y2 = lmList[12][1:]  # Gets index 12s x and y values (skips index value because it starts from 1)
        finger = fingers(lmList)  # Calling the fingers function to check which fingers are up

        if finger[1] == 1 and finger[2] == 0 and finger[4] ==0:  # Checks to see if the pointing finger is up and thumb finger is down
            x3 = numpy.interp(x1, (75, 640 - 75),(0, wScr))  # Converts the width of the window relative to the screen width
            y3 = numpy.interp(y1, (75, 480 - 75),(0, hScr))  # Converts the height of the window relative to the screen height

            cX = pX + (x3 - pX) / 7  # Stores previous x locations to update current x location
            cY = pY + (y3 - pY) / 7  # Stores previous y locations to update current y location

            autopy.mouse.move(wScr - cX,cY)  # Function to move the mouse to the x3 and y3 values (wSrc inverts the direction)
            pX, pY = cX, cY  # Stores the current x and y location as previous x and y location for next loop



        if finger[1] == 0 and finger[0] == 1 :  # Checks to see if the pointer finger is down and thumb finger is up
             p1.click(button = 'left')
        
        if sum(finger) == 5:
             p1.keyDown("right")
             p1.keyUp("left")
        
        elif sum(finger) == 0:
             p1.keyDown("left")
             p1.keyUp("right")
        elif finger[1] == 1 and finger[2] == 1 and finger[3] == 1:
             p1.press("space")
        elif finger[1]==1:
             p1.keyUp("right")
             p1.keyUp("left")

    cv2.imshow("Webcam", img)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
try:
    cv2.release()
except:
    sys.exit()
cv2.destroyAllWindows()

Step 3 : Ensuring Camera connection

Ensure the camera is connected to your desktop/laptop as it is neccessary for detecting hand gestures

Step 4 : Ruinning the Script

Before Running the script, just open hill climb racing game and just minimise it for a while.
After that, just open the terminal and make sure the terminal is open inside virtual environment and then type the command to run the script which is 'python file_name.py' .If everything goes well then a pop of program will appear in front of you in this manner

hand gesture

Step 5: Enjoy the game

Just Maximize the hill climb racing game tab and enjoy the game in similar way I demonstrated in video !!!

Conclusion

In wrapping up our exploration, Mediapipe emerges as a force to be reckoned with in the expansive field of computer vision. Its prowess in human pose estimation, facial recognition, and hand tracking showcases a tool that transcends boundaries. The library's robust features, seamless integration, and wide-ranging applications underscore its pivotal role in reshaping the landscape of computer vision technologies.

Mediapipe's strength lies not only in its accuracy but also in its adaptability. The ease with which it integrates into various domains, from gaming to healthcare, highlights its versatility. It stands as a testament to the transformative potential residing within open-source libraries, emphasizing the collaborative power of the developer community in propelling the capabilities of computer vision to new heights. As we navigate the dynamic realm of technology, Mediapipe stands as a beacon, guiding us towards a future where computer vision becomes an even more integral part of our daily lives.
  
Stay up-to-date with our latest content by subscribing to our channel! Don't miss out on our next video - make sure to subscribe today.



No comments: