Any views expressed within media held on this service are those of the contributors, should not be taken as approved or endorsed by the University, and do not necessarily reflect the views of the University in respect of any particular issue.

Camera control in Unity using MediaPipe

In order to create an in-game language to allow MM (player 1) who can hear, but sees blurry to communicate with NN (player 2) who can see, but hears muffled, I thought using a hand tracking system implemented in Unity could work.

MediaPipe Hands

Is a high-fidelity hand and finger tracking solution. It employs machine learning (ML) to infer 21 3D landmarks of a hand from just a single frame.

https://mediapipe.dev/images/mobile/hand_landmarks.png
Hand landmarks

MediaPipe Hands utilizes an ML pipeline consisting of multiple models working together: A palm detection model that operates on the full image and returns an oriented hand bounding box. A hand landmark model that operates on the cropped image region defined by the palm detector and returns high-fidelity 3D hand keypoints.

Installing

The recommended way to do it is by installing Python and pip: https://www.python.org/downloads/ . In some cases it might require the package manager Conda to be installed. In the terminal, OpenCV and Mediapipe need to be installed; this process may vary depending on the computers processor. For a Mac with M4, the next process carried on:

OpenCV-Mediapipe installation
MediaPipe in Conda environment
Environment activation
MediaPipe installation in Conda

Configuration

The first step is to create a Python code for Hand gesture recognition and network communication using the MediaPipe library. This may vary depending on what is expected to do, but the one used in the first trial was this:

Hand tracking recognition and network communication Python code

Where the hand landmarks (3, 4) represent the thumb finger and will move the player forward when the thumb is up and backward when it’s down. This code will also capture the video from the default camera, processes the hand landmarks and sends the data via socket, in this case, to Unity.

Unity connection

For the Unity implementation, we need to create C# sharp code that starts the socket server.

Socket server to Unity C# code

This script will be inserted into the First Person Controller inside the Unity environment. The script will need to be modified in case it doesn’t match or use the same variables as the First Person Controller script.

FPS Socket Server script

Running the Python file

The Unity game must be running previously to running the py file. Once this is done, in the terminal type the corresponding path of the Python file.

Py file running

Testing controls in Unity

Forward
Backward
Stop

(Open CV-Mediapipe installation)

Leave a Reply

css.php

Report this page

To report inappropriate content on this page, please use the form below. Upon receiving your report, we will be in touch as per the Take Down Policy of the service.

Please note that personal data collected through this form is used and stored for the purposes of processing this report and communication with you.

If you are unable to report a concern about content via this form please contact the Service Owner.

Please enter an email address you wish to be contacted on. Please describe the unacceptable content in sufficient detail to allow us to locate it, and why you consider it to be unacceptable.
By submitting this report, you accept that it is accurate and that fraudulent or nuisance complaints may result in action by the University.

  Cancel