[Open Cv] Mediapipe Pose Detection

카테고리 없음

[Open Cv] Mediapipe Pose Detection

외손잡이 2024. 4. 11. 21:13

작성자	장원준
일 시	2024. 4. 11 (목) 18:00 ~ 21:00
장 소	복지관 b128-1호
참가자 명단	임혜진, 이재영, 성창민, 김명원, 장원준
사 진

Mediapipe는 구글에서 제공하는 AI 프레임 워크로서 비디오형식 데이터를 이용한 다양한 비전 AI 기능을 파이프라인 형태로 손쉽게 사용할 수 있도록 제공된다.

Object Detection , Image Classification 등등 다양한 기능과 모델을 제공하는 프레임워크이다.

AI모델개발 및 여러 데이터셋을 이용한 학습이 완료된 상태로 제공되는 프레임워크이므로, 라이브러리를 불러 사용하듯 mediapipe 설치이후, 간단하게 기능을 호출하여 사용하면 되는 형태로 비전 AI 기능을 개발할 수 있다.

저 모델들 중에서 클라이밍하는 사람을 탐지하기 위해서 Pose detection 모델을 사용할 것이다.

구글에서 오픈소스로 제공하는 인공지능 모델이기 때문에 깃허브로 공개되어 있는데,

https://github.com/googlesamples/mediapipe/blob/main/examples/pose_landmarker/python/%5BMediaPipe_Python_Tasks%5D_Pose_Landmarker.ipynb

mediapipe/examples/pose_landmarker/python/[MediaPipe_Python_Tasks]_Pose_Landmarker.ipynb at main · googlesamples/mediapipe

Contribute to googlesamples/mediapipe development by creating an account on GitHub.

github.com

이 링크이다.

!pip install -q mediapipe


!wget -O pose_landmarker.task -q https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_heavy/float16/1/pose_landmarker_heavy.task

mediapipe 을 설치한다.

포즈 인식 모델을 다운로드하여, 현재 작업 디렉토리에 저장하는 코드이다.

from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import numpy as np


def draw_landmarks_on_image(rgb_image, detection_result):
  pose_landmarks_list = detection_result.pose_landmarks
  annotated_image = np.copy(rgb_image)

  # Loop through the detected poses to visualize.
  for idx in range(len(pose_landmarks_list)):
    pose_landmarks = pose_landmarks_list[idx]

    # Draw the pose landmarks.
    pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    pose_landmarks_proto.landmark.extend([
      landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
      annotated_image,
      pose_landmarks_proto,
      solutions.pose.POSE_CONNECTIONS,
      solutions.drawing_styles.get_default_pose_landmarks_style())
  return annotated_image

이 코드는 MediaPipe를 사용하여 입력 이미지에 포즈 랜드마크를 시각화하고, 각 포즈의 랜드마크를 그려넣은 이미지의 복사본을 반환하는 함수입니다.

from google.colab import files
uploaded = files.upload()

for filename in uploaded:
   content = uploaded[filename]
   with open(filename, 'wb') as f:
     f.write(content)

 if len(uploaded.keys()):
   IMAGE_FILE = next(iter(uploaded))
   print('Uploaded file:', IMAGE_FILE)

직접 원하는 이미지를 업로드하는 코드입니다.

# STEP 1: Import the necessary modules.
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision

# STEP 2: Create an PoseLandmarker object.
base_options = python.BaseOptions(model_asset_path='pose_landmarker.task')
options = vision.PoseLandmarkerOptions(
    base_options=base_options,
    output_segmentation_masks=True)
detector = vision.PoseLandmarker.create_from_options(options)

# STEP 3: Load the input image.
image = mp.Image.create_from_file("image.jpg")

# STEP 4: Detect pose landmarks from the input image.
detection_result = detector.detect(image)

# STEP 5: Process the detection result. In this case, visualize it.
annotated_image = draw_landmarks_on_image(image.numpy_view(), detection_result)
cv2_imshow(cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR))

결과값을 시각화 하는 코드입니다.

segmentation_mask = detection_result.segmentation_masks[0].numpy_view()
visualized_mask = np.repeat(segmentation_mask[:, :, np.newaxis], 3, axis=2) * 255
cv2_imshow(visualized_mask)

그멘테이션 마스크를 추출하고, 이를 시각화하는 코드입니다.

위코드를 활용해서 클라이밍에 맞는 알고리즘을 짜면 더 효율적이고 좋은 성능의 모델이 나올 수 있을 듯 하다..!
또, 생각보다 오픈소스로 학습까지 되어있는 모델들이 많아서 앞으로의 프로젝트에서 이를 활용하면 모델을 학습시키는 시간을 획기적으로 줄일 수 있다고 느꼈다.

현재글[Open Cv] Mediapipe Pose Detection

KOBOT

모각공, 파이썬, 자바, KOBOT, 코봇, java, 모여서각자코딩, DP, 플러터, DART, 모여서 각자 코딩, Django, 알고리즘, 국민대학교, Flutter, 백준, 모각코, Python, 보드팀, 모여서각자코딩하기,

Today :
Yesterday :

KOBOT

[Open Cv] Mediapipe Pose Detection

'카테고리 없음'의 다른글

티스토리툴바

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31