<aside> 💡 목표

이미지를 기반으로 이미지 내 각 물체의 거리를 파악하고 이를 언어 모델이 답변할 수 있도록 하자

필요 구현 사항

depth 정보를 기반으로 이미지 내 각 요소의 거리 도출
특정 요소와 도출된 거리 간 매칭 수행
모든 요소와 각 요소의 거리 정보를 언어 모델의 입력으로 설정하여 언어 모델의 응답 취득 </aside>

그러고보니 꼭 depth 이미지를 기반으로 distance를 추출할 필요는 없고 일반 이미지를 기반으로 Object detection을 한 다음 distance를 추출해도 괜찮을 듯 합니다.

관련 논문

Absolute distance prediction based on deep learning object detection and monocular depth estimation models: https://arxiv.org/pdf/2111.01715.pdf
Dist-YOLO: Fast Object Detection with Distance Estimation: https://www.mdpi.com/2076-3417/12/3/1354

Blog

Measure Distance in Photos and Videos Using Computer Vision: https://blog.roboflow.com/computer-vision-measure-distance/

Youtube

https://m.youtube.com/watch?v=FcRCwTgYXJw

Dataset

KITTI distance estimation

GitHub - harshilpatel312/KITTI-distance-estimation: Estimating distance to objects in the scene using detection information

기법 정리

GitHub - Asadullah-Dal17/Distance_measurement_using_single_camera: using single camera to measure the distance opencv python,

object detection model로 진행한다면, dist-yolo가 안될 경우에 yolov8에다가 distance 산출하는 모듈만 붙여서 custom하면 괜찮을 것 같다는 개인적인 생각이 듭니다.

하드웨어 (적용한다고 하면)