Skip to content

Commit

Permalink
feat: look at person (#166)
Browse files Browse the repository at this point in the history
* refactor: use correct naming conventions.

* feat/refactor: introduce generic object detection skills, modularise from this.

* refactor: let the generic perception skills acquire their own sensor input (image/pcl).

* refactor/chore: imports and add LookForSeats stub.

* feat: always publish markers for debugging, when doing 3D point estimation.

* WIP: look for place to seat people.

* chore: default sample rate.

* chore: setup receptionist demo points.

* fix: sleep.

* feat: LookToPoint skill.

* chore: reorganise receptionist Rasa data and pipeline.

* feat: additional motions.

* chore: update receptionist demo.

* feat: look for chair and seat guest.

* WIP: get guest name.

* feat: be more informative.

* fix: set unknown.

* feat: announce name as well as drink.

* refactor: ignore point.

* feat: be more explicit; look at the chair and semantically describe its location.

* WIP.

* feat: add option to specify device to use for whisper.

* feat: simple transcribe microphone service.

* refactor: make transcribe and parse service use the new 'simple' transcribe audio service.

* feat: terminate audio interface.

* chore: update CMakeLists.

* refactor: acquire microphone when initialising service.

* chore: update launch file.

* fix: remove call to redundant method,

* fix: error handling.

* feat/fix: pre-load whisper model by prompting it, adjust for ambient noise at the beginning.

* chore: configuration required for deploying docs

* docs: update container documentation

* ci: first draft of documentation build through actions

* ci: remove submodules

* ci: use ubuntu 20.04

* ci: try using ros:noetic container image

* ci: catkin tools isn't available in container

* ci: remove extranous source line

* ci: try the one-liner ROS installer

* ci: use the official ROS noetic installation guide

* ci: install catkin tools

* ci: try sourcing before commands

* ci: deploy to github pages (no stage check yet)

* ci: use deploy key

* ci: only deploy if on main

* chore: update receptionist data and pipeline.

* fix: explicitly set spacy NLP model.

* fix: drop age for now, since spacy is broken(?).

* feat: point and raise_torso motions.

* feat: point to chair.

* feat: additional motions.

* fix: reset motions.

* fix: handle empty transcriptions.

* fix: enable start state.

* fix: properly reset motions

* fix(ci): use github_token for gh pages deploy.

* revert: fix(ci): use github_token for gh pages deploy.

* feat: save microphone output to audio files

* chore: update gitignore with python setup files

* fix: filename typo and add to cmake

* testtesttest

* Test

* Test

* Message Change

* fixed message writting and reading

* fixed image crop

* Updated the feature extraction node and changed the messages.

* fixed that torsal and head frames were inversed.

* Changed colour format from BGR to RGB within the detection process.

* keep the saved model

* keep the model file

* keep the saved model

* Cancel saving the images (but sitll cannot see use cv2.imshow)

* Runnable demo

* added the hair colour distribution matching method

* retrained model is very robust so changed the threshold

* Moving the head to meet the person.

* xyz axis readable.

* (Hopefully) Runnable with 3d input.

* Speak normally.

* Try to move the head.

* testtesttest

* Test

* Test

* Message Change

* fixed message writting and reading

* fixed image crop

* Updated the feature extraction node and changed the messages.

* fixed that torsal and head frames were inversed.

* Changed colour format from BGR to RGB within the detection process.

* keep the saved model

* keep the model file

* keep the saved model

* Cancel saving the images (but sitll cannot see use cv2.imshow)

* Runnable demo

* added the hair colour distribution matching method

* retrained model is very robust so changed the threshold

* Moving the head to meet the person.

* xyz axis readable.

* (Hopefully) Runnable with 3d input.

* ah

* At least the head moves, not looking at me though.

* Cleaned the file to have only one model appear.

* Replace the old model with the new one.

* correct the lost module.

* info update

* fixed issues in the service

* fixed a stupic typo

* runnable version for full demo

* Recover the state machine for demo.

* Added a simple loop to refresh the frame taken, should work fine.

* adding learn_face

* adding detect faces

* Add detect faces state

* editing detect phases

* Implement state for learning faces

* adding in extra states

* any changes

* adjustments when testing

* push lol

* finished merging?

* new state machine

* merging from rexy laptop

* fixed detect faces

* getting rid of lasr_speech

* swapping from base controllers to skills

* fixing state machine after merge with main

* making changes

* TEST: .idea diff

* feat: get name and drink skill & prior data yaml

* fix: missing keys for get name & drink constructor

* feat: stubs for get attribute state

* feat: wrap head manager and lowering base into GoToLocation.

* feat: state machine setup.

* test: Github tracking fixes

* feat: allow AskAndListen to take static text, a format string or reda from userdata.

* refactor: being picky about spaces.

* fix: import states in dunder init.

* feat: get name and drink, and stub for attributes.

* fix: typehint for userdata

* feat: setup host.

* fix: typehint.

* fix: refactor ParseNameAndDrink state to convert possible names and drinks to lowercase

* refactor: receptionist -> sm_merge (conflicts with python package...)

* refactor: remove redundant __init__.py

* fix: container consistency.

* fix: initialise data correctly.

* fix: lower.

* fix: uncomment.

* feat: take seating area pose and polygon as input to sm.

* feat: guide guest to seating area.

* feat: add introduce state

* feat: add introduction to state machine

* feat(WIP): find empty seat.

* fix: sneaky parenthesis stops boolifying.

* fix: correctly access guest attribute data

* fix: get introduce working.

* feat(WIP, still): find an empty seat.

* feat/refactor: SeatGuest is working and rename.

* feat: update init.py

* refactor: static motions.

* feat: seat guest in state machine.

* feat: send head back to default.

* refactor: center -> centre.

* feat: logic for second guest and pass guest id through constructors.

* fix: transition.

* refactor: drop unused method.

* Remove print statement and add TODO.

* fix: namespace.

* fix: transition and assume PlayMotion succeeded.

* test: yaml setup for testing.

* fix: transitions.

* refactor: remove redundant comment.

* fix: dumb intersection bug.

* feat: clip skill for VQA

* feat: attempt of clip for attributes

* fix: missing imports

* feat: tidied clip attributes

* refactor: cleanup.

* refactor: cleanup again.

* refactor: massive cleanup.

* refactor: remove hair colour images.

* refactor: cleanup cv2_img.

* refactor: move Vqa service message into lasr_vision_msgs.

* fix: import.

* cleanup: remove vision changes.

* refactor: cleanup clip vqa skill.

* refactor: cleanup describe people skill.

* refactor: cleanup vision msgs.

* refactor: remove shebang and unused import.

* refactor: remove unnecessary message.

* refactor: make QueryImage a ServiceState.

* refactor: make QueryImage a ServiceState.

* build: CMake and requirements.

* feat: fix the msgs for bodypix

* feat: test initial segment person

* feat: filter the poses liek the masks

* feat: filter the poses liek the masks

* chore: add pcl to img msg and keep the same timestampt

* chore: change the look to point to use pointstamp instead

* chore: add the 3d image skill

* feat: working look at person

* chore: cleanup for PR

* fix: the PR comments + add head manager disable/enable

* feat: wip

* feat: working wip

* feat: look at multiple people working

* chore: prep for PR

* chore: reformat agaaaain

---------

Co-authored-by: Jared Swift <[email protected]>
Co-authored-by: Paul Makles <[email protected]>
Co-authored-by: m-barker <[email protected]>
Co-authored-by: Benteng Ma <[email protected]>
Co-authored-by: MBT <[email protected]>
Co-authored-by: tiago <[email protected]>
Co-authored-by: Zoe <[email protected]>
Co-authored-by: Aaliyah Merchant <[email protected]>
Co-authored-by: Haiwei Luo <[email protected]>
  • Loading branch information
10 people authored May 24, 2024
1 parent 953d35c commit d7c3ca9
Show file tree
Hide file tree
Showing 7 changed files with 309 additions and 13 deletions.
11 changes: 11 additions & 0 deletions common/helpers/cv2_pcl/src/cv2_pcl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,23 @@
from sensor_msgs.msg import PointCloud2
import ros_numpy as rnp
import cv2
from cv2_img import cv2_img_to_msg

from typing import Tuple, Union

Mat = np.ndarray


def pcl_to_img_msg(pcl: PointCloud2) -> Mat:
"""
Convert a given PointCloud2 message to img_msg
"""
# keep the same timestamp
cv2 = pcl_to_cv2(pcl)

return cv2_img_to_msg(cv2, pcl.header.stamp)


def pcl_to_cv2(
pcl: PointCloud2, height: Union[int, None] = None, width: Union[int, None] = None
) -> Mat:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,6 @@ def detect(
keypoints_msg = []

for i, keypoint in pose.keypoints.items():

if camel_to_snake(keypoint.part) in request.masks[0].parts:
keypoint_msg = BodyPixKeypoint()
keypoint_msg.xy = [int(keypoint.position.x), int(keypoint.position.y)]
Expand Down
1 change: 0 additions & 1 deletion common/vision/lasr_vision_msgs/msg/BodyPixPose.msg
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
# keypoints
BodyPixKeypoint[] keypoints
19 changes: 14 additions & 5 deletions skills/src/lasr_skills/detect_faces.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,30 @@

from lasr_vision_msgs.srv import DetectFaces as DetectFacesSrv
from sensor_msgs.msg import Image
from cv2_pcl import pcl_to_img_msg


class DetectFaces(smach.State):

def __init__(self, image_topic: str = "/xtion/rgb/image_raw"):
def __init__(
self,
image_topic: str = "/xtion/rgb/image_raw",
):
smach.State.__init__(
self, outcomes=["succeeded", "failed"], output_keys=["detections"]
self,
outcomes=["succeeded", "failed"],
input_keys=["pcl_msg"],
output_keys=["detections"],
)
self.img_msg = None
self._image_topic = image_topic
self._detect_faces = rospy.ServiceProxy("/detect_faces", DetectFacesSrv)

def execute(self, userdata):
img_msg = rospy.wait_for_message(self._image_topic, Image)
self.img_msg = pcl_to_img_msg(userdata.pcl_msg)
if not self.img_msg:
self.img_msg = rospy.wait_for_message(self._image_topic, Image)
try:
result = self._detect_faces(img_msg)
result = self._detect_faces(self.img_msg)
userdata.detections = result
return "succeeded"
except rospy.ServiceException as e:
Expand Down
282 changes: 282 additions & 0 deletions skills/src/lasr_skills/look_at_person.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,282 @@
import smach_ros
from geometry_msgs.msg import PointStamped
import smach
from vision import GetPointCloud
from lasr_vision_msgs.srv import BodyPixDetection, BodyPixDetectionRequest
from lasr_vision_msgs.msg import BodyPixMaskRequest
from lasr_skills import LookToPoint, DetectFaces
import cv2
import rospy
from sensor_msgs.msg import Image
from cv2_img import cv2_img_to_msg, msg_to_cv2_img
from markers import create_and_publish_marker
from visualization_msgs.msg import Marker
from cv2_pcl import pcl_to_img_msg
import ros_numpy as rnp
import rosservice
from smach import CBState
from std_msgs.msg import String
import actionlib
from control_msgs.msg import PointHeadAction, PointHeadGoal
from geometry_msgs.msg import Point

PUBLIC_CONTAINER = False

try:
from pal_startup_msgs.srv import (
StartupStart,
StartupStop,
StartupStartRequest,
StartupStopRequest,
)
except ModuleNotFoundError:
PUBLIC_CONTAINER = True


class LookAtPerson(smach.StateMachine):
class CheckEyes(smach.State):
def __init__(self, debug=True):
smach.State.__init__(
self,
outcomes=["succeeded", "failed", "no_detection"],
input_keys=["bbox_eyes", "pcl_msg", "masks", "detections"],
output_keys=["pointstamped"],
)
self.DEBUG = debug
self.img_pub = rospy.Publisher("/debug/image", Image, queue_size=1)
self.marker_pub = rospy.Publisher("eyes", Marker, queue_size=1)
self.look_at_pub = actionlib.SimpleActionClient(
"/head_controller/point_head_action", PointHeadAction
)

def execute(self, userdata):
rospy.sleep(1)
if len(userdata.bbox_eyes) < 1 and len(userdata.detections.detections) > 0:
return "succeeded"
elif (
len(userdata.bbox_eyes) < 1 and len(userdata.detections.detections) < 1
):
return "no_detection"

for det in userdata.bbox_eyes:
left_eye = det["left_eye"]
right_eye = det["right_eye"]
eye_point = (left_eye[0] + right_eye[0]) / 2, (
left_eye[1] + right_eye[1]
) / 2

if self.DEBUG:
img = msg_to_cv2_img(pcl_to_img_msg(userdata.pcl_msg))
cv2.circle(img, (left_eye[0], left_eye[1]), 5, (0, 255, 0), -1)
cv2.circle(img, (right_eye[0], right_eye[1]), 5, (0, 255, 0), -1)
img_msg1 = cv2_img_to_msg(img)
self.img_pub.publish(img_msg1)

pcl_xyz = rnp.point_cloud2.pointcloud2_to_xyz_array(
userdata.pcl_msg, remove_nans=False
)
eye_point_pcl = pcl_xyz[int(eye_point[1]), int(eye_point[0])]
if any([True for i in eye_point_pcl if i != i]):
eye_point_pcl = pcl_xyz[int(right_eye[1]), int(right_eye[0])]

look_at = PointStamped()
look_at.header = userdata.pcl_msg.header
look_at.point.x = eye_point_pcl[0]
look_at.point.y = eye_point_pcl[1]
look_at.point.z = eye_point_pcl[2]

if self.DEBUG:
create_and_publish_marker(self.marker_pub, look_at, r=0, g=1, b=0)

userdata.bbox_eyes.remove(det)
userdata.pointstamped = look_at

self.look_at_pub.wait_for_server()
goal = PointHeadGoal()
goal.pointing_frame = "head_2_link"
goal.pointing_axis = Point(1.0, 0.0, 0.0)
goal.max_velocity = 1.0
goal.target = look_at
self.look_at_pub.send_goal(goal)

return "succeeded"

return "failed"

@smach.cb_interface(input_keys=["poses", "detections"], output_keys=["bbox_eyes"])
def match_poses_and_detections(ud):
bbox_eyes = []
for pose in ud.poses:
for detection in ud.detections.detections:
temp = {
"bbox": detection.xywh,
}
for keypoint in pose.keypoints:
if (
keypoint.part == "leftEye"
and detection.xywh[0]
< keypoint.xy[0]
< detection.xywh[0] + detection.xywh[2]
and detection.xywh[1]
< keypoint.xy[1]
< detection.xywh[1] + detection.xywh[3]
):
temp["left_eye"] = keypoint.xy
if (
keypoint.part == "rightEye"
and detection.xywh[0]
< keypoint.xy[0]
< detection.xywh[0] + detection.xywh[2]
and detection.xywh[1]
< keypoint.xy[1]
< detection.xywh[1] + detection.xywh[3]
):
temp["right_eye"] = keypoint.xy

if "left_eye" in temp and "right_eye" in temp:
bbox_eyes.append(temp)

ud.bbox_eyes = bbox_eyes

return "succeeded"

def __init__(self):
super(LookAtPerson, self).__init__(
outcomes=["succeeded", "failed"],
input_keys=[],
output_keys=["masks", "poses", "pointstamped"],
)
self.DEBUG = rospy.get_param("/debug", True)
IS_SIMULATION = (
"/pal_startup_control/start" not in rosservice.get_service_list()
)

with self:
if not IS_SIMULATION:
if PUBLIC_CONTAINER:
rospy.logwarn(
"You are using a public container. The head manager will not be stopped during navigation."
)
else:
smach.StateMachine.add(
"DISABLE_HEAD_MANAGER",
smach_ros.ServiceState(
"/pal_startup_control/stop",
StartupStop,
request=StartupStopRequest("head_manager"),
),
transitions={
"succeeded": "GET_IMAGE",
"aborted": "failed",
"preempted": "failed",
},
)
smach.StateMachine.add(
"GET_IMAGE",
GetPointCloud("/xtion/depth_registered/points"),
transitions={
"succeeded": "SEGMENT_PERSON",
},
remapping={"pcl_msg": "pcl_msg"},
)

eyes = BodyPixMaskRequest()
eyes.parts = ["left_eye", "right_eye"]
masks = [eyes]

smach.StateMachine.add(
"SEGMENT_PERSON",
smach_ros.ServiceState(
"/bodypix/detect",
BodyPixDetection,
request_cb=lambda ud, _: BodyPixDetectionRequest(
pcl_to_img_msg(ud.pcl_msg), "resnet50", 0.7, masks
),
response_slots=["masks", "poses"],
input_keys=["pcl_msg"],
output_keys=["masks", "poses"],
),
transitions={
"succeeded": "DETECT_FACES",
"aborted": "failed",
"preempted": "failed",
},
)
smach.StateMachine.add(
"DETECT_FACES",
DetectFaces(),
transitions={
"succeeded": "MATCH_POSES_AND_DETECTIONS",
"failed": "failed",
},
remapping={"pcl_msg": "pcl_msg", "detections": "detections"},
)

smach.StateMachine.add(
"MATCH_POSES_AND_DETECTIONS",
CBState(
self.match_poses_and_detections,
input_keys=["poses", "detections"],
output_keys=["poses"],
outcomes=["succeeded", "failed"],
),
transitions={"succeeded": "CHECK_EYES", "failed": "failed"},
remapping={"bbox_eyes": "bbox_eyes"},
)
smach.StateMachine.add(
"CHECK_EYES",
self.CheckEyes(self.DEBUG),
transitions={
"succeeded": "LOOK_TO_POINT",
"failed": "failed",
"no_detection": "succeeded",
},
remapping={
"pcl_msg": "pcl_msg",
"bbox_eyes": "bbox_eyes",
"pointstamped": "pointstamped",
},
)

smach.StateMachine.add(
"LOOK_TO_POINT",
LookToPoint(),
transitions={
"succeeded": "LOOP",
"aborted": "failed",
"preempted": "failed",
},
remapping={"pointstamped": "pointstamped"},
)
smach.StateMachine.add(
"LOOP",
smach.CBState(
lambda ud: "succeeded" if len(ud.bbox_eyes) > 0 else "finish",
input_keys=["bbox_eyes"],
output_keys=["bbox_eyes"],
outcomes=["succeeded", "finish"],
),
transitions={
"succeeded": "CHECK_EYES",
"finish": "ENABLE_HEAD_MANAGER",
},
)
if not IS_SIMULATION:
if PUBLIC_CONTAINER:
rospy.logwarn(
"You are using a public container. The head manager will not be start following navigation."
)
else:
smach.StateMachine.add(
"ENABLE_HEAD_MANAGER",
smach_ros.ServiceState(
"/pal_startup_control/start",
StartupStart,
request=StartupStartRequest("head_manager", ""),
),
transitions={
"succeeded": "succeeded",
"preempted": "failed",
"aborted": "failed",
},
)
7 changes: 2 additions & 5 deletions skills/src/lasr_skills/look_to_point.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,7 @@ def __init__(self):
pointing_frame="head_2_link",
pointing_axis=Point(1.0, 0.0, 0.0),
max_velocity=1.0,
target=PointStamped(
header=Header(frame_id="map"),
point=ud.point,
),
target=ud.pointstamped,
),
input_keys=["point"],
input_keys=["pointstamped"],
)
1 change: 0 additions & 1 deletion tasks/receptionist/launch/setup.launch
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,4 @@
<node pkg="lasr_vision_feature_extraction" type="service" name="torch_service" output="screen"/>
<node pkg="lasr_vision_bodypix" type="service" name="bodypix_service" output="screen"/>


</launch>

0 comments on commit d7c3ca9

Please sign in to comment.