Skip to content

Commit

Permalink
[core]: Added ofi_get_realtime interfaces
Browse files Browse the repository at this point in the history
Added new interfaces to support pthread_cond_timedwait
which use CLOCK_REALTIME.In general CLOCK_MONOTONIC is
preferred and in use by several providers, therefore new
interfaces are added to support the special case of pthread apis.

[prov/sockets]: changes made to use ofi_get_realtime instead of
ofi_get_time before ofi_wait_cond.

Signed-off-by: Nikhil Nanal <[email protected]>
  • Loading branch information
nikhilnanal authored and j-xiong committed Jun 12, 2024
1 parent 3180bab commit 1a6447a
Show file tree
Hide file tree
Showing 6 changed files with 28 additions and 5 deletions.
4 changes: 4 additions & 0 deletions include/ofi.h
Original file line number Diff line number Diff line change
Expand Up @@ -377,6 +377,10 @@ uint64_t ofi_gettime_ns(void);
uint64_t ofi_gettime_us(void);
uint64_t ofi_gettime_ms(void);

uint64_t ofi_get_realtime_ns(void);
uint64_t ofi_get_realtime_ms(void);
uint64_t ofi_get_realtime_us(void);

static inline uint64_t ofi_timeout_time(int timeout)
{
return (timeout >= 0) ? ofi_gettime_ms() + timeout : 0;
Expand Down
1 change: 1 addition & 0 deletions include/windows/osd.h
Original file line number Diff line number Diff line change
Expand Up @@ -1008,6 +1008,7 @@ size_t ofi_ifaddr_get_speed(struct ifaddrs *ifa);

#define file2unix_time 10000000i64
#define win2unix_epoch 116444736000000000i64
#define CLOCK_REALTIME 0
#define CLOCK_MONOTONIC 1

/* Own implementation of clock_gettime*/
Expand Down
4 changes: 2 additions & 2 deletions prov/sockets/src/sock_cntr.c
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ static int sock_cntr_wait(struct fid_cntr *fid_cntr, uint64_t threshold,
ofi_atomic_inc32(&cntr->num_waiting);

if (timeout >= 0) {
start_ms = ofi_gettime_ms();
start_ms = ofi_get_realtime_ms();
end_ms = start_ms + timeout;
}

Expand All @@ -341,7 +341,7 @@ static int sock_cntr_wait(struct fid_cntr *fid_cntr, uint64_t threshold,
ret = ofi_wait_cond(&cntr->cond, &cntr->mut, (int) remaining_ms);
}

uint64_t curr_ms = ofi_gettime_ms();
uint64_t curr_ms = ofi_get_realtime_ms();
if (timeout >= 0) {
if (curr_ms >= end_ms) {
ret = -FI_ETIMEDOUT;
Expand Down
4 changes: 2 additions & 2 deletions prov/sockets/src/sock_wait.c
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ static int sock_wait_wait(struct fid_wait *wait_fid, int timeout)

wait = container_of(wait_fid, struct sock_wait, wait_fid);
if (timeout > 0)
start_ms = ofi_gettime_ms();
start_ms = ofi_get_realtime_ms();

head = &wait->fid_list;
for (p = head->next; p != head; p = p->next) {
Expand All @@ -154,7 +154,7 @@ static int sock_wait_wait(struct fid_wait *wait_fid, int timeout)
}
}
if (timeout > 0) {
end_ms = ofi_gettime_ms();
end_ms = ofi_get_realtime_ms();
timeout -= (int) (end_ms - start_ms);
timeout = timeout < 0 ? 0 : timeout;
}
Expand Down
18 changes: 18 additions & 0 deletions src/common.c
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,24 @@ uint32_t ofi_generate_seed(void)
return rand_seed;
}

uint64_t ofi_get_realtime_ns(void)
{
struct timespec now;

clock_gettime(CLOCK_REALTIME, &now);
return now.tv_sec * 1000000000 + now.tv_nsec;
}

uint64_t ofi_get_realtime_us(void)
{
return ofi_get_realtime_ns() / 1000;
}

uint64_t ofi_get_realtime_ms(void)
{
return ofi_get_realtime_ns() / 1000000;
}

uint64_t ofi_gettime_ns(void)
{
struct timespec now;
Expand Down
2 changes: 1 addition & 1 deletion src/unix/osd.c
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ int ofi_wait_cond(pthread_cond_t *cond, pthread_mutex_t *mut, int timeout_ms)
if (timeout_ms < 0)
return pthread_cond_wait(cond, mut);

t = ofi_gettime_ms() + timeout_ms;
t = ofi_get_realtime_ms() + timeout_ms;
ts.tv_sec = t / 1000;
ts.tv_nsec = (t % 1000) * 1000000;
return pthread_cond_timedwait(cond, mut, &ts);
Expand Down

3 comments on commit 1a6447a

@nahkbce2
Copy link

@nahkbce2 nahkbce2 commented on 1a6447a Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikhilnanal @j-xiong i am reaching out to know if
if (shm->shared_fd < 0) {
FI_WARN(&core_prov, FI_LOG_CORE, "shm_open failed\n");
ret = -FI_EINVAL;
goto failed;
}
the warning in unis/osd.c has any detrimental consequences because if we enable FI_LOG_LEVEL=Debug we see the following in our log
"libfabric:458240:1718387425::psm3:av:psmx3_av_open():995 housky-n-cp503a35.americas.shell.com:rank39: FI_AV_MAP asked, but force FI_AV_TABLE for shared AV
libfabric:458255:1718387425::core:core:ofi_shm_map():173 shm_open failed
libfabric:458255:1718387425::psm3:av:psmx3_av_open():1050 housky-n-cp503a35.americas.shell.com:rank37: failed to map shared AV: FI_NAMED_AV_0
"
please help me understand this.
Thanks in advance.

@j-xiong
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nahkbce2 I don't think the FI_WARN statement has any side effects other than providing the feedback to users. If you are talking about the entire if block, that's the necessary error check. The log indicates that the shm object named "FI_NAMED_AV_0" for the shared AV can't be opened. Do you see that name under /dev/shm at the time psmx3_av_open is called?

@nahkbce2
Copy link

@nahkbce2 nahkbce2 commented on 1a6447a Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@j-xiong Thank you for responding, i am a newbie and dont know how to check /dev/shm. Could you pls give me some instructions on how to do that?
image

Please sign in to comment.