Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FPGA #43

Open
iavssw opened this issue Apr 8, 2024 · 1 comment
Open

FPGA #43

iavssw opened this issue Apr 8, 2024 · 1 comment

Comments

@iavssw
Copy link

iavssw commented Apr 8, 2024

When running more that one job inside a pod cannot submit more than one job reliably. If more that one job is summitted in succession we get a input output error. This problem can be mitigated by xbutil reset from the host before a pod is spun up but this is not a desirable .

Any feedback would be grateful.

user@mlcluster-interactive-example-jfdz2:~/FPGA_test$ ./host vadd_hw.xclbin 512 0 1 64

 Total Data of 512.000 Mbytes to be written to global memory from host

 Kernel is invoked 1 time and repeats itself 1 times

Found Platform
Platform Name: Xilinx
DEVICE xilinx_u55c_gen3x16_xdma_base_3
INFO: Reading vadd_hw.xclbin
Loading: 'vadd_hw.xclbin'
- host loop iteration #0 of 1 total iterations
kernel_time_in_sec = 0.0421578
Duration using events profiling: 42050286 ns
 match_count = 134217728 mismatch_count = 0 total_data_size = 134217728
Throughput Achieved = 12.7674 GB/s
TEST PASSED
user@mlcluster-interactive-example-jfdz2:~/FPGA_test$ ./host vadd_hw.xclbin 512 0 1 64

 Total Data of 512.000 Mbytes to be written to global memory from host

 Kernel is invoked 1 time and repeats itself 1 times

Found Platform
Platform Name: Xilinx
DEVICE xilinx_u55c_gen3x16_xdma_base_3
INFO: Reading vadd_hw.xclbin
Loading: 'vadd_hw.xclbin'
- host loop iteration #0 of 1 total iterations
XRT build version: 2.14.384
Build hash: 090bb050d570d2b668477c3bd0f979dc3a34b9db
Build date: 2022-12-09 00:55:08
Git branch: 2022.2
PID: 99
UID: 1006
[Mon Apr  8 15:10:45 2024 GMT]
HOST: mlcluster-interactive-example-jfdz2
EXE: /home/gregj/FPGA_test/host
[XRT] ERROR: unable to sync BO: Input/output error
terminate called after throwing an instance of 'xrt_xocl::error'
  what():  event 0 never submitted
Aborted (core dumped)

@yuzhang66
Copy link
Contributor

Hi @iavssw, this issue may related to the XRT container solution, could try to run this test under a pure container environment without k8s and see what happens? If it can be reproduced, I suggest to reach the XRT team for further help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants