Comment by ycui7
6 hours ago
Problem is the more B70 you have, the slower the inference it gets(due to terrible software atm). A single B70 is almost barely faster than CPU inference. If you have 4 B70, you might as well run interference on CPU and be faster with cheaper DDR5 instead of GDDR6.
For what you say to be useful, please specify what sowftware you have used with B70, including its version.
Hardware-wise a B70 should be significantly faster than any of the available CPUs at ML inference. If it was not so in your tests, that must really be a software problem, so you must identify the software, for others to know what does not work.