You must log in or register to comment.
TLDW: Tried huge open models and got 0.7 tokens/s. Seems clustering tools aren’t ready yet.
I’m new to this, but some advice seems to be use vulkan to get stuff to work. in your spare time one day, look at rocm.
I appreciate that! I’m just trying to recap the authors results for people that might be interested in the bottom line.
It seems that the beowulf architecture still has too much overhead
I am really curious how the T/s scales with network speed, as said in the video 5G is quite slow.