Using DirectML, so might be a bit edge-casey. Trying to load this model into my ~10gb of VRAM is not going well. Naturally, trying to get transformers to work with non-cuda devices is abysmal. Windows 11, Radeon RX 6750XT. Not really wanting to use CPU speed for this. Should I give up?
This issue appears to be discussing a feature request or bug report related to the repository. Based on the content, it seems to be still under discussion. The issue was opened by ethrx and has received 2 comments.