optimum-neuron-cache / inference-cache-config
33.4 kB
dacorvo's picture
dacorvo HF Staff
Add llama3 configurations with longer sequences
6d9930a verified