nohup: ignoring input ***************************************** Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. ***************************************** /workspace/hanrui/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/junquan/SpecForge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/hanrui/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/junquan/SpecForge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/hanrui/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/junquan/SpecForge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/hanrui/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/junquan/SpecForge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( `torch_dtype` is deprecated! Use `dtype` instead! Loading checkpoint shards: 0%| | 0/5 [00:00