Untitled
[ ] ‣
https://www.reddit.com/r/LocalLLaMA/comments/14rluww/triton_vs_llamacpp/
[ ] https://github.com/LargeWorldModel/LWM
[ ] https://github.com/ggerganov/llama.cpp (Majorly for CPU usage)
[ ] https://github.com/IST-DASLab/gptq
[ ] https://github.com/flashinfer-ai/flashinfer/
[ ] https://hamel.dev/notes/llm/inference/03_inference.html
[ ] https://flann.super.site/talks
[ ] https://srush.github.io/raspy/
https://www.youtube.com/watch?v=t5LjgczaS80