Add NVIDIA API (GLM-4.7) as primary inference method with thinking mode 369743c saneowl commited on May 3