Instructions to use Writer/Palmyra-Fin-70B-32K with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Writer/Palmyra-Fin-70B-32K with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Writer/Palmyra-Fin-70B-32K") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Writer/Palmyra-Fin-70B-32K") model = AutoModelForCausalLM.from_pretrained("Writer/Palmyra-Fin-70B-32K") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Writer/Palmyra-Fin-70B-32K with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Writer/Palmyra-Fin-70B-32K" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Writer/Palmyra-Fin-70B-32K", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Writer/Palmyra-Fin-70B-32K
- SGLang
How to use Writer/Palmyra-Fin-70B-32K with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Writer/Palmyra-Fin-70B-32K" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Writer/Palmyra-Fin-70B-32K", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Writer/Palmyra-Fin-70B-32K" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Writer/Palmyra-Fin-70B-32K", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Writer/Palmyra-Fin-70B-32K with Docker Model Runner:
docker model run hf.co/Writer/Palmyra-Fin-70B-32K
Reproduce the CFA Level III results
Hi @wassemgtk , We are interested in reproducing the results. Could you specify which provider and which 88 questions were used (maybe their identifiers)? We would like to have the exact same testing setup as you did, and we'll contact the provider to obtain the rights. Thanks again!
We used two tests from Q1 2021 and Q1 2023. Please ensure that you set up the correct system prompt below:
Take a moment to calm your mind and focus. You are preparing for the CFA® Level III exam, which demands precision and deep understanding. Carefully read the scenario provided and the subsequent question. Your task is to analyze the scenario and select the most appropriate answer from the options given.
Scenario:
[Insert the specific scenario file input here]
Question:
Regarding the immediate tactical asset allocation demand, the behavioral bias most likely exhibited by the client is:
Answer Choices:
A. Loss aversion
B. Home bias
C. Representativeness bias
Please provide:
Your Answer: Choose A, B, or C.
Explanation: Offer a concise explanation supporting your answer, demonstrating your understanding of the behavioral biases and their implications in asset allocation.