Instructions to use Open-Orca/Mistral-7B-OpenOrca with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Open-Orca/Mistral-7B-OpenOrca with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Open-Orca/Mistral-7B-OpenOrca") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Open-Orca/Mistral-7B-OpenOrca") model = AutoModelForCausalLM.from_pretrained("Open-Orca/Mistral-7B-OpenOrca") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Local Apps Settings
- vLLM
How to use Open-Orca/Mistral-7B-OpenOrca with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Open-Orca/Mistral-7B-OpenOrca" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open-Orca/Mistral-7B-OpenOrca", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Open-Orca/Mistral-7B-OpenOrca
- SGLang
How to use Open-Orca/Mistral-7B-OpenOrca with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Open-Orca/Mistral-7B-OpenOrca" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open-Orca/Mistral-7B-OpenOrca", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Open-Orca/Mistral-7B-OpenOrca" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open-Orca/Mistral-7B-OpenOrca", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Open-Orca/Mistral-7B-OpenOrca with Docker Model Runner:
docker model run hf.co/Open-Orca/Mistral-7B-OpenOrca
OpenOrca Dataset 'Fail to generate dataset'
I've been trying to download the OpenOrca dataset to use it to finetune some other models. It seems there is something wrong with the recently committed parquet conversion functions. Downloading with load_dataset("Open-Orca/OpenOrca") results in an error when it build the training dataset. This also happens when I clone the repo manually.
ArrowInvalid Traceback (most recent call last)
File ~/Documents/Repos/llama2-4int/.venv/lib/python3.10/site-packages/datasets/builder.py:1879, in ArrowBasedBuilder._prepare_split_single(self, gen_kwargs, fpath, file_format, max_shard_size, job_id)
1878 _time = time.time()
-> 1879 for _, table in generator:
1880 if max_shard_size is not None and writer._num_bytes > max_shard_size:
File ~/Documents/Repos/llama2-4int/.venv/lib/python3.10/site-packages/datasets/packaged_modules/parquet/parquet.py:73, in Parquet._generate_tables(self, files)
72 with open(file, "rb") as f:
---> 73 parquet_file = pq.ParquetFile(f)
74 try:
File ~/Documents/Repos/llama2-4int/.venv/lib/python3.10/site-packages/pyarrow/parquet/core.py:341, in ParquetFile.init(self, source, metadata, common_metadata, read_dictionary, memory_map, buffer_size, pre_buffer, coerce_int96_timestamp_unit, decryption_properties, thrift_string_size_limit, thrift_container_size_limit, filesystem)
340 self.reader = ParquetReader()
--> 341 self.reader.open(
342 source, use_memory_map=memory_map,
343 buffer_size=buffer_size, pre_buffer=pre_buffer,
344 read_dictionary=read_dictionary, metadata=metadata,
345 coerce_int96_timestamp_unit=coerce_int96_timestamp_unit,
346 decryption_properties=decryption_properties,
347 thrift_string_size_limit=thrift_string_size_limit,
348 thrift_container_size_limit=thrift_container_size_limit,
349 )
350 self.common_metadata = common_metadata
...
1911 e = e.context
-> 1912 raise DatasetGenerationError("An error occurred while generating the dataset") from e
1914 yield job_id, True, (total_num_examples, total_num_bytes, writer._features, num_shards, shard_lengths)
DatasetGenerationError: An error occurred while generating the dataset
Anyone else having the same issue? Any solution?
More details after debugging in IDE
Exception has occurred: DatasetGenerationError
An error occurred while generating the dataset
pyarrow.lib.ArrowInvalid: Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.
The above exception was the direct cause of the following exception:
File "/home/mtman/Documents/Repos/llama2-4int/data.py", line 6, in
dataset = load_dataset("./data/OpenOrca/")
datasets.builder.DatasetGenerationError: An error occurred while generating the dataset
UPDATE: I upgraded datasets to v 2.14.5 and it solved the problem. I was using 2.13.
UPDATE: I upgraded datasets to v 2.14.5 and it solved the problem. I was using 2.13.
@mattma1970 FYI, I recommend using the SlimOrca subset of our data. It is verified answers, and is smaller. it will cost hundreds to train on all 4.5m entries, and to be frank some are little to no learning value. https://huggingface.co/Open-Orca/SlimOrca/
Thanks for the feedback. I'll check it out.
Cheers
Matt