-
Notifications
You must be signed in to change notification settings - Fork 509
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[OCI] Enable SkyServe for OCI (#4338)
* [OCI] Enable SkyServe for OCI * enable open_ports * fix * Add example serve-qwen-7b.yaml * fix * format * Skip check the source CIDR so that user can control the security by manually. * Update sky/provision/oci/query_utils.py Co-authored-by: Tian Xia <cblmemo@gmail.com> * Update sky/provision/oci/query_utils.py Co-authored-by: Tian Xia <cblmemo@gmail.com> * Update sky/provision/oci/query_utils.py Co-authored-by: Tian Xia <cblmemo@gmail.com> * nit * Implement open_ports/cleanup_ports per cluster * Address review comments * naming * debug info * remove unneccessary logic * detach the nsg before instance termination * typo * Add example * same file already exists in examples/serve folder * Add example for serve cpu resource task. * nit * Address review comments: mainly eliminate the port overlap issue. * Add a smoke test * nit * OCI now supports open_port --------- Co-authored-by: Tian Xia <cblmemo@gmail.com>
- Loading branch information
Showing
9 changed files
with
291 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
service: | ||
readiness_probe: / | ||
replicas: 2 | ||
|
||
resources: | ||
cloud: oci | ||
region: us-sanjose-1 | ||
ports: 8080 | ||
cpus: 2+ | ||
|
||
run: python -m http.server 8080 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# service.yaml | ||
service: | ||
readiness_probe: /v1/models | ||
replicas: 2 | ||
|
||
# Fields below describe each replica. | ||
resources: | ||
cloud: oci | ||
region: us-sanjose-1 | ||
ports: 8080 | ||
accelerators: {A10:1} | ||
|
||
setup: | | ||
conda create -n vllm python=3.12 -y | ||
conda activate vllm | ||
pip install vllm | ||
pip install vllm-flash-attn | ||
run: | | ||
conda activate vllm | ||
python -u -m vllm.entrypoints.openai.api_server \ | ||
--host 0.0.0.0 --port 8080 \ | ||
--model Qwen/Qwen2-7B-Instruct \ | ||
--served-model-name Qwen2-7B-Instruct \ | ||
--device=cuda --dtype auto --max-model-len=2048 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.