-
Notifications
You must be signed in to change notification settings - Fork 7
Add pytest utils and smoke test for vllm-detector-adapter [using cpu vllm version] #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add pytest utils and smoke test for vllm-detector-adapter [using cpu vllm version] #72
Conversation
Signed-off-by: Shonda-Adena-Witherspoon <shonda.adena.witherspoon@ibm.com>
Signed-off-by: Shonda-Adena-Witherspoon <shonda.adena.witherspoon@ibm.com>
evaline-ju
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the test! A few questions
| git clone --branch v0.8.4 \ | ||
| https://github.com/vllm-project/vllm.git {envtmpdir}/vllm_source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few concerns here:
(1) This adds another source of vllm dependency truth vs. the pyproject.toml which is confusing...for example the 0.8.4 branch cannot run on Mac (as written in a comment), so now if I run tox -e py on my machine, I get errors. Can we leverage the pyproject.toml info?
(2) When changing versions here, I seem to hit some edge cases when I run this locally - the new version will seem to install, but the tests are run on the old version. For example, if I run this as is, then change the tag to v0.8.5, I would expect API breakages, but I noticed the tests pass. Conversely if I start over completely [i.e. delete the tox folder with all dependencies], and run with 0.8.5, I'll get import errors as expected, but if then I switch to 0.8.3 I still get the same errors. It would be good to allow developers to run tests locally dependably across multiple versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: but isn't the build and test with tox test just for the git workflow? If a user wanted to install vllm and test the pytest I thought they would install this project and whichever vllm version they wish and then just run the pytest test/test_http_server for example? They should be able to run locally on whichever version then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tox is used for the git workflow but its main goal is to provide standardized testing env, dependency management etc. via a virtual env - developers can then also just run tox -e py etc. with less worry on different dependencies. It'd be good to have developers easily able to reproduce errors that may be seen in the git workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, then a bigger issue in the tox case is that a prerequisite of installing system dependencies such as the compilers (as done with the change I made to the git workflow is a prerequisite that will vary from machine to machine, and it can't go in the tox.ini for that reason, but works fine for the git workflow since it's addressed before running tox -e py.
example: for my RHEL8 dev vm, I had to install the compilers like the following:
yum install gcc gcc-c++ numactl-devel python3-devel
May need more discussion here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this installation method for vLLM, we are somewhat missing the goal of where we want to go with this story.
Essentially, we want to be able to quickly test if newer version of vLLM would work or not. We already have work going on in PR #67 to automatically figure out next update for vLLM version. Next step, after 67 PR is to automatically update vLLM version and test if that works.
This work, in PR 72, is aimed to make that kind of testing feasible.
So, if we are installing vLLM dependency with hardcoded version here, then it would add another task of update vLLM in one more place, and since tox is totally different file format, it would add more complexity to the system.
Additionally, even out of the context of the use-case I mentioned above, having multiple places for managing vLLM workflow would make it harder to use such a test for more than the current version support, since we would need to change vLLM in multiple places everytime, which becomes error prone.
So we need to figure out new approach for how we can install vLLM in CPU mode using as much standard installation package definition in pyproject.toml as possible
Signed-off-by: Shonda-Adena-Witherspoon <shonda.adena.witherspoon@ibm.com>
| - uses: actions/checkout@v3 | ||
| - name: Install system dependencies | ||
| run: | | ||
| sudo apt-get update -y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a comment here giving more context around why we are installing gcc would be helpful
| - name: Install system dependencies | ||
| run: | | ||
| sudo apt-get update -y | ||
| sudo apt-get install -y gcc-12 g++-12 libnuma-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also it seems we have split the vLLM installation in 2 parts with gcc being installed here and rest of it getting installed in tox.
Any reason for this split ?
What this PR does
This PR adds a small API server boot test for the vllm-detector-adapter server (installed using vllm from source to enable CPU vllm). The smoke tests makes a call to the health endpoint to verify no issues with the server starting up. Additionally, utilities and fixtures are added here to make it easy to expand tests scope in the future, while the tox and github workflow build yaml are modified to correctly install the cpu version of vllm.
Closes 53
How this was tested
This was tested two ways: