Adding Offload Test into DXC build pipeline#8584
Conversation
2959f34 to
6c3d8b4
Compare
6c3d8b4 to
fad2cb9
Compare
| variables: | ||
| LAVAPIPE_VERSION: '26.1.2' | ||
| WARP_VERSION: '1.0.19' | ||
| VS_DEV: 'C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvars64.bat' |
There was a problem hiding this comment.
We shouldn't hardcode this string. We should be able to reuse some of the logic in the other scripts to find this.
| - task: BatchScript@1 | ||
| displayName: 'Set up MSVC environment' | ||
| inputs: | ||
| filename: '$(VS_DEV)' |
There was a problem hiding this comment.
Does hctstart.cmd actually do this for you? Might be a way to avoid hard coding the script path?
There was a problem hiding this comment.
That would require cloning DXC repo in the offload-tests worker which I was avoiding doing... I've modified the script to find MSVC variables instead.
| displayName: 'DXIL Compat Suite Tests (1.8 point release)' | ||
| condition: succeededOrFailed() | ||
|
|
||
| - script: | |
There was a problem hiding this comment.
For PR builds, I'm wondering if we should only do this if the PR has a particular label on it? I don't think we'll want to gate all PRs on this since it'll add significant time to the builds.
| $dst = "$(Build.ArtifactStagingDirectory)\dxc" | ||
| New-Item -ItemType Directory -Force -Path $dst | Out-Null | ||
| foreach ($f in @("dxc.exe","dxv.exe","dxcompiler.dll","dxil.dll")) { | ||
| if (Test-Path "$bin\$f") { Copy-Item "$bin\$f" $dst } else { Write-Warning "missing $f" } |
There was a problem hiding this comment.
Instead of logging a warning, should we fail if one of the files doesn't exist?
| condition: succeededOrFailed() | ||
| variables: | ||
| LAVAPIPE_VERSION: '26.1.2' | ||
| WARP_VERSION: '1.0.19' |
There was a problem hiding this comment.
Are hard coded versions intentional? I worry about needing to consistently bump these versions.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
|
||
| - pwsh: | | ||
| $ver = "$(LAVAPIPE_VERSION)" | ||
| $ver = (Invoke-RestMethod "https://api.github.com/repos/pal1000/mesa-dist-win/releases/latest" -Headers @{ "User-Agent" = "azure-pipelines" }).tag_name |
There was a problem hiding this comment.
Actually, I think picking the version of WARP / lava pipe is something we want to be able to do manually so we can choose the correct last known good combination of these tools - and also get reproducible builds.
damyanp
left a comment
There was a problem hiding this comment.
I think this is basically good to go. It's unfortunate that we need to jump through hoops to deal with ADO pipeline integration not being as feature rich as github actions.
| - task: UsePythonVersion@0 | ||
| inputs: | ||
| versionSpec: '3.11' | ||
| - script: choco install ninja -y |
There was a problem hiding this comment.
We install ninja using chocolatey as opposed to winget?
There was a problem hiding this comment.
In previous jobs, we used hctstart, which is only available inside DXC repo. I am avoiding cloning that in this stage of the pipeline. Not sure how hctstart gets ninja installed
There was a problem hiding this comment.
Is Chocolatey a standard piece of software that's available on all the Windows VMs?
I was thinking WinGet would be more preferrable than Chocolatey, since Chocolatey is a third-party piece of software whereas WinGet is included with Windows.
There was a problem hiding this comment.
Chocolatey is available as part of the 1ES VM images, which we use in this build
alsepkow
left a comment
There was a problem hiding this comment.
Took a look with @inbelic @Icohedron and added a few comments.
| testResultsFiles: '**/testresults.xunit.xml' | ||
| condition: succeededOrFailed() | ||
|
|
||
| - stage: OffloadGate |
There was a problem hiding this comment.
Is this stage neccessary? Seems wasteful to spin up a VM to check labels. I think we could just do that at the end of the build job instead? Or is there a reason you chose to do this in its own stage.
There was a problem hiding this comment.
ADO doesn't have access to github labels, the way I came up to do that was to use github api. This is required to block the offload-test execution unless a PR opts in to it by explicitly setting the run-offload-tests in GitHub. This happen before the tests and in parallel with build stage, It must be before testing in order to block it
There was a problem hiding this comment.
Why is it a requirement that the label checking is done in parallel with the build stage instead of as part of the build stage? The concern alex raised is about spinning an entire VM just to check labels and nothing else, when it could be merged with the build stage.
There was a problem hiding this comment.
Adding the check to the previous stage would make the check run for all the builds we do. Which is not ideal, since it would hit the GitHub API multiple times. I personally prefer to spawn a new VM for such a check; in my tests, this didn't took long to do compared to being rate limited by github
| dependsOn: | ||
| - Build | ||
| - OffloadGate | ||
| condition: and(succeededOrFailed(), eq(dependencies.OffloadGate.outputs['gate.decide.run'], 'true')) |
There was a problem hiding this comment.
Was the succeedOrFailed() intentional? Wondering if it would make more sense to have this as succeeded.
Is there value in running the tests if the build failed?
There was a problem hiding this comment.
We only need one stage in the build to pass and emit binaries to run the offload-tests. Adding succeededOrFailed() makes sure tests are run even if an unrelated stage fails.
There was a problem hiding this comment.
Do DXC PR builds often fail due to "unrelated" things? I know the offload test suite often has unrelated fails because of constant updates to llvm and the machines' drivers and such, but what about DXC?
There was a problem hiding this comment.
"unrelated" in that scenario is a failure in a build that doesn't produce a binary, currently, that is the case for all of them except one. So if any build that doesn't produce a binary fails, we will not test this. This pipeline will be part of VK Release, therefore I don't want to block a VK release because a MacOS or Linux build is failling.
| strategy: | ||
| matrix: | ||
| spirv_release: | ||
| artifact: 'dxc_spirv_release' |
There was a problem hiding this comment.
Whats the reasoning for this? If we just default this to true all the time then we could remove the branhcing logic and just do that every run instead?
There was a problem hiding this comment.
Not all emitted binaries support SPIR-V codegen; in those cases, we don't want to test on lavapipe.
There was a problem hiding this comment.
This matrix only has one element: spirv_release. Is it planned to add more configurations to this matrix in the future? If not, we could fold that runLavapipe check into the steps instead of as part of the matrix.
There was a problem hiding this comment.
We don't have current plans to add more configurations, although I personally think we should. I personally think that leaving the matrix with one element helps clarify that this is a testing job and that we could add more binaries to be tests easily to that stage
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
damyanp
left a comment
There was a problem hiding this comment.
This LGTM, but please make sure you've addressed everyone else's feedback before merging.
This patch adds a new stage into DXC build pipeline, it runs the offload-test agains the built binaries, it uses WARP for DXIL and lava-pipe for SPIRV builds.
Part of: microsoft/hlsl-specs#886