Skip to main content
โ† Back to index page

v2.7.0

abstractmj
Antvirf
clcc2019
DSFans2014
FouoF
fyp711
Kevinz857
litaixun
mayooot
miaobyte

๐Ÿš€ Major featuresโ€‹

๐Ÿ› Major bug fixesโ€‹

  • fix: Before executing MIG partitioning, suppress NVML usage in oโ€ฆ by (@Goend) in #1095
  • Fix golint-CI by (@archlitchi) in #1127
  • fix: override node socre failure for kunlun #1137 by (@ouyangluwei163) in #1138
  • fix: Multi-node scoring nodes are inaccurate by (@ouyangluwei163) in #1147
  • fix: An error occurred while create Iluvatar pod by (@ouyangluwei163) in #1149
  • Fix e2e CI by (@archlitchi) in #1165
  • fix: Add option for overwrite schedulerName by (@Shouren) in #1163
  • fix: using go-safecast to fix incorrect conversion of numbers by (@Shouren) in #1183
  • fix: deal with security issues reported by Trivy in image by (@Shouren) in #1189
  • fix: wrong Pod's UID and emtpy Pod's name in log of webhook.go by (@Shouren) in #1092
  • fix: concurrent map writes error in scheduler.calcScore #1269 by (@Shouren) in #1270
  • fix: release dangling node lock by (@peachest) in #1271
  • fix: fix err which retrieved incorrect NUMA node information issue #1275 by (@abstractmj) in #1276
  • fix(security): resolve issues reported by Code scanning in Security by (@Shouren) in #1280
  • fix: fix golangci-lint error by (@DSFans2014) in #1319
  • Fix: device allocation missing containers with no device request by (@FouoF) in #1299
  • fix: update int8Slice to uint8Slice for better type clarity and consistency by (@yxxhero) in #1357

๐Ÿ“ What's Changedโ€‹

๐Ÿ“š Documentationโ€‹

  • documentation: add Known Issues for dynamic mig support by (@Goend) in #1122
  • docs: fix broken link by (@lixd) in #1125
  • clearly list supported devices doc references at README by (@FouoF) in #1155
  • docs: update ascend910b-support docs by (@DSFans2014) in #1321

๐Ÿ”จ Other Changesโ€‹

  • Optimize Fit-in-device logic to make it device-specific by (@archlitchi) in #1097
  • feat(scheduler): make node lock timeout configurable by (@Kevinz857) in #1117
  • featue: mig mode-change #1116 by (@ouyangluwei163) in #1124
  • feat: Add new labels in .github/release.yml by (@Shouren) in #1066
  • feat(scheduler-role): use a scoped-down role for scheduler by (@Antvirf) in #1152
  • feat(helm): optionally disable admission webhook by (@Antvirf) in #1145
  • remove redundant metrics for vgpu allocation by (@FouoF) in #1169
  • refactor: clean up code and improve maintainability by (@Wangmin362) in #1195
  • refactor: Ranging over SplitSeq is more efficient by (@Shouren) in #1239
  • feat:NodeLockTimeout set from env by (@miaobyte) in #1244
  • refactor: move watchAndFeedback function to feedback.go by (@miaobyte) in #1248
  • feat: add informer-based pod cache to reduce API server load by (@miaobyte) in #1250
  • feat: Add option to disable device plugin at values.yaml. by (@FouoF) in #1274
  • refactor(util/nodelock): replace manual polling with k8s.io/client-go/util/retry by (@mayooot) in #1252
  • refactor: Remove annotation in Devices interfaces by (@Shouren) in #1343
  • feat: update the Ascend910 scheduling policy by (@DSFans2014) in #1344
  • feat(nvidia): default gpucores=100 when memory is exclusive and coresโ€ฆ by (@xrwang8) in #1354
  • Prerelease-v2.6 by (@archlitchi) in #1108
  • add new reviewers Shouren and ouyangluwei163 by (@wawa0210) in #1131
  • Support topology-awareness for Kunlunxin device by (@archlitchi) in #1121
  • Support Metax sGPU Qos Policy by (@Kyrie336) in #1123
  • add global image for chart by (@calvin0327) in #1133
  • fix: Skip admission webhook when Pod's scheduler is already assigned. by (@ghostloda) in #1041
  • Add node configs to docs by (@wylswz) in #1159
  • build(deps): upgrade golang to 1.24.4 by (@Shouren) in #1172
  • build(deps): Upgrade golang image in ci to 1.24.4 by (@Shouren) in #1176
  • build(deps): Upgrade controller-runtime to 0.21.0 by (@Shouren) in #1171
  • build(deps): Dump github.com/NVIDIA/nvidia-container-toolkit by (@Shouren) in #1170
  • Add unit tests for Fit Function for enflame,hygon, metax, mthreads, nvidia by (@Wangmin362) in #1199
  • [Misc] update hami-core version by (@chaunceyjiang) in #1201
  • Improve the impl of DevicePluginConfigs.Nodeconfig overwriting NvidiaConfig by (@FouoF) in #1158
  • Add unit tests for cambricon's Fit Function by (@Wangmin362) in #1198
  • Add unit tests for Ascend's Fit Function by (@Wangmin362) in #1197
  • ไฟฎๅค็”Ÿๆˆ pod ่ฏทๆฑ‚่ต„ๆบๆ—ถไธๅฟ…่ฆ็š„้‡ๅค่ฎก็ฎ— by (@litaixun) in #1215
  • ไฟฎๅคๆ›ดๆ–ฐ่Š‚็‚นๆณจ่งฃๆ—ถ็š„ๆ—ฅๅฟ—ๆ็คบ่ฏ by (@litaixun) in #1214
  • If the mem applied for the Mig device is the same as the template value,>will result in CardNotFoundCustom Filter Rule. by (@zgqqiang) in #1179
  • updated dri section to combine text for better readability by (@mpetason) in #1216
  • feat: Add nvidia gpu topoloy scheduler by (@fyp711) in #1028
  • add issue translate robot by (@wawa0210) in #1232
  • add issue translate robot by (@wawa0210) in #1234
  • perf(util/nodelock): Use clientset Patch instead of Update. by (@mayooot) in #1192
  • Update hami-core and fix readme documents by (@archlitchi) in #1240
  • Update hami-core version to fix by (@archlitchi) in #1256
  • [Snyk] Security upgrade tensorflow/tensorflow from latest-gpu to 2.20.0rc0-gpu by (@wawa0210) in #1243
  • feat: Add an action of 'Close stale issue and PRs' in github worklfow by (@Shouren) in #1083
  • Welcome fyp711 to become a HAMi member by (@wawa0210) in #1288
  • Add values readme by (@clcc2019) in #1267
  • Support Metax sGPU device health check by (@Kyrie336) in #1295
  • Optimize pkg/util.go and distribute logics to corresponding logics by (@archlitchi) in #1296
  • cleanup: Clear and correct ascend device name by (@FouoF) in #1315
  • bugfix: Nvidia card abnormal pod will still continue to schedule by (@zgqqiang) in #1336
  • FIx CI, add 910B4-1 template and fix vGPUmonitor metrics error by (@archlitchi) in #1345
  • add httpTargetPort to values.yaml by (@flpanbin) in #1356
  • Update kunlunxin documents by (@archlitchi) in #1366
  • update chart version and hami-core by (@archlitchi) in #1369

Committers: ๐Ÿ†• New Contributorsโ€‹

Full Changelog: https://github.com/Project-HAMi/HAMi/compare/v2.6.1...v2.7.0

CNCFHAMi is a CNCF Sandbox project