跳转到文档内容
← 返回索引页

v2.7.0

abstractmj
Antvirf
clcc2019
DSFans2014
FouoF
fyp711
Kevinz857
litaixun
mayooot
miaobyte

🚀 主要功能

🐛 主要 bug 修复

  • fix: Before executing MIG partitioning, suppress NVML usage in o…,作者 (@Goend) ,PR #1095
  • Fix golint-CI,作者 (@archlitchi) ,PR #1127
  • fix: override node socre failure for kunlun #1137,作者 (@ouyangluwei163) ,PR #1138
  • fix: Multi-node scoring nodes are inaccurate,作者 (@ouyangluwei163) ,PR #1147
  • fix: An error occurred while create Iluvatar pod,作者 (@ouyangluwei163) ,PR #1149
  • Fix e2e CI,作者 (@archlitchi) ,PR #1165
  • fix: Add option for overwrite schedulerName,作者 (@Shouren) ,PR #1163
  • fix: using go-safecast to fix incorrect conversion of numbers,作者 (@Shouren) ,PR #1183
  • fix: deal with security issues reported by Trivy in image,作者 (@Shouren) ,PR #1189
  • fix: wrong Pod's UID and emtpy Pod's name in log of webhook.go,作者 (@Shouren) ,PR #1092
  • fix: concurrent map writes error in scheduler.calcScore #1269,作者 (@Shouren) ,PR #1270
  • fix: release dangling node lock,作者 (@peachest) ,PR #1271
  • fix: fix err which retrieved incorrect NUMA node information issue #1275,作者 (@abstractmj) ,PR #1276
  • fix(security): resolve issues reported by Code scanning in Security,作者 (@Shouren) ,PR #1280
  • fix: fix golangci-lint error,作者 (@DSFans2014) ,PR #1319
  • Fix: device allocation missing containers with no device request,作者 (@FouoF) ,PR #1299
  • fix: update int8Slice to uint8Slice for better type clarity and consistency,作者 (@yxxhero) ,PR #1357

📝 变更内容

📚 Documentation

  • documentation: add Known Issues for dynamic mig support,作者 (@Goend) ,PR #1122
  • docs: fix broken link,作者 (@lixd) ,PR #1125
  • clearly list supported devices doc references at README,作者 (@FouoF) ,PR #1155
  • docs: update ascend910b-support docs,作者 (@DSFans2014) ,PR #1321

🔨 其他变更

  • Optimize Fit-in-device logic to make it device-specific,作者 (@archlitchi) ,PR #1097
  • feat(scheduler): make node lock timeout configurable,作者 (@Kevinz857) ,PR #1117
  • featue: mig mode-change #1116,作者 (@ouyangluwei163) ,PR #1124
  • feat: Add new labels in .github/release.yml,作者 (@Shouren) ,PR #1066
  • feat(scheduler-role): use a scoped-down role for scheduler,作者 (@Antvirf) ,PR #1152
  • feat(helm): optionally disable admission webhook,作者 (@Antvirf) ,PR #1145
  • remove redundant metrics for vgpu allocation,作者 (@FouoF) ,PR #1169
  • refactor: clean up code and improve maintainability,作者 (@Wangmin362) ,PR #1195
  • refactor: Ranging over SplitSeq is more efficient,作者 (@Shouren) ,PR #1239
  • feat:NodeLockTimeout set from env,作者 (@miaobyte) ,PR #1244
  • refactor: move watchAndFeedback function to feedback.go,作者 (@miaobyte) ,PR #1248
  • feat: add informer-based pod cache to reduce API server load,作者 (@miaobyte) ,PR #1250
  • feat: Add option to disable device plugin at values.yaml.,作者 (@FouoF) ,PR #1274
  • refactor(util/nodelock): replace manual polling with k8s.io/client-go/util/retry,作者 (@mayooot) ,PR #1252
  • refactor: Remove annotation in Devices interfaces,作者 (@Shouren) ,PR #1343
  • feat: update the Ascend910 scheduling policy,作者 (@DSFans2014) ,PR #1344
  • feat(nvidia): default gpucores=100 when memory is exclusive and cores…,作者 (@xrwang8) ,PR #1354
  • Prerelease-v2.6,作者 (@archlitchi) ,PR #1108
  • add new reviewers Shouren and ouyangluwei163,作者 (@wawa0210) ,PR #1131
  • Support topology-awareness for Kunlunxin device,作者 (@archlitchi) ,PR #1121
  • Support Metax sGPU Qos Policy,作者 (@Kyrie336) ,PR #1123
  • add global image for chart,作者 (@calvin0327) ,PR #1133
  • fix: Skip admission webhook when Pod's scheduler is already assigned.,作者 (@ghostloda) ,PR #1041
  • Add node configs to docs,作者 (@wylswz) ,PR #1159
  • build(deps): upgrade golang to 1.24.4,作者 (@Shouren) ,PR #1172
  • build(deps): Upgrade golang image in ci to 1.24.4,作者 (@Shouren) ,PR #1176
  • build(deps): Upgrade controller-runtime to 0.21.0,作者 (@Shouren) ,PR #1171
  • build(deps): Dump github.com/NVIDIA/nvidia-container-toolkit,作者 (@Shouren) ,PR #1170
  • Add unit tests for Fit Function for enflame,hygon, metax, mthreads, nvidia,作者 (@Wangmin362) ,PR #1199
  • [Misc] update hami-core version,作者 (@chaunceyjiang) ,PR #1201
  • Improve the impl of DevicePluginConfigs.Nodeconfig overwriting NvidiaConfig,作者 (@FouoF) ,PR #1158
  • Add unit tests for cambricon's Fit Function,作者 (@Wangmin362) ,PR #1198
  • Add unit tests for Ascend's Fit Function,作者 (@Wangmin362) ,PR #1197
  • 修复生成 pod 请求资源时不必要的重复计算,作者 (@litaixun) ,PR #1215
  • 修复更新节点注解时的日志提示词,作者 (@litaixun) ,PR #1214
  • If the mem applied for the Mig device is the same as the template value,>will result in CardNotFoundCustom Filter Rule.,作者 (@zgqqiang) ,PR #1179
  • updated dri section to combine text for better readability,作者 (@mpetason) ,PR #1216
  • feat: Add nvidia gpu topoloy scheduler,作者 (@fyp711) ,PR #1028
  • add issue translate robot,作者 (@wawa0210) ,PR #1232
  • add issue translate robot,作者 (@wawa0210) ,PR #1234
  • perf(util/nodelock): Use clientset Patch instead of Update.,作者 (@mayooot) ,PR #1192
  • Update hami-core and fix readme documents,作者 (@archlitchi) ,PR #1240
  • Update hami-core version to fix,作者 (@archlitchi) ,PR #1256
  • [Snyk] Security upgrade tensorflow/tensorflow from latest-gpu to 2.20.0rc0-gpu,作者 (@wawa0210) ,PR #1243
  • feat: Add an action of 'Close stale issue and PRs' in github worklfow,作者 (@Shouren) ,PR #1083
  • Welcome fyp711 to become a HAMi member,作者 (@wawa0210) ,PR #1288
  • Add values readme,作者 (@clcc2019) ,PR #1267
  • Support Metax sGPU device health check,作者 (@Kyrie336) ,PR #1295
  • Optimize pkg/util.go and distribute logics to corresponding logics,作者 (@archlitchi) ,PR #1296
  • cleanup: Clear and correct ascend device name,作者 (@FouoF) ,PR #1315
  • bugfix: Nvidia card abnormal pod will still continue to schedule,作者 (@zgqqiang) ,PR #1336
  • FIx CI, add 910B4-1 template and fix vGPUmonitor metrics error,作者 (@archlitchi) ,PR #1345
  • add httpTargetPort to values.yaml,作者 (@flpanbin) ,PR #1356
  • Update kunlunxin documents,作者 (@archlitchi) ,PR #1366
  • update chart version and hami-core,作者 (@archlitchi) ,PR #1369

贡献者:🆕 新贡献者

完整更新日志: https://github.com/Project-HAMi/HAMi/compare/v2.6.1...v2.7.0

CNCFHAMi 是 CNCF Sandbox 项目