Skip to content

Commit 788bea0

Browse files
committed
Merge: selftests/mm: relax test to fail after 100 migration failures
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5590 JIRA: https://issues.redhat.com/browse/RHEL-62703 This patch is a backport of the following upstream commit: 536ab83 From: Dev Jain dev.jain@arm.com Date: Fri, 30 Aug 2024 10:46:09 +0530 Subject: \[PATCH\] selftests/mm: relax test to fail after 100 migration failures ``` It was recently observed at [1] that during the folio unmapping stage of migration, when the PTEs are cleared, a racing thread faulting on that folio may increase the refcount of the folio, sleep on the folio lock (the migration path has the lock), and migration ultimately fails when asserting the actual refcount against the expected. Thereby, the migration selftest fails on shared-anon mappings. The above enforces the fact that migration is a best-effort service, therefore, it is wrong to fail the test for just a single failure; hence, fail the test after 100 consecutive failures (where 100 is still a subjective choice). Note that, this has no effect on the execution time of the test since that is controlled by a timeout. [1] https://lore.kernel.org/all/20240801081657.1386743-1-dev.jain@arm.com/ Link: https://lkml.kernel.org/r/20240830051609.4037834-1-dev.jain@arm.com Signed-off-by: Dev Jain <dev.jain@arm.com> Suggested-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Tested-by: Ryan Roberts <ryan.roberts@arm.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@gentwo.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Gavin Shan <gshan@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> ``` Signed-off-by: Radostin Stoyanov <radostin@redhat.com> Approved-by: Waiman Long <longman@redhat.com> Approved-by: Nico Pache <npache@redhat.com> Approved-by: Gavin Shan <gshan@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2 parents cc47a26 + da79980 commit 788bea0

File tree

1 file changed

+11
-6
lines changed

1 file changed

+11
-6
lines changed

tools/testing/selftests/mm/migration.c

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,10 @@
1515
#include <signal.h>
1616
#include <time.h>
1717

18-
#define TWOMEG (2<<20)
19-
#define RUNTIME (20)
20-
21-
#define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1)))
18+
#define TWOMEG (2<<20)
19+
#define RUNTIME (20)
20+
#define MAX_RETRIES 100
21+
#define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1)))
2222

2323
FIXTURE(migration)
2424
{
@@ -65,6 +65,7 @@ int migrate(uint64_t *ptr, int n1, int n2)
6565
int ret, tmp;
6666
int status = 0;
6767
struct timespec ts1, ts2;
68+
int failures = 0;
6869

6970
if (clock_gettime(CLOCK_MONOTONIC, &ts1))
7071
return -1;
@@ -79,13 +80,17 @@ int migrate(uint64_t *ptr, int n1, int n2)
7980
ret = move_pages(0, 1, (void **) &ptr, &n2, &status,
8081
MPOL_MF_MOVE_ALL);
8182
if (ret) {
82-
if (ret > 0)
83+
if (ret > 0) {
84+
/* Migration is best effort; try again */
85+
if (++failures < MAX_RETRIES)
86+
continue;
8387
printf("Didn't migrate %d pages\n", ret);
88+
}
8489
else
8590
perror("Couldn't migrate pages");
8691
return -2;
8792
}
88-
93+
failures = 0;
8994
tmp = n2;
9095
n2 = n1;
9196
n1 = tmp;

0 commit comments

Comments
 (0)