Skip to content

Commit d69eb02

Browse files
authored
Merge pull request #665 from iJobsYuYing/main
add topic servers-and-cloud-computing/glibc-with-lse
2 parents e7ab95c + 6f2330d commit d69eb02

File tree

7 files changed

+570
-0
lines changed

7 files changed

+570
-0
lines changed
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Use glibc with LSE to improve the performance of workloads on Arm servers
3+
4+
minutes_to_complete: 60
5+
6+
who_is_this_for: This is an advanced topic for software developers interested in learning how to improve the performance of their workloads on Arm servers.
7+
8+
learning_objectives:
9+
- Build and install glibc with LSE on an Arm server
10+
- Benchmark workload performance using glibc with LSE optimizations
11+
- Benchmark MongoDB using glibc with LSE optimizations
12+
13+
prerequisites:
14+
- An Arm based instance from a cloud service provider.
15+
16+
author_primary: Ying Yu, Arm
17+
18+
### Tags
19+
skilllevels: Advanced
20+
subjects: Performance and Architecture
21+
22+
armips:
23+
- Neoverse
24+
25+
layout: learningpathall
26+
27+
learning_path_main_page: 'yes'
28+
29+
operatingsystems:
30+
- Linux
31+
32+
tools_software_languages:
33+
- Glibc
34+
- LSE
35+
- MongoDB
36+
37+
38+
### FIXED, DO NOT MODIFY
39+
# ================================================================================
40+
weight: 1 # _index.md always has weight of 1 to order correctly
41+
layout: "learningpathall" # All files under learning paths have this same wrapper
42+
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
43+
---
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
# ================================================================================
3+
# Edit
4+
# ================================================================================
5+
6+
next_step_guidance: >
7+
You can continue learning about porting other cloud applications to the Arm architecture for increased performance and cost savings.
8+
# 1-3 sentence recommendation outlining how the reader can generally keep learning about these topics, and a specific explanation of why the next step is being recommended.
9+
10+
recommended_path: "/learning-paths/servers-and-cloud-computing/lse"
11+
# Link to the next learning path being recommended.
12+
13+
14+
# further_reading links to references related to this path. Can be:
15+
# Manuals for a tool / software mentioned (type: documentation)
16+
# Blog about related topics (type: blog)
17+
# General online references (type: website)
18+
19+
further_reading:
20+
- resource:
21+
title: Arm Architecture Reference Manual
22+
link: https://developer.arm.com/documentation/ddi0487/latest
23+
type: documentation
24+
- resource:
25+
title: Projects releated with MongoDB
26+
link: https://github.com/mongodb
27+
type: website
28+
29+
# ================================================================================
30+
# FIXED, DO NOT MODIFY
31+
# ================================================================================
32+
weight: 9 # set to always be larger than the content in this path, and one more than 'review'
33+
title: "Next Steps" # Always the same
34+
layout: "learningpathall" # All files under learning paths have this same wrapper
35+
---
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
# ================================================================================
3+
# Edit
4+
# ================================================================================
5+
6+
# Always 3 questions. Should try to test the reader's knowledge, and reinforce the key points you want them to remember.
7+
# question: A one sentence question
8+
# answers: The correct answers (from 2-4 answer options only). Should be surrounded by quotes.
9+
# correct_answer: An integer indicating what answer is correct (index starts from 1)
10+
# explanation: A short (1-3 sentence) explanation of why the correct answer is correct. Can add additional context if desired
11+
12+
13+
review:
14+
- questions:
15+
question: >
16+
You can use LSE on Non-Arm servers?
17+
answers:
18+
- "Yes"
19+
- "No"
20+
correct_answer: 2
21+
explanation: >
22+
LSE is a feature specific to ARM architecture.
23+
24+
- questions:
25+
question: >
26+
All ARM architectures support LSE.
27+
answers:
28+
- "Yes"
29+
- "No"
30+
correct_answer: 2
31+
explanation: >
32+
LSE is supported only on ARM architectures of ARMv8-A and above.
33+
34+
- questions:
35+
question: >
36+
LSE can improve all applications performance.
37+
answers:
38+
- "Yes"
39+
- "No"
40+
correct_answer: 2
41+
explanation: >
42+
Only the performance of multi-threaded applications that heavily rely on atomic operations and synchronization primitives can be improved potentially.
43+
44+
- questions:
45+
question: >
46+
YCSB can only support MongoDB.
47+
answers:
48+
- "Yes"
49+
- "No"
50+
correct_answer: 2
51+
explanation: >
52+
YCSB supports a variety of popular data-serving systems, including Apache Cassandra, MongoDB, Redis, HBase, Amazon DynamoDB, and more. It provides a set of workload scenarios that can be customized to simulate specific application patterns and data access patterns.
53+
54+
55+
56+
# ================================================================================
57+
# FIXED, DO NOT MODIFY
58+
# ================================================================================
59+
title: "Review" # Always the same title
60+
weight: 8 # Set to always be larger than the content in this path
61+
layout: "learningpathall" # All files under learning paths have this same wrapper
62+
---
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
---
2+
# User change
3+
title: "Build Glibc with LSE"
4+
5+
weight: 2 # (intro is 1), 2 is first, 3 is second, etc.
6+
7+
# Do not modify these elements
8+
layout: "learningpathall"
9+
---
10+
11+
12+
## Before you begin
13+
"Glibc with LSE" refers to the version of [the GNU C Library (glibc)](https://www.gnu.org/software/libc/) that includes support for [LSE (Large Systems Extensions)](https://learn.arm.com/learning-paths/servers-and-cloud-computing/lse/). LSE is an extension to the ARMv8-A architecture that provides enhanced atomic operations and memory model features.
14+
15+
LSE introduces additional atomic instructions and operations, such as Load-Acquire, Store-Release, and Atomic Compare-and-Swap (CAS). These operations allow for more efficient synchronization and concurrent access to shared memory in multi-threaded applications running on ARMv8-A processors.
16+
17+
When glibc is compiled with LSE support, it can take advantage of these enhanced atomic operations provided by the LSE extension. This can potentially improve the performance of multi-threaded applications that heavily rely on atomic operations and synchronization primitives.
18+
19+
20+
## Build and Install Glibc
21+
You can build glibc without installing, or with installing to a specific directory.
22+
23+
```bash
24+
cd ~
25+
git clone https://sourceware.org/git/glibc.git
26+
cd glibc
27+
git checkout glibc-2.32
28+
build=~/glibc-2.32_build_install/build
29+
mkdir -p $build
30+
cd $build
31+
```
32+
Before execute the command "./glibc/configure", gawk and bison should be installed.
33+
Glibc-2.32 matches gcc-10 perfect, and the version of ld(GNU linker) should not be higher than 2.35.
34+
__So Ubuntu 20.04 is strongly recommended!__
35+
```
36+
sudo apt update
37+
sudo apt install -y gcc-10 g++-10 gawk bison make
38+
```
39+
40+
- __Without installing__
41+
```bash
42+
sudo bash ~/glibc/configure --prefix=/usr --disable-werror CC=gcc-10 CXX=g++-10
43+
sudo make -C $build -j$(expr $(nproc) - 1)
44+
```
45+
46+
- __OR__
47+
48+
- __With installing to a specific directory__
49+
```bash
50+
install=~/glibc-2.32_build_install/install
51+
mkdir -p ${install}
52+
sudo make -C $build -j$(expr $(nproc) - 1) install DESTDIR=${install}
53+
sudo make -C $build -j$(expr $(nproc) - 1) localedata/install-locales DESTDIR=${install}
54+
sudo make -C $build -j$(expr $(nproc) - 1) localedata/install-locale-files DESTDIR=${install}
55+
```
56+
57+
58+
## With LSE
59+
If you want to build glibc with LSE, you should add `CFLAGS` and `CXXFLAGS` to configure implicitly or explicitly.
60+
61+
```bash
62+
sudo bash ~/glibc/configure --prefix=/usr --disable-werror CC=gcc-10 CXX=g++-10 CFLAGS="-mcpu=native -O3" CXXFLAGS="-mcpu=native -O3"
63+
sudo make -C $build -j$(expr $(nproc) - 1)
64+
```
65+
OR
66+
```bash
67+
sudo bash ~/glibc/configure --prefix=/usr --disable-werror CC=gcc-10 CXX=g++-10 CFLAGS="-mcpu=neoverse-n2+lse -O3" CXXFLAGS="-mcpu=neoverse-n2+lse -O3"
68+
sudo make -C $build -j$(expr $(nproc) - 1)
69+
```
70+
71+
##
72+
73+
## With NO-LSE
74+
75+
```bash
76+
sudo bash ~/glibc/configure --prefix=/usr --disable-werror CC=gcc-10 CXX=g++-10
77+
sudo make -C $build -j$(expr $(nproc) - 1)
78+
```
79+
OR
80+
```bash
81+
sudo bash ~/glibc/configure --prefix=/usr --disable-werror CC=gcc-10 CXX=g++-10 CFLAGS="-mcpu=neoverse-n2+nolse -O3" CXXFLAGS="-mcpu=neoverse-n2+nolse -O3"
82+
sudo make -C $build -j$(expr $(nproc) - 1)
83+
```
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
---
2+
# User change
3+
title: "Compare the results with LSE and NoLSE"
4+
weight: 6 # 1 is first, 2 is second, etc.
5+
6+
# Do not modify these elements
7+
layout: "learningpathall"
8+
---
9+
10+
Now you can run the mongodb benchmark using Glibc with LSE and NoLSE and compare the results.
11+
12+
## Result with No-LSE
13+
Launch MongoDB with Glibc without LSE and obtain benchmark result.
14+
The overall TPS is __6662.1275371047195__ with No-LSE Glibc.
15+
```console
16+
[OVERALL], RunTime(ms), 750511
17+
[OVERALL], Throughput(ops/sec), 6662.1275371047195
18+
[TOTAL_GCS_G1_Young_Generation], Count, 3527
19+
[TOTAL_GC_TIME_G1_Young_Generation], Time(ms), 27871
20+
[TOTAL_GC_TIME_%_G1_Young_Generation], Time(%), 3.713603131732913
21+
[TOTAL_GCS_G1_Old_Generation], Count, 0
22+
[TOTAL_GC_TIME_G1_Old_Generation], Time(ms), 0
23+
[TOTAL_GC_TIME_%_G1_Old_Generation], Time(%), 0.0
24+
[TOTAL_GCs], Count, 3527
25+
[TOTAL_GC_TIME], Time(ms), 27871
26+
[TOTAL_GC_TIME_%], Time(%), 3.713603131732913
27+
[READ], Operations, 1998800
28+
[READ], AverageLatency(us), 3047.5279422653593
29+
[READ], MinLatency(us), 225
30+
[READ], MaxLatency(us), 219775
31+
[READ], 95thPercentileLatency(us), 10495
32+
[READ], 99thPercentileLatency(us), 23791
33+
[READ], Return=OK, 1998800
34+
[READ-MODIFY-WRITE], Operations, 999359
35+
[READ-MODIFY-WRITE], AverageLatency(us), 5748.511211686691
36+
[READ-MODIFY-WRITE], MinLatency(us), 482
37+
[READ-MODIFY-WRITE], MaxLatency(us), 379135
38+
[READ-MODIFY-WRITE], 95thPercentileLatency(us), 17455
39+
[READ-MODIFY-WRITE], 99thPercentileLatency(us), 32271
40+
[CLEANUP], Operations, 64
41+
[CLEANUP], AverageLatency(us), 57.46875
42+
[CLEANUP], MinLatency(us), 0
43+
[CLEANUP], MaxLatency(us), 3519
44+
[CLEANUP], 95thPercentileLatency(us), 3
45+
[CLEANUP], 99thPercentileLatency(us), 38
46+
[UPDATE], Operations, 2500755
47+
[UPDATE], AverageLatency(us), 2888.259603599713
48+
[UPDATE], MinLatency(us), 211
49+
[UPDATE], MaxLatency(us), 378367
50+
[UPDATE], 95thPercentileLatency(us), 9743
51+
[UPDATE], 99thPercentileLatency(us), 22895
52+
[UPDATE], Return=OK, 2500755
53+
[SCAN], Operations, 1499804
54+
[SCAN], AverageLatency(us), 22936.992162975963
55+
[SCAN], MinLatency(us), 317
56+
[SCAN], MaxLatency(us), 335615
57+
[SCAN], 95thPercentileLatency(us), 51551
58+
[SCAN], 99thPercentileLatency(us), 70143
59+
[SCAN], Return=OK, 1499804
60+
```
61+
62+
## Result with LSE
63+
Launch MongoDB with Glibc with LSE and obtain benchmark result.
64+
The overall TPS is __6871.605426919102__ with LSE Glibc.
65+
___So you can get around 3.14% extra benefit through Glibc with LSE!___
66+
```console
67+
[OVERALL], RunTime(ms), 727632
68+
[OVERALL], Throughput(ops/sec), 6871.605426919102
69+
[TOTAL_GCS_G1_Young_Generation], Count, 3397
70+
[TOTAL_GC_TIME_G1_Young_Generation], Time(ms), 27090
71+
[TOTAL_GC_TIME_%_G1_Young_Generation], Time(%), 3.7230358203047693
72+
[TOTAL_GCS_G1_Old_Generation], Count, 0
73+
[TOTAL_GC_TIME_G1_Old_Generation], Time(ms), 0
74+
[TOTAL_GC_TIME_%_G1_Old_Generation], Time(%), 0.0
75+
[TOTAL_GCs], Count, 3397
76+
[TOTAL_GC_TIME], Time(ms), 27090
77+
[TOTAL_GC_TIME_%], Time(%), 3.7230358203047693
78+
[READ], Operations, 1998456
79+
[READ], AverageLatency(us), 3045.181359009155
80+
[READ], MinLatency(us), 230
81+
[READ], MaxLatency(us), 346111
82+
[READ], 95thPercentileLatency(us), 10535
83+
[READ], 99thPercentileLatency(us), 23631
84+
[READ], Return=OK, 1998456
85+
[READ-MODIFY-WRITE], Operations, 999026
86+
[READ-MODIFY-WRITE], AverageLatency(us), 5786.018821331977
87+
[READ-MODIFY-WRITE], MinLatency(us), 467
88+
[READ-MODIFY-WRITE], MaxLatency(us), 359167
89+
[READ-MODIFY-WRITE], 95thPercentileLatency(us), 17599
90+
[READ-MODIFY-WRITE], 99thPercentileLatency(us), 32463
91+
[CLEANUP], Operations, 64
92+
[CLEANUP], AverageLatency(us), 52.25
93+
[CLEANUP], MinLatency(us), 1
94+
[CLEANUP], MaxLatency(us), 3217
95+
[CLEANUP], 95thPercentileLatency(us), 2
96+
[CLEANUP], 99thPercentileLatency(us), 8
97+
[UPDATE], Operations, 2500888
98+
[UPDATE], AverageLatency(us), 2905.821863674023
99+
[UPDATE], MinLatency(us), 207
100+
[UPDATE], MaxLatency(us), 320767
101+
[UPDATE], 95thPercentileLatency(us), 9903
102+
[UPDATE], 99thPercentileLatency(us), 23007
103+
[UPDATE], Return=OK, 2500888
104+
[SCAN], Operations, 1499682
105+
[SCAN], AverageLatency(us), 21999.80309425598
106+
[SCAN], MinLatency(us), 298
107+
[SCAN], MaxLatency(us), 525311
108+
[SCAN], 95thPercentileLatency(us), 49727
109+
[SCAN], 99thPercentileLatency(us), 67455
110+
[SCAN], Return=OK, 1499682
111+
```
112+

0 commit comments

Comments
 (0)