You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

7.8 KiB

The results of these benchmarks suggest that building this `bc`

with
optimization at `-O3`

with link-time optimization (`-flto`

) will result in the
best performance. However, using `-march=native`

can result in **WORSE**
performance.

*Note*: all benchmarks were run four times, and the fastest run is the one
shown. Also, `[bc]`

means whichever `bc`

was being run, and the assumed working
directory is the root directory of this repository. Also, this `bc`

was at
version `3.0.0`

while GNU `bc`

was at version `1.07.1`

, and all tests were
conducted on an `x86_64`

machine running Gentoo Linux with `clang`

`9.0.1`

as
the compiler.

These benchmarks were run with both `bc`

's compiled with the typical `-O2`

optimizations and no link-time optimization.

The command used was:

```
tests/script.sh bc add.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 2.54
user 1.21
sys 1.32
```

For this `bc`

:

```
real 0.88
user 0.85
sys 0.02
```

The command used was:

```
tests/script.sh bc subtract.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 2.51
user 1.05
sys 1.45
```

For this `bc`

:

```
real 0.91
user 0.85
sys 0.05
```

The command used was:

```
tests/script.sh bc multiply.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 7.15
user 4.69
sys 2.46
```

For this `bc`

:

```
real 2.20
user 2.10
sys 0.09
```

The command used was:

```
tests/script.sh bc divide.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 3.36
user 1.87
sys 1.48
```

For this `bc`

:

```
real 1.61
user 1.57
sys 0.03
```

The command used was:

```
printf '1234567890^100000; halt\n' | time -p [bc] -q > /dev/null
```

For GNU `bc`

:

```
real 11.30
user 11.30
sys 0.00
```

For this `bc`

:

```
real 0.73
user 0.72
sys 0.00
```

This file was downloaded, saved at `../timeconst.bc`

and the following
patch was applied:

```
--- ../timeconst.bc 2018-09-28 11:32:22.808669000 -0600
+++ ../timeconst.bc 2019-06-07 07:26:36.359913078 -0600
@@ -110,8 +110,10 @@
print "#endif /* KERNEL_TIMECONST_H */\n"
}
- halt
}
-hz = read();
-timeconst(hz)
+for (i = 0; i <= 50000; ++i) {
+ timeconst(i)
+}
+
+halt
```

The command used was:

```
time -p [bc] ../timeconst.bc > /dev/null
```

For GNU `bc`

:

```
real 16.71
user 16.06
sys 0.65
```

For this `bc`

:

```
real 13.16
user 13.15
sys 0.00
```

Because this `bc`

is faster when doing math, it might be a better comparison to
run a script that is not running any math. As such, I put the following into
`../test.bc`

:

```
for (i = 0; i < 100000000; ++i) {
y = i
}
i
y
halt
```

The command used was:

```
time -p [bc] ../test.bc > /dev/null
```

For GNU `bc`

:

```
real 16.60
user 16.59
sys 0.00
```

For this `bc`

:

```
real 22.76
user 22.75
sys 0.00
```

I also put the following into `../test2.bc`

:

```
i = 0
while (i < 100000000) {
i += 1
}
i
halt
```

The command used was:

```
time -p [bc] ../test2.bc > /dev/null
```

For GNU `bc`

:

```
real 17.32
user 17.30
sys 0.00
```

For this `bc`

:

```
real 16.98
user 16.96
sys 0.01
```

It seems that the improvements to the interpreter helped a lot in certain cases.

Also, I have no idea why GNU `bc`

did worse when it is technically doing less
work.

`2.7.0`

Note that, when running the benchmarks, the optimizations used are not the ones
I recommended for version `2.7.0`

, which are `-O3 -flto -march=native`

.

This `bc`

separates its code into modules that, when optimized at link time,
removes a lot of the inefficiency that comes from function overhead. This is
most keenly felt with one function: `bc_vec_item()`

, which should turn into just
one instruction (on `x86_64`

) when optimized at link time and inlined. There are
other functions that matter as well.

I also recommended `-march=native`

on the grounds that newer instructions would
increase performance on math-heavy code. We will see if that assumption was
correct. (Spoiler: **NO**.)

When compiling both `bc`

's with the optimizations I recommended for this `bc`

for version `2.7.0`

, the results are as follows.

The command used was:

```
tests/script.sh bc add.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 2.44
user 1.11
sys 1.32
```

For this `bc`

:

```
real 0.59
user 0.54
sys 0.05
```

The command used was:

```
tests/script.sh bc subtract.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 2.42
user 1.02
sys 1.40
```

For this `bc`

:

```
real 0.64
user 0.57
sys 0.06
```

The command used was:

```
tests/script.sh bc multiply.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 7.01
user 4.50
sys 2.50
```

For this `bc`

:

```
real 1.59
user 1.53
sys 0.05
```

The command used was:

```
tests/script.sh bc divide.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 3.26
user 1.82
sys 1.44
```

For this `bc`

:

```
real 1.24
user 1.20
sys 0.03
```

The command used was:

```
printf '1234567890^100000; halt\n' | time -p [bc] -q > /dev/null
```

For GNU `bc`

:

```
real 11.08
user 11.07
sys 0.00
```

For this `bc`

:

```
real 0.71
user 0.70
sys 0.00
```

The command for the `../timeconst.bc`

script was:

```
time -p [bc] ../timeconst.bc > /dev/null
```

For GNU `bc`

:

```
real 15.62
user 15.08
sys 0.53
```

For this `bc`

:

```
real 10.09
user 10.08
sys 0.01
```

The command for the next script, the `for`

loop script, was:

```
time -p [bc] ../test.bc > /dev/null
```

For GNU `bc`

:

```
real 14.76
user 14.75
sys 0.00
```

For this `bc`

:

```
real 17.95
user 17.94
sys 0.00
```

The command for the next script, the `while`

loop script, was:

```
time -p [bc] ../test2.bc > /dev/null
```

For GNU `bc`

:

```
real 14.84
user 14.83
sys 0.00
```

For this `bc`

:

```
real 13.53
user 13.52
sys 0.00
```

Just for kicks, let's see if `-march=native`

is even useful.

The optimizations I used for both `bc`

's were `-O3 -flto`

.

The command used was:

```
tests/script.sh bc add.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 2.41
user 1.05
sys 1.35
```

For this `bc`

:

```
real 0.58
user 0.52
sys 0.05
```

The command used was:

```
tests/script.sh bc subtract.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 2.39
user 1.10
sys 1.28
```

For this `bc`

:

```
real 0.65
user 0.57
sys 0.07
```

The command used was:

```
tests/script.sh bc multiply.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 6.82
user 4.30
sys 2.51
```

For this `bc`

:

```
real 1.57
user 1.49
sys 0.08
```

The command used was:

```
tests/script.sh bc divide.bc 1 0 1 1 [bc]
```

For GNU `bc`

:

```
real 3.25
user 1.81
sys 1.43
```

For this `bc`

:

```
real 1.27
user 1.23
sys 0.04
```

The command used was:

```
printf '1234567890^100000; halt\n' | time -p [bc] -q > /dev/null
```

For GNU `bc`

:

```
real 10.50
user 10.49
sys 0.00
```

For this `bc`

:

```
real 0.72
user 0.71
sys 0.00
```

The command for the `../timeconst.bc`

script was:

```
time -p [bc] ../timeconst.bc > /dev/null
```

For GNU `bc`

:

```
real 15.50
user 14.81
sys 0.68
```

For this `bc`

:

```
real 10.17
user 10.15
sys 0.01
```

The command for the next script, the `for`

loop script, was:

```
time -p [bc] ../test.bc > /dev/null
```

For GNU `bc`

:

```
real 14.99
user 14.99
sys 0.00
```

For this `bc`

:

```
real 16.85
user 16.84
sys 0.00
```

The command for the next script, the `while`

loop script, was:

```
time -p [bc] ../test2.bc > /dev/null
```

For GNU `bc`

:

```
real 14.92
user 14.91
sys 0.00
```

For this `bc`

:

```
real 12.75
user 12.75
sys 0.00
```

It turns out that `-march=native`

can be a problem. As such, I have removed the
recommendation to build with `-march=native`

.

When I ran these benchmarks with my `bc`

compiled under `clang`

vs. `gcc`

, it
performed much better under `clang`

. I recommend compiling this `bc`

with
`clang`

.