Age | Commit message (Collapse) | Author |
|
Fixes #221.
R=ken2
https://golang.org/cl/165086
|
|
R=r
https://golang.org/cl/165083
|
|
cuts working size for hello world from 6 MB to 1.2 MB.
still some work to be done, but diminishing returns.
R=r
https://golang.org/cl/165080
|
|
to provide functionality previously hacked in to
reflect and gob.
R=r
https://golang.org/cl/165076
|
|
R=r
https://golang.org/cl/165078
|
|
R=r, rsc
https://golang.org/cl/165068
|
|
the one-item case could be generalized easily with no cost. worth considering.
R=rsc
CC=golang-dev, cw
https://golang.org/cl/167044
|
|
nodes in the tree are nested with respect to one another.
a simple change to the Visitor interface makes it possible
to do this (for example to maintain a current node-depth, or a
knowledge of the name of the current function).
Visit(nil) is called at the end of a node's children;
this make possible the channel-based interface below,
amongst other possibilities.
It is still just as simple to get the original behaviour - just
return the same Visitor from Visit.
Here are a couple of possible Visitor types.
// closure-based
type FVisitor func(n interface{}) FVisitor
func (f FVisitor) Visit(n interface{}) Visitor {
return f(n);
}
// channel-based
type CVisitor chan Visit;
type Visit struct {
node interface{};
reply chan CVisitor;
};
func (v CVisitor) Visit(n interface{}) Visitor
{
if n == nil {
close(v);
} else {
reply := make(chan CVisitor);
v <- Visit{n, reply};
r := <-reply;
if r == nil {
return nil;
}
return r;
}
return nil;
}
R=gri
CC=rsc
https://golang.org/cl/166047
|
|
shootout.alioth.debian.org .
it's now there: http://shootout.alioth.debian.org/u32q/benchmark.php?test=chameneosredux&lang=all&box=1!
R=r, rsc
CC=golang-dev
https://golang.org/cl/167043
|
|
a couple of cleanups.
don't keep big buffers in the free list.
R=rsc
CC=golang-dev
https://golang.org/cl/166078
|
|
cleans up godoc's output for package fmt substantially.
R=rsc
CC=golang-dev
https://golang.org/cl/165070
|
|
Roughly 33% faster for simple cases, probably more for complex ones.
Before:
mallocs per Sprintf(""): 4
mallocs per Sprintf("xxx"): 6
mallocs per Sprintf("%x"): 10
mallocs per Sprintf("%x %x"): 12
Now:
mallocs per Sprintf(""): 2
mallocs per Sprintf("xxx"): 3
mallocs per Sprintf("%x"): 5
mallocs per Sprintf("%x %x"): 7
Speed improves because of avoiding mallocs and also by sharing a bytes.Buffer
between print.go and format.go rather than copying the data back after each
printed item.
Before:
fmt_test.BenchmarkSprintfEmpty 1000000 1346 ns/op
fmt_test.BenchmarkSprintfString 500000 3461 ns/op
fmt_test.BenchmarkSprintfInt 500000 3671 ns/op
Now:
fmt_test.BenchmarkSprintfEmpty 2000000 995 ns/op
fmt_test.BenchmarkSprintfString 1000000 2745 ns/op
fmt_test.BenchmarkSprintfInt 1000000 2391 ns/op
fmt_test.BenchmarkSprintfIntInt 500000 3751 ns/op
I believe there is more to get but this is a good milestone.
R=rsc
CC=golang-dev, hong
https://golang.org/cl/166076
|
|
* broken by reflect, gob
TBR=r
https://golang.org/cl/166077
|
|
For 386 we use the [f]statfs64 system call, which takes three
parameters: the filename, the size of the statfs64 structure,
and a pointer to the structure itself.
R=rsc
https://golang.org/cl/166073
|
|
1.9s gcc reverse-complement.c
reverse-complement.go
4.5s / 3.5s original, with/without bounds checks
3.5s / 3.3s bounds check reduction
3.3s / 2.8s smarter garbage collector
2.6s / 2.3s assembler bytes.IndexByte
2.5s / 2.1s even smarter garbage collector
2.3s / 2.1s fix optimizer unnecessary spill bug
2.0s / 1.9s change loop to range (this CL)
R=r
https://golang.org/cl/166072
|
|
* inform garbage collector about memory with no pointers in it
1.9s gcc reverse-complement.c
reverse-complement.go
4.5s / 3.5s original, with/without bounds checks
3.5s / 3.3s bounds check reduction
3.3s / 2.8s smarter garbage collector
2.6s / 2.3s assembler bytes.IndexByte
2.5s / 2.1s even smarter garbage collector (this CL)
R=r
https://golang.org/cl/165064
|
|
R=ken2
https://golang.org/cl/166071
|
|
R=ken2
https://golang.org/cl/166070
|
|
R=r
https://golang.org/cl/165065
|
|
R=r
https://golang.org/cl/166067
|
|
i don't know why the timeout needs
to be so big.
R=r
https://golang.org/cl/165063
|
|
R=r
https://golang.org/cl/166068
|
|
R=r
https://golang.org/cl/166064
|
|
R=rsc
https://golang.org/cl/165062
|
|
Makes the code look cleaner, even if it's a little harder to figure
out from the sort invariants.
R=rsc
CC=golang-dev
https://golang.org/cl/165061
|
|
R=rsc
https://golang.org/cl/166058
|
|
On a microbenchmark that ping-pongs on lots of channels, this makes
the multithreaded case about 20% faster and the uniprocessor case
about 1% slower. (Due to cache effects, I expect.)
R=rsc, agl
CC=golang-dev
https://golang.org/cl/166043
|
|
PERFORMANCE DIFFERENCE
SUMMARY
amd64 386
2.2 GHz AMD Opteron 8214 HE (Linux) 3.0x faster 8.2x faster
3.60 GHz Intel Xeon (Linux) 2.2x faster 6.2x faster
2.53 GHz Intel Core2 Duo E7200 (Linux) 1.5x faster 4.4x faster
2.66 Ghz Intel Xeon 5150 (Mac Pro, OS X) 1.5x SLOWER 3.0x faster
2.33 GHz Intel Xeon E5435 (Linux) 1.5x SLOWER 3.0x faster
2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X) 1.4x SLOWER 3.0x faster
1.83 GHz Intel Core2 T5600 (Mac Mini, OS X) none* 3.0x faster
* but yesterday I consistently saw 1.4x SLOWER.
DETAILS
2.2 GHz AMD Opteron 8214 HE (Linux)
amd64 (3x faster)
IndexByte4K 500000 3733 ns/op 1097.24 MB/s
IndexByte4M 500 4328042 ns/op 969.10 MB/s
IndexByte64M 50 67866160 ns/op 988.84 MB/s
IndexBytePortable4K 200000 11161 ns/op 366.99 MB/s
IndexBytePortable4M 100 11795880 ns/op 355.57 MB/s
IndexBytePortable64M 10 188675000 ns/op 355.68 MB/s
386 (8.2x faster)
IndexByte4K 500000 3734 ns/op 1096.95 MB/s
IndexByte4M 500 4209954 ns/op 996.28 MB/s
IndexByte64M 50 68031980 ns/op 986.43 MB/s
IndexBytePortable4K 50000 30670 ns/op 133.55 MB/s
IndexBytePortable4M 50 31868220 ns/op 131.61 MB/s
IndexBytePortable64M 2 508851500 ns/op 131.88 MB/s
3.60 GHz Intel Xeon (Linux)
amd64 (2.2x faster)
IndexByte4K 500000 4612 ns/op 888.12 MB/s
IndexByte4M 500 4835250 ns/op 867.44 MB/s
IndexByte64M 20 77388450 ns/op 867.17 MB/s
IndexBytePortable4K 200000 10306 ns/op 397.44 MB/s
IndexBytePortable4M 100 11201460 ns/op 374.44 MB/s
IndexBytePortable64M 10 179456800 ns/op 373.96 MB/s
386 (6.3x faster)
IndexByte4K 500000 4631 ns/op 884.47 MB/s
IndexByte4M 500 4846388 ns/op 865.45 MB/s
IndexByte64M 20 78691200 ns/op 852.81 MB/s
IndexBytePortable4K 100000 28989 ns/op 141.29 MB/s
IndexBytePortable4M 50 31183180 ns/op 134.51 MB/s
IndexBytePortable64M 5 498347200 ns/op 134.66 MB/s
2.53 GHz Intel Core2 Duo E7200 (Linux)
amd64 (1.5x faster)
IndexByte4K 500000 6502 ns/op 629.96 MB/s
IndexByte4M 500 6692208 ns/op 626.74 MB/s
IndexByte64M 10 107410400 ns/op 624.79 MB/s
IndexBytePortable4K 200000 9721 ns/op 421.36 MB/s
IndexBytePortable4M 100 10013680 ns/op 418.86 MB/s
IndexBytePortable64M 10 160460800 ns/op 418.23 MB/s
386 (4.4x faster)
IndexByte4K 500000 6505 ns/op 629.67 MB/s
IndexByte4M 500 6694078 ns/op 626.57 MB/s
IndexByte64M 10 107397600 ns/op 624.86 MB/s
IndexBytePortable4K 100000 28835 ns/op 142.05 MB/s
IndexBytePortable4M 50 29562680 ns/op 141.88 MB/s
IndexBytePortable64M 5 473221400 ns/op 141.81 MB/s
2.66 Ghz Intel Xeon 5150 (Mac Pro, OS X)
amd64 (1.5x SLOWER)
IndexByte4K 200000 9290 ns/op 440.90 MB/s
IndexByte4M 200 9568925 ns/op 438.33 MB/s
IndexByte64M 10 154473600 ns/op 434.44 MB/s
IndexBytePortable4K 500000 6202 ns/op 660.43 MB/s
IndexBytePortable4M 500 6583614 ns/op 637.08 MB/s
IndexBytePortable64M 20 107166250 ns/op 626.21 MB/s
386 (3x faster)
IndexByte4K 200000 9301 ns/op 440.38 MB/s
IndexByte4M 200 9568025 ns/op 438.37 MB/s
IndexByte64M 10 154391000 ns/op 434.67 MB/s
IndexBytePortable4K 100000 27526 ns/op 148.80 MB/s
IndexBytePortable4M 100 28302490 ns/op 148.20 MB/s
IndexBytePortable64M 5 454170200 ns/op 147.76 MB/s
2.33 GHz Intel Xeon E5435 (Linux)
amd64 (1.5x SLOWER)
IndexByte4K 200000 10601 ns/op 386.38 MB/s
IndexByte4M 100 10827240 ns/op 387.38 MB/s
IndexByte64M 10 173175500 ns/op 387.52 MB/s
IndexBytePortable4K 500000 7082 ns/op 578.37 MB/s
IndexBytePortable4M 500 7391792 ns/op 567.43 MB/s
IndexBytePortable64M 20 122618550 ns/op 547.30 MB/s
386 (3x faster)
IndexByte4K 200000 11074 ns/op 369.88 MB/s
IndexByte4M 100 10902620 ns/op 384.71 MB/s
IndexByte64M 10 181292800 ns/op 370.17 MB/s
IndexBytePortable4K 50000 31725 ns/op 129.11 MB/s
IndexBytePortable4M 50 32564880 ns/op 128.80 MB/s
IndexBytePortable64M 2 545926000 ns/op 122.93 MB/s
2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X)
amd64 (1.4x SLOWER)
IndexByte4K 200000 11120 ns/op 368.35 MB/s
IndexByte4M 100 11531950 ns/op 363.71 MB/s
IndexByte64M 10 184819000 ns/op 363.11 MB/s
IndexBytePortable4K 500000 7419 ns/op 552.10 MB/s
IndexBytePortable4M 200 8018710 ns/op 523.06 MB/s
IndexBytePortable64M 10 127614900 ns/op 525.87 MB/s
386 (3x faster)
IndexByte4K 200000 11114 ns/op 368.54 MB/s
IndexByte4M 100 11443530 ns/op 366.52 MB/s
IndexByte64M 10 185212000 ns/op 362.34 MB/s
IndexBytePortable4K 50000 32891 ns/op 124.53 MB/s
IndexBytePortable4M 50 33930580 ns/op 123.61 MB/s
IndexBytePortable64M 2 545400500 ns/op 123.05 MB/s
1.83 GHz Intel Core2 T5600 (Mac Mini, OS X)
amd64 (no difference)
IndexByte4K 200000 13497 ns/op 303.47 MB/s
IndexByte4M 100 13890650 ns/op 301.95 MB/s
IndexByte64M 5 222358000 ns/op 301.81 MB/s
IndexBytePortable4K 200000 13584 ns/op 301.53 MB/s
IndexBytePortable4M 100 13913280 ns/op 301.46 MB/s
IndexBytePortable64M 10 222572600 ns/op 301.51 MB/s
386 (3x faster)
IndexByte4K 200000 13565 ns/op 301.95 MB/s
IndexByte4M 100 13882640 ns/op 302.13 MB/s
IndexByte64M 5 221411600 ns/op 303.10 MB/s
IndexBytePortable4K 50000 39978 ns/op 102.46 MB/s
IndexBytePortable4M 50 41038160 ns/op 102.20 MB/s
IndexBytePortable64M 2 656362500 ns/op 102.24 MB/s
R=r
CC=golang-dev
https://golang.org/cl/166055
|
|
R=gri
CC=golang-dev
https://golang.org/cl/164088
|
|
add README explaining how to try the
web demos.
Fixes #339.
R=r
CC=barry.d.silverman, bss, vadim
https://golang.org/cl/165057
|
|
R=r
https://golang.org/cl/166060
|
|
R=rsc
https://golang.org/cl/165058
|
|
R=ken2
https://golang.org/cl/165059
|
|
Fixes #374.
R=r
https://golang.org/cl/166053
|
|
R=ken2
https://golang.org/cl/165055
|
|
R=rsc
https://golang.org/cl/166052
|
|
Modify iterFunc to take chan<- instead of just chan.
R=rsc, dsymonds1
CC=golang-dev, r
https://golang.org/cl/160064
|
|
Fixes bug 375.
R=rsc
https://golang.org/cl/165045
|
|
Fixes #176.
R=r
https://golang.org/cl/166044
|
|
R=jini, r
https://golang.org/cl/163092
|
|
R=r
https://golang.org/cl/164083
|
|
R=r
https://golang.org/cl/164086
|
|
* throw away dead code
* add mlookup counter
* add malloc counter
* set up for blocks with no pointers
Fixes #367.
R=r
https://golang.org/cl/165050
|
|
this package,
so make it a local method (_String()).
R=rsc
CC=golang-dev
https://golang.org/cl/165049
|
|
R=rsc
CC=golang-dev
https://golang.org/cl/165048
|
|
buffer allocation.
Use them in Copy and Copyn.
Speed up ReadFile by using ReadFrom and avoiding Copy altogether (a minor win).
R=rsc, gri
CC=golang-dev
https://golang.org/cl/166041
|
|
R=rsc
https://golang.org/cl/164095
|
|
- fixes a godoc issue (for instance, "godoc os EOF" now shows an entry)
R=r
CC=rsc
https://golang.org/cl/165042
|
|
Fixes #238.
R=ken2
https://golang.org/cl/163098
|
|
Fixes #245.
R=ken2
https://golang.org/cl/164094
|