Pavel Holoborodko

10 Reputation

One Badge

11 years, 260 days

MaplePrimes Activity


These are replies submitted by Pavel Holoborodko

Thank you both for helpful comments.

Indeed, the issue here is that Maple uses formula which is generically correct (except in some special cases). Probably Maple should allow user to see these special cases, as Mathematica does:

http://mathematica.stackexchange.com/questions/121644/issue-with-symbolic-summation

Over the year we have been updating our engine for dense matrices to use faster algorithms, multi-core optimizations and anything else contributing to high speed.

Recent comparison with Maple & Matlab on 500x500 dense matrices: http://goo.gl/f40c3n

We are up to 400 times faster (300 times in average).

Interestingly Maple is not able to work with such (large) matrices in GUI mode - we had to use CLI, otherwise it crashed. Also it wasn't able to compute eigenvalues of a complex 500x500 matrix at all (so that only real case is shown).

Now performance of arbitrary precision mode in our toolbox is close to quadruple precision (of comparable precision). Both are emulated in software and arbitrary precision engine is much more optimized now.

@Axel Vogt 

Thank you for your attention to my open source project! 

As far as I know Maple also uses MPFR ;). I think everything comes down to "how" it is used and implementation of algorithms on the top of it.

We do not use Bailey's implementation since it is not compliant with IEEE 754-2008 and code is more of academic value than production use.

So I would be interested in an interface for Maple to use it.

I would be more interested if Maplesoft would buy toolbox/license and adopt it for their needs, rather than develop interface myself.

Thank you,

Pavel.

@acer 

"memory used" or the result from kernelopts(bytesused) is the amount of memory that has been processed by Maple's garbage collector (memory manager). It is not the amount of allocated memory in use by the Maple kernel.

This is quite a relief, 4.65GB looks daunting.

 

I see that your site's present results of a 100-digit comparison of sparse solvers for direct LU only.

That is because Maple has only one direct solver - LU. We have the whole pack: LU, QR, LDLT.

As you are no doubt aware there are often several options for such solvers (controlloling degree of fill-in, or what have you). It can be tricky to find optimal values for such, on a problem by problem basis, which makes comparison amongst solvers tricky too.

Both solvers are compared as it is on the same matrix set - without any special tunning, as usual user would see them (scripts are available for check).

When it comes to fill-in reduction methods - it is must have for any sparse solver. We use the most common COLAMD algorithm. It is hard to imagine Maple is not using any of such techniques. 

Taking into account long history of Maple this part should be only better than in Advanpix. 

Please also note Maple was unable to solve majority of the matrices in the test set - we show only those it did manage to provide answer to. It was crashing (or running for days) on the others.

You have reported here the results just for dense solvers in the 34-digit case, where (of course) the dedicated quad-precision solver greatly outperforms the arbitary precision solver. That should be no surprise, I think.

Here is results for 100 digits:

mp.Digits(100);
A = mp(rand(256,256));
A = A*A';

tic; [L,U]=lu(A); toc;
Elapsed time is 2.61 seconds.

tic; R = qr(A); toc;
Elapsed time is 5.71 seconds.

tic; L = chol(A); toc;
Elapsed time is 1.31 seconds.

We still faster by 7 times.

Note - my previous test uses [Q,R] = qr(A). But actually, we do not need to form Q explicitly to solve linear system - I omit it in this test as we compare solvers, not full decomposition.

***

I will do iterative solvers comparison in near future.

Pavel

 

 

 

As promised, I have added comparison for Advanpix vs. Maple direct sparse solvers for the case of 100 decimal digits:

http://www.advanpix.com/2013/10/03/advanpix-vs-maple-sparse-solvers-comparison/

Originally my toolbox is designed to operate with any desired precision. However when user asks for 34 digits (quadruple) - toolbox uses routines specifically optimized for the case.

Both engines - arbitrary & quadruple are implemented in software (there is no hardware support exists for quadruple precision). Comparison is legitimate as Maple also uses software floats. 

 

@ Carl Love

Thank you very much for the script. I have used it on my computer and compared with the toolbox. Here is quick results.

 

Maple. 

LU :     memory used=2.31GiB, alloc change=48.00MiB, cpu time=20.80s, real time=19.50s

QR:     memory used=4.65GiB, alloc change=0 bytes, cpu time=43.76s, real time=40.94s

CHOL:  memory used=1.19GiB, alloc change=0 bytes, cpu time=10.16s, real time=9.68s

 

Advanpix.  

General linear solver in Advanpix toolbox chooses optimal decomposition for the supplied matrix automatically (operator "\").

To compare with Maple solvers I just show timings for the particular decomposition - core of the linear solver. 

mp.Digits(34);

A = mp(rand(256,256));
A = A*A';

tic; [L,U]=lu(A); toc;
Elapsed time is 1.13 seconds.

tic; [Q,R]=qr(A); toc;
Elapsed time is 1.45 seconds.

tic; L = chol(A); toc;
Elapsed time is 0.19 seconds.

***

We still beat Maple even for desnse matrices. What is the most amazing for me - amount of memory Maple used in each decomposition. 

Matrix of 256x256 elements of 128-bit precision needs 1MB for storage. It is difficult to imagine why they need 4.65GB for QR decomposition.

Advanpix toolbox have used around 4MB of RAM for QR decomposition for matrix of the same size and precision.

 

I will do more detailed comparison for dense matrices soon - if it is of any interest. 

Pavel.

I am creator of Advanpix toolbox. 

The goal of the comparison was to see timings of direct solvers for sparse matrices (comparison of iterative solvers will be added soon).

I am avid Maple user for 15+ years - and I found very dissapointing facts during the tests:

 

1. Maple has only one direct solver - LU (regardles of precision).

Whereas other respectfull systems (including Octave) usually have LDL^t and QR. We have all three too.

 

2. Maple couldn't solve many of the test problems at all (see list in the referenced post).

It run for days without completing the task... 

 

3. Previous versions of Maple (< 17) - were not capable of solving even reduced set of test problems (crashed, quickly consumed RAM above the limit, etc). 

 

@acer @ Carl Love

Advanpix uses software 128-bit floats compatible with IEEE 754-2008 standard (there is no hardware which has 128-bit floats).

Quadruple precision is equivalent to 34 decimal digits - that is why we use Digits:=34 in Maple.

We are preparing comparison of arbitrary precision in Advanpix vs. arbitrary precision in Maple - still Advanpix is much faster and more stable (it is able to solve all test matrices, Maple is not).

We will publish results in a few days. 

 

***

I would appreciate any further comments. 

 

 

 

Page 1 of 1