[GAP] Effect of using malloc for temporary allocations in GAP

Max Horn Max.Horn at math.uni-giessen.de
Tue Jan 3 14:45:44 GMT 2017


Hi Dima,

> On 03 Jan 2017, at 14:20, Dmitrii Pasechnik <dmitrii.pasechnik at cs.ox.ac.uk> wrote:
> 
> On Tue, Jan 03, 2017 at 10:36:08AM +0000, Christopher Jefferson wrote:
>> A couple of points (replying in my phone, sorry for bad formatting).
>> 
>> The use of things like sbrk is I believe just because this code is old, and originated before you could rely on a good mmap and friends.
>> 
>> Other processes shouldn't be a problem -- all modern OSes give every process a separate memory space. It could obviously be a problem for things like sage, which want to load into the same process as gap.
> 
> Aren't also the heap and a virtual memory region obtained by
> mmap separate even within a process?

This question is "not even wrong" -- it simply doesn't make sense. You seem
to presuppose that "the heap" is a single memory region, which it is not.
The memory manager (e.g. libc's malloc) may and will allocate additional
pages to satisfy allocation requests.

To illustrate this by a hypothetical example: Suppose "the heap" so far
covers addresses 0x0 till 0x0FFFFF, and now we mmap some data, and it gets
mapped to addresses 0x10000 till 0x1FFFFF. Now somebody tries to malloc()
some data which does not fit in the existing hap anymore. Now libc extends
the heap, and quite possibly the new (additional) heap region will end up
right after our mmap'ed data. So we cannot grow that region contiguously
anymore.

You may want to read any introduction to how modern multi-process operating
systems handle memory management before continuing asking questions and/or
making suggestions on this matter.


>> With regards GASMAN. It is very fast, and I believe any replacement would be much slower. One of the biggest problems with HPC gap is that boehm is much slower.
>> 
>> Of course gasman has issues -- it can't support multithreading (hence can't be in hpcgap), is moving, requires continuous address space. But any replacement will, for gap's current usage, be I believe slower. Also much of gaps internals relies on features of gasman other allocators may not provide (like cheap access to the exact size of any allocation for example).
> 
> Why would GASMAN be problematic with multithreading? Surely, it will
> need a mechanism to pause all the other threads while collecting
> garbage, but apart from this, is there anything fundamental that
> makes it hard to port?

Pausing all threads to do GC is problematic on its own; for starters, all
memory allocations (which currently are blindly fast with GASMAN, and GAP
code kind of relies on that) would need to take a lock, which is expensive.
But let's ignore that for the moment (as perhaps there is a clever scheme to
avoid or reduce that overhead).

Then we still would have to teach GASMAN to inspect the stack and CPU
register contents (!) of all other running threads, as those could contain
references to objects. This is a very fundamental and major change. And
non-portable, too.  It makes up a considerable part of the complexity of
Boehm GC, I think.


Lots of people have and still do invest tons of time researching concurrent,
multi-threaded garbage collectors. There are lots of them by now out there
(e.g. for Java, Go, Ruby, ...), some are even quite good, I think. But
writing one, or "integrating" a new one, is still a very difficult matter,
and there is no perfect solution (and likely never will be). For example, a
GC that works very well for one particular work load may suck at another.
This is what Chris was getting at (I think) when he wrote "any replacement
would be much slower". While I actually don't agree with that statement in
this generality, it is certainly true that one cannot just pick some other
GC, and hope to just "slap it in": First off, it very likely would require
some serious adjustments, just like you can't generally take the engine of a
car from brand X and transplant it into a car of brand Y without some
*major* engineering.  Secondly, it would not be tuned to the GAP workload,
and hence very likely be much slower than GASMAN without some heavy tuning
(and there is no guarantee it could ever match GASMAN). In fact, GAP code in
general is getting heavily optimized to work well with GASMAN, so this is a
feedback loop, and it'll be very difficult to break it.

That said, I still believe a multi-threaded concurrent GC that consistently
matches or even outperforms GASMAN on a multi-core system is possible. But
this would be a major project. And of course my believe might be wrong,
which you'd only know for sure after trying, at which point you may have
sunk months of work... And who is going to invest that work? It's not a
mathematical research project, after all. Perhaps somebody in computer
science can sell this as a research project and get a grant for it? But a
priori, there is nothing really "novel" here, "just" implementing and then
tuning existing technologies.  But a person or group doing that would need
to know a lot about GC in general, and about the needs and internals of GAP
(and GASMAN, to make it "compatible enough"). Finding and paying such a
person (without having a company like Google snatch them up first) is very
hard.

I guess e.g. Reimer would fit the bill, but he has a job already, and
AFAIK we have no money to pay him or anybody else to work on a new GC
anyway...


Bye,
Max
-- 
Prof. Dr. Max Horn
AG Algebra
Mathematisches Institut
Justus-Liebig-Universität Gießen
Arndtstraße 2
D-35392 Gießen
Tel: (+49) 641 99-32044
Fax: (+49) 641 99-32049
E-Mail: max.horn at math.uni-giessen.de




More information about the Gap mailing list