[GAP] Effect of using malloc for temporary allocations in GAP

Dmitrii Pasechnik dmitrii.pasechnik at cs.ox.ac.uk
Tue Jan 3 16:49:27 GMT 2017


On Tue, Jan 03, 2017 at 03:45:44PM +0100, Max Horn wrote:
> > On 03 Jan 2017, at 14:20, Dmitrii Pasechnik <dmitrii.pasechnik at cs.ox.ac.uk> wrote:
> > 
> > On Tue, Jan 03, 2017 at 10:36:08AM +0000, Christopher Jefferson wrote:
> >> A couple of points (replying in my phone, sorry for bad formatting).
> >> 
> >> The use of things like sbrk is I believe just because this code is old, and originated before you could rely on a good mmap and friends.
> >> 
> >> Other processes shouldn't be a problem -- all modern OSes give every process a separate memory space. It could obviously be a problem for things like sage, which want to load into the same process as gap.
> > 
> > Aren't also the heap and a virtual memory region obtained by
> > mmap separate even within a process?
> 
> This question is "not even wrong" -- it simply doesn't make sense. You seem
> to presuppose that "the heap" is a single memory region, which it is not.
> The memory manager (e.g. libc's malloc) may and will allocate additional
> pages to satisfy allocation requests.
> 
> To illustrate this by a hypothetical example: Suppose "the heap" so far
> covers addresses 0x0 till 0x0FFFFF, and now we mmap some data, and it gets
> mapped to addresses 0x10000 till 0x1FFFFF. Now somebody tries to malloc()
> some data which does not fit in the existing hap anymore. Now libc extends
> the heap, and quite possibly the new (additional) heap region will end up
> right after our mmap'ed data. So we cannot grow that region contiguously
> anymore.

This explanation appears to contradict the modern picture of contiguous
virtual memory regions, which may well consist of non-contiguous
physical pages (that's what e.g. vmalloc gives one in the Linux kernel).
Although probably I don't understand something here.

> >> With regards GASMAN. It is very fast, and I believe any replacement would be much slower. One of the biggest problems with HPC gap is that boehm is much slower.
> >> 
> >> Of course gasman has issues -- it can't support multithreading (hence can't be in hpcgap), is moving, requires continuous address space. But any replacement will, for gap's current usage, be I believe slower. Also much of gaps internals relies on features of gasman other allocators may not provide (like cheap access to the exact size of any allocation for example).
> > 
> > Why would GASMAN be problematic with multithreading? Surely, it will
> > need a mechanism to pause all the other threads while collecting
> > garbage, but apart from this, is there anything fundamental that
> > makes it hard to port?
> 
> Pausing all threads to do GC is problematic on its own; for starters, all
> memory allocations (which currently are blindly fast with GASMAN, and GAP
> code kind of relies on that) would need to take a lock, which is expensive.

Why would allocations need a lock?
Allocations may be queued and processed one by one.

> But let's ignore that for the moment (as perhaps there is a clever scheme to
> avoid or reduce that overhead).
> 
> Then we still would have to teach GASMAN to inspect the stack and CPU
> register contents (!) of all other running threads, as those could contain
> references to objects. This is a very fundamental and major change. And
> non-portable, too.  It makes up a considerable part of the complexity of
> Boehm GC, I think.
>
Oh, right, registers, this might be a pain indeed. One begins to understand why
reference counting might be easier to deal with...

[...]

> I guess e.g. Reimer would fit the bill, but he has a job already, and
> AFAIK we have no money to pay him or anybody else to work on a new GC
> anyway...

Although DFG forked out a huge sum for computer algebra,
and some of it is reaching GAP-related German universities.
http://www.uni-kl.de/aktuelles/news/news/detail/News/grosser-erfolg-fuer-die-tu-kaiserslautern-neuer-dfg-sonderforschungsbereich-in-der-mathematik/

Best,
Dima



More information about the Gap mailing list