← Back to team overview

graphite-dev team mailing list archive

Re: [Question #191807]: what about using mmap?

 

Question #191807 on Graphite changed:
https://answers.launchpad.net/graphite/+question/191807

    Status: Answered => Open

Amos Shapira is still having a problem:
mmap works on virtual address space.

You can map the entire file into virtual memory - it doesn't mean that
you really need that much physical memory but only enough physical
ADDRESS SPACE (i.e. the pointer size) to address that file size. On
32-bit systems this is theoretically 4Gb but I think Linux limits this
to around 3Gb since it has to allocate part of the VIRTUAL address space
for the kernel. On 64-bit systems this is ~1 Exbibyte (2^64, but I
assume one bit goes for the kernel space again so make it 2^63).

If you want to be more economical then you can map only parts of the
file. I'll have to delve deeper into Whisper code in order to know how
relevant this is but you can, for instance, map just the first page (the
one with the file header) and the last page into memory.

Whichever way you go, the kernel will need to only allocate physical
memory pages for the pages you actually access (read or write).

I just did a orught back-of-the-envelop calculations about file sizes -
our current schema configuration is:

#5 second intervals for a week, then 1 min intervals for 13 months
retentions = 5:120960,60:565920

I assume that 120960+565920 represents the total number of datapoint entries in the file.
All our files have exactly the same size: 8242600.
So I assume that 8242600 / (120960+565920) = 12.00005... means that each entry takes 12 bytes (and the extra bits are taken by the file header).
This means that a file covering 1 minute for 13 months (I think this is a good example use case) takes less than 8Mb.
We currently have 15461 files on our (new) system, ~96 per server, let's round this to 100 files per server.

8Mb * 100 = 800Mb per server.
we track about 170 servers right now  so it's ~120Gb to map into VIRTUAL memory - that's ~ 37 bits out of 63 bits of available address space, i.e. you can feet about (63-37)=26 bits TIMES this size into a 64-bit machine's virtual memory = ~67 million times more.

This might sound a lot but remember this is only VIRTUAL address space. The real memory you need depends on the access patterns.
Files are mapped into memory a PAGESIZE units ("getconf PAGESIZE" from the shell). On the x86_64 KVM guest I run graphite on this is currently 4k, so for instance if you need to access only the first last last pages of each file this is cut down by 1024 to 2*4kb = 8kb per file = ~120Mb for our case (170 servers, ~100 files per server).
THESE are the physical memory requirements.

Let's say that we want Graphite to use no more than 2Gb out of the 4Gb
RAM we have on our current system - you can fit ~17 TIMES more data into
physical memory of such a system (all assumptions considered).

These calculations do NOT take into account what whisper actually does.
I'll have to look at the code to support/dispute them further, but I
hope this gives the general direction of where I'm going.

Additionally, since these pages are backed up by the file on the file
system, the kernel doesn't have to page them in/out to/from the swap
space when the page has to be cleared - the kernel just flushes the page
into disk (if this didn't happen already, it usually happens regularly
every 30 seconds) so you save on that too, both in I/O and swap space.

Besides - read/write request you do are actually completed by the kernel
by mapping the files' pages into kernel memory anyway, so if you access
more memory than the page cache can handle then you'll have IO issues
anyway (and your application's read/write buffers take additional
memory).

I got permission from my workplace to give a day for this on Monday so I plan to:
1. Read the whisper code and see how it access files.
2. Use strace to see what access pattern I see with the .wsp files.
3. Try to demonstrate use of Python's mmap calls in the whisper library (again - I never programmed in Python before so I might be slow with that).

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.