[gpfsug-discuss] pagepool shrink doesn't release all memory

Aaron Knister aaron.s.knister at nasa.gov
Sun Feb 25 16:45:10 GMT 2018


Hmm...interesting. It sure seems to try :)

The pmap command was this:

pmap $(pidof mmfsd) | sort -n -k3 | tail

-Aaron

On 2/23/18 9:35 AM, IBM Spectrum Scale wrote:
> AFAIK you can increase the pagepool size dynamically but you cannot 
> shrink it dynamically.  To shrink it you must restart the GPFS daemon.   
> Also, could you please provide the actual pmap commands you executed?
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum 
> Scale (GPFS), then please post it to the public IBM developerWroks Forum 
> at 
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. 
> 
> 
> If your query concerns a potential software error in Spectrum Scale 
> (GPFS) and you have an IBM software maintenance contract please contact 
>   1-800-237-5511 in the United States or your local IBM Service Center 
> in other countries.
> 
> The forum is informally monitored as time permits and should not be used 
> for priority messages to the Spectrum Scale (GPFS) team.
> 
> 
> 
> From: Aaron Knister <aaron.s.knister at nasa.gov>
> To: <gpfsug-discuss at spectrumscale.org>
> Date: 02/22/2018 10:30 PM
> Subject: Re: [gpfsug-discuss] pagepool shrink doesn't release all memory
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------------------------------------------------
> 
> 
> 
> This is also interesting (although I don't know what it really means).
> Looking at pmap run against mmfsd I can see what happens after each step:
> 
> # baseline
> 00007fffe4639000  59164K      0K      0K      0K      0K ---p [anon]
> 00007fffd837e000  61960K      0K      0K      0K      0K ---p [anon]
> 0000020000000000 1048576K 1048576K 1048576K 1048576K      0K rwxp [anon]
> Total:           1613580K 1191020K 1189650K 1171836K      0K
> 
> # tschpool 64G
> 00007fffe4639000  59164K      0K      0K      0K      0K ---p [anon]
> 00007fffd837e000  61960K      0K      0K      0K      0K ---p [anon]
> 0000020000000000 67108864K 67108864K 67108864K 67108864K  0K rwxp [anon]
> Total:           67706636K 67284108K 67282625K 67264920K      0K
> 
> # tschpool 1G
> 00007fffe4639000  59164K      0K      0K      0K      0K ---p [anon]
> 00007fffd837e000  61960K      0K      0K      0K      0K ---p [anon]
> 0000020001400000 139264K 139264K 139264K 139264K      0K rwxp [anon]
> 0000020fc9400000 897024K 897024K 897024K 897024K      0K rwxp [anon]
> 0000020009c00000 66052096K      0K      0K      0K      0K rwxp [anon]
> Total:           67706636K 1223820K 1222451K 1204632K      0K
> 
> Even though mmfsd has that 64G chunk allocated there's none of it
> *used*. I wonder why Linux seems to be accounting it as allocated.
> 
> -Aaron
> 
> On 2/22/18 10:17 PM, Aaron Knister wrote:
>  > I've been exploring the idea for a while of writing a SLURM SPANK plugin
>  > to allow users to dynamically change the pagepool size on a node. Every
>  > now and then we have some users who would benefit significantly from a
>  > much larger pagepool on compute nodes but by default keep it on the
>  > smaller side to make as much physmem available as possible to batch work.
>  >
>  > In testing, though, it seems as though reducing the pagepool doesn't
>  > quite release all of the memory. I don't really understand it because
>  > I've never before seen memory that was previously resident become
>  > un-resident but still maintain the virtual memory allocation.
>  >
>  > Here's what I mean. Let's take a node with 128G and a 1G pagepool.
>  >
>  > If I do the following to simulate what might happen as various jobs
>  > tweak the pagepool:
>  >
>  > - tschpool 64G
>  > - tschpool 1G
>  > - tschpool 32G
>  > - tschpool 1G
>  > - tschpool 32G
>  >
>  > I end up with this:
>  >
>  > mmfsd thinks there's 32G resident but 64G virt
>  > # ps -o vsz,rss,comm -p 24397
>  >     VSZ   RSS COMMAND
>  > 67589400 33723236 mmfsd
>  >
>  > however, linux thinks there's ~100G used
>  >
>  > # free -g
>  > total       used free     shared    buffers cached
>  > Mem:           125 100         25 0          0 0
>  > -/+ buffers/cache: 98         26
>  > Swap: 7          0 7
>  >
>  > I can jump back and forth between 1G and 32G *after* allocating 64G
>  > pagepool and the overall amount of memory in use doesn't balloon but I
>  > can't seem to shed that original 64G.
>  >
>  > I don't understand what's going on... :) Any ideas? This is with Scale
>  > 4.2.3.6.
>  >
>  > -Aaron
>  >
> 
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=OrZQeEmI6chBdguG-h4YPHsxXZ4gTU3CtIuN4e3ijdY&s=hvVIRG5kB1zom2Iql2_TOagchsgl99juKiZfJt5S1tM&e=
> 
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776



More information about the gpfsug-discuss mailing list