[gpfsug-discuss] reserving memory for GPFS process

Brian Marshall mimarsh2 at vt.edu
Tue Dec 20 17:15:23 GMT 2016


Skyrim equals Slurm.  Mobile shenanigans.

Brian

On Dec 20, 2016 12:07 PM, "Brian Marshall" <mimarsh2 at vt.edu> wrote:

> We use adaptive - Moab torque right now but are thinking about going to
> Skyrim
>
> Brian
>
> On Dec 20, 2016 11:38 AM, "Buterbaugh, Kevin L" <
> Kevin.Buterbaugh at vanderbilt.edu> wrote:
>
>> Hi Brian,
>>
>> It would be helpful to know what scheduling software, if any, you use.
>>
>> We were a PBS / Moab shop for a number of years but switched to SLURM two
>> years ago.  With both you can configure the maximum amount of memory
>> available to all jobs on a node.  So we just simply “reserve” however much
>> we need for GPFS and other “system” processes.
>>
>> I can tell you that SLURM is *much* more efficient at killing processes
>> as soon as they exceed the amount of memory they’ve requested than PBS /
>> Moab ever dreamed of being.
>>
>> Kevin
>>
>> On Dec 20, 2016, at 10:27 AM, Skylar Thompson <skylar2 at u.washington.edu>
>> wrote:
>>
>> We're a Grid Engine shop, and use cgroups (m_mem_free) to control user
>> process memory
>> usage. In the GE exec host configuration, we reserve 4GB for the OS
>> (including GPFS) so jobs are not able to consume all the physical memory
>> on
>> the system.
>>
>> On Tue, Dec 20, 2016 at 11:25:04AM -0500, Brian Marshall wrote:
>>
>> All,
>>
>> What is your favorite method for stopping a user process from eating up
>> all
>> the system memory and saving 1 GB (or more) for the GPFS / system
>> processes?  We have always kicked around the idea of cgroups but never
>> moved on it.
>>
>> The problem:  A user launches a job which uses all the memory on a node,
>> which causes the node to be expelled, which causes brief filesystem
>> slowness everywhere.
>>
>> I bet this problem has already been solved and I am just googling the
>> wrong
>> search terms.
>>
>>
>> Thanks,
>> Brian
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>>
>> --
>> -- Skylar Thompson (skylar2 at u.washington.edu)
>> -- Genome Sciences Department, System Administrator
>> -- Foege Building S046, (206)-685-7354 <(206)%20685-7354>
>> -- University of Washington School of Medicine
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>>
>>
>>>> Kevin Buterbaugh - Senior System Administrator
>> Vanderbilt University - Advanced Computing Center for Research and
>> Education
>> Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 <(615)%20875-9633>
>>
>>
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161220/bcf33648/attachment-0002.htm>


More information about the gpfsug-discuss mailing list