From t.king at qmul.ac.uk Tue Jul 1 17:39:38 2014 From: t.king at qmul.ac.uk (Tom King) Date: Tue, 1 Jul 2014 17:39:38 +0100 Subject: [gpfsug-discuss] GPFS AFM experience Message-ID: <53B2E44A.9040008@qmul.ac.uk> Hi I've recently adopted a small two node GPFS cluster which is likely to be relocated to an off-site datacentre in the near future and I'm concerned that latency times are going to be impacted for on-site users. My reading of AFM is that an Independent Writer cache held on-site could mitigate this and was wondering whether anybody had any experience of using AFM caches in this manner. Thanks Tom King -- Head of Research Infrastructure IT Services Queen Mary University of London From farid.chabane at ymail.com Thu Jul 3 14:54:40 2014 From: farid.chabane at ymail.com (FC) Date: Thu, 3 Jul 2014 14:54:40 +0100 Subject: [gpfsug-discuss] Introducing myself Message-ID: <1404395680.95829.YahooMailNeo@web171806.mail.ir2.yahoo.com> Hello , I'm happy to join this mailing list, my name is Farid Chabane, I'm a HPC Engineer at Serviware, a Bull group Company (French computer manufacturer and HPC Solutions). Serviware installs and upgrades clusters that can scale to hundreds of nodes with GPFS Filesystem as the main parallel filesystem. Our main customers are french universities and Industrial companies, some of them : university of Nice, GDF, Alstom, IFPEN, Total,... Kind regards, Farid CHABANE Serviware -------------- next part -------------- An HTML attachment was scrubbed... URL: From adean at ocf.co.uk Fri Jul 4 14:49:42 2014 From: adean at ocf.co.uk (Andrew Dean) Date: Fri, 4 Jul 2014 14:49:42 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: Hi Tom, Absolutely AFM could be used in this manner. A small GPFS instance on the local site would be used to cache the remote filesystem in the off-site data centre. There are sites already using AFM in this manner, it would also be relatively straightforward to demonstrate AFM working by using a WAN emulator to introduce latency/reduce bandwidth between ?cache' and ?remote? GPFS Clusters if you wanted to carry out at a POC prior to the move. Hope that helps. Regards, Andrew Dean HPC Business Development Manager OCF plc Tel: 0114 257 2200 Mob: 07508 033894 Fax: 0114 257 0022 Web: www.ocf.co.uk Blog: http://blog.ocf.co.uk Twitter: @ocfplc OCF plc is a company registered in England and Wales. Registered number 4132533, VAT number GB 780 6803 14. Registered office address: OCF plc, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG. This message is private and confidential. If you have received this message in error, please notify us immediately and remove it from your system. > >-----Original Message----- >From: gpfsug-discuss-bounces at gpfsug.org >[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Tom King >Sent: 01 July 2014 17:40 >To: gpfsug-discuss at gpfsug.org >Subject: [gpfsug-discuss] GPFS AFM experience > >Hi > >I've recently adopted a small two node GPFS cluster which is likely to be >relocated to an off-site datacentre in the near future and I'm concerned >that latency times are going to be impacted for on-site users. > >My reading of AFM is that an Independent Writer cache held on-site could >mitigate this and was wondering whether anybody had any experience of >using AFM caches in this manner. > >Thanks > >Tom King > > > >-- >Head of Research Infrastructure >IT Services >Queen Mary University of London >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bevans at pixitmedia.com Fri Jul 4 16:07:51 2014 From: bevans at pixitmedia.com (Barry Evans) Date: Fri, 04 Jul 2014 16:07:51 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B2E44A.9040008@qmul.ac.uk> References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: <53B6C347.5050008@pixitmedia.com> Hi Tom, Couple of quick ones: Do you know what the latency and line distance is likely to be? What's the rated bandwidth on the line? Is the workload at the 'local' site fairly mixed in terms of reads/writes? What is the kernel version on the current cluster? Regards, Barry > Tom King > 1 July 2014 17:39 > Hi > > I've recently adopted a small two node GPFS cluster which is likely to > be relocated to an off-site datacentre in the near future and I'm > concerned that latency times are going to be impacted for on-site users. > > My reading of AFM is that an Independent Writer cache held on-site > could mitigate this and was wondering whether anybody had any > experience of using AFM caches in this manner. > > Thanks > > Tom King > > > -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: compose-unknown-contact.jpg Type: image/jpeg Size: 770 bytes Desc: not available URL: From t.king at qmul.ac.uk Fri Jul 4 16:34:05 2014 From: t.king at qmul.ac.uk (Tom King) Date: Fri, 4 Jul 2014 16:34:05 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B6C347.5050008@pixitmedia.com> References: <53B2E44A.9040008@qmul.ac.uk> <53B6C347.5050008@pixitmedia.com> Message-ID: <53B6C96D.7030007@qmul.ac.uk> Thanks both Barry and Andrew I'd love to be able to provide some numbers but a lot of this is hedge betting whilst there's budget available, I'm sure an environment we're all used to. I'll probably get 2*10Gb/s to the data centre for GPFS and all other traffic. I can perhaps convince our networks team to QoS this. At present, there's no latency figures though we're looking at a 30 mile point to point distance over a managed service, I won't go into the minutiae of higher education networking in the UK, we're hoping for <5ms. I expect a good mixture of read/write, I expect growth in IO will be higher on the local side than the DC. The GPFS nodes are running RHEL 6.5 so 2.6.32. It doesn't seem obvious to me how one should size the capacity of the local AFM cache as a proportion of the primary storage and whether it's self-purging. I expect that we'd be looking at ~ 1 PB at the DC within the lifetime of the existing hardware. Thanks again for assistance, knowing that there are people out there doing this makes me feel that it's worth running up a demo. Tom On 04/07/2014 16:07, Barry Evans wrote: > Hi Tom, > > Couple of quick ones: > > Do you know what the latency and line distance is likely to be? What's > the rated bandwidth on the line? Is the workload at the 'local' site > fairly mixed in terms of reads/writes? What is the kernel version on the > current cluster? > > Regards, > Barry > > >> Tom King >> 1 July 2014 17:39 >> Hi >> >> I've recently adopted a small two node GPFS cluster which is likely to >> be relocated to an off-site datacentre in the near future and I'm >> concerned that latency times are going to be impacted for on-site users. >> >> My reading of AFM is that an Independent Writer cache held on-site >> could mitigate this and was wondering whether anybody had any >> experience of using AFM caches in this manner. >> >> Thanks >> >> Tom King >> >> >> > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other > person. Please notify the sender immediately and delete this email from > your computer system. Any opinions expressed are not necessarily those > of the company from which this email was sent and, whilst to the best of > our knowledge no viruses or defects exist, no responsibility can be > accepted for any loss or damage arising from its receipt or subsequent > use of this email. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Head of Research Infrastructure IT Services Queen Mary University of London From luke.raimbach at oerc.ox.ac.uk Mon Jul 14 08:26:11 2014 From: luke.raimbach at oerc.ox.ac.uk (Luke Raimbach) Date: Mon, 14 Jul 2014 07:26:11 +0000 Subject: [gpfsug-discuss] Multicluster UID Mapping Message-ID: Dear GPFS Experts, I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? Cheers, Luke. -- Luke Raimbach IT Manager Oxford e-Research Centre 7 Keble Road, Oxford, OX1 3QG +44(0)1865 610639 From orlando.richards at ed.ac.uk Mon Jul 14 15:11:48 2014 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jul 2014 15:11:48 +0100 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: <53C3E524.9010406@ed.ac.uk> On 14/07/14 08:26, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? > I'm not aware of any easy way to accommodate this. GPFS has machine-based authentication and authorisation, but not user-based. A bit like NFSv3, but with "proper" machine auth at least. This has stopped us exporting GPFS file systems outside a management domain - except where the file system is built solely for that purpose. You could look at gpfs native encryption, which should allow you to share keys between the clusters for specific areas - but that'd be a heavyweight fix. Failing that - you could drop GPFS and use something else to cross export specific areas (NFS, etc). You could possibly look at pNFS to make that slightly less disappointing... > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Facilities (ECDF) Systems Leader Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From rltodd.ml1 at gmail.com Mon Jul 14 19:45:22 2014 From: rltodd.ml1 at gmail.com (Lindsay Todd) Date: Mon, 14 Jul 2014 14:45:22 -0400 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: Luke, Without fully knowing your use case... If your data partitions so that what cluster B users only need a subset of the file system, such that it doesn't matter if they read anything on it, and the remainder can be kept completely away from them, then a possibility is to have two file systems on cluster A, only one of which is exported to B. (For example, we have a general user file system going to all clusters, as well as a smaller file system of VM images restricted to hypervisors only.) The lack of user authentication (such as found in AFS) has handicapped our use of GPFS. With not completely trusted users (we provide general HPC compute services), someone with a privilege escalation exploit can own the file system, and GPFS provides no defense against this. I am hoping that maybe native encryption can be bent to provide better protection, but I haven't had opportunity to explore this yet. /Lindsay On Mon, Jul 14, 2014 at 3:26 AM, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and > cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID > maps between Windows and Linux environment resulting in a very high ID > range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. > This is fine for preventing remote root access to file system GPFS. > However, cluster B may have untrusted users who have root privileges on > that cluster from time-to-time. Cluster B is "part-managed" by the admin on > cluster A, who only provides tools for maintaining a consistent UID space > with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged > users on cluster B from creating local users with a UID matching one in > cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I > might trust. Any thoughts? > > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From L.A.Hurst at bham.ac.uk Wed Jul 16 15:21:31 2014 From: L.A.Hurst at bham.ac.uk (Laurence Alexander Hurst) Date: Wed, 16 Jul 2014 14:21:31 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Message-ID: Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT From ewahl at osc.edu Wed Jul 16 16:29:49 2014 From: ewahl at osc.edu (Ed Wahl) Date: Wed, 16 Jul 2014 15:29:49 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Thu Jul 17 03:14:23 2014 From: oehmes at us.ibm.com (Sven Oehme) Date: Wed, 16 Jul 2014 19:14:23 -0700 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Laurence, Ed, the GPFS team is very well aware that there is a trend in moving towards analytics on primary data vs exporting and importing data somewhere else to run analytics on it, especially if the primary data architecture is already scalable (e.g. GPFS based). we also understand the need to use/support shared storage for analytics as it is in many areas economically as well as performance wise superior to shared nothing system, particular if you have mixed non-sequential workloads, significant write content, high utilization, etc. i assume you understand that we can't share future plans / capabilities on a mailing list, but if you are interested in how/when you can enable an existing GPFS Filesystem to be used with HDFS Hadoop, please either contact your IBM rep to contact me or send me a direct email and we set something up. ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com IBM Almaden Research Lab ------------------------------------------ From: Ed Wahl To: gpfsug main discussion list Date: 07/16/2014 08:31 AM Subject: Re: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Sent by: gpfsug-discuss-bounces at gpfsug.org It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From seanlee at tw.ibm.com Thu Jul 17 10:27:27 2014 From: seanlee at tw.ibm.com (Sean S Lee) Date: Thu, 17 Jul 2014 17:27:27 +0800 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Hello, > Looking at FPO it appears to require being setup as a separate > 'shared-nothing' cluster, with additional FPO and (at least 3) > server licensing costs attached. Presumably we could then use AFM > to ingest(/copy/sync) data from a Hadoop-specific fileset on our > existing GPFS cluster to the FPO cluster, removing the requirement > for additional gateway/heads for user (data) access? At least, > based on what I've read so far this would be the way we would have > to do it but it seems convoluted and not ideal. GPFS FPO nodes can become part of your existing cluster. Have you read this document? If not, take a look, it contains quite a lot of details on how it's done. http://public.dhe.ibm.com/common/ssi/ecm/en/dcw03051usen/DCW03051USEN.PDF Also take a look at the public GPFS FAQs which contain some recommendations related to GFPS FPO. > Has anyone else run Hadoop alongside, or on top of, an existing san- > based GPFS cluster (and wanted to use data stored on that cluster)? > Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) > compare to HDFS, especial regards performance (I know IBM have > produced lots of pretty graphs showing how much more performant than > HDFS GPFS-FPO is for particular use cases)? Yes, there are GPFS users who run MapReduce workloads against multi-purpose GPFS clusters that contain both "classic" and FPO filesystems. Performance-wise, a lot depends on the workload. But also don't forget that by avoiding the back-and-forth copying and moving of your data isn't directly measured as better performance, although that too can make turnaround times faster. Regards Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From zander at ebi.ac.uk Wed Jul 30 11:27:35 2014 From: zander at ebi.ac.uk (Zander Mears) Date: Wed, 30 Jul 2014 11:27:35 +0100 Subject: [gpfsug-discuss] Hello! Message-ID: <53D8C897.9000902@ebi.ac.uk> Hi All, As suggested when joining here's a quick intro to me, who I work for and how we use GPFS. I'm currently a Storage System Administrator for EBI, the European Bioinformatics Institute. We've recently started using GPFS via a three GSS26 system storage cluster connecting to our general purpose compute farm. GPFS is new for EBI and very new to me as well as HPC in general coming from an ISP networking > General/Windows Sys admin > VMWare sys admin > NetApp sys admin > storage sys admin with a spattering of Linux spread throughout this time of around 17 years. We are using the GPFS system as a scratch area for our users to run IO intensive work loads on. We have been impressed with some of the figures we've seen but are still in the tweaking stages, for example trying to resolve excessive expulsions (looking resolved), frame overruns and bonding issues. We monitor a number of metrics via Zabbix which a colleague (the main admin) setup. We work closely with the compute team who manage the compute farm and the client GPFS cluster configured on it. We're a RHEL shop. I've joined mainly to lurk tbh to see if I can pick up any tips and tweaks. Hope that's enough! cheers Zander From chair at gpfsug.org Thu Jul 31 00:38:23 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:38:23 +0100 Subject: [gpfsug-discuss] Hello! In-Reply-To: <53D8C897.9000902@ebi.ac.uk> References: <53D8C897.9000902@ebi.ac.uk> Message-ID: <53D981EF.3020000@gpfsug.org> Hi Zander, We have a git repository. Would you be interested in adding any Zabbix custom metrics gathering to GPFS to it? https://github.com/gpfsug/gpfsug-tools Best, Jez On 30/07/14 11:27, Zander Mears wrote: > Hi All, > > As suggested when joining here's a quick intro to me, who I work for > and how we use GPFS. > > I'm currently a Storage System Administrator for EBI, the European > Bioinformatics Institute. We've recently started using GPFS via a > three GSS26 system storage cluster connecting to our general purpose > compute farm. > > GPFS is new for EBI and very new to me as well as HPC in general > coming from an ISP networking > General/Windows Sys admin > VMWare sys > admin > NetApp sys admin > storage sys admin with a spattering of > Linux spread throughout this time of around 17 years. > > We are using the GPFS system as a scratch area for our users to run IO > intensive work loads on. We have been impressed with some of the > figures we've seen but are still in the tweaking stages, for example > trying to resolve excessive expulsions (looking resolved), frame > overruns and bonding issues. > > We monitor a number of metrics via Zabbix which a colleague (the main > admin) setup. We work closely with the compute team who manage the > compute farm and the client GPFS cluster configured on it. We're a > RHEL shop. > > I've joined mainly to lurk tbh to see if I can pick up any tips and > tweaks. > > Hope that's enough! > > cheers > > Zander > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at gpfsug.org Thu Jul 31 00:53:31 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:53:31 +0100 Subject: [gpfsug-discuss] GPFS UG 10 RFE Questionnaire Results available Message-ID: <53D9857B.6030806@gpfsug.org> Hello all The GPFS UG 10 Survey results have been compiled and are now available in a handy PDF. Please see: http://www.gpfsug.org/2014/07/31/gpfs-ug-10-rfe-questionnaire-results-available/ Each of the survey questions has been entered as an RFE on the IBM GPFS Request for Enhancement site. (http://goo.gl/1K6LBa) On the RFE site you can vote any RFE you think has merit - or submit additional RFEs yourself. If you do submit an additional RFE, let us known on the mailing list so we can chip in accordingly. We actively encourage you to discuss the RFEs on list. Best regards to all Jez (Chair) & on behalf of Claire (Secretary) From t.king at qmul.ac.uk Tue Jul 1 17:39:38 2014 From: t.king at qmul.ac.uk (Tom King) Date: Tue, 1 Jul 2014 17:39:38 +0100 Subject: [gpfsug-discuss] GPFS AFM experience Message-ID: <53B2E44A.9040008@qmul.ac.uk> Hi I've recently adopted a small two node GPFS cluster which is likely to be relocated to an off-site datacentre in the near future and I'm concerned that latency times are going to be impacted for on-site users. My reading of AFM is that an Independent Writer cache held on-site could mitigate this and was wondering whether anybody had any experience of using AFM caches in this manner. Thanks Tom King -- Head of Research Infrastructure IT Services Queen Mary University of London From farid.chabane at ymail.com Thu Jul 3 14:54:40 2014 From: farid.chabane at ymail.com (FC) Date: Thu, 3 Jul 2014 14:54:40 +0100 Subject: [gpfsug-discuss] Introducing myself Message-ID: <1404395680.95829.YahooMailNeo@web171806.mail.ir2.yahoo.com> Hello , I'm happy to join this mailing list, my name is Farid Chabane, I'm a HPC Engineer at Serviware, a Bull group Company (French computer manufacturer and HPC Solutions). Serviware installs and upgrades clusters that can scale to hundreds of nodes with GPFS Filesystem as the main parallel filesystem. Our main customers are french universities and Industrial companies, some of them : university of Nice, GDF, Alstom, IFPEN, Total,... Kind regards, Farid CHABANE Serviware -------------- next part -------------- An HTML attachment was scrubbed... URL: From adean at ocf.co.uk Fri Jul 4 14:49:42 2014 From: adean at ocf.co.uk (Andrew Dean) Date: Fri, 4 Jul 2014 14:49:42 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: Hi Tom, Absolutely AFM could be used in this manner. A small GPFS instance on the local site would be used to cache the remote filesystem in the off-site data centre. There are sites already using AFM in this manner, it would also be relatively straightforward to demonstrate AFM working by using a WAN emulator to introduce latency/reduce bandwidth between ?cache' and ?remote? GPFS Clusters if you wanted to carry out at a POC prior to the move. Hope that helps. Regards, Andrew Dean HPC Business Development Manager OCF plc Tel: 0114 257 2200 Mob: 07508 033894 Fax: 0114 257 0022 Web: www.ocf.co.uk Blog: http://blog.ocf.co.uk Twitter: @ocfplc OCF plc is a company registered in England and Wales. Registered number 4132533, VAT number GB 780 6803 14. Registered office address: OCF plc, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG. This message is private and confidential. If you have received this message in error, please notify us immediately and remove it from your system. > >-----Original Message----- >From: gpfsug-discuss-bounces at gpfsug.org >[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Tom King >Sent: 01 July 2014 17:40 >To: gpfsug-discuss at gpfsug.org >Subject: [gpfsug-discuss] GPFS AFM experience > >Hi > >I've recently adopted a small two node GPFS cluster which is likely to be >relocated to an off-site datacentre in the near future and I'm concerned >that latency times are going to be impacted for on-site users. > >My reading of AFM is that an Independent Writer cache held on-site could >mitigate this and was wondering whether anybody had any experience of >using AFM caches in this manner. > >Thanks > >Tom King > > > >-- >Head of Research Infrastructure >IT Services >Queen Mary University of London >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bevans at pixitmedia.com Fri Jul 4 16:07:51 2014 From: bevans at pixitmedia.com (Barry Evans) Date: Fri, 04 Jul 2014 16:07:51 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B2E44A.9040008@qmul.ac.uk> References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: <53B6C347.5050008@pixitmedia.com> Hi Tom, Couple of quick ones: Do you know what the latency and line distance is likely to be? What's the rated bandwidth on the line? Is the workload at the 'local' site fairly mixed in terms of reads/writes? What is the kernel version on the current cluster? Regards, Barry > Tom King > 1 July 2014 17:39 > Hi > > I've recently adopted a small two node GPFS cluster which is likely to > be relocated to an off-site datacentre in the near future and I'm > concerned that latency times are going to be impacted for on-site users. > > My reading of AFM is that an Independent Writer cache held on-site > could mitigate this and was wondering whether anybody had any > experience of using AFM caches in this manner. > > Thanks > > Tom King > > > -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: compose-unknown-contact.jpg Type: image/jpeg Size: 770 bytes Desc: not available URL: From t.king at qmul.ac.uk Fri Jul 4 16:34:05 2014 From: t.king at qmul.ac.uk (Tom King) Date: Fri, 4 Jul 2014 16:34:05 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B6C347.5050008@pixitmedia.com> References: <53B2E44A.9040008@qmul.ac.uk> <53B6C347.5050008@pixitmedia.com> Message-ID: <53B6C96D.7030007@qmul.ac.uk> Thanks both Barry and Andrew I'd love to be able to provide some numbers but a lot of this is hedge betting whilst there's budget available, I'm sure an environment we're all used to. I'll probably get 2*10Gb/s to the data centre for GPFS and all other traffic. I can perhaps convince our networks team to QoS this. At present, there's no latency figures though we're looking at a 30 mile point to point distance over a managed service, I won't go into the minutiae of higher education networking in the UK, we're hoping for <5ms. I expect a good mixture of read/write, I expect growth in IO will be higher on the local side than the DC. The GPFS nodes are running RHEL 6.5 so 2.6.32. It doesn't seem obvious to me how one should size the capacity of the local AFM cache as a proportion of the primary storage and whether it's self-purging. I expect that we'd be looking at ~ 1 PB at the DC within the lifetime of the existing hardware. Thanks again for assistance, knowing that there are people out there doing this makes me feel that it's worth running up a demo. Tom On 04/07/2014 16:07, Barry Evans wrote: > Hi Tom, > > Couple of quick ones: > > Do you know what the latency and line distance is likely to be? What's > the rated bandwidth on the line? Is the workload at the 'local' site > fairly mixed in terms of reads/writes? What is the kernel version on the > current cluster? > > Regards, > Barry > > >> Tom King >> 1 July 2014 17:39 >> Hi >> >> I've recently adopted a small two node GPFS cluster which is likely to >> be relocated to an off-site datacentre in the near future and I'm >> concerned that latency times are going to be impacted for on-site users. >> >> My reading of AFM is that an Independent Writer cache held on-site >> could mitigate this and was wondering whether anybody had any >> experience of using AFM caches in this manner. >> >> Thanks >> >> Tom King >> >> >> > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other > person. Please notify the sender immediately and delete this email from > your computer system. Any opinions expressed are not necessarily those > of the company from which this email was sent and, whilst to the best of > our knowledge no viruses or defects exist, no responsibility can be > accepted for any loss or damage arising from its receipt or subsequent > use of this email. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Head of Research Infrastructure IT Services Queen Mary University of London From luke.raimbach at oerc.ox.ac.uk Mon Jul 14 08:26:11 2014 From: luke.raimbach at oerc.ox.ac.uk (Luke Raimbach) Date: Mon, 14 Jul 2014 07:26:11 +0000 Subject: [gpfsug-discuss] Multicluster UID Mapping Message-ID: Dear GPFS Experts, I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? Cheers, Luke. -- Luke Raimbach IT Manager Oxford e-Research Centre 7 Keble Road, Oxford, OX1 3QG +44(0)1865 610639 From orlando.richards at ed.ac.uk Mon Jul 14 15:11:48 2014 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jul 2014 15:11:48 +0100 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: <53C3E524.9010406@ed.ac.uk> On 14/07/14 08:26, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? > I'm not aware of any easy way to accommodate this. GPFS has machine-based authentication and authorisation, but not user-based. A bit like NFSv3, but with "proper" machine auth at least. This has stopped us exporting GPFS file systems outside a management domain - except where the file system is built solely for that purpose. You could look at gpfs native encryption, which should allow you to share keys between the clusters for specific areas - but that'd be a heavyweight fix. Failing that - you could drop GPFS and use something else to cross export specific areas (NFS, etc). You could possibly look at pNFS to make that slightly less disappointing... > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Facilities (ECDF) Systems Leader Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From rltodd.ml1 at gmail.com Mon Jul 14 19:45:22 2014 From: rltodd.ml1 at gmail.com (Lindsay Todd) Date: Mon, 14 Jul 2014 14:45:22 -0400 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: Luke, Without fully knowing your use case... If your data partitions so that what cluster B users only need a subset of the file system, such that it doesn't matter if they read anything on it, and the remainder can be kept completely away from them, then a possibility is to have two file systems on cluster A, only one of which is exported to B. (For example, we have a general user file system going to all clusters, as well as a smaller file system of VM images restricted to hypervisors only.) The lack of user authentication (such as found in AFS) has handicapped our use of GPFS. With not completely trusted users (we provide general HPC compute services), someone with a privilege escalation exploit can own the file system, and GPFS provides no defense against this. I am hoping that maybe native encryption can be bent to provide better protection, but I haven't had opportunity to explore this yet. /Lindsay On Mon, Jul 14, 2014 at 3:26 AM, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and > cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID > maps between Windows and Linux environment resulting in a very high ID > range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. > This is fine for preventing remote root access to file system GPFS. > However, cluster B may have untrusted users who have root privileges on > that cluster from time-to-time. Cluster B is "part-managed" by the admin on > cluster A, who only provides tools for maintaining a consistent UID space > with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged > users on cluster B from creating local users with a UID matching one in > cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I > might trust. Any thoughts? > > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From L.A.Hurst at bham.ac.uk Wed Jul 16 15:21:31 2014 From: L.A.Hurst at bham.ac.uk (Laurence Alexander Hurst) Date: Wed, 16 Jul 2014 14:21:31 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Message-ID: Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT From ewahl at osc.edu Wed Jul 16 16:29:49 2014 From: ewahl at osc.edu (Ed Wahl) Date: Wed, 16 Jul 2014 15:29:49 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Thu Jul 17 03:14:23 2014 From: oehmes at us.ibm.com (Sven Oehme) Date: Wed, 16 Jul 2014 19:14:23 -0700 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Laurence, Ed, the GPFS team is very well aware that there is a trend in moving towards analytics on primary data vs exporting and importing data somewhere else to run analytics on it, especially if the primary data architecture is already scalable (e.g. GPFS based). we also understand the need to use/support shared storage for analytics as it is in many areas economically as well as performance wise superior to shared nothing system, particular if you have mixed non-sequential workloads, significant write content, high utilization, etc. i assume you understand that we can't share future plans / capabilities on a mailing list, but if you are interested in how/when you can enable an existing GPFS Filesystem to be used with HDFS Hadoop, please either contact your IBM rep to contact me or send me a direct email and we set something up. ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com IBM Almaden Research Lab ------------------------------------------ From: Ed Wahl To: gpfsug main discussion list Date: 07/16/2014 08:31 AM Subject: Re: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Sent by: gpfsug-discuss-bounces at gpfsug.org It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From seanlee at tw.ibm.com Thu Jul 17 10:27:27 2014 From: seanlee at tw.ibm.com (Sean S Lee) Date: Thu, 17 Jul 2014 17:27:27 +0800 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Hello, > Looking at FPO it appears to require being setup as a separate > 'shared-nothing' cluster, with additional FPO and (at least 3) > server licensing costs attached. Presumably we could then use AFM > to ingest(/copy/sync) data from a Hadoop-specific fileset on our > existing GPFS cluster to the FPO cluster, removing the requirement > for additional gateway/heads for user (data) access? At least, > based on what I've read so far this would be the way we would have > to do it but it seems convoluted and not ideal. GPFS FPO nodes can become part of your existing cluster. Have you read this document? If not, take a look, it contains quite a lot of details on how it's done. http://public.dhe.ibm.com/common/ssi/ecm/en/dcw03051usen/DCW03051USEN.PDF Also take a look at the public GPFS FAQs which contain some recommendations related to GFPS FPO. > Has anyone else run Hadoop alongside, or on top of, an existing san- > based GPFS cluster (and wanted to use data stored on that cluster)? > Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) > compare to HDFS, especial regards performance (I know IBM have > produced lots of pretty graphs showing how much more performant than > HDFS GPFS-FPO is for particular use cases)? Yes, there are GPFS users who run MapReduce workloads against multi-purpose GPFS clusters that contain both "classic" and FPO filesystems. Performance-wise, a lot depends on the workload. But also don't forget that by avoiding the back-and-forth copying and moving of your data isn't directly measured as better performance, although that too can make turnaround times faster. Regards Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From zander at ebi.ac.uk Wed Jul 30 11:27:35 2014 From: zander at ebi.ac.uk (Zander Mears) Date: Wed, 30 Jul 2014 11:27:35 +0100 Subject: [gpfsug-discuss] Hello! Message-ID: <53D8C897.9000902@ebi.ac.uk> Hi All, As suggested when joining here's a quick intro to me, who I work for and how we use GPFS. I'm currently a Storage System Administrator for EBI, the European Bioinformatics Institute. We've recently started using GPFS via a three GSS26 system storage cluster connecting to our general purpose compute farm. GPFS is new for EBI and very new to me as well as HPC in general coming from an ISP networking > General/Windows Sys admin > VMWare sys admin > NetApp sys admin > storage sys admin with a spattering of Linux spread throughout this time of around 17 years. We are using the GPFS system as a scratch area for our users to run IO intensive work loads on. We have been impressed with some of the figures we've seen but are still in the tweaking stages, for example trying to resolve excessive expulsions (looking resolved), frame overruns and bonding issues. We monitor a number of metrics via Zabbix which a colleague (the main admin) setup. We work closely with the compute team who manage the compute farm and the client GPFS cluster configured on it. We're a RHEL shop. I've joined mainly to lurk tbh to see if I can pick up any tips and tweaks. Hope that's enough! cheers Zander From chair at gpfsug.org Thu Jul 31 00:38:23 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:38:23 +0100 Subject: [gpfsug-discuss] Hello! In-Reply-To: <53D8C897.9000902@ebi.ac.uk> References: <53D8C897.9000902@ebi.ac.uk> Message-ID: <53D981EF.3020000@gpfsug.org> Hi Zander, We have a git repository. Would you be interested in adding any Zabbix custom metrics gathering to GPFS to it? https://github.com/gpfsug/gpfsug-tools Best, Jez On 30/07/14 11:27, Zander Mears wrote: > Hi All, > > As suggested when joining here's a quick intro to me, who I work for > and how we use GPFS. > > I'm currently a Storage System Administrator for EBI, the European > Bioinformatics Institute. We've recently started using GPFS via a > three GSS26 system storage cluster connecting to our general purpose > compute farm. > > GPFS is new for EBI and very new to me as well as HPC in general > coming from an ISP networking > General/Windows Sys admin > VMWare sys > admin > NetApp sys admin > storage sys admin with a spattering of > Linux spread throughout this time of around 17 years. > > We are using the GPFS system as a scratch area for our users to run IO > intensive work loads on. We have been impressed with some of the > figures we've seen but are still in the tweaking stages, for example > trying to resolve excessive expulsions (looking resolved), frame > overruns and bonding issues. > > We monitor a number of metrics via Zabbix which a colleague (the main > admin) setup. We work closely with the compute team who manage the > compute farm and the client GPFS cluster configured on it. We're a > RHEL shop. > > I've joined mainly to lurk tbh to see if I can pick up any tips and > tweaks. > > Hope that's enough! > > cheers > > Zander > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at gpfsug.org Thu Jul 31 00:53:31 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:53:31 +0100 Subject: [gpfsug-discuss] GPFS UG 10 RFE Questionnaire Results available Message-ID: <53D9857B.6030806@gpfsug.org> Hello all The GPFS UG 10 Survey results have been compiled and are now available in a handy PDF. Please see: http://www.gpfsug.org/2014/07/31/gpfs-ug-10-rfe-questionnaire-results-available/ Each of the survey questions has been entered as an RFE on the IBM GPFS Request for Enhancement site. (http://goo.gl/1K6LBa) On the RFE site you can vote any RFE you think has merit - or submit additional RFEs yourself. If you do submit an additional RFE, let us known on the mailing list so we can chip in accordingly. We actively encourage you to discuss the RFEs on list. Best regards to all Jez (Chair) & on behalf of Claire (Secretary) From t.king at qmul.ac.uk Tue Jul 1 17:39:38 2014 From: t.king at qmul.ac.uk (Tom King) Date: Tue, 1 Jul 2014 17:39:38 +0100 Subject: [gpfsug-discuss] GPFS AFM experience Message-ID: <53B2E44A.9040008@qmul.ac.uk> Hi I've recently adopted a small two node GPFS cluster which is likely to be relocated to an off-site datacentre in the near future and I'm concerned that latency times are going to be impacted for on-site users. My reading of AFM is that an Independent Writer cache held on-site could mitigate this and was wondering whether anybody had any experience of using AFM caches in this manner. Thanks Tom King -- Head of Research Infrastructure IT Services Queen Mary University of London From farid.chabane at ymail.com Thu Jul 3 14:54:40 2014 From: farid.chabane at ymail.com (FC) Date: Thu, 3 Jul 2014 14:54:40 +0100 Subject: [gpfsug-discuss] Introducing myself Message-ID: <1404395680.95829.YahooMailNeo@web171806.mail.ir2.yahoo.com> Hello , I'm happy to join this mailing list, my name is Farid Chabane, I'm a HPC Engineer at Serviware, a Bull group Company (French computer manufacturer and HPC Solutions). Serviware installs and upgrades clusters that can scale to hundreds of nodes with GPFS Filesystem as the main parallel filesystem. Our main customers are french universities and Industrial companies, some of them : university of Nice, GDF, Alstom, IFPEN, Total,... Kind regards, Farid CHABANE Serviware -------------- next part -------------- An HTML attachment was scrubbed... URL: From adean at ocf.co.uk Fri Jul 4 14:49:42 2014 From: adean at ocf.co.uk (Andrew Dean) Date: Fri, 4 Jul 2014 14:49:42 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: Hi Tom, Absolutely AFM could be used in this manner. A small GPFS instance on the local site would be used to cache the remote filesystem in the off-site data centre. There are sites already using AFM in this manner, it would also be relatively straightforward to demonstrate AFM working by using a WAN emulator to introduce latency/reduce bandwidth between ?cache' and ?remote? GPFS Clusters if you wanted to carry out at a POC prior to the move. Hope that helps. Regards, Andrew Dean HPC Business Development Manager OCF plc Tel: 0114 257 2200 Mob: 07508 033894 Fax: 0114 257 0022 Web: www.ocf.co.uk Blog: http://blog.ocf.co.uk Twitter: @ocfplc OCF plc is a company registered in England and Wales. Registered number 4132533, VAT number GB 780 6803 14. Registered office address: OCF plc, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG. This message is private and confidential. If you have received this message in error, please notify us immediately and remove it from your system. > >-----Original Message----- >From: gpfsug-discuss-bounces at gpfsug.org >[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Tom King >Sent: 01 July 2014 17:40 >To: gpfsug-discuss at gpfsug.org >Subject: [gpfsug-discuss] GPFS AFM experience > >Hi > >I've recently adopted a small two node GPFS cluster which is likely to be >relocated to an off-site datacentre in the near future and I'm concerned >that latency times are going to be impacted for on-site users. > >My reading of AFM is that an Independent Writer cache held on-site could >mitigate this and was wondering whether anybody had any experience of >using AFM caches in this manner. > >Thanks > >Tom King > > > >-- >Head of Research Infrastructure >IT Services >Queen Mary University of London >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bevans at pixitmedia.com Fri Jul 4 16:07:51 2014 From: bevans at pixitmedia.com (Barry Evans) Date: Fri, 04 Jul 2014 16:07:51 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B2E44A.9040008@qmul.ac.uk> References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: <53B6C347.5050008@pixitmedia.com> Hi Tom, Couple of quick ones: Do you know what the latency and line distance is likely to be? What's the rated bandwidth on the line? Is the workload at the 'local' site fairly mixed in terms of reads/writes? What is the kernel version on the current cluster? Regards, Barry > Tom King > 1 July 2014 17:39 > Hi > > I've recently adopted a small two node GPFS cluster which is likely to > be relocated to an off-site datacentre in the near future and I'm > concerned that latency times are going to be impacted for on-site users. > > My reading of AFM is that an Independent Writer cache held on-site > could mitigate this and was wondering whether anybody had any > experience of using AFM caches in this manner. > > Thanks > > Tom King > > > -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: compose-unknown-contact.jpg Type: image/jpeg Size: 770 bytes Desc: not available URL: From t.king at qmul.ac.uk Fri Jul 4 16:34:05 2014 From: t.king at qmul.ac.uk (Tom King) Date: Fri, 4 Jul 2014 16:34:05 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B6C347.5050008@pixitmedia.com> References: <53B2E44A.9040008@qmul.ac.uk> <53B6C347.5050008@pixitmedia.com> Message-ID: <53B6C96D.7030007@qmul.ac.uk> Thanks both Barry and Andrew I'd love to be able to provide some numbers but a lot of this is hedge betting whilst there's budget available, I'm sure an environment we're all used to. I'll probably get 2*10Gb/s to the data centre for GPFS and all other traffic. I can perhaps convince our networks team to QoS this. At present, there's no latency figures though we're looking at a 30 mile point to point distance over a managed service, I won't go into the minutiae of higher education networking in the UK, we're hoping for <5ms. I expect a good mixture of read/write, I expect growth in IO will be higher on the local side than the DC. The GPFS nodes are running RHEL 6.5 so 2.6.32. It doesn't seem obvious to me how one should size the capacity of the local AFM cache as a proportion of the primary storage and whether it's self-purging. I expect that we'd be looking at ~ 1 PB at the DC within the lifetime of the existing hardware. Thanks again for assistance, knowing that there are people out there doing this makes me feel that it's worth running up a demo. Tom On 04/07/2014 16:07, Barry Evans wrote: > Hi Tom, > > Couple of quick ones: > > Do you know what the latency and line distance is likely to be? What's > the rated bandwidth on the line? Is the workload at the 'local' site > fairly mixed in terms of reads/writes? What is the kernel version on the > current cluster? > > Regards, > Barry > > >> Tom King >> 1 July 2014 17:39 >> Hi >> >> I've recently adopted a small two node GPFS cluster which is likely to >> be relocated to an off-site datacentre in the near future and I'm >> concerned that latency times are going to be impacted for on-site users. >> >> My reading of AFM is that an Independent Writer cache held on-site >> could mitigate this and was wondering whether anybody had any >> experience of using AFM caches in this manner. >> >> Thanks >> >> Tom King >> >> >> > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other > person. Please notify the sender immediately and delete this email from > your computer system. Any opinions expressed are not necessarily those > of the company from which this email was sent and, whilst to the best of > our knowledge no viruses or defects exist, no responsibility can be > accepted for any loss or damage arising from its receipt or subsequent > use of this email. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Head of Research Infrastructure IT Services Queen Mary University of London From luke.raimbach at oerc.ox.ac.uk Mon Jul 14 08:26:11 2014 From: luke.raimbach at oerc.ox.ac.uk (Luke Raimbach) Date: Mon, 14 Jul 2014 07:26:11 +0000 Subject: [gpfsug-discuss] Multicluster UID Mapping Message-ID: Dear GPFS Experts, I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? Cheers, Luke. -- Luke Raimbach IT Manager Oxford e-Research Centre 7 Keble Road, Oxford, OX1 3QG +44(0)1865 610639 From orlando.richards at ed.ac.uk Mon Jul 14 15:11:48 2014 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jul 2014 15:11:48 +0100 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: <53C3E524.9010406@ed.ac.uk> On 14/07/14 08:26, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? > I'm not aware of any easy way to accommodate this. GPFS has machine-based authentication and authorisation, but not user-based. A bit like NFSv3, but with "proper" machine auth at least. This has stopped us exporting GPFS file systems outside a management domain - except where the file system is built solely for that purpose. You could look at gpfs native encryption, which should allow you to share keys between the clusters for specific areas - but that'd be a heavyweight fix. Failing that - you could drop GPFS and use something else to cross export specific areas (NFS, etc). You could possibly look at pNFS to make that slightly less disappointing... > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Facilities (ECDF) Systems Leader Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From rltodd.ml1 at gmail.com Mon Jul 14 19:45:22 2014 From: rltodd.ml1 at gmail.com (Lindsay Todd) Date: Mon, 14 Jul 2014 14:45:22 -0400 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: Luke, Without fully knowing your use case... If your data partitions so that what cluster B users only need a subset of the file system, such that it doesn't matter if they read anything on it, and the remainder can be kept completely away from them, then a possibility is to have two file systems on cluster A, only one of which is exported to B. (For example, we have a general user file system going to all clusters, as well as a smaller file system of VM images restricted to hypervisors only.) The lack of user authentication (such as found in AFS) has handicapped our use of GPFS. With not completely trusted users (we provide general HPC compute services), someone with a privilege escalation exploit can own the file system, and GPFS provides no defense against this. I am hoping that maybe native encryption can be bent to provide better protection, but I haven't had opportunity to explore this yet. /Lindsay On Mon, Jul 14, 2014 at 3:26 AM, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and > cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID > maps between Windows and Linux environment resulting in a very high ID > range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. > This is fine for preventing remote root access to file system GPFS. > However, cluster B may have untrusted users who have root privileges on > that cluster from time-to-time. Cluster B is "part-managed" by the admin on > cluster A, who only provides tools for maintaining a consistent UID space > with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged > users on cluster B from creating local users with a UID matching one in > cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I > might trust. Any thoughts? > > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From L.A.Hurst at bham.ac.uk Wed Jul 16 15:21:31 2014 From: L.A.Hurst at bham.ac.uk (Laurence Alexander Hurst) Date: Wed, 16 Jul 2014 14:21:31 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Message-ID: Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT From ewahl at osc.edu Wed Jul 16 16:29:49 2014 From: ewahl at osc.edu (Ed Wahl) Date: Wed, 16 Jul 2014 15:29:49 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Thu Jul 17 03:14:23 2014 From: oehmes at us.ibm.com (Sven Oehme) Date: Wed, 16 Jul 2014 19:14:23 -0700 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Laurence, Ed, the GPFS team is very well aware that there is a trend in moving towards analytics on primary data vs exporting and importing data somewhere else to run analytics on it, especially if the primary data architecture is already scalable (e.g. GPFS based). we also understand the need to use/support shared storage for analytics as it is in many areas economically as well as performance wise superior to shared nothing system, particular if you have mixed non-sequential workloads, significant write content, high utilization, etc. i assume you understand that we can't share future plans / capabilities on a mailing list, but if you are interested in how/when you can enable an existing GPFS Filesystem to be used with HDFS Hadoop, please either contact your IBM rep to contact me or send me a direct email and we set something up. ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com IBM Almaden Research Lab ------------------------------------------ From: Ed Wahl To: gpfsug main discussion list Date: 07/16/2014 08:31 AM Subject: Re: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Sent by: gpfsug-discuss-bounces at gpfsug.org It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From seanlee at tw.ibm.com Thu Jul 17 10:27:27 2014 From: seanlee at tw.ibm.com (Sean S Lee) Date: Thu, 17 Jul 2014 17:27:27 +0800 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Hello, > Looking at FPO it appears to require being setup as a separate > 'shared-nothing' cluster, with additional FPO and (at least 3) > server licensing costs attached. Presumably we could then use AFM > to ingest(/copy/sync) data from a Hadoop-specific fileset on our > existing GPFS cluster to the FPO cluster, removing the requirement > for additional gateway/heads for user (data) access? At least, > based on what I've read so far this would be the way we would have > to do it but it seems convoluted and not ideal. GPFS FPO nodes can become part of your existing cluster. Have you read this document? If not, take a look, it contains quite a lot of details on how it's done. http://public.dhe.ibm.com/common/ssi/ecm/en/dcw03051usen/DCW03051USEN.PDF Also take a look at the public GPFS FAQs which contain some recommendations related to GFPS FPO. > Has anyone else run Hadoop alongside, or on top of, an existing san- > based GPFS cluster (and wanted to use data stored on that cluster)? > Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) > compare to HDFS, especial regards performance (I know IBM have > produced lots of pretty graphs showing how much more performant than > HDFS GPFS-FPO is for particular use cases)? Yes, there are GPFS users who run MapReduce workloads against multi-purpose GPFS clusters that contain both "classic" and FPO filesystems. Performance-wise, a lot depends on the workload. But also don't forget that by avoiding the back-and-forth copying and moving of your data isn't directly measured as better performance, although that too can make turnaround times faster. Regards Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From zander at ebi.ac.uk Wed Jul 30 11:27:35 2014 From: zander at ebi.ac.uk (Zander Mears) Date: Wed, 30 Jul 2014 11:27:35 +0100 Subject: [gpfsug-discuss] Hello! Message-ID: <53D8C897.9000902@ebi.ac.uk> Hi All, As suggested when joining here's a quick intro to me, who I work for and how we use GPFS. I'm currently a Storage System Administrator for EBI, the European Bioinformatics Institute. We've recently started using GPFS via a three GSS26 system storage cluster connecting to our general purpose compute farm. GPFS is new for EBI and very new to me as well as HPC in general coming from an ISP networking > General/Windows Sys admin > VMWare sys admin > NetApp sys admin > storage sys admin with a spattering of Linux spread throughout this time of around 17 years. We are using the GPFS system as a scratch area for our users to run IO intensive work loads on. We have been impressed with some of the figures we've seen but are still in the tweaking stages, for example trying to resolve excessive expulsions (looking resolved), frame overruns and bonding issues. We monitor a number of metrics via Zabbix which a colleague (the main admin) setup. We work closely with the compute team who manage the compute farm and the client GPFS cluster configured on it. We're a RHEL shop. I've joined mainly to lurk tbh to see if I can pick up any tips and tweaks. Hope that's enough! cheers Zander From chair at gpfsug.org Thu Jul 31 00:38:23 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:38:23 +0100 Subject: [gpfsug-discuss] Hello! In-Reply-To: <53D8C897.9000902@ebi.ac.uk> References: <53D8C897.9000902@ebi.ac.uk> Message-ID: <53D981EF.3020000@gpfsug.org> Hi Zander, We have a git repository. Would you be interested in adding any Zabbix custom metrics gathering to GPFS to it? https://github.com/gpfsug/gpfsug-tools Best, Jez On 30/07/14 11:27, Zander Mears wrote: > Hi All, > > As suggested when joining here's a quick intro to me, who I work for > and how we use GPFS. > > I'm currently a Storage System Administrator for EBI, the European > Bioinformatics Institute. We've recently started using GPFS via a > three GSS26 system storage cluster connecting to our general purpose > compute farm. > > GPFS is new for EBI and very new to me as well as HPC in general > coming from an ISP networking > General/Windows Sys admin > VMWare sys > admin > NetApp sys admin > storage sys admin with a spattering of > Linux spread throughout this time of around 17 years. > > We are using the GPFS system as a scratch area for our users to run IO > intensive work loads on. We have been impressed with some of the > figures we've seen but are still in the tweaking stages, for example > trying to resolve excessive expulsions (looking resolved), frame > overruns and bonding issues. > > We monitor a number of metrics via Zabbix which a colleague (the main > admin) setup. We work closely with the compute team who manage the > compute farm and the client GPFS cluster configured on it. We're a > RHEL shop. > > I've joined mainly to lurk tbh to see if I can pick up any tips and > tweaks. > > Hope that's enough! > > cheers > > Zander > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at gpfsug.org Thu Jul 31 00:53:31 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:53:31 +0100 Subject: [gpfsug-discuss] GPFS UG 10 RFE Questionnaire Results available Message-ID: <53D9857B.6030806@gpfsug.org> Hello all The GPFS UG 10 Survey results have been compiled and are now available in a handy PDF. Please see: http://www.gpfsug.org/2014/07/31/gpfs-ug-10-rfe-questionnaire-results-available/ Each of the survey questions has been entered as an RFE on the IBM GPFS Request for Enhancement site. (http://goo.gl/1K6LBa) On the RFE site you can vote any RFE you think has merit - or submit additional RFEs yourself. If you do submit an additional RFE, let us known on the mailing list so we can chip in accordingly. We actively encourage you to discuss the RFEs on list. Best regards to all Jez (Chair) & on behalf of Claire (Secretary) From t.king at qmul.ac.uk Tue Jul 1 17:39:38 2014 From: t.king at qmul.ac.uk (Tom King) Date: Tue, 1 Jul 2014 17:39:38 +0100 Subject: [gpfsug-discuss] GPFS AFM experience Message-ID: <53B2E44A.9040008@qmul.ac.uk> Hi I've recently adopted a small two node GPFS cluster which is likely to be relocated to an off-site datacentre in the near future and I'm concerned that latency times are going to be impacted for on-site users. My reading of AFM is that an Independent Writer cache held on-site could mitigate this and was wondering whether anybody had any experience of using AFM caches in this manner. Thanks Tom King -- Head of Research Infrastructure IT Services Queen Mary University of London From farid.chabane at ymail.com Thu Jul 3 14:54:40 2014 From: farid.chabane at ymail.com (FC) Date: Thu, 3 Jul 2014 14:54:40 +0100 Subject: [gpfsug-discuss] Introducing myself Message-ID: <1404395680.95829.YahooMailNeo@web171806.mail.ir2.yahoo.com> Hello , I'm happy to join this mailing list, my name is Farid Chabane, I'm a HPC Engineer at Serviware, a Bull group Company (French computer manufacturer and HPC Solutions). Serviware installs and upgrades clusters that can scale to hundreds of nodes with GPFS Filesystem as the main parallel filesystem. Our main customers are french universities and Industrial companies, some of them : university of Nice, GDF, Alstom, IFPEN, Total,... Kind regards, Farid CHABANE Serviware -------------- next part -------------- An HTML attachment was scrubbed... URL: From adean at ocf.co.uk Fri Jul 4 14:49:42 2014 From: adean at ocf.co.uk (Andrew Dean) Date: Fri, 4 Jul 2014 14:49:42 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: Hi Tom, Absolutely AFM could be used in this manner. A small GPFS instance on the local site would be used to cache the remote filesystem in the off-site data centre. There are sites already using AFM in this manner, it would also be relatively straightforward to demonstrate AFM working by using a WAN emulator to introduce latency/reduce bandwidth between ?cache' and ?remote? GPFS Clusters if you wanted to carry out at a POC prior to the move. Hope that helps. Regards, Andrew Dean HPC Business Development Manager OCF plc Tel: 0114 257 2200 Mob: 07508 033894 Fax: 0114 257 0022 Web: www.ocf.co.uk Blog: http://blog.ocf.co.uk Twitter: @ocfplc OCF plc is a company registered in England and Wales. Registered number 4132533, VAT number GB 780 6803 14. Registered office address: OCF plc, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG. This message is private and confidential. If you have received this message in error, please notify us immediately and remove it from your system. > >-----Original Message----- >From: gpfsug-discuss-bounces at gpfsug.org >[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Tom King >Sent: 01 July 2014 17:40 >To: gpfsug-discuss at gpfsug.org >Subject: [gpfsug-discuss] GPFS AFM experience > >Hi > >I've recently adopted a small two node GPFS cluster which is likely to be >relocated to an off-site datacentre in the near future and I'm concerned >that latency times are going to be impacted for on-site users. > >My reading of AFM is that an Independent Writer cache held on-site could >mitigate this and was wondering whether anybody had any experience of >using AFM caches in this manner. > >Thanks > >Tom King > > > >-- >Head of Research Infrastructure >IT Services >Queen Mary University of London >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bevans at pixitmedia.com Fri Jul 4 16:07:51 2014 From: bevans at pixitmedia.com (Barry Evans) Date: Fri, 04 Jul 2014 16:07:51 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B2E44A.9040008@qmul.ac.uk> References: <53B2E44A.9040008@qmul.ac.uk> Message-ID: <53B6C347.5050008@pixitmedia.com> Hi Tom, Couple of quick ones: Do you know what the latency and line distance is likely to be? What's the rated bandwidth on the line? Is the workload at the 'local' site fairly mixed in terms of reads/writes? What is the kernel version on the current cluster? Regards, Barry > Tom King > 1 July 2014 17:39 > Hi > > I've recently adopted a small two node GPFS cluster which is likely to > be relocated to an off-site datacentre in the near future and I'm > concerned that latency times are going to be impacted for on-site users. > > My reading of AFM is that an Independent Writer cache held on-site > could mitigate this and was wondering whether anybody had any > experience of using AFM caches in this manner. > > Thanks > > Tom King > > > -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: compose-unknown-contact.jpg Type: image/jpeg Size: 770 bytes Desc: not available URL: From t.king at qmul.ac.uk Fri Jul 4 16:34:05 2014 From: t.king at qmul.ac.uk (Tom King) Date: Fri, 4 Jul 2014 16:34:05 +0100 Subject: [gpfsug-discuss] GPFS AFM experience In-Reply-To: <53B6C347.5050008@pixitmedia.com> References: <53B2E44A.9040008@qmul.ac.uk> <53B6C347.5050008@pixitmedia.com> Message-ID: <53B6C96D.7030007@qmul.ac.uk> Thanks both Barry and Andrew I'd love to be able to provide some numbers but a lot of this is hedge betting whilst there's budget available, I'm sure an environment we're all used to. I'll probably get 2*10Gb/s to the data centre for GPFS and all other traffic. I can perhaps convince our networks team to QoS this. At present, there's no latency figures though we're looking at a 30 mile point to point distance over a managed service, I won't go into the minutiae of higher education networking in the UK, we're hoping for <5ms. I expect a good mixture of read/write, I expect growth in IO will be higher on the local side than the DC. The GPFS nodes are running RHEL 6.5 so 2.6.32. It doesn't seem obvious to me how one should size the capacity of the local AFM cache as a proportion of the primary storage and whether it's self-purging. I expect that we'd be looking at ~ 1 PB at the DC within the lifetime of the existing hardware. Thanks again for assistance, knowing that there are people out there doing this makes me feel that it's worth running up a demo. Tom On 04/07/2014 16:07, Barry Evans wrote: > Hi Tom, > > Couple of quick ones: > > Do you know what the latency and line distance is likely to be? What's > the rated bandwidth on the line? Is the workload at the 'local' site > fairly mixed in terms of reads/writes? What is the kernel version on the > current cluster? > > Regards, > Barry > > >> Tom King >> 1 July 2014 17:39 >> Hi >> >> I've recently adopted a small two node GPFS cluster which is likely to >> be relocated to an off-site datacentre in the near future and I'm >> concerned that latency times are going to be impacted for on-site users. >> >> My reading of AFM is that an Independent Writer cache held on-site >> could mitigate this and was wondering whether anybody had any >> experience of using AFM caches in this manner. >> >> Thanks >> >> Tom King >> >> >> > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other > person. Please notify the sender immediately and delete this email from > your computer system. Any opinions expressed are not necessarily those > of the company from which this email was sent and, whilst to the best of > our knowledge no viruses or defects exist, no responsibility can be > accepted for any loss or damage arising from its receipt or subsequent > use of this email. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Head of Research Infrastructure IT Services Queen Mary University of London From luke.raimbach at oerc.ox.ac.uk Mon Jul 14 08:26:11 2014 From: luke.raimbach at oerc.ox.ac.uk (Luke Raimbach) Date: Mon, 14 Jul 2014 07:26:11 +0000 Subject: [gpfsug-discuss] Multicluster UID Mapping Message-ID: Dear GPFS Experts, I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? Cheers, Luke. -- Luke Raimbach IT Manager Oxford e-Research Centre 7 Keble Road, Oxford, OX1 3QG +44(0)1865 610639 From orlando.richards at ed.ac.uk Mon Jul 14 15:11:48 2014 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jul 2014 15:11:48 +0100 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: <53C3E524.9010406@ed.ac.uk> On 14/07/14 08:26, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID maps between Windows and Linux environment resulting in a very high ID range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. This is fine for preventing remote root access to file system GPFS. However, cluster B may have untrusted users who have root privileges on that cluster from time-to-time. Cluster B is "part-managed" by the admin on cluster A, who only provides tools for maintaining a consistent UID space with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged users on cluster B from creating local users with a UID matching one in cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I might trust. Any thoughts? > I'm not aware of any easy way to accommodate this. GPFS has machine-based authentication and authorisation, but not user-based. A bit like NFSv3, but with "proper" machine auth at least. This has stopped us exporting GPFS file systems outside a management domain - except where the file system is built solely for that purpose. You could look at gpfs native encryption, which should allow you to share keys between the clusters for specific areas - but that'd be a heavyweight fix. Failing that - you could drop GPFS and use something else to cross export specific areas (NFS, etc). You could possibly look at pNFS to make that slightly less disappointing... > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Facilities (ECDF) Systems Leader Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From rltodd.ml1 at gmail.com Mon Jul 14 19:45:22 2014 From: rltodd.ml1 at gmail.com (Lindsay Todd) Date: Mon, 14 Jul 2014 14:45:22 -0400 Subject: [gpfsug-discuss] Multicluster UID Mapping In-Reply-To: References: Message-ID: Luke, Without fully knowing your use case... If your data partitions so that what cluster B users only need a subset of the file system, such that it doesn't matter if they read anything on it, and the remainder can be kept completely away from them, then a possibility is to have two file systems on cluster A, only one of which is exported to B. (For example, we have a general user file system going to all clusters, as well as a smaller file system of VM images restricted to hypervisors only.) The lack of user authentication (such as found in AFS) has handicapped our use of GPFS. With not completely trusted users (we provide general HPC compute services), someone with a privilege escalation exploit can own the file system, and GPFS provides no defense against this. I am hoping that maybe native encryption can be bent to provide better protection, but I haven't had opportunity to explore this yet. /Lindsay On Mon, Jul 14, 2014 at 3:26 AM, Luke Raimbach wrote: > Dear GPFS Experts, > > I have two clusters, A and B where cluster A owns file system GPFS and > cluster B owns no file systems. > > Cluster A is mixed Linux/Windows and has IMU keeping consistent UID/GID > maps between Windows and Linux environment resulting in a very high ID > range (typically both UID/GID starting at 850000000) > > Cluster B remote mounts file system GPFS with UID/GID=0 remapped to 99. > This is fine for preventing remote root access to file system GPFS. > However, cluster B may have untrusted users who have root privileges on > that cluster from time-to-time. Cluster B is "part-managed" by the admin on > cluster A, who only provides tools for maintaining a consistent UID space > with cluster A. > > In this scenario, what can be done to prevent untrusted root-privileged > users on cluster B from creating local users with a UID matching one in > cluster A and thus reading their data? > > Ideally, I want to remap all remote UIDs *except* a small subset which I > might trust. Any thoughts? > > Cheers, > Luke. > > -- > > Luke Raimbach > IT Manager > Oxford e-Research Centre > 7 Keble Road, > Oxford, > OX1 3QG > > +44(0)1865 610639 > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From L.A.Hurst at bham.ac.uk Wed Jul 16 15:21:31 2014 From: L.A.Hurst at bham.ac.uk (Laurence Alexander Hurst) Date: Wed, 16 Jul 2014 14:21:31 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Message-ID: Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT From ewahl at osc.edu Wed Jul 16 16:29:49 2014 From: ewahl at osc.edu (Ed Wahl) Date: Wed, 16 Jul 2014 15:29:49 +0000 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Thu Jul 17 03:14:23 2014 From: oehmes at us.ibm.com (Sven Oehme) Date: Wed, 16 Jul 2014 19:14:23 -0700 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Laurence, Ed, the GPFS team is very well aware that there is a trend in moving towards analytics on primary data vs exporting and importing data somewhere else to run analytics on it, especially if the primary data architecture is already scalable (e.g. GPFS based). we also understand the need to use/support shared storage for analytics as it is in many areas economically as well as performance wise superior to shared nothing system, particular if you have mixed non-sequential workloads, significant write content, high utilization, etc. i assume you understand that we can't share future plans / capabilities on a mailing list, but if you are interested in how/when you can enable an existing GPFS Filesystem to be used with HDFS Hadoop, please either contact your IBM rep to contact me or send me a direct email and we set something up. ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com IBM Almaden Research Lab ------------------------------------------ From: Ed Wahl To: gpfsug main discussion list Date: 07/16/2014 08:31 AM Subject: Re: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Sent by: gpfsug-discuss-bounces at gpfsug.org It seems to me that neither IBM nor Intel have done a good job with the marketing and pre-sales of their hadoop connectors. As my site hosts both GPFS and Lustre I've been paying attention to this. Soon enough I'll need some hadoop and I've been rather interested in who tells a convincing story. With IBM it's been like pulling teeth, so far, to get FPO info. (other than pricing) Intel has only been slightly better with EE. It was better with Panache, aka AFM, and there are now quite a few external folks doing all kinds of interesting things with it. From standard caching to trying local only burst buffers. I'm hopeful that we'll start to see the same with FPO and EE soon. I'll be very interested to hear more in this vein. Ed OSC ----- Reply message ----- From: "Laurence Alexander Hurst" To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) Date: Wed, Jul 16, 2014 10:21 AM Dear GPFSUG, I've been looking into the possibility of using GPFS with Hadoop, especially as we already have experience with GPFS (traditional san-based) cluster for our HPC provision (which is part of the same network fabric, so integration should be possible and would be desirable). The proof-of-concept Hadoop cluster I've setup has HDFS as well as our current GPFS file system exposed (to allow users to import/export their data from HDFS to the shared filestore). HDFS is a pain to get data in and out of and also precludes us using many deployment tools to mass-update the nodes (I know this would also be a problem with GPFS-FPO) by reimage and/or reinstall. It appears that the GPFS-FPO product is intended to provide HDFS's performance benefits for highly distributed data-intensive workloads with the same ease of use of a traditional GPFS filesystem. One of the things I'm wondering is; can we link this with our existing GPFS cluster sanely? This would avoid having to have additional filesystem gateway servers for our users to import/export their data from outside the system and allow, as seemlessly as possible, a clear workflow from generating large datasets on the HPC facility to analysing them (e.g. with a MapReduce function) on the Hadoop facility. Looking at FPO it appears to require being setup as a separate 'shared-nothing' cluster, with additional FPO and (at least 3) server licensing costs attached. Presumably we could then use AFM to ingest(/copy/sync) data from a Hadoop-specific fileset on our existing GPFS cluster to the FPO cluster, removing the requirement for additional gateway/heads for user (data) access? At least, based on what I've read so far this would be the way we would have to do it but it seems convoluted and not ideal. Or am I completely barking up the wrong tree with FPO? Has anyone else run Hadoop alongside, or on top of, an existing san-based GPFS cluster (and wanted to use data stored on that cluster)? Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) compare to HDFS, especial regards performance (I know IBM have produced lots of pretty graphs showing how much more performant than HDFS GPFS-FPO is for particular use cases)? Many thanks, Laurence -- Laurence Hurst, IT Services, University of Birmingham, Edgbaston, B15 2TT _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From seanlee at tw.ibm.com Thu Jul 17 10:27:27 2014 From: seanlee at tw.ibm.com (Sean S Lee) Date: Thu, 17 Jul 2014 17:27:27 +0800 Subject: [gpfsug-discuss] GPFS GPFS-FPO and Hadoop (specifically MapReduce in the first instance) In-Reply-To: References: Message-ID: Hello, > Looking at FPO it appears to require being setup as a separate > 'shared-nothing' cluster, with additional FPO and (at least 3) > server licensing costs attached. Presumably we could then use AFM > to ingest(/copy/sync) data from a Hadoop-specific fileset on our > existing GPFS cluster to the FPO cluster, removing the requirement > for additional gateway/heads for user (data) access? At least, > based on what I've read so far this would be the way we would have > to do it but it seems convoluted and not ideal. GPFS FPO nodes can become part of your existing cluster. Have you read this document? If not, take a look, it contains quite a lot of details on how it's done. http://public.dhe.ibm.com/common/ssi/ecm/en/dcw03051usen/DCW03051USEN.PDF Also take a look at the public GPFS FAQs which contain some recommendations related to GFPS FPO. > Has anyone else run Hadoop alongside, or on top of, an existing san- > based GPFS cluster (and wanted to use data stored on that cluster)? > Any tips, if you have? How does it (traditional GPFS or GPFS-FPO) > compare to HDFS, especial regards performance (I know IBM have > produced lots of pretty graphs showing how much more performant than > HDFS GPFS-FPO is for particular use cases)? Yes, there are GPFS users who run MapReduce workloads against multi-purpose GPFS clusters that contain both "classic" and FPO filesystems. Performance-wise, a lot depends on the workload. But also don't forget that by avoiding the back-and-forth copying and moving of your data isn't directly measured as better performance, although that too can make turnaround times faster. Regards Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From zander at ebi.ac.uk Wed Jul 30 11:27:35 2014 From: zander at ebi.ac.uk (Zander Mears) Date: Wed, 30 Jul 2014 11:27:35 +0100 Subject: [gpfsug-discuss] Hello! Message-ID: <53D8C897.9000902@ebi.ac.uk> Hi All, As suggested when joining here's a quick intro to me, who I work for and how we use GPFS. I'm currently a Storage System Administrator for EBI, the European Bioinformatics Institute. We've recently started using GPFS via a three GSS26 system storage cluster connecting to our general purpose compute farm. GPFS is new for EBI and very new to me as well as HPC in general coming from an ISP networking > General/Windows Sys admin > VMWare sys admin > NetApp sys admin > storage sys admin with a spattering of Linux spread throughout this time of around 17 years. We are using the GPFS system as a scratch area for our users to run IO intensive work loads on. We have been impressed with some of the figures we've seen but are still in the tweaking stages, for example trying to resolve excessive expulsions (looking resolved), frame overruns and bonding issues. We monitor a number of metrics via Zabbix which a colleague (the main admin) setup. We work closely with the compute team who manage the compute farm and the client GPFS cluster configured on it. We're a RHEL shop. I've joined mainly to lurk tbh to see if I can pick up any tips and tweaks. Hope that's enough! cheers Zander From chair at gpfsug.org Thu Jul 31 00:38:23 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:38:23 +0100 Subject: [gpfsug-discuss] Hello! In-Reply-To: <53D8C897.9000902@ebi.ac.uk> References: <53D8C897.9000902@ebi.ac.uk> Message-ID: <53D981EF.3020000@gpfsug.org> Hi Zander, We have a git repository. Would you be interested in adding any Zabbix custom metrics gathering to GPFS to it? https://github.com/gpfsug/gpfsug-tools Best, Jez On 30/07/14 11:27, Zander Mears wrote: > Hi All, > > As suggested when joining here's a quick intro to me, who I work for > and how we use GPFS. > > I'm currently a Storage System Administrator for EBI, the European > Bioinformatics Institute. We've recently started using GPFS via a > three GSS26 system storage cluster connecting to our general purpose > compute farm. > > GPFS is new for EBI and very new to me as well as HPC in general > coming from an ISP networking > General/Windows Sys admin > VMWare sys > admin > NetApp sys admin > storage sys admin with a spattering of > Linux spread throughout this time of around 17 years. > > We are using the GPFS system as a scratch area for our users to run IO > intensive work loads on. We have been impressed with some of the > figures we've seen but are still in the tweaking stages, for example > trying to resolve excessive expulsions (looking resolved), frame > overruns and bonding issues. > > We monitor a number of metrics via Zabbix which a colleague (the main > admin) setup. We work closely with the compute team who manage the > compute farm and the client GPFS cluster configured on it. We're a > RHEL shop. > > I've joined mainly to lurk tbh to see if I can pick up any tips and > tweaks. > > Hope that's enough! > > cheers > > Zander > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at gpfsug.org Thu Jul 31 00:53:31 2014 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Thu, 31 Jul 2014 00:53:31 +0100 Subject: [gpfsug-discuss] GPFS UG 10 RFE Questionnaire Results available Message-ID: <53D9857B.6030806@gpfsug.org> Hello all The GPFS UG 10 Survey results have been compiled and are now available in a handy PDF. Please see: http://www.gpfsug.org/2014/07/31/gpfs-ug-10-rfe-questionnaire-results-available/ Each of the survey questions has been entered as an RFE on the IBM GPFS Request for Enhancement site. (http://goo.gl/1K6LBa) On the RFE site you can vote any RFE you think has merit - or submit additional RFEs yourself. If you do submit an additional RFE, let us known on the mailing list so we can chip in accordingly. We actively encourage you to discuss the RFEs on list. Best regards to all Jez (Chair) & on behalf of Claire (Secretary)