[gpfsug-discuss] Kernel > 4.10, python >= 3.8 issue

Ray Coetzee coetzee.ray at gmail.com
Wed May 26 21:21:32 BST 2021


Thank you for the info Felipe.

I'll test with 5.0.5-7 in the morning.

Kind regards

Ray Coetzee




On Wed, May 26, 2021 at 7:36 PM Felipe Knop <knop at us.ibm.com> wrote:

> Ray,
>
> Apologies; I should have added more details. My records show that there is
> a fix for IJ28891 and IJ29942, which was delivered in
>
> 5.1.0.2
> 5.0.5.5
>
> "Program error" I believe is a code that indicates "needs to be fixed in
> our product".
>
> I agree that the workaround mentioned is not "actionable" . The APAR page
> should have been clear that there is a fix available.
>
> Regards,
>
>   Felipe
>
> ----
> Felipe Knop knop at us.ibm.com
> GPFS Development and Security
> IBM Systems
> IBM Building 008
> 2455 South Rd, Poughkeepsie, NY 12601
> (845) 433-9314 T/L 293-9314
>
>
>
>
> ----- Original message -----
> From: Ray Coetzee <coetzee.ray at gmail.com>
> To: Felipe Knop <knop at us.ibm.com>
> Cc: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel > 4.10, python >= 3.8 issue
> Date: Wed, May 26, 2021 2:27 PM
>
> Hi Felipe We looked at APAR IJ28891 & IJ29942, as both look identical but
> are closed as a "program error" with a workaround of "Do not use a file
> size which that is a multiple of the page size." Kind regards ‍ ‍ ‍ ‍ ‍ ‍ ‍
> ‍ ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
> ZjQcmQRYFpfptBannerEnd
> Hi Felipe
>
> We looked at APAR  IJ28891 & IJ29942, as both look identical but are
> closed as a "program error" with a workaround of "Do not use a file size
> which that is a multiple of the page size."
>
> Kind regards
>
> Ray Coetzee
>
>
> On Wed, May 26, 2021 at 6:45 PM Felipe Knop <knop at us.ibm.com> wrote:
>
> Ray,
>
> I wonder if you are hitting the problem which was fixed on the following
> APAR:
>
> https://www.ibm.com/support/pages/apar/IJ28891
>
>
>   Felipe
>
> ----
> Felipe Knop knop at us.ibm.com
> GPFS Development and Security
> IBM Systems
> IBM Building 008
> 2455 South Rd, Poughkeepsie, NY 12601
> (845) 433-9314 T/L 293-9314
>
>
>
>
> ----- Original message -----
> From: Ray Coetzee <coetzee.ray at gmail.com>
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Cc:
> Subject: [EXTERNAL] [gpfsug-discuss] Kernel > 4.10, python >= 3.8 issue
> Date: Wed, May 26, 2021 1:38 PM
>
> Hello all
>
> I'd be interested to know if anyone else has experienced a problem with Kernel
> > 4.10, python >= 3.8 and Spectrum Scale (5.0.5-2).
>
>
> We noticed that python shut.copy() is failing against a GPFS mount with:
>
> BlockingIOError: [Errno 11] Resource temporarily unavailable: 'test.file'
> -> 'test2.file'
>
> To reproduce the error:
>
> ```
> [user at login01]$ module load python-3.8.9-gcc-9.3.0-soqwnzh
>
> [ user at login01]$ truncate --size 640MB test.file
> [ user at login01]$ python3 -c "import shutil; shutil.copy('test.file',
> 'test2.file')"
> Traceback (most recent call last):
>  File "<string>", line 1, in <module>
>  File
> "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.8.9-soqwnzhndvqpk3mly3w6z6zx6cdv45sn/lib/python3.8/shutil.py",
> line 418, in copy
>  copyfile(src, dst, follow_symlinks=follow_symlinks)
>  File
> "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.8.9-soqwnzhndvqpk3mly3w6z6zx6cdv45sn/lib/python3.8/shutil.py",
> line 275, in copyfile
>  _fastcopy_sendfile(fsrc, fdst)
>  File
> "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.8.9-soqwnzhndvqpk3mly3w6z6zx6cdv45sn/lib/python3.8/shutil.py",
> line 172, in _fastcopy_sendfile
>  raise err
>  File
> "/hps/software/spack/opt/spack/linux-centos8-sandybridge/gcc-9.3.0/python-3.8.9-soqwnzhndvqpk3mly3w6z6zx6cdv45sn/lib/python3.8/shutil.py",
> line 152, in _fastcopy_sendfile
>  sent = os.sendfile(outfd, infd, offset, blocksize)
> BlockingIOError: [Errno 11] Resource temporarily unavailable: 'test.file'
> -> 'test2.file'
>
>
>
>  Investigating into why this is happening revealed that:
>
>
> 1. It is failing for python3.8 and above.
> 2. It is happening only a GPFS mount
> 3. It is happening with files whose size is multiple of 4KB (OS Page size)
>
> Relevant links:
> https://bugs.python.org/issue43743
> https://www.ibm.com/support/pages/apar/IJ28891
>
>
> Doing an strace revealed that at the lower level, it seems to be related
> to the Linux Syscall of “sendfile”, which seems to fail in these cases on
> GPFS.
>
>
> Strace for a 4096 B file:
>
> ```
> sendfile(4, 3, [0] => [4096], 8388608) = 4096
> sendfile(4, 3, [4096], 8388608) = -1 EAGAIN (Resource temporarily
> unavailable)
>
> ```
>
> The same file on other disk:
> ```
> sendfile(4, 3, [0] => [4096], 8388608) = 4096
> sendfile(4, 3, [4096], 8388608) = 0
>
>
>
> IBM's "fix" for the problem of "Do not use a file size which that is a
> multiple of the page size."  sounds really blasé.
>
>
> ```
>
>
> Kind regards
>
> Ray Coetzee
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210526/9096ca4a/attachment-0002.htm>


More information about the gpfsug-discuss mailing list