Skip to content

Commit 37d5a12

Browse files
author
Benjamin Moody
committed
Implement reading signals in format 24.
In signal format 24, each sample is stored in three bytes, as a little-endian signed integer. Since most architectures don't natively support 24-bit integers, neither does numpy, and therefore in order to read the signal, it must be converted from the on-disk format into some format that numpy can work with (i.e., 32-bit signed integers.) Previously, when loading a format-24 record using rdrecord or rdsamp, the internal function _rd_dat_file made a halfhearted attempt to read the signals, by extracting the two most significant bytes of each sample, discarding the least significant byte, and storing the result as a 16-bit integer. This meant not only that the precision was silently reduced, but the resulting physical sample values were incorrect by a factor of 256. (Additionally, the implementation of _rd_dat_file was broken if the signal file contained a prolog, if the signal file could not be mmapped, or if the entire signal file was not meant to be read at once.) Fix this by handling format 24 in the same way as the other unaligned formats (212, 310, and 311): - Read the data initially as an array of unsigned bytes; this means the 'dtype' passed to numpy.fromfile is numpy.dtype('<u1') and the 'count' is the number of bytes rather than the number of samples. - Reformat the array of bytes into an array of integers in the _blocks_to_samples function. This means format 24 must be considered an "unaligned" rather than an "aligned" format. In order to avoid making unnecessary copies of the data, rather than using numpy arithmetic operations, the middle and low bytes are *copied* from the input array into the corresponding locations in an 8-bit view of the 32-bit output array. This is dependent on the system byte order.
1 parent 5bb2e3c commit 37d5a12

File tree

1 file changed

+29
-5
lines changed

1 file changed

+29
-5
lines changed

wfdb/io/_signal.py

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import math
22
import os
3+
import sys
34

45
import numpy as np
56

@@ -10,10 +11,10 @@
1011
MAX_I32 = 2147483647
1112
MIN_I32 = -2147483648
1213

13-
# Formats in which all samples align with byte boundaries
14-
ALIGNED_FMTS = ['8', '16', '24', '32', '61', '80', '160']
15-
# Formats in which not all samples align with byte boundaries
16-
UNALIGNED_FMTS = ['212', '310', '311']
14+
# Formats in which all samples align with integer (power-of-two) boundaries
15+
ALIGNED_FMTS = ['8', '16', '32', '61', '80', '160']
16+
# Formats in which not all samples align with integer boundaries
17+
UNALIGNED_FMTS = ['212', '310', '311', '24']
1718
# Formats which are stored in offset binary form
1819
OFFSET_FMTS = ['80', '160']
1920
# All WFDB dat formats - https://www.physionet.org/physiotools/wag/signal-5.htm
@@ -29,7 +30,7 @@
2930
'160': 16, '212': 12, '310': 10, '311': 10}
3031

3132
# Numpy dtypes used to load dat files of each format.
32-
DATA_LOAD_TYPES = {'8': '<i1', '16': '<i2', '24': '<i3', '32': '<i4',
33+
DATA_LOAD_TYPES = {'8': '<i1', '16': '<i2', '24': '<u1', '32': '<i4',
3334
'61': '>i2', '80': '<u1', '160': '<u2', '212': '<u1',
3435
'310': '<u1', '311': '<u1'}
3536

@@ -1398,6 +1399,9 @@ def _rd_dat_file(file_name, dir_name, pn_dir, fmt, start_byte, n_samp):
13981399
elif fmt in ['310', '311']:
13991400
byte_count = _required_byte_num('read', fmt, n_samp)
14001401
element_count = byte_count
1402+
elif fmt == '24':
1403+
byte_count = n_samp * 3
1404+
element_count = byte_count
14011405
else:
14021406
element_count = n_samp
14031407
byte_count = n_samp * BYTES_PER_SAMPLE[fmt]
@@ -1535,6 +1539,26 @@ def _blocks_to_samples(sig_data, n_samp, fmt):
15351539
# Loaded values as un_signed. Convert to 2's complement form.
15361540
# Values > 2^9-1 are negative.
15371541
sig[sig > 511] -= 1024
1542+
1543+
elif fmt == '24':
1544+
# The following is equivalent to:
1545+
# sig = (sig_data[2::3].view('int8').astype('int32') * 65536
1546+
# + sig_data[1::3].astype('uint16') * 256
1547+
# + sig_data[0::3])
1548+
1549+
# Treat the high byte as signed and shift it by 16 bits.
1550+
sig = np.left_shift(sig_data[2::3].view('int8'), 16, dtype='int32')
1551+
1552+
# Directly copy the low and middle bytes.
1553+
if sys.byteorder == 'little':
1554+
sig.view('uint8')[0::4] = sig_data[0::3]
1555+
sig.view('uint8')[1::4] = sig_data[1::3]
1556+
elif sys.byteorder == 'big':
1557+
sig.view('uint8')[3::4] = sig_data[0::3]
1558+
sig.view('uint8')[2::4] = sig_data[1::3]
1559+
else:
1560+
raise NotImplementedError
1561+
15381562
return sig
15391563

15401564

0 commit comments

Comments
 (0)