Python UnicodeDecodeError with SAS file

 If you are getting below error while reading SAS file using Python or Output display b'  value'   with rows


[13:47:17] [INFO] [dku.utils]  -     return lib.map_infer(values, mapper, convert=convert)
[13:47:17] [INFO] [dku.utils]  -   File "lib.pyx", line 2972, in pandas._libs.lib.map_infer
[13:47:17] [INFO] [dku.utils]  -   File "<string>", line 15, in <lambda>
[13:47:17] [INFO] [dku.utils]  - UnicodeDecodeError: 'utf-8' codec can't decode byte 0x95 in position 20: invalid start byte


 Solution:

Please use below code to for resolution:


import pandas as pd, numpy as np
df = pd.read_sas('path//20240710.sas7bdat')
for col in df.columns:
    if df[col].dtype == 'object':
        df[col] = df[col].apply(lambda x: x.decode('utf-8','ignore') if isinstance(x, bytes) else x)
print(df)

Comments

Popular posts from this blog

Date format issue with spark sql

Hive Partition sub folders HIVE_UNION_SUBDIR_1,HIVE_UNION_SUBDIR_2,HIVE_UNION_SUBDIR_8

Dataiku and Dremio date difference