python 3.x - Pandas to Oracle via SQL Alchemy: UnicodeEncodeError: 'ascii' codec can't encode character -
using pandas 18.1...
i'm trying iterate through folder of csvs read each csv , send oracle database table. there non-ascii character lurking in 1 of many csvs (more reveling in anguish). keep getting error:
unicodeencodeerror: 'ascii' codec can't encode character '\xab' in position 8: ordinal not in range(128)
here's code:
import pandas pd import pandas.io.sql psql sqlalchemy import create_engine import cx_oracle cx engine = create_engine('oracle+cx_oracle://schema:'+pwd+'@server:port/service_name' ,encoding='latin1') name='table' path=r'path_to_folder' filelist = os.listdir(path) file in filelist: df = pd.read_csv(pathc+'\\'+file,encoding='latin1',index_col=0) df=df.astype('unicode') df['date'] = pd.to_datetime(df['date']) df['date'] = pd.to_datetime(df['contract_effdt'],format='%yyyy-%mm-%dd') df.to_sql(name, engine, if_exists = 'append')
i've tried following:
- encoding=utf-8 (in engine, if in read_csv, throws error)
- adding ?encoding=utf8 after "service_name" in engine
- using df=df.astype('unicode') (and not)
what want do: replace unreadable character else and, importantly, proceed sending data oracle.
note:
the data file i'm using cms.gov site. here's zip file example. i'm using "contracts_info" file.
thanks in advance!
i encoded string fields utf-8 individually , may have helped (a new error occurred, assume not related this):
dfc['organization type'] = dfc['organization type'].str.encode('utf-8')
new error:
databaseerror: (cx_oracle.databaseerror) ora-00904: "contract_id": invalid identifier
this because "contract_id" not set index. once did that, went (except being slower molasses, begins next adventure).
Comments
Post a Comment