Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
115 commits
Select commit Hold shift + click to select a range
3f84840
delete code blocks containing TextFileReader
ludwiglierhammer Mar 15, 2024
5b3f169
get rid of chunksize
ludwiglierhammer Mar 15, 2024
bad91e7
pre-commit
ludwiglierhammer Mar 15, 2024
7b69471
delete chunked tests
ludwiglierhammer Mar 15, 2024
d11912c
Merge branch 'main' into no_textfilereader
ludwiglierhammer Apr 10, 2024
ae1998c
do not use chunksize
ludwiglierhammer Apr 10, 2024
46a4669
do not import unused packages
ludwiglierhammer Apr 10, 2024
50fd66d
get rid of TextFileReader objects
ludwiglierhammer Apr 10, 2024
4650eca
deleted
ludwiglierhammer Apr 10, 2024
98762eb
delete chunksizes
ludwiglierhammer Apr 10, 2024
3e3812c
do not differentiate between DataFrame and TExtFileReader
ludwiglierhammer Apr 10, 2024
70f2e25
stop testing with TextFileReader objects
ludwiglierhammer Apr 10, 2024
22aef16
delete packages for testing TextFileReader objects
ludwiglierhammer Apr 10, 2024
39e7a03
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jun 20, 2024
d0dbbe4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 20, 2024
ba851ed
delete correction parser test
ludwiglierhammer Jun 20, 2024
c00812a
fix with main branch
ludwiglierhammer Jun 21, 2024
9f38d7b
get rid of TextFiTextFileReader loop
ludwiglierhammer Jun 21, 2024
22a92d8
simplify
ludwiglierhammer Jun 21, 2024
d3baecf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 21, 2024
7fef4f4
use header information
ludwiglierhammer Jun 21, 2024
a829dab
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jun 26, 2024
7b2bad7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 26, 2024
eb5ae0a
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Jun 26, 2024
d5fb27e
remove unused import
ludwiglierhammer Jun 26, 2024
dba2265
set default encoding to utf-8
ludwiglierhammer Jun 26, 2024
7475d08
remove StringIO import
ludwiglierhammer Jun 26, 2024
86c7f6c
if statement adjustment
ludwiglierhammer Jun 26, 2024
a9a6183
remove blank lines
ludwiglierhammer Jun 27, 2024
c126c35
gather duplicated lines
ludwiglierhammer Jun 27, 2024
a0dba69
remove if statement which is always True
ludwiglierhammer Jun 27, 2024
db67457
do not use StringIO buffer
ludwiglierhammer Jun 27, 2024
0b5bbbc
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jul 1, 2024
4cf7046
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Jul 2, 2024
3f6f69c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 2, 2024
e877e65
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jul 4, 2024
5854100
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 4, 2024
140a025
make use of ast
ludwiglierhammer Jul 4, 2024
29e3aec
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jul 8, 2024
58473e0
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Jul 8, 2024
21b77a4
remove print statement
ludwiglierhammer Jul 12, 2024
9dcb7fa
remove table from decimal_places funciotn
ludwiglierhammer Jul 12, 2024
0113b0a
convert float to stringprint_float; directly convert list to str for …
ludwiglierhammer Jul 12, 2024
134e66c
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jul 19, 2024
bb83aff
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Aug 13, 2024
049e1df
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Aug 27, 2024
ba3255b
Merge branch 'main' into no_textfilereader
ludwiglierhammer Sep 19, 2024
96c7b2f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 19, 2024
e166472
Merge branch 'main' into no_textfilereader
ludwiglierhammer Sep 27, 2024
97d0c78
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 27, 2024
be4de72
fixing pre-commit hook
ludwiglierhammer Sep 27, 2024
04fe3ef
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Sep 27, 2024
febc03f
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Sep 30, 2024
016b66e
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 1, 2024
62a3020
fixing decimal_places
ludwiglierhammer Oct 1, 2024
9249acd
rename _writing_csv_files to _to_map
ludwiglierhammer Oct 1, 2024
df188dd
write striinggs and flots adjusted
ludwiglierhammer Oct 2, 2024
5515a90
remove
ludwiglierhammer Oct 2, 2024
22b7262
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 2, 2024
461b96b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 2, 2024
ab3ee89
delete argument TextParser
ludwiglierhammer Oct 2, 2024
6c5cbac
select first list entry
ludwiglierhammer Oct 2, 2024
41fdcd3
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Oct 2, 2024
a7d85fa
Merge branch 'glamod:main' into no_textfilereader
ludwiglierhammer Oct 7, 2024
269dccc
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 7, 2024
4670e10
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 8, 2024
e92778a
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 9, 2024
a886081
import os
ludwiglierhammer Oct 9, 2024
8b26ddd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 9, 2024
4d9902f
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 14, 2024
7f80d90
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 15, 2024
bc5f8be
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 18, 2024
037756a
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 18, 2024
c782537
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 21, 2024
9213adc
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 21, 2024
a2dc9fe
Merge branch 'main' into no_textfilereader
ludwiglierhammer Oct 23, 2024
f3de2a4
Merge branch 'main' into no_textfilereader
ludwiglierhammer Nov 4, 2024
f09096b
Merge branch 'main' into no_textfilereader
ludwiglierhammer Dec 16, 2024
9dd6b1f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 16, 2024
fab17a0
use default decimal places from properties
ludwiglierhammer Dec 16, 2024
b7f701b
Merge branch 'no_textfilereader' of https://github.com/ludwiglierhamm…
ludwiglierhammer Dec 16, 2024
2816995
Merge branch 'main' into no_textfilereader
ludwiglierhammer Dec 18, 2024
89318e2
Merge branch 'main' into no_textfilereader
ludwiglierhammer Dec 18, 2024
dfb406f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 18, 2024
303356e
delete syntax errors
ludwiglierhammer Dec 18, 2024
5f74783
remove TextFileReader elements from main
ludwiglierhammer Dec 18, 2024
b97c738
merge conflicts
ludwiglierhammer Jan 6, 2025
005560a
Merge branch 'ludwiglierhammer-no_textfilereader'
ludwiglierhammer Jan 6, 2025
316f688
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 6, 2025
7de069d
no hdlr
ludwiglierhammer Jan 6, 2025
a780a85
delete more TextFileReaders
ludwiglierhammer Jan 6, 2025
f59641b
import from workflow_suite
ludwiglierhammer Jan 6, 2025
13829a6
solve merge conflict
ludwiglierhammer Jan 6, 2025
2e15ae9
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jan 13, 2025
cea6c5b
solve merge conflicts
ludwiglierhammer Jan 14, 2025
0a3edb0
remove unused imports
ludwiglierhammer Jan 14, 2025
fa72165
make copy
ludwiglierhammer Jan 14, 2025
8424c70
solving pylint issues
ludwiglierhammer Jan 14, 2025
7a61fde
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jan 17, 2025
9a6b71a
Merge branch 'main' into no_textfilereader
ludwiglierhammer Jan 17, 2025
23d60cd
delete TestFileReader from write_data
ludwiglierhammer Jan 17, 2025
f8a5630
update indentation
ludwiglierhammer Jan 17, 2025
b035081
update more indentation
ludwiglierhammer Jan 17, 2025
b71c552
Merge branch 'main' into no_textfilereader
ludwiglierhammer Feb 5, 2025
40742ed
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 5, 2025
06026ab
data_df -> data
ludwiglierhammer Feb 5, 2025
6353df9
remove unused import
ludwiglierhammer Feb 5, 2025
82250f4
no need for utility function convert_str_boolean
ludwiglierhammer Feb 6, 2025
1573f89
Merge branch 'main' into no_textfilereader
ludwiglierhammer Feb 7, 2025
2f31be5
update encoding
ludwiglierhammer Feb 7, 2025
c16d3c8
resolve merge conflicts
ludwiglierhammer Feb 13, 2025
33d576c
Merge branch 'll-no_textfilereader' into no_textfilereader
ludwiglierhammer Feb 13, 2025
68a3a23
delete unused modules
ludwiglierhammer Feb 13, 2025
3350023
remove comments
ludwiglierhammer Feb 13, 2025
cc35964
Merge branch 'main' into no_textfilereader
ludwiglierhammer Mar 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 25 additions & 65 deletions cdm_reader_mapper/cdm_mapper/mapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,19 @@

Created on Thu Apr 11 13:45:38 2019

Maps data contained in a pandas DataFrame (or pd.io.parsers.TextFileReader) to
the C3S Climate Data Store Common Data Model (CDM) header and observational
tables using the mapping information available in the tool's mapping library
Maps data contained in a pandas DataFrame to the C3S Climate Data Store Common Data Model (CDM)
header and observational tables using the mapping information available in the tool's mapping library
for the input data model.

@author: iregon
"""

from __future__ import annotations

from copy import deepcopy
from io import StringIO

import numpy as np
import pandas as pd

from cdm_reader_mapper.common import logging_hdlr, pandas_TextParser_hdlr
from cdm_reader_mapper.common import logging_hdlr

from . import properties
from .codes.codes import get_code_table
Expand Down Expand Up @@ -223,11 +219,11 @@ def _map_and_convert(
null_label,
imodel_functions,
codes_subset,
cdm_tables,
cdm_complete,
cdm_atts,
logger,
):
atts = deepcopy(cdm_tables[table]["atts"])
atts = cdm_atts.get(table)
columns = (
[x for x in atts.keys() if x in idata.columns]
if not cdm_complete
Expand Down Expand Up @@ -258,10 +254,7 @@ def _map_and_convert(

table_df_i.columns = pd.MultiIndex.from_product([[table], columns])
table_df_i = drop_duplicates(table_df_i)
table_df_i = table_df_i.fillna(null_label)
table_df_i.to_csv(cdm_tables[table]["buffer"], header=False, index=False, mode="a")
cdm_tables[table]["columns"] = table_df_i.columns
return cdm_tables
return table_df_i.fillna(null_label)


def map_and_convert(
Expand All @@ -284,11 +277,6 @@ def map_and_convert(

imodel_functions = mapping_functions("_".join([data_model] + list(sub_models)))

# Initialize dictionary to store temporal tables (buffer) and table attributes
cdm_tables = {
k: {"buffer": StringIO(), "atts": cdm_atts.get(k)} for k in imodel_maps.keys()
}

date_columns = {}
for table, values in imodel_maps.items():
date_columns[table] = [
Expand All @@ -297,39 +285,22 @@ def map_and_convert(
if "timestamp" in cdm_atts.get(table, {}).get(x, {}).get("data_type")
]

for idata in data:
cols = [x for x in idata]
for table, mapping in imodel_maps.items():
cdm_tables = _map_and_convert(
idata,
mapping,
table,
cols,
null_label,
imodel_functions,
codes_subset,
cdm_tables,
cdm_complete,
logger,
)

table_list = []
for table in cdm_tables.keys():
# Convert dtime to object to be parsed by the reader
logger.debug(
f"\tParse datetime by reader; Table: {table}; Columns: {date_columns[table]}"
)
cdm_tables[table]["buffer"].seek(0)
data = pd.read_csv(
cdm_tables[table]["buffer"],
names=cdm_tables[table]["columns"],
na_values=[],
dtype="object",
keep_default_na=False,
for table in cdm_subset:
mapping = imodel_maps[table]
table_df = _map_and_convert(
data,
mapping,
table,
data.columns,
null_label,
imodel_functions,
codes_subset,
cdm_complete,
cdm_atts,
logger,
)
cdm_tables[table]["buffer"].close()
cdm_tables[table].pop("buffer")
table_list.append(data)
table_list.append(table_df)

merged = pd.concat(table_list, axis=1, join="outer")
return merged.reset_index(drop=True)
Expand Down Expand Up @@ -377,25 +348,14 @@ def map_model(
return

# Check input data type and content (empty?)
# Make sure data is an iterable: this is to homogenize how we handle
# dataframes and textreaders
if isinstance(data, pd.DataFrame):
logger.debug("Input data is a pd.DataFrame")
if len(data) == 0:
logger.error("Input data is empty")
return
else:
data = [data]
elif isinstance(data, pd.io.parsers.TextFileReader):
logger.debug("Input is a pd.TextFileReader")
not_empty = pandas_TextParser_hdlr.is_not_empty(data)
if not not_empty:
logger.error("Input data is empty")
return
else:
if not isinstance(data, pd.DataFrame):
logger.error("Input data type " f"{type(data)}" " not supported")
return

if data.empty:
logger.error("Input data is empty")
return

return map_and_convert(
imodel[0],
*imodel[1:],
Expand Down
3 changes: 1 addition & 2 deletions cdm_reader_mapper/cdm_mapper/writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@
Created on Thu Apr 11 13:45:38 2019

Exports tables written in the C3S Climate Data Store Common Data Model (CDM) format to ascii files,
The tables format is contained in a python dictionary, stored as an attribute in a pandas.DataFrame
(or pd.io.parsers.TextFileReader).
The tables format is contained in a python dictionary, stored as an attribute in a pandas.DataFrame.

This module uses a set of printer functions to "print" element values to a
string object before exporting them to a final ascii file.
Expand Down
31 changes: 4 additions & 27 deletions cdm_reader_mapper/common/inspect.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,6 @@
from __future__ import annotations

import numpy as np
import pandas as pd

from cdm_reader_mapper.common import pandas_TextParser_hdlr


def count_by_cat_i(series):
Expand All @@ -34,10 +31,7 @@ def get_length(data):
int
Total row count
"""
if not isinstance(data, pd.io.parsers.TextFileReader):
return len(data)
else:
return pandas_TextParser_hdlr.get_length(data)
return len(data)


def count_by_cat(data, columns=None):
Expand All @@ -60,23 +54,6 @@ def count_by_cat(data, columns=None):
if not isinstance(columns, list):
columns = [columns]
counts = {}
if not isinstance(data, pd.io.parsers.TextFileReader):
for column in columns:
counts[column] = count_by_cat_i(data[column])
return counts
else:
for column in columns:
data_cp = pandas_TextParser_hdlr.make_copy(data)
count_dicts = []
for df in data_cp:
count_dicts.append(count_by_cat_i(df[column]))

data_cp.close()
cats = [list(x.keys()) for x in count_dicts]
cats = list({x for y in cats for x in y})
cats.sort
count_dict = {}
for cat in cats:
count_dict[cat] = sum([x.get(cat) for x in count_dicts if x.get(cat)])
counts[column] = count_dict
return counts
for column in columns:
counts[column] = count_by_cat_i(data[column])
return counts
94 changes: 0 additions & 94 deletions cdm_reader_mapper/common/pandas_TextParser_hdlr.py

This file was deleted.

Loading
Loading