Also, when trying #2, I’m still running into a #VALUE! error, similar to:
https://community.anaconda.cloud/t/value-error-for-cells-with-a-dict-over-a-certain-size
However, in this case I’m now also getting an error in the log:
---------------------------------
Traceback (most recent call last):
File "/lib/python3.12/site-packages/pandas/core/internals/managers.py", line 1693, in as_array
arr = self._interleave(dtype=dtype, na_value=na_value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/core/internals/managers.py", line 1734, in _interleave
arr = blk.get_values(dtype)
^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/core/internals/blocks.py", line 2244, in get_values
values = values.astype(object)
^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/core/arrays/datetimes.py", line 739, in astype
return dtl.DatetimeLikeArrayMixin.astype(self, dtype, copy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/site-packages/pandas/core/arrays/datetimelike.py", line 458, in astype
converted = ints_to_pydatetime(
^^^^^^^^^^^^^^^^^^^
MemoryError
My code is roughly:
merged_data = []
for year in historical_data:
for month in historical_data[year]:
for peak in historical_data[year][month]:
act_group = historical_data[year][month][period].get('ACT')
fc_group = historical_data[year][month][period].get('FC')
if act_group is not None and fc_group is not None:
merged_df = pd.merge(act_group, fc_group, on=['Year', 'Month', 'Period'], how='inner', suffixes=('_ACT', '_FC'))
merged_df['Relative_Delta'] = (merged_df['ActVal'] - merged_df['FCVal']) / merged_df['FCVal']
merged_df = merged_df.assign(Year=year, Month=month, Period=period)
merged_data.append(merged_df)
merged_data = pd.concat(merged_data, ignore_index=True)
merged_data
(this results in a #VALUE!) error with the log output above
However, if I change that last line to
"merged_data"
so that it just returns a string then it works fine without errors.
But the fact that that works and “returning” the data structure (to be used as a reference) fails with an error tells me it’s something associated with how it is represented in the spreadsheet
EDIT:
I also tried implementing your first suggestion, by defining ‘foo’ in “Imports and Definitions”. The problem I run into here is that I use the REF() function to define the source for the data that goes into ‘foo’, but REF() is not available yet.
I tried simply defining ‘foo’ in there, eg
foo = pd.DataFrame()
Then populating it in cell A1 for example. But although the variable itself persists, the data does not - every time A1 recalulates ‘foo’ starts empty. I thought perhaps adding
global foo
at the top might solve that, but it does not