"saving" / "loading" object data in python

gsteele13 · 31 March 2021 19:13

Hi all,

I’d like to be able to “save the settings” of an object and reload them later.

The attributes I want to save/load are small and pretty simple data formats (nothing fancy: int, float, dict, strings, lists of those), so I’d like to use JSON to keep things simple.

So far, the best I’ve come up with is to hack through the dir() listing of self and pluck out non-callable items not starting with _ into a dict, which I then seralise. And for loading, I hack it a bit using exec():

import json

class Foo:
    a = 1
    b = 1.5
    c = "a string"
    d = ["a", "list", "of", "strings"]
    
    def save_settings(self):
        settings = {}
        attributes = dir(self)
        for k in attributes: 
            if k[0] != "_" and not callable(eval("self."+k)):
                settings[k] = eval("self."+k)
        print(settings)
        with open("settings.json", "w") as f:
            json.dump(settings, f, indent=4)
    
    def load_settings(self):
        with open("settings.json", "r") as f:
            settings = jason.load(f)
        print(settings)
        # Backward compatibility could be handled here by 
        # using an except() statement
        for k in settings.keys():
            exec(f"self.{k} = settings['{k}']")

Does this seem reasonable? Or is there a less “dirty” way to do this? I’m a bit of a noob, so this is mostly googling and reverse engineering.

Thanks!
Gary

slavoutich · 1 April 2021 21:03

On a glance, less hacky way to load settings would be:

self.__dict__.update(settings)

But I would, of course, not invent anything new and look around PyPi, this is a very common task, say, this looks reasonable for dataclasses. In general, the problem you are solving is called “Python object JSON serialization”, it is very googleable

Of course, I can’t go away without mentioning traitlets, because Jupyter configuration is stored in JSON and all the required machinery is there, but I can’t provide a direct recipie without reading the code or thinking a lot myself.

gsteele13 · 1 April 2021 21:35

cool, thanks slava, i’ll give it a try for sure!

i had a look around for serialisation, you can pickle a lot of things, even classes (with some strangeness about pickling methods), but typically more oriented for saving a full object and loading a full object, rather than a (possibly incomplete) list of attributes

I think I should do some reading on dataclasses too

gsteele13 · 1 April 2021 21:39

I also did not know about the update() function of dict objects, awesome!

gsteele13 · 1 April 2021 21:51

I also learned something weird about python class attributes and self.__dict__: apparently, they are not added into self.__dict__ until they are set once? even though they do have a value…

Screenshot 2021-04-01 at 23.47.03

But if you set them in __init__(), then they are in the dict:

import json

class Foo:
    def __init__(self):    
        self.a = 1
        self.b = 1.5
        self.c = "a string"
        self.d = ["a", "list", "of", "strings"]

    def save_settings(self):
        with open("settings.json", "w") as f:
            json.dump(self.__dict__, f, indent=4)
    
    def load_settings(self):
        with open("settings.json", "r") as f:
            settings = json.load(f)
        self.__dict__.update(settings)

mega handy! and “reverse compatibility” for free

gsteele13 · 2 April 2021 18:51

curiously, with traits, if you declare them only in the __init__ function then they are not picked up in the traits() function, you must declare them outside of the init in the “top level” of the code…so kindof the opposite behaviour to __dict__…

gsteele13 · 2 April 2021 19:14

and it seems that traitets already support configuration files for traits of classes:

https://traitlets.readthedocs.io/en/stable/config.html#python-configuration-files

which is perfect since the things I want to serialise are going to be traits that are (optionally) linked to ipywidget gui controls (see other post… )

anton-akhmerov · 2 April 2021 19:33

Great that you found the solution!

To comment on the other question, the difference is between class attributes, which are defined in the class body, and are stored in Bar.__dict__ and the instance attributes, which are stored in B.__dict__. One way to get all data associated with an object is dir(B), however that will return all the methods as well.

slavoutich · 4 April 2021 17:39

My concern here (why I didn’t point that directly) was mostly about another thing: if you want to dynamically save/load configuration, you would rather go for JSON configuration files, and I am not sure how these are organized. This is a distinction in Jupyter configs: JSON configs are mostly machine-generated and Python configs are user-generated. I don’t remember what is the best way to use dynamic JSON configs, I would dig in Jupyter code first to see how they do it.

gsteele13 · 5 April 2021 09:02

Indeed, it seems that traitlets also support JSON config files, but it is not super-clear from the docs if load_config() can also accept JSON files. Will have to dig a bit.