Skip to content Skip to sidebar Skip to footer

Most Efficient And Most Pythonic Way To Deep Copy Class Instance And Perform Additional Manipulation

Say I have a Python class instance, here's a really simple example: class Book: def __init__(self, genre): self.genre = genre self.author = self.title = None

Solution 1:

You may be looking for the cls.__new__() method. It allows you to control instance creation. You can call this function to create a blank instance, then update the dict of this instance with the dict of your original instance:

Here's an example:

classA:
    def__init__(self, a):
        self.a = a

    defnewA(self):
        b = type(self).__new__(type(self))
        b.__dict__.update(self.__dict__)
        return b


x = A(42)
x.a # 42
y = x.newA()
y.a # 42

This is very different from a deepcopy, but I'm suggesting it because of your example and the way you worded your question. No copying is actually being done--the new instance contains references to the same data as the old instance, making it susceptible to side-effects. but I offer it because in the example you gave, the new class you created was based on self.genre, which indicates that you aren't as concerned about whether or not the data in the new instance is a copy so much as about the computational cost of generating the new instance.

Solution 2:

  1. Use patterns matter a lot. If I'm just starting to think about efficiency for deep copies of large objects in a project, I rather enjoy lenses. The idea is that instead of interacting with a Book directly you'll interact with something that's acts like a Book and internally keeps track of immutable deltas and changesets when you modify it.

  2. This is up for debate, but copy.deepcopy from the standard library is hard to beat in terms of pythonic code.

  3. Absolutely, no questions.

3a. As a somewhat pathological example, if your class is only storing a few bytes then straight up copying the entire class isn't any more expensive than any modification you were about to do to it. Your deepcopy strategy would be to just copy it and do what you need to do -- the pythonic solution would also be the "best" solution by most metrics.

3b. If you're storing 10**10 bytes per class and only modifying 5 then a full copy is probably incredibly expensive compared to the work you're doing. Lenses can look really attractive (since they'll just hold a reference to the original object and the 5 byte change).

3c. You mentioned data access times as a potential factor. I'll use a long linked list as an example here -- both a full deepcopy and lenses will perform poorly here since they need to walk the entire list for any naive implementation. If you knew you needed to deepcopy linked lists though you could come up with some kind of immutable tree structure bisecting the list and use that to adopt a quasi-lenses strategy, only swapping out the leaves/branches of the tree that you change.

3*. Elaborating on the examples a little more, you're choosing a representation for a collection of objects in your code that can be conceptualized as a chain of deep copies and modifications. There are some classic approaches (copy all the data and make changes to the copies, leave the data alone and record all the changes) that work on arbitrary objects that have certain performance characteristics, but if you have additional information about the problem (e.g., these are all big arrays and only one coordinate changes) you can use that to design a representation that best fits your problem, and more importantly your desired performance trade-offs.

Post a Comment for "Most Efficient And Most Pythonic Way To Deep Copy Class Instance And Perform Additional Manipulation"