Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why not .append here? #7

Open
dneise opened this issue Jun 29, 2020 · 3 comments
Open

why not .append here? #7

dneise opened this issue Jun 29, 2020 · 3 comments

Comments

@dneise
Copy link

dneise commented Jun 29, 2020

Hello, sorry for the noise, I was just curious why .append() was not used here.

@initialed85
Copy link
Contributor

Hey there, no reason I can think of- just the way I normally do it; i.e. concatenate two lists (one of which has a length of 1) versus appending to a list.

You got me interested though and I did some profiling (natively, on my Macbook w/ Python 2.7 and Python 3.7).

Test code

import cProfile

import sys

print("{}\n".format(sys.version))


def work_a():
    thing = []
    for i in range(0, 1024000):
        thing += [1]
    print("list length is {}".format(len(thing)))


def work_b():
    thing = []
    for i in range(0, 1024000):
        thing.append(1)
    print("list length is {}".format(len(thing)))


cProfile.run("work_a()")
print("----")
cProfile.run("work_b()")

Python 2.7 result

2.7.16 (default, Jul  5 2020, 02:24:03) 
[GCC 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.29.21) (-macos10.15-objc-

list length is 1024000
         6 function calls in 0.194 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.007    0.007    0.194    0.194 <string>:1(<module>)
        1    0.165    0.165    0.187    0.187 scratch_186.py:8(work_a)
        1    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'format' of 'str' objects}
        1    0.022    0.022    0.022    0.022 {range}


----
list length is 1024000
         1024006 function calls in 0.215 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.215    0.215 <string>:1(<module>)
        1    0.136    0.136    0.209    0.209 scratch_186.py:15(work_b)
        1    0.000    0.000    0.000    0.000 {len}
  1024000    0.064    0.000    0.064    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'format' of 'str' objects}
        1    0.009    0.009    0.009    0.009 {range}

Python 3.7 result

3.7.7 (default, Mar 10 2020, 15:43:33) 
[Clang 11.0.0 (clang-1100.0.33.17)]

list length is 1024000
         7 function calls in 0.091 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.091    0.091 <string>:1(<module>)
        1    0.085    0.085    0.085    0.085 scratch_187.py:8(work_a)
        1    0.000    0.000    0.091    0.091 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'format' of 'str' objects}


----
list length is 1024000
         1024007 function calls in 0.182 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.004    0.004    0.182    0.182 <string>:1(<module>)
        1    0.118    0.118    0.179    0.179 scratch_187.py:15(work_b)
        1    0.000    0.000    0.182    0.182 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.print}
  1024000    0.061    0.000    0.061    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'format' of 'str' objects}

Oddly, it seems the list concatenation approach is faster- complete luck on my behalf. It doesn't really make sense though, reading about it- .append should be appending to the existing list where as += should expand to thing = thing + [1] (which reassigns thing).

/shrug

@dneise
Copy link
Author

dneise commented Aug 24, 2020

Hello, wow .. thanks for taking the time to answer my question. The results confused me and I was unable to reproduce them.
I used timeit while you used cProfile, I found a Note in the documentation about cProfile saying:

The profiler modules are designed to provide an execution profile for a given program, not for benchmarking purposes (for that, there is timeit for reasonably accurate results). This particularly applies to benchmarking Python code against C code: the profilers introduce overhead for Python code, but not for C-level functions, and so the C code would seem faster than any Python one.

Python 3.7.4 (default, Aug 13 2019, 20:35:49) 

In [1]: %%timeit thing = []
   ...:     thing += [1]
   ...: 
51.1 ns ± 0.101 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [2]: %%timeit thing = []
   ...:     thing.append(1)
   ...: 
44.8 ns ± 0.198 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Note in a %%timeitcell magic, the 1st line is setup-code which is not timed, while only the body is timed.
https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit

@dneise
Copy link
Author

dneise commented Aug 24, 2020

Python 2.7.18 |Anaconda, Inc.| (default, Apr 23 2020, 22:42:48) 
In [1]: %%timeit thing = []
   ...:     thing.append(1)
   ...: 
The slowest run took 61.94 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 50 ns per loop

In [2]: %%timeit thing = []
   ...:     thing += [1]
   ...: 
10000000 loops, best of 3: 68 ns per loop

# using a loop to avoid any kind of potential caching.

In [3]: %%timeit thing = []
    ...: for i in range(1000000):
    ...:     thing.append(i)
    ...: 
10 loops, best of 3: 67.9 ms per loop

In [4]: %%timeit thing = []
    ...: for i in range(1000000):
    ...:     thing += [i]
    ...: 
10 loops, best of 3: 86.8 ms per loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants