Block storage rewritesHackerRank “Service Lane” in Python“Algorithmic Crush” problem hitting timeout errorsPangrams python implementationThe veiled guise of binary and stringPangrams implementation using LinqMatching maximal nodes with velocitiesCount numbers with atleast one common factor, excluding one (updated)Counting ways to divide a word into palindrome blocksOptimization of list's sublistProgramming Contest - Snuke Festival
Why would five hundred and five same as one?
Is this saw blade faulty?
Why didn’t Eve recognize the little cockroach as a living organism?
Reason why a kingside attack is not justified
Extract substring according to regexp with sed or grep
Would this string work as string?
categorizing a variable turns it from insignificant to significant
What is the tangent at a sharp point on a curve?
Why is "la Gestapo" feminine?
1 John in Luther’s Bibel
How would a solely written language work mechanically
Unfrosted light bulb
Can you describe someone as luxurious? As in someone who likes luxurious things?
I keep switching characters, how do I stop?
Calculate Pi using Monte Carlo
Output visual diagram of picture
Why can't I get pgrep output right to variable on bash script?
What is the period/term used describe Giuseppe Arcimboldo's style of painting?
How can a new country break out from a developed country without war?
Why is participating in the European Parliamentary elections used as a threat?
Air travel with refrigerated insulin
Can you take a "free object interaction" while incapacitated?
Do people actually use the word "kaputt" in conversation?
Turning a hard to access nut?
Block storage rewrites
HackerRank “Service Lane” in Python“Algorithmic Crush” problem hitting timeout errorsPangrams python implementationThe veiled guise of binary and stringPangrams implementation using LinqMatching maximal nodes with velocitiesCount numbers with atleast one common factor, excluding one (updated)Counting ways to divide a word into palindrome blocksOptimization of list's sublistProgramming Contest - Snuke Festival
$begingroup$
Summary
In a block storage system, new data is written in blocks. We are going to represent the flash memory as one sequential array. We have a list of block writes coming in the form of arrays of size 2:
writes[i] = [first_block_written, last_block_written]
.
Each block has a rewrite limit. If rewrites on a block reach a certain specified threshold we should run a special diagnostic.
Given
blockCount
(an integer representing the total number of blocks),writes
(the list of block-write arrays of size 2), andthreshold
, your task is to return the list of disjoint block segments, each consisting of blocks that have reached the rewrite threshold. The list of block segments should be sorted in increasing order by their left ends.
Examples
For
blockCount = 10, writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 2
, the output should beblockStorageRewrites(blockCount, writes, threshold) = [[2, 5]]
.
After the first write, the blocks
0, 1, 2, 3
and4
were written in once;
After the second write, the blocks 0, 1, 2 and 5 were written in once, and the blocks3
and4
reached the rewrite threshold;
After the final write, the blocks2
and5
reached the rewrite threshold as well, so the blocks that should be diagnosed are2, 3, 4
and5
.
Blocks2, 3, 4
and5
form one consequent segment[2, 5]
.
For
blockCount = 10
,writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 3
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[3, 4]]
For
blockCount = 10
,writes = [[3, 4], [0, 1], [6, 6]]
, andthreshold = 1
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[0, 1], [3, 4], [6, 6]]
Constraints
1 ≤ blockCount ≤ 10**5
0 ≤ writes.length ≤ 10**5
writes[i].length = 2
0 ≤ writes[i][0] ≤ writes[i][1] < blockCount
First try
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = [0] * blockCount
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return list(group_blocks(num_writes, threshold))
Here I just do exactly what is asked, I create an array of blockCount size, loop over the writes and lastly group the consecutive ranges with itertoos.groupby
After trying to optimize
I'm not quite sure what the complexity was, but I tried to lessen the load, yet I still didn't pass the TLE constraints
def get_bad_blocks(blockCount, writes, threshold):
cons = False
u = l = -1
for i in range(blockCount):
count = 0
for lower, upper in writes:
if lower <= i <= upper:
count += 1
if count >= threshold:
if not cons:
cons = True
l = i
u = -1
break
else:
u = i - 1
cons = False
if u != -1 and l != -1:
yield [l, u]
l = -1
if cons:
yield [l, i]
def blockStorageRewrites(blockCount, writes, threshold):
return list(get_bad_blocks(blockCount, writes, threshold))
Questions
You can review any and all, but preferably I'm looking for answers that:
- Tell me if my first example is readable
- I'm less concerned about readability in my second try, and more interested in speed
- Please ignore the
PEP8
naming violations as this is an issue with the programming challenge site
python python-3.x programming-challenge time-limit-exceeded
$endgroup$
|
show 4 more comments
$begingroup$
Summary
In a block storage system, new data is written in blocks. We are going to represent the flash memory as one sequential array. We have a list of block writes coming in the form of arrays of size 2:
writes[i] = [first_block_written, last_block_written]
.
Each block has a rewrite limit. If rewrites on a block reach a certain specified threshold we should run a special diagnostic.
Given
blockCount
(an integer representing the total number of blocks),writes
(the list of block-write arrays of size 2), andthreshold
, your task is to return the list of disjoint block segments, each consisting of blocks that have reached the rewrite threshold. The list of block segments should be sorted in increasing order by their left ends.
Examples
For
blockCount = 10, writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 2
, the output should beblockStorageRewrites(blockCount, writes, threshold) = [[2, 5]]
.
After the first write, the blocks
0, 1, 2, 3
and4
were written in once;
After the second write, the blocks 0, 1, 2 and 5 were written in once, and the blocks3
and4
reached the rewrite threshold;
After the final write, the blocks2
and5
reached the rewrite threshold as well, so the blocks that should be diagnosed are2, 3, 4
and5
.
Blocks2, 3, 4
and5
form one consequent segment[2, 5]
.
For
blockCount = 10
,writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 3
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[3, 4]]
For
blockCount = 10
,writes = [[3, 4], [0, 1], [6, 6]]
, andthreshold = 1
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[0, 1], [3, 4], [6, 6]]
Constraints
1 ≤ blockCount ≤ 10**5
0 ≤ writes.length ≤ 10**5
writes[i].length = 2
0 ≤ writes[i][0] ≤ writes[i][1] < blockCount
First try
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = [0] * blockCount
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return list(group_blocks(num_writes, threshold))
Here I just do exactly what is asked, I create an array of blockCount size, loop over the writes and lastly group the consecutive ranges with itertoos.groupby
After trying to optimize
I'm not quite sure what the complexity was, but I tried to lessen the load, yet I still didn't pass the TLE constraints
def get_bad_blocks(blockCount, writes, threshold):
cons = False
u = l = -1
for i in range(blockCount):
count = 0
for lower, upper in writes:
if lower <= i <= upper:
count += 1
if count >= threshold:
if not cons:
cons = True
l = i
u = -1
break
else:
u = i - 1
cons = False
if u != -1 and l != -1:
yield [l, u]
l = -1
if cons:
yield [l, i]
def blockStorageRewrites(blockCount, writes, threshold):
return list(get_bad_blocks(blockCount, writes, threshold))
Questions
You can review any and all, but preferably I'm looking for answers that:
- Tell me if my first example is readable
- I'm less concerned about readability in my second try, and more interested in speed
- Please ignore the
PEP8
naming violations as this is an issue with the programming challenge site
python python-3.x programming-challenge time-limit-exceeded
$endgroup$
1
$begingroup$
I don't think it's possible to give an "answer" addressing the performance issue — your current algorithm is unsalvageable. Consider what would happen in a realistic scenario whereblockCount=1000000
. The very first thing you'd do is loopfor i in range(1000000)
! That cannot possibly be part of a valid solution. You need to come up with an algorithm that doesn't takeO(blockCount)
time.
$endgroup$
– Quuxplusone
Mar 13 at 15:47
$begingroup$
The comment "should be faster compared to len(list(g))" makes me think you should be doing some performance tests :)
$endgroup$
– Peilonrayz
Mar 13 at 15:51
$begingroup$
@Quuxplusone probably, but I fail to see how this can be done without looping over the entire blockcounts... hence the asking for a review.
$endgroup$
– Ludisposed
Mar 13 at 15:52
$begingroup$
@Peilonrayz I have tested that locally, it should be faster because it doesn't have to construct the list as far as I can tell
$endgroup$
– Ludisposed
Mar 13 at 15:53
2
$begingroup$
I don't have time to check now, but my guess is that you would want to use a (default)dict with only those blocks that are encountered inwrites
. If it is a starting block, add +1, if it's the end block, subtract 1. Then go over the values withitertool.accumulate
to get the numbers of writes between blocks.
$endgroup$
– Georgy
Mar 13 at 16:10
|
show 4 more comments
$begingroup$
Summary
In a block storage system, new data is written in blocks. We are going to represent the flash memory as one sequential array. We have a list of block writes coming in the form of arrays of size 2:
writes[i] = [first_block_written, last_block_written]
.
Each block has a rewrite limit. If rewrites on a block reach a certain specified threshold we should run a special diagnostic.
Given
blockCount
(an integer representing the total number of blocks),writes
(the list of block-write arrays of size 2), andthreshold
, your task is to return the list of disjoint block segments, each consisting of blocks that have reached the rewrite threshold. The list of block segments should be sorted in increasing order by their left ends.
Examples
For
blockCount = 10, writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 2
, the output should beblockStorageRewrites(blockCount, writes, threshold) = [[2, 5]]
.
After the first write, the blocks
0, 1, 2, 3
and4
were written in once;
After the second write, the blocks 0, 1, 2 and 5 were written in once, and the blocks3
and4
reached the rewrite threshold;
After the final write, the blocks2
and5
reached the rewrite threshold as well, so the blocks that should be diagnosed are2, 3, 4
and5
.
Blocks2, 3, 4
and5
form one consequent segment[2, 5]
.
For
blockCount = 10
,writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 3
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[3, 4]]
For
blockCount = 10
,writes = [[3, 4], [0, 1], [6, 6]]
, andthreshold = 1
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[0, 1], [3, 4], [6, 6]]
Constraints
1 ≤ blockCount ≤ 10**5
0 ≤ writes.length ≤ 10**5
writes[i].length = 2
0 ≤ writes[i][0] ≤ writes[i][1] < blockCount
First try
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = [0] * blockCount
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return list(group_blocks(num_writes, threshold))
Here I just do exactly what is asked, I create an array of blockCount size, loop over the writes and lastly group the consecutive ranges with itertoos.groupby
After trying to optimize
I'm not quite sure what the complexity was, but I tried to lessen the load, yet I still didn't pass the TLE constraints
def get_bad_blocks(blockCount, writes, threshold):
cons = False
u = l = -1
for i in range(blockCount):
count = 0
for lower, upper in writes:
if lower <= i <= upper:
count += 1
if count >= threshold:
if not cons:
cons = True
l = i
u = -1
break
else:
u = i - 1
cons = False
if u != -1 and l != -1:
yield [l, u]
l = -1
if cons:
yield [l, i]
def blockStorageRewrites(blockCount, writes, threshold):
return list(get_bad_blocks(blockCount, writes, threshold))
Questions
You can review any and all, but preferably I'm looking for answers that:
- Tell me if my first example is readable
- I'm less concerned about readability in my second try, and more interested in speed
- Please ignore the
PEP8
naming violations as this is an issue with the programming challenge site
python python-3.x programming-challenge time-limit-exceeded
$endgroup$
Summary
In a block storage system, new data is written in blocks. We are going to represent the flash memory as one sequential array. We have a list of block writes coming in the form of arrays of size 2:
writes[i] = [first_block_written, last_block_written]
.
Each block has a rewrite limit. If rewrites on a block reach a certain specified threshold we should run a special diagnostic.
Given
blockCount
(an integer representing the total number of blocks),writes
(the list of block-write arrays of size 2), andthreshold
, your task is to return the list of disjoint block segments, each consisting of blocks that have reached the rewrite threshold. The list of block segments should be sorted in increasing order by their left ends.
Examples
For
blockCount = 10, writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 2
, the output should beblockStorageRewrites(blockCount, writes, threshold) = [[2, 5]]
.
After the first write, the blocks
0, 1, 2, 3
and4
were written in once;
After the second write, the blocks 0, 1, 2 and 5 were written in once, and the blocks3
and4
reached the rewrite threshold;
After the final write, the blocks2
and5
reached the rewrite threshold as well, so the blocks that should be diagnosed are2, 3, 4
and5
.
Blocks2, 3, 4
and5
form one consequent segment[2, 5]
.
For
blockCount = 10
,writes = [[0, 4], [3, 5], [2, 6]]
, andthreshold = 3
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[3, 4]]
For
blockCount = 10
,writes = [[3, 4], [0, 1], [6, 6]]
, andthreshold = 1
, the output should be
blockStorageRewrites(blockCount, writes, threshold) = [[0, 1], [3, 4], [6, 6]]
Constraints
1 ≤ blockCount ≤ 10**5
0 ≤ writes.length ≤ 10**5
writes[i].length = 2
0 ≤ writes[i][0] ≤ writes[i][1] < blockCount
First try
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = [0] * blockCount
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return list(group_blocks(num_writes, threshold))
Here I just do exactly what is asked, I create an array of blockCount size, loop over the writes and lastly group the consecutive ranges with itertoos.groupby
After trying to optimize
I'm not quite sure what the complexity was, but I tried to lessen the load, yet I still didn't pass the TLE constraints
def get_bad_blocks(blockCount, writes, threshold):
cons = False
u = l = -1
for i in range(blockCount):
count = 0
for lower, upper in writes:
if lower <= i <= upper:
count += 1
if count >= threshold:
if not cons:
cons = True
l = i
u = -1
break
else:
u = i - 1
cons = False
if u != -1 and l != -1:
yield [l, u]
l = -1
if cons:
yield [l, i]
def blockStorageRewrites(blockCount, writes, threshold):
return list(get_bad_blocks(blockCount, writes, threshold))
Questions
You can review any and all, but preferably I'm looking for answers that:
- Tell me if my first example is readable
- I'm less concerned about readability in my second try, and more interested in speed
- Please ignore the
PEP8
naming violations as this is an issue with the programming challenge site
python python-3.x programming-challenge time-limit-exceeded
python python-3.x programming-challenge time-limit-exceeded
edited Mar 13 at 16:09
Ludisposed
asked Mar 13 at 15:33
LudisposedLudisposed
8,84422267
8,84422267
1
$begingroup$
I don't think it's possible to give an "answer" addressing the performance issue — your current algorithm is unsalvageable. Consider what would happen in a realistic scenario whereblockCount=1000000
. The very first thing you'd do is loopfor i in range(1000000)
! That cannot possibly be part of a valid solution. You need to come up with an algorithm that doesn't takeO(blockCount)
time.
$endgroup$
– Quuxplusone
Mar 13 at 15:47
$begingroup$
The comment "should be faster compared to len(list(g))" makes me think you should be doing some performance tests :)
$endgroup$
– Peilonrayz
Mar 13 at 15:51
$begingroup$
@Quuxplusone probably, but I fail to see how this can be done without looping over the entire blockcounts... hence the asking for a review.
$endgroup$
– Ludisposed
Mar 13 at 15:52
$begingroup$
@Peilonrayz I have tested that locally, it should be faster because it doesn't have to construct the list as far as I can tell
$endgroup$
– Ludisposed
Mar 13 at 15:53
2
$begingroup$
I don't have time to check now, but my guess is that you would want to use a (default)dict with only those blocks that are encountered inwrites
. If it is a starting block, add +1, if it's the end block, subtract 1. Then go over the values withitertool.accumulate
to get the numbers of writes between blocks.
$endgroup$
– Georgy
Mar 13 at 16:10
|
show 4 more comments
1
$begingroup$
I don't think it's possible to give an "answer" addressing the performance issue — your current algorithm is unsalvageable. Consider what would happen in a realistic scenario whereblockCount=1000000
. The very first thing you'd do is loopfor i in range(1000000)
! That cannot possibly be part of a valid solution. You need to come up with an algorithm that doesn't takeO(blockCount)
time.
$endgroup$
– Quuxplusone
Mar 13 at 15:47
$begingroup$
The comment "should be faster compared to len(list(g))" makes me think you should be doing some performance tests :)
$endgroup$
– Peilonrayz
Mar 13 at 15:51
$begingroup$
@Quuxplusone probably, but I fail to see how this can be done without looping over the entire blockcounts... hence the asking for a review.
$endgroup$
– Ludisposed
Mar 13 at 15:52
$begingroup$
@Peilonrayz I have tested that locally, it should be faster because it doesn't have to construct the list as far as I can tell
$endgroup$
– Ludisposed
Mar 13 at 15:53
2
$begingroup$
I don't have time to check now, but my guess is that you would want to use a (default)dict with only those blocks that are encountered inwrites
. If it is a starting block, add +1, if it's the end block, subtract 1. Then go over the values withitertool.accumulate
to get the numbers of writes between blocks.
$endgroup$
– Georgy
Mar 13 at 16:10
1
1
$begingroup$
I don't think it's possible to give an "answer" addressing the performance issue — your current algorithm is unsalvageable. Consider what would happen in a realistic scenario where
blockCount=1000000
. The very first thing you'd do is loop for i in range(1000000)
! That cannot possibly be part of a valid solution. You need to come up with an algorithm that doesn't take O(blockCount)
time.$endgroup$
– Quuxplusone
Mar 13 at 15:47
$begingroup$
I don't think it's possible to give an "answer" addressing the performance issue — your current algorithm is unsalvageable. Consider what would happen in a realistic scenario where
blockCount=1000000
. The very first thing you'd do is loop for i in range(1000000)
! That cannot possibly be part of a valid solution. You need to come up with an algorithm that doesn't take O(blockCount)
time.$endgroup$
– Quuxplusone
Mar 13 at 15:47
$begingroup$
The comment "should be faster compared to len(list(g))" makes me think you should be doing some performance tests :)
$endgroup$
– Peilonrayz
Mar 13 at 15:51
$begingroup$
The comment "should be faster compared to len(list(g))" makes me think you should be doing some performance tests :)
$endgroup$
– Peilonrayz
Mar 13 at 15:51
$begingroup$
@Quuxplusone probably, but I fail to see how this can be done without looping over the entire blockcounts... hence the asking for a review.
$endgroup$
– Ludisposed
Mar 13 at 15:52
$begingroup$
@Quuxplusone probably, but I fail to see how this can be done without looping over the entire blockcounts... hence the asking for a review.
$endgroup$
– Ludisposed
Mar 13 at 15:52
$begingroup$
@Peilonrayz I have tested that locally, it should be faster because it doesn't have to construct the list as far as I can tell
$endgroup$
– Ludisposed
Mar 13 at 15:53
$begingroup$
@Peilonrayz I have tested that locally, it should be faster because it doesn't have to construct the list as far as I can tell
$endgroup$
– Ludisposed
Mar 13 at 15:53
2
2
$begingroup$
I don't have time to check now, but my guess is that you would want to use a (default)dict with only those blocks that are encountered in
writes
. If it is a starting block, add +1, if it's the end block, subtract 1. Then go over the values with itertool.accumulate
to get the numbers of writes between blocks.$endgroup$
– Georgy
Mar 13 at 16:10
$begingroup$
I don't have time to check now, but my guess is that you would want to use a (default)dict with only those blocks that are encountered in
writes
. If it is a starting block, add +1, if it's the end block, subtract 1. Then go over the values with itertool.accumulate
to get the numbers of writes between blocks.$endgroup$
– Georgy
Mar 13 at 16:10
|
show 4 more comments
1 Answer
1
active
oldest
votes
$begingroup$
First off when you're optimizing code you need to know what to optimize. At first I thought the problem code was not the groupby
, but instead the creation of num_writes
. And so I changed your code to be able to find the performance of it.
import cProfile
import random
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def build_writes(block_count, writes):
num_writes = [0] * block_count
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return num_writes
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = build_writes(blockCount, writes)
return list(group_blocks(num_writes, threshold))
block_count = 10**5
writes = []
for _ in range(10**4):
a = random.randrange(0, block_count)
b = random.randrange(a, block_count)
writes.append([a, b])
cProfile.run('blockStorageRewrites(block_count, writes, 10**4)')
Resulting in the following profile:
200008 function calls in 25.377 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 25.377 25.377 <string>:1(<module>)
100001 0.019 0.000 0.025 0.000 lus.py:10(<genexpr>)
1 25.342 25.342 25.342 25.342 lus.py:16(build_writes)
1 0.000 0.000 25.375 25.375 lus.py:24(blockStorageRewrites)
1 0.000 0.000 0.033 0.033 lus.py:6(group_blocks)
100000 0.007 0.000 0.007 0.000 lus.py:8(<lambda>)
1 0.000 0.000 25.377 25.377 built-in method builtins.exec
1 0.007 0.007 0.033 0.033 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
Changing the code as per Georgy's comment to:
def build_writes(block_count, writes):
num_writes = dict(enumerate([0] * block_count))
for lower, upper in writes:
num_writes[lower] += 1
num_writes[upper] -= 1
return list(accumulate(num_writes))
Gets the following profile, which is orders of magnitude faster:
200011 function calls in 0.066 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 0.066 0.066 <string>:1(<module>)
100002 0.021 0.000 0.028 0.000 lus.py:10(<genexpr>)
1 0.025 0.025 0.025 0.025 lus.py:16(build_writes)
1 0.003 0.003 0.064 0.064 lus.py:24(blockStorageRewrites)
2 0.000 0.000 0.036 0.018 lus.py:6(group_blocks)
100000 0.008 0.000 0.008 0.000 lus.py:8(<lambda>)
1 0.000 0.000 0.066 0.066 built-in method builtins.exec
2 0.008 0.004 0.036 0.018 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
$endgroup$
1
$begingroup$
Thereturn list(accumulate(num_writes))
line should bereturn list(accumulate(num_writes.values()))
else I'd be accumulating the keys. Other then that, this is a great teaching moment for me. I should know what the bottlenecks are before optimization :)
$endgroup$
– Ludisposed
Mar 14 at 8:13
1
$begingroup$
And I did had to do(block_count + 1)
andnum_writes[upper + 1] -= 1
to make the upper bound work correctly
$endgroup$
– Ludisposed
Mar 14 at 8:17
$begingroup$
@Ludisposed Ah oops :( My main aim was to show how to go about improving performance, so I didn't test at all.
$endgroup$
– Peilonrayz
Mar 14 at 14:06
$begingroup$
No problem, as the "You should have profiled the code before starting to optimize" was the (for me) important part of the answer. I do wonder did you find any other mistakes, simplifications or just quit after profiling?
$endgroup$
– Ludisposed
Mar 14 at 14:14
1
$begingroup$
@Ludisposed I quit after profiling, but the first code looks good otherwise. If there wasn't a TLE then I'd say the original code is good and say to stop there. If there's another TLE then I'd look into it more, but I think 25 seconds -> 0.5 would mean there isn't one.
$endgroup$
– Peilonrayz
Mar 14 at 14:19
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215355%2fblock-storage-rewrites%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
First off when you're optimizing code you need to know what to optimize. At first I thought the problem code was not the groupby
, but instead the creation of num_writes
. And so I changed your code to be able to find the performance of it.
import cProfile
import random
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def build_writes(block_count, writes):
num_writes = [0] * block_count
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return num_writes
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = build_writes(blockCount, writes)
return list(group_blocks(num_writes, threshold))
block_count = 10**5
writes = []
for _ in range(10**4):
a = random.randrange(0, block_count)
b = random.randrange(a, block_count)
writes.append([a, b])
cProfile.run('blockStorageRewrites(block_count, writes, 10**4)')
Resulting in the following profile:
200008 function calls in 25.377 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 25.377 25.377 <string>:1(<module>)
100001 0.019 0.000 0.025 0.000 lus.py:10(<genexpr>)
1 25.342 25.342 25.342 25.342 lus.py:16(build_writes)
1 0.000 0.000 25.375 25.375 lus.py:24(blockStorageRewrites)
1 0.000 0.000 0.033 0.033 lus.py:6(group_blocks)
100000 0.007 0.000 0.007 0.000 lus.py:8(<lambda>)
1 0.000 0.000 25.377 25.377 built-in method builtins.exec
1 0.007 0.007 0.033 0.033 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
Changing the code as per Georgy's comment to:
def build_writes(block_count, writes):
num_writes = dict(enumerate([0] * block_count))
for lower, upper in writes:
num_writes[lower] += 1
num_writes[upper] -= 1
return list(accumulate(num_writes))
Gets the following profile, which is orders of magnitude faster:
200011 function calls in 0.066 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 0.066 0.066 <string>:1(<module>)
100002 0.021 0.000 0.028 0.000 lus.py:10(<genexpr>)
1 0.025 0.025 0.025 0.025 lus.py:16(build_writes)
1 0.003 0.003 0.064 0.064 lus.py:24(blockStorageRewrites)
2 0.000 0.000 0.036 0.018 lus.py:6(group_blocks)
100000 0.008 0.000 0.008 0.000 lus.py:8(<lambda>)
1 0.000 0.000 0.066 0.066 built-in method builtins.exec
2 0.008 0.004 0.036 0.018 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
$endgroup$
1
$begingroup$
Thereturn list(accumulate(num_writes))
line should bereturn list(accumulate(num_writes.values()))
else I'd be accumulating the keys. Other then that, this is a great teaching moment for me. I should know what the bottlenecks are before optimization :)
$endgroup$
– Ludisposed
Mar 14 at 8:13
1
$begingroup$
And I did had to do(block_count + 1)
andnum_writes[upper + 1] -= 1
to make the upper bound work correctly
$endgroup$
– Ludisposed
Mar 14 at 8:17
$begingroup$
@Ludisposed Ah oops :( My main aim was to show how to go about improving performance, so I didn't test at all.
$endgroup$
– Peilonrayz
Mar 14 at 14:06
$begingroup$
No problem, as the "You should have profiled the code before starting to optimize" was the (for me) important part of the answer. I do wonder did you find any other mistakes, simplifications or just quit after profiling?
$endgroup$
– Ludisposed
Mar 14 at 14:14
1
$begingroup$
@Ludisposed I quit after profiling, but the first code looks good otherwise. If there wasn't a TLE then I'd say the original code is good and say to stop there. If there's another TLE then I'd look into it more, but I think 25 seconds -> 0.5 would mean there isn't one.
$endgroup$
– Peilonrayz
Mar 14 at 14:19
|
show 2 more comments
$begingroup$
First off when you're optimizing code you need to know what to optimize. At first I thought the problem code was not the groupby
, but instead the creation of num_writes
. And so I changed your code to be able to find the performance of it.
import cProfile
import random
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def build_writes(block_count, writes):
num_writes = [0] * block_count
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return num_writes
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = build_writes(blockCount, writes)
return list(group_blocks(num_writes, threshold))
block_count = 10**5
writes = []
for _ in range(10**4):
a = random.randrange(0, block_count)
b = random.randrange(a, block_count)
writes.append([a, b])
cProfile.run('blockStorageRewrites(block_count, writes, 10**4)')
Resulting in the following profile:
200008 function calls in 25.377 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 25.377 25.377 <string>:1(<module>)
100001 0.019 0.000 0.025 0.000 lus.py:10(<genexpr>)
1 25.342 25.342 25.342 25.342 lus.py:16(build_writes)
1 0.000 0.000 25.375 25.375 lus.py:24(blockStorageRewrites)
1 0.000 0.000 0.033 0.033 lus.py:6(group_blocks)
100000 0.007 0.000 0.007 0.000 lus.py:8(<lambda>)
1 0.000 0.000 25.377 25.377 built-in method builtins.exec
1 0.007 0.007 0.033 0.033 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
Changing the code as per Georgy's comment to:
def build_writes(block_count, writes):
num_writes = dict(enumerate([0] * block_count))
for lower, upper in writes:
num_writes[lower] += 1
num_writes[upper] -= 1
return list(accumulate(num_writes))
Gets the following profile, which is orders of magnitude faster:
200011 function calls in 0.066 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 0.066 0.066 <string>:1(<module>)
100002 0.021 0.000 0.028 0.000 lus.py:10(<genexpr>)
1 0.025 0.025 0.025 0.025 lus.py:16(build_writes)
1 0.003 0.003 0.064 0.064 lus.py:24(blockStorageRewrites)
2 0.000 0.000 0.036 0.018 lus.py:6(group_blocks)
100000 0.008 0.000 0.008 0.000 lus.py:8(<lambda>)
1 0.000 0.000 0.066 0.066 built-in method builtins.exec
2 0.008 0.004 0.036 0.018 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
$endgroup$
1
$begingroup$
Thereturn list(accumulate(num_writes))
line should bereturn list(accumulate(num_writes.values()))
else I'd be accumulating the keys. Other then that, this is a great teaching moment for me. I should know what the bottlenecks are before optimization :)
$endgroup$
– Ludisposed
Mar 14 at 8:13
1
$begingroup$
And I did had to do(block_count + 1)
andnum_writes[upper + 1] -= 1
to make the upper bound work correctly
$endgroup$
– Ludisposed
Mar 14 at 8:17
$begingroup$
@Ludisposed Ah oops :( My main aim was to show how to go about improving performance, so I didn't test at all.
$endgroup$
– Peilonrayz
Mar 14 at 14:06
$begingroup$
No problem, as the "You should have profiled the code before starting to optimize" was the (for me) important part of the answer. I do wonder did you find any other mistakes, simplifications or just quit after profiling?
$endgroup$
– Ludisposed
Mar 14 at 14:14
1
$begingroup$
@Ludisposed I quit after profiling, but the first code looks good otherwise. If there wasn't a TLE then I'd say the original code is good and say to stop there. If there's another TLE then I'd look into it more, but I think 25 seconds -> 0.5 would mean there isn't one.
$endgroup$
– Peilonrayz
Mar 14 at 14:19
|
show 2 more comments
$begingroup$
First off when you're optimizing code you need to know what to optimize. At first I thought the problem code was not the groupby
, but instead the creation of num_writes
. And so I changed your code to be able to find the performance of it.
import cProfile
import random
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def build_writes(block_count, writes):
num_writes = [0] * block_count
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return num_writes
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = build_writes(blockCount, writes)
return list(group_blocks(num_writes, threshold))
block_count = 10**5
writes = []
for _ in range(10**4):
a = random.randrange(0, block_count)
b = random.randrange(a, block_count)
writes.append([a, b])
cProfile.run('blockStorageRewrites(block_count, writes, 10**4)')
Resulting in the following profile:
200008 function calls in 25.377 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 25.377 25.377 <string>:1(<module>)
100001 0.019 0.000 0.025 0.000 lus.py:10(<genexpr>)
1 25.342 25.342 25.342 25.342 lus.py:16(build_writes)
1 0.000 0.000 25.375 25.375 lus.py:24(blockStorageRewrites)
1 0.000 0.000 0.033 0.033 lus.py:6(group_blocks)
100000 0.007 0.000 0.007 0.000 lus.py:8(<lambda>)
1 0.000 0.000 25.377 25.377 built-in method builtins.exec
1 0.007 0.007 0.033 0.033 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
Changing the code as per Georgy's comment to:
def build_writes(block_count, writes):
num_writes = dict(enumerate([0] * block_count))
for lower, upper in writes:
num_writes[lower] += 1
num_writes[upper] -= 1
return list(accumulate(num_writes))
Gets the following profile, which is orders of magnitude faster:
200011 function calls in 0.066 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 0.066 0.066 <string>:1(<module>)
100002 0.021 0.000 0.028 0.000 lus.py:10(<genexpr>)
1 0.025 0.025 0.025 0.025 lus.py:16(build_writes)
1 0.003 0.003 0.064 0.064 lus.py:24(blockStorageRewrites)
2 0.000 0.000 0.036 0.018 lus.py:6(group_blocks)
100000 0.008 0.000 0.008 0.000 lus.py:8(<lambda>)
1 0.000 0.000 0.066 0.066 built-in method builtins.exec
2 0.008 0.004 0.036 0.018 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
$endgroup$
First off when you're optimizing code you need to know what to optimize. At first I thought the problem code was not the groupby
, but instead the creation of num_writes
. And so I changed your code to be able to find the performance of it.
import cProfile
import random
from itertools import groupby
def group_blocks(num_writes, threshold):
i = 0
for key, group in groupby(num_writes, lambda x : x >= threshold):
# This is faster compared to len(list(g))
length = sum(1 for _ in group)
if key:
yield [i, i + length - 1]
i += length
def build_writes(block_count, writes):
num_writes = [0] * block_count
for lower, upper in writes:
for i in range(lower, upper + 1):
num_writes[i] += 1
return num_writes
def blockStorageRewrites(blockCount, writes, threshold):
num_writes = build_writes(blockCount, writes)
return list(group_blocks(num_writes, threshold))
block_count = 10**5
writes = []
for _ in range(10**4):
a = random.randrange(0, block_count)
b = random.randrange(a, block_count)
writes.append([a, b])
cProfile.run('blockStorageRewrites(block_count, writes, 10**4)')
Resulting in the following profile:
200008 function calls in 25.377 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 25.377 25.377 <string>:1(<module>)
100001 0.019 0.000 0.025 0.000 lus.py:10(<genexpr>)
1 25.342 25.342 25.342 25.342 lus.py:16(build_writes)
1 0.000 0.000 25.375 25.375 lus.py:24(blockStorageRewrites)
1 0.000 0.000 0.033 0.033 lus.py:6(group_blocks)
100000 0.007 0.000 0.007 0.000 lus.py:8(<lambda>)
1 0.000 0.000 25.377 25.377 built-in method builtins.exec
1 0.007 0.007 0.033 0.033 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
Changing the code as per Georgy's comment to:
def build_writes(block_count, writes):
num_writes = dict(enumerate([0] * block_count))
for lower, upper in writes:
num_writes[lower] += 1
num_writes[upper] -= 1
return list(accumulate(num_writes))
Gets the following profile, which is orders of magnitude faster:
200011 function calls in 0.066 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 0.066 0.066 <string>:1(<module>)
100002 0.021 0.000 0.028 0.000 lus.py:10(<genexpr>)
1 0.025 0.025 0.025 0.025 lus.py:16(build_writes)
1 0.003 0.003 0.064 0.064 lus.py:24(blockStorageRewrites)
2 0.000 0.000 0.036 0.018 lus.py:6(group_blocks)
100000 0.008 0.000 0.008 0.000 lus.py:8(<lambda>)
1 0.000 0.000 0.066 0.066 built-in method builtins.exec
2 0.008 0.004 0.036 0.018 built-in method builtins.sum
1 0.000 0.000 0.000 0.000 method 'disable' of '_lsprof.Profiler' objects
answered Mar 13 at 16:17
PeilonrayzPeilonrayz
26k338110
26k338110
1
$begingroup$
Thereturn list(accumulate(num_writes))
line should bereturn list(accumulate(num_writes.values()))
else I'd be accumulating the keys. Other then that, this is a great teaching moment for me. I should know what the bottlenecks are before optimization :)
$endgroup$
– Ludisposed
Mar 14 at 8:13
1
$begingroup$
And I did had to do(block_count + 1)
andnum_writes[upper + 1] -= 1
to make the upper bound work correctly
$endgroup$
– Ludisposed
Mar 14 at 8:17
$begingroup$
@Ludisposed Ah oops :( My main aim was to show how to go about improving performance, so I didn't test at all.
$endgroup$
– Peilonrayz
Mar 14 at 14:06
$begingroup$
No problem, as the "You should have profiled the code before starting to optimize" was the (for me) important part of the answer. I do wonder did you find any other mistakes, simplifications or just quit after profiling?
$endgroup$
– Ludisposed
Mar 14 at 14:14
1
$begingroup$
@Ludisposed I quit after profiling, but the first code looks good otherwise. If there wasn't a TLE then I'd say the original code is good and say to stop there. If there's another TLE then I'd look into it more, but I think 25 seconds -> 0.5 would mean there isn't one.
$endgroup$
– Peilonrayz
Mar 14 at 14:19
|
show 2 more comments
1
$begingroup$
Thereturn list(accumulate(num_writes))
line should bereturn list(accumulate(num_writes.values()))
else I'd be accumulating the keys. Other then that, this is a great teaching moment for me. I should know what the bottlenecks are before optimization :)
$endgroup$
– Ludisposed
Mar 14 at 8:13
1
$begingroup$
And I did had to do(block_count + 1)
andnum_writes[upper + 1] -= 1
to make the upper bound work correctly
$endgroup$
– Ludisposed
Mar 14 at 8:17
$begingroup$
@Ludisposed Ah oops :( My main aim was to show how to go about improving performance, so I didn't test at all.
$endgroup$
– Peilonrayz
Mar 14 at 14:06
$begingroup$
No problem, as the "You should have profiled the code before starting to optimize" was the (for me) important part of the answer. I do wonder did you find any other mistakes, simplifications or just quit after profiling?
$endgroup$
– Ludisposed
Mar 14 at 14:14
1
$begingroup$
@Ludisposed I quit after profiling, but the first code looks good otherwise. If there wasn't a TLE then I'd say the original code is good and say to stop there. If there's another TLE then I'd look into it more, but I think 25 seconds -> 0.5 would mean there isn't one.
$endgroup$
– Peilonrayz
Mar 14 at 14:19
1
1
$begingroup$
The
return list(accumulate(num_writes))
line should be return list(accumulate(num_writes.values()))
else I'd be accumulating the keys. Other then that, this is a great teaching moment for me. I should know what the bottlenecks are before optimization :)$endgroup$
– Ludisposed
Mar 14 at 8:13
$begingroup$
The
return list(accumulate(num_writes))
line should be return list(accumulate(num_writes.values()))
else I'd be accumulating the keys. Other then that, this is a great teaching moment for me. I should know what the bottlenecks are before optimization :)$endgroup$
– Ludisposed
Mar 14 at 8:13
1
1
$begingroup$
And I did had to do
(block_count + 1)
and num_writes[upper + 1] -= 1
to make the upper bound work correctly$endgroup$
– Ludisposed
Mar 14 at 8:17
$begingroup$
And I did had to do
(block_count + 1)
and num_writes[upper + 1] -= 1
to make the upper bound work correctly$endgroup$
– Ludisposed
Mar 14 at 8:17
$begingroup$
@Ludisposed Ah oops :( My main aim was to show how to go about improving performance, so I didn't test at all.
$endgroup$
– Peilonrayz
Mar 14 at 14:06
$begingroup$
@Ludisposed Ah oops :( My main aim was to show how to go about improving performance, so I didn't test at all.
$endgroup$
– Peilonrayz
Mar 14 at 14:06
$begingroup$
No problem, as the "You should have profiled the code before starting to optimize" was the (for me) important part of the answer. I do wonder did you find any other mistakes, simplifications or just quit after profiling?
$endgroup$
– Ludisposed
Mar 14 at 14:14
$begingroup$
No problem, as the "You should have profiled the code before starting to optimize" was the (for me) important part of the answer. I do wonder did you find any other mistakes, simplifications or just quit after profiling?
$endgroup$
– Ludisposed
Mar 14 at 14:14
1
1
$begingroup$
@Ludisposed I quit after profiling, but the first code looks good otherwise. If there wasn't a TLE then I'd say the original code is good and say to stop there. If there's another TLE then I'd look into it more, but I think 25 seconds -> 0.5 would mean there isn't one.
$endgroup$
– Peilonrayz
Mar 14 at 14:19
$begingroup$
@Ludisposed I quit after profiling, but the first code looks good otherwise. If there wasn't a TLE then I'd say the original code is good and say to stop there. If there's another TLE then I'd look into it more, but I think 25 seconds -> 0.5 would mean there isn't one.
$endgroup$
– Peilonrayz
Mar 14 at 14:19
|
show 2 more comments
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215355%2fblock-storage-rewrites%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
I don't think it's possible to give an "answer" addressing the performance issue — your current algorithm is unsalvageable. Consider what would happen in a realistic scenario where
blockCount=1000000
. The very first thing you'd do is loopfor i in range(1000000)
! That cannot possibly be part of a valid solution. You need to come up with an algorithm that doesn't takeO(blockCount)
time.$endgroup$
– Quuxplusone
Mar 13 at 15:47
$begingroup$
The comment "should be faster compared to len(list(g))" makes me think you should be doing some performance tests :)
$endgroup$
– Peilonrayz
Mar 13 at 15:51
$begingroup$
@Quuxplusone probably, but I fail to see how this can be done without looping over the entire blockcounts... hence the asking for a review.
$endgroup$
– Ludisposed
Mar 13 at 15:52
$begingroup$
@Peilonrayz I have tested that locally, it should be faster because it doesn't have to construct the list as far as I can tell
$endgroup$
– Ludisposed
Mar 13 at 15:53
2
$begingroup$
I don't have time to check now, but my guess is that you would want to use a (default)dict with only those blocks that are encountered in
writes
. If it is a starting block, add +1, if it's the end block, subtract 1. Then go over the values withitertool.accumulate
to get the numbers of writes between blocks.$endgroup$
– Georgy
Mar 13 at 16:10