specklepy/example/many_children.py

64 строки
1.6 KiB
Python
Исходник Постоянная ссылка Обычный вид История

2022-12-09 22:48:02 +03:00
import os
import random
import string
import time
from pathlib import Path
SQLite write batching (#188) ### SUMMARY **sqlite transport** This transport now batches and bulk inserts objects when writing resulting in huge performance improvements (100x). **base object serializer** Batching in the sqlite transport necessitated some refactoring here in order to safely call end_write when not using operations.send/receive. This has been resolved by turning traverse_base into a wrapper for _traverse_base which can take care of calling begin/end_write and resetting the writer at the top level. This is not breaking since the top level methods to call have not changed names and the original method has just been prepended with a _ Additionally, missing referenced child objects in the read transport used to raise a SpeckleException. However, using the gql client to call objects.get() will return an object with missing references by design thus throwing an error in serialization. This has been resolved by instead raising a SpeckleWarning when child objects can't be found and just returning the reference + id. ((this method of interacting with objects is discouraged so it is not surprising to me that this bug was lurking for so long - but an oopsie nonetheless!)) **ci / dev** Updates for the ci config and the dev container to work with the recent changes in server. NOTE: dev container seems to be pulling an older version of server -- not resolved yet --- * quick and hacky sqlite batching * feat(transports): batching sqlite inserts * chore: upgrade gql3 also removed py-spy as it's not used and i was getting install errors :/ * ci: bump node version * ci: formatting * update CI versions * update to new circleci redis baseimage * update test fixture auth to non deprecated token based method * add start and finish write method calls to base object serialize * chore: dev container update * fix(serialization): move end and begin write * style: formatting * fix(serializer): warn but don't throw if ref not found this is _not_ an issue with the transports, but an issue with using the graphql api to fetch objects. since you are only receiving one obj and none of the children, the transport has no way to find them and should simply return the reference as is. idk why anyone would really use `object.get` so tbh i'm not surprised no one has found this bug yet lol * fix(client): don't parse obj create response * fix(serializer): wrap `traverse_base` moving `begin` and `end_write` to the seriazlier due to the new sqlite transport with batched writes necessitates a wrapper around `traverse_base` so end/begin write can be called once at the top level. just adding begin/end write to the original traversal method would make tons of calls to `end_write` since the traversal is recursive Co-authored-by: izzy lyseggen <izzy.lyseggen@gmail.com>
2022-06-20 14:00:09 +03:00
from typing import List
2022-12-09 22:48:02 +03:00
SQLite write batching (#188) ### SUMMARY **sqlite transport** This transport now batches and bulk inserts objects when writing resulting in huge performance improvements (100x). **base object serializer** Batching in the sqlite transport necessitated some refactoring here in order to safely call end_write when not using operations.send/receive. This has been resolved by turning traverse_base into a wrapper for _traverse_base which can take care of calling begin/end_write and resetting the writer at the top level. This is not breaking since the top level methods to call have not changed names and the original method has just been prepended with a _ Additionally, missing referenced child objects in the read transport used to raise a SpeckleException. However, using the gql client to call objects.get() will return an object with missing references by design thus throwing an error in serialization. This has been resolved by instead raising a SpeckleWarning when child objects can't be found and just returning the reference + id. ((this method of interacting with objects is discouraged so it is not surprising to me that this bug was lurking for so long - but an oopsie nonetheless!)) **ci / dev** Updates for the ci config and the dev container to work with the recent changes in server. NOTE: dev container seems to be pulling an older version of server -- not resolved yet --- * quick and hacky sqlite batching * feat(transports): batching sqlite inserts * chore: upgrade gql3 also removed py-spy as it's not used and i was getting install errors :/ * ci: bump node version * ci: formatting * update CI versions * update to new circleci redis baseimage * update test fixture auth to non deprecated token based method * add start and finish write method calls to base object serialize * chore: dev container update * fix(serialization): move end and begin write * style: formatting * fix(serializer): warn but don't throw if ref not found this is _not_ an issue with the transports, but an issue with using the graphql api to fetch objects. since you are only receiving one obj and none of the children, the transport has no way to find them and should simply return the reference as is. idk why anyone would really use `object.get` so tbh i'm not surprised no one has found this bug yet lol * fix(client): don't parse obj create response * fix(serializer): wrap `traverse_base` moving `begin` and `end_write` to the seriazlier due to the new sqlite transport with batched writes necessitates a wrapper around `traverse_base` so end/begin write can be called once at the top level. just adding begin/end write to the original traversal method would make tons of calls to `end_write` since the traversal is recursive Co-authored-by: izzy lyseggen <izzy.lyseggen@gmail.com>
2022-06-20 14:00:09 +03:00
from specklepy.api import operations
2022-12-09 22:48:02 +03:00
from specklepy.objects import Base
SQLite write batching (#188) ### SUMMARY **sqlite transport** This transport now batches and bulk inserts objects when writing resulting in huge performance improvements (100x). **base object serializer** Batching in the sqlite transport necessitated some refactoring here in order to safely call end_write when not using operations.send/receive. This has been resolved by turning traverse_base into a wrapper for _traverse_base which can take care of calling begin/end_write and resetting the writer at the top level. This is not breaking since the top level methods to call have not changed names and the original method has just been prepended with a _ Additionally, missing referenced child objects in the read transport used to raise a SpeckleException. However, using the gql client to call objects.get() will return an object with missing references by design thus throwing an error in serialization. This has been resolved by instead raising a SpeckleWarning when child objects can't be found and just returning the reference + id. ((this method of interacting with objects is discouraged so it is not surprising to me that this bug was lurking for so long - but an oopsie nonetheless!)) **ci / dev** Updates for the ci config and the dev container to work with the recent changes in server. NOTE: dev container seems to be pulling an older version of server -- not resolved yet --- * quick and hacky sqlite batching * feat(transports): batching sqlite inserts * chore: upgrade gql3 also removed py-spy as it's not used and i was getting install errors :/ * ci: bump node version * ci: formatting * update CI versions * update to new circleci redis baseimage * update test fixture auth to non deprecated token based method * add start and finish write method calls to base object serialize * chore: dev container update * fix(serialization): move end and begin write * style: formatting * fix(serializer): warn but don't throw if ref not found this is _not_ an issue with the transports, but an issue with using the graphql api to fetch objects. since you are only receiving one obj and none of the children, the transport has no way to find them and should simply return the reference as is. idk why anyone would really use `object.get` so tbh i'm not surprised no one has found this bug yet lol * fix(client): don't parse obj create response * fix(serializer): wrap `traverse_base` moving `begin` and `end_write` to the seriazlier due to the new sqlite transport with batched writes necessitates a wrapper around `traverse_base` so end/begin write can be called once at the top level. just adding begin/end write to the original traversal method would make tons of calls to `end_write` since the traversal is recursive Co-authored-by: izzy lyseggen <izzy.lyseggen@gmail.com>
2022-06-20 14:00:09 +03:00
from specklepy.transports.sqlite import SQLiteTransport
class Sub(Base):
bar: List[str]
def random_string():
letters = string.ascii_lowercase
return "".join(random.choice(letters) for _ in range(10))
BASE_PATH = SQLiteTransport.get_base_path("Speckle")
def clean_db():
os.remove(Path(BASE_PATH, "Objects.db"))
def one_pass(clean: bool, randomize: bool, child_count: int):
foo = Base()
for i in range(child_count):
stuff = random_string() if randomize else "stuff"
foo[f"@child_{i}"] = Sub(bar=["asdf", "bar", i, stuff])
if clean:
clean_db()
transport = SQLiteTransport()
start = time.time()
hash = operations.send(base=foo, transports=[transport])
send_time = time.time() - start
receive_start = time.time()
operations.receive(hash, transport)
receive_time = time.time() - receive_start
return send_time, receive_time
if __name__ == "__main__":
sample_size = 4
test_permutations = [
(True, True),
(False, False),
(False, True),
(True, False),
]
for clean, randomize in test_permutations:
print(f"CLEAN: {clean}, RANDOMIZE: {randomize}")
for child_count in [10, 100, 1000, 10000]:
print(f"\tCHILD COUNT: {child_count}")
for _ in range(sample_size):
send_time, receive_time = one_pass(clean, randomize, child_count)
print(f"\t\tSend: {send_time} Receive: {receive_time}")