knossos-ksc

Граф коммитов

Автор	SHA1	Сообщение	Дата
Andrew Fitzgibbon	8fa75e67c0	More documentation (#1059 ) * Improve intro, add MNIST * Generate relu.cpp * Add nnModule, mnist, Lambda	2021-09-09 12:17:04 +01:00
Andrew Fitzgibbon	a4e2d73c15	Fix sqrl compile failure (#1057 ) * Ensure script fails if one run fails * Fix convert_to_ks per #1048	2021-09-07 09:05:23 +01:00
toelli-msft	bf4c417c5f	sqrl upper bounds (#968 ) Co-authored-by: Tom Ellis <tom-git@jaguarpaw.co.uk> Co-authored-by: Andrew Fitzgibbon <awf@microsoft.com>	2021-09-06 19:04:44 +01:00
Andrew Fitzgibbon	9a87228a54	Rename, add comments, per #1023 (#1048 )	2021-09-02 15:22:07 +01:00
Andrew Fitzgibbon	c482f9c6b4	Doc: Add "under construction" notice	2021-09-01 20:47:02 +01:00
David Collier	0327e1a4ae	Correct order of arguments in vrelu3 aten implementation (#1034 ) > The bug is not found by CI because we have some benchmarks which are _expected_ to give incorrect results (the `embedded_INCORRECT` benchmarks). Maybe we should disable those benchmarks in CI so that we can fail the build if another example breaks? Made #1045	2021-09-01 10:11:14 +01:00
Andrew Fitzgibbon	a79b395997	Implement vmap (#1019 )	2021-08-27 22:04:37 +01:00
Andrew Fitzgibbon	a26a38f67d	WIP Documentation tidies (#1012 ) * Update AST cpp * Move glossary into sphinx * rm cg-todos.txt * Restructure doc folders	2021-08-24 13:44:26 +01:00
David Collier	2d00e379c6	Fix PyTorch implementation of sqrl (#1027 )	2021-08-24 11:17:53 +01:00
Tom Ellis	150f3d3c73	Remove TupledAD	2021-08-18 14:50:08 +01:00
Andrew Fitzgibbon	c529d57017	Initial plumbing for vmap (#1010 ) https://github.com/microsoft/knossos-ksc/pull/1010	2021-08-17 14:34:55 +01:00
David Collier	3310934e34	Rename revmap to map2 (#1016 )	2021-08-17 09:08:26 +01:00
Tom Ellis	d222e25ce7	Float sqrt(2.0) outside loop Doesn't actually speed things up	2021-08-11 12:51:04 +01:00
Tom Ellis	adf0f22e0f	Add vgelu_embedded_cpp_aten	2021-08-11 12:51:04 +01:00
David Collier	57f32e9f4a	Add #includes to generated C++ (#1000 )	2021-08-06 17:02:04 +01:00
David Collier	7f05afab62	Remove special case for ATen benchmark (#1001 )	2021-08-06 16:58:59 +01:00
David Collier	146aeced61	Remove call to renamed function (#989 )	2021-07-28 17:59:09 +01:00
Andrew Fitzgibbon	ddd91c4a2a	vmap interface to elementwise ops (#970 )	2021-07-28 16:59:30 +01:00
Tom Ellis	937d4ea8c3	Use sum in sqrl instead of mean in a place where the two have the same effect but sum is likely to be faster. There is a lot of noise in the benchmarks so it's not completely obvious it is faster. Fixes https://github.com/microsoft/knossos-ksc/issues/967	2021-07-26 10:01:58 +01:00
Andrew Fitzgibbon	6cff51b0a5	Knossos.register, and delayed compilation (#960 )	2021-07-24 13:38:12 +01:00
Andrew Fitzgibbon	0daf53e905	Clean pytorch_nice doc (#971 )	2021-07-24 13:15:23 +01:00
Tom Ellis	e79ac1db9f	Embedded C++ gelu benchmarks	2021-07-23 10:32:02 +01:00
Tom Ellis	0ab1a292db	Allow to set compiler flags in benchmarks and extend relu3	2021-07-21 10:42:14 +01:00
Andrew Fitzgibbon	a498df8a59	Move pytorch examples higher (#957 ) PyTorch examples were hidden below the optimized examples	2021-07-19 22:21:21 +01:00
David Collier	1305e89be8	Use torch::tensor instead of ks::tensor in entry points (#931 )	2021-07-16 14:13:37 +01:00
David Collier	82aea2c764	Generate C++ entry points (#925 )	2021-07-15 10:54:08 +01:00
David Collier	e232149ee1	Rename module entry points (#947 )	2021-07-13 15:57:56 +01:00
Andrew Fitzgibbon	f179c8e485	Code health: torch.autograd Function, torch zeros, jax version (#945 )	2021-07-13 15:22:16 +01:00
Andrew Fitzgibbon	90012065ec	Compile changed files only (#944 ) When we make a new extension, emit the C++ file only if it has changed. Ninja will then not rebuild the extension. Gives a 7x speedup on full tests including benchmark, and a 19x speedup on regular tests Sends generated files to build/torch_extensions	2021-07-13 12:35:14 +01:00
Tom Ellis	e115c770cf	relu3 upper bounds	2021-07-12 14:27:11 +01:00
David Collier	76afa818d3	Move common parts of elementwise kernel into new header	2021-07-09 17:11:59 +01:00
David Collier	4f043d7343	Make boilerplate for elementwise operations reusable	2021-07-09 17:11:59 +01:00
David Collier	224041d409	Only support tensor elements of type ks_float	2021-07-09 17:11:59 +01:00
David Collier	940532dc50	Move checking of arguments into .cu file	2021-07-09 17:11:59 +01:00
Tom Ellis	2efb8dd94b	Change embedding prefix to support embedding C++ in the future Instead of calling the embedded ks function `vrelu3_ks_embeded_...` it will be `vrelu3_embedded_ks_...`. This will allow us to add `vrelu3_embedded_cpp...`.	2021-06-22 16:38:19 +01:00
Colin Gravill	bafd9c1dea	Use internal PyTorch vmap for vrelu3 (#881 ) * Use internal vmap implementation * Adjust how candidates functions are filtered Wrappers values will sometimes show internal function name. Better to filter over the identifier * Comment vrelu3_pytorch_nice for now Uses functionality currently unsupported by vmap "RuntimeError: Batching rule not implemented for aten::is_nonzero. We could not generate a fallback." * Add note about vmap stablity * More explicit call, discussion: https://github.com/microsoft/knossos-ksc/pull/881#discussion_r654523527	2021-06-21 16:14:04 +02:00
Tom Ellis	205daf422a	Support "ks_embedded" benchmarks and add two of them	2021-06-15 18:59:10 +01:00
Tom Ellis	a7cd2db5cf	Complete vgelu example	2021-06-15 17:59:45 +01:00
Tom Ellis	1b48afacfe	Use explicit floating point literal Otherwise it is sent to ksc as an Integer	2021-06-15 17:58:46 +01:00
David Collier	14a1563a81	Comment out largest sqrl example due to OOM issue	2021-06-11 09:40:25 +01:00
Colin Gravill	fe0771d745	Allow pytest benchmark to measure with or without CUDA transfer times (#854 ) Allow benchmarking with and without CUDA transfers Move device responsibility out of example	2021-06-07 09:28:51 +01:00
David Collier	663a5c07af	Alternative version of ts2mod based on recursively compiling needed functions (#838 ) Co-authored-by: Andrew Fitzgibbon <awf@microsoft.com>	2021-06-03 16:21:31 +01:00
David Collier	f20f21813a	Support tensor rank either 1 or 2 in CUDA vrelu3 (#815 )	2021-05-24 15:24:33 +01:00
David Collier	7392c718ae	Add handwritten CUDA vrelu3 to benchmarks (#785 )	2021-05-24 12:00:46 +01:00
Colin Gravill	edb05342ad	Assorted linting fixes (#790 ) * Flake8 linting fixes * Spelling fixes * Add imports * Remove unused imports * Apply autoflake, unused import mainly, but few redundant pass statements as well	2021-05-20 13:31:16 +01:00
Colin Gravill	35b9910851	Remove Black line length override (#774 ) * Test removing Black line length override i.e. make it 88	2021-05-12 15:10:41 +01:00
Colin Gravill	2f20004af4	Apply Black formatting (#757 ) * Apply Black formatting * Ensure Black formatting on PR	2021-05-06 16:21:12 +01:00
Andrew Fitzgibbon	387c9228cb	KSC Benchmarking (#645 ) For each function foo defined in examples/folder/myeg.py, define: foo: Torchscript implementation foo_pt: PyTorch reference implementation, as fast as possible, possibly ugly foo_bench_configs: return a list of arguments at which to benchmark the function Then call run-bench myeg foo Some steps towards #725 Co-authored-by: Colin Gravill <colin@gravill.com>	2021-04-22 16:09:56 +01:00
Tom Ellis	32d3dc7980	Change edefs to take a single type rather than a list of types (#685 ) Now that we are one-arg we don't need the latter Related to https://github.com/microsoft/knossos-ksc/issues/681 Co-authored-by: Tom Ellis <tom-git@jaguarpaw.co.uk> Co-authored-by: Andrew Fitzgibbon <awf@microsoft.com>	2021-04-01 10:12:10 +01:00
David Collier	00e33b8b2e	Add hand-optimized ks for GMM using buildFromSparse (#600 )	2021-03-09 11:58:45 +00:00

1 2

79 Коммитов